matt.boats

Backups

If it's worth saving, it's worth over saving

Over the past year, I started self hosting backups of my photos, music, computers, phones, and projects. This data, spanning over five years of my life, comprises a meager ~500GB.

My motivations for self hosting are:

Here is (maybe part 1) of a high level overview of how all of this works.

3-2-1 backup rule

The general rule is:

I've accomplished this with:

Setup

MacBook host

I use an old MacBook Pro laptop as a host for everything. This makes phone and laptop backups easy since all of my devices are made by Apple.

Storage Drives

I have 3 drives connected to the MacBook host:

These were just hard drives I had laying around (free!) and I started using BTRFS since it looked slick and supported copy-on-write snapshotting.

Storage VM

This is the main part of the server and is running as a VM on the host machine.

I was hoping to be able to use a docker container for this, but I ran into issues with raw access to the drives since they had to be mounted by the host OS first - which isn't possible for BTRFS on Mac.

Services

I use tailscale as a VPN so I always have access to the data.

There are a couple of git servers hosted on here too for various local projects.

For status updates, I generate a page every few minutes which contains usage information, drive health, and CPU/memory usage.

Immich

Immich is an open source image hosting service. This was super easy to set up and has been working very well. It provides features on par with google/apple photos (face/object detection, live photos, map information).

This runs with docker compose inside of the Storage VM and I'm able to connect to the open port inside my tailnet.

Make sure to keep snapshots or other backups since they say

Do not use it as the only way to store your photos and videos

Navidrome is a good open source option for music streaming. It uses the Subsonic protocol for querying album/music data which comes with a good number of front ends for the web/iOS.

This also runs as a set of docker containers in the VM.

Time Machine

I use the host MacBook to do Time Machine backups onto an external drive.

I tried Netatalk as an open source Time Machine destination inside the VM, but ran into pretty consistent issues.

Instead, I periodically (manually) run rsync to get backups from the Time Machine disk to the main drive. This copy should only transfer what is new, so I don't have huge duplicates stored there.

One day I will ditch Time Machine altogether and just run rsync from the other Macs I have.

Phone

This is similar to the Time Machine backup, I manually run backup copies of my phone using Finder and then rsync that data over to the main drives.

Encrypted Offsite Backups

Keeping data offsite requires encryption. I use LUKS without a header to create a drive that just looks like random bytes of data. The keys and headers are stored on a USB drive with a hardware based password.

In addition to a drive that I keep offsite, I make a copy in AWS. Glacier Deep Archive is the cheapest way to store data in the cloud (since they mainly only charge on egress, which can take a few days). I have yet needed to restore from a full Glacier backup, but I've successfully performed a few test restores.

I'll do this process once every few months: