Cloud Backup for NAS Shootout, Part I

In case further proof were needed, last month’s story about the companies that lost irreplaceable data in a literal towering inferno proves again that everyone with important files needs offsite backups.

I don’t mean to make fun of the hosting provider or the companies who got caught without backups in multiple locations (OK, maybe them, a little bit). Keeping your data safe from a real-world catastrophe—whether you’re managing business infrastructure or New Folders (1) through (23) on your desktop—is hard!

This is the first in a series of posts documenting a project I did to start backing up my personal network attached storage (NAS) server to cloud storage. This post outlines my requirements for an effective cloud backup strategy and some challenges I’ve run into. Future posts will evaluate some different backup applications based on my testing and describe the technical implementation in more depth.

One view I’d like to promote is that everyone has important data, and it’s worth the effort to plan for disaster recovery. In my opinion, the considerations and risks are the same whether you’re an individual or a large organization. I hope these posts can be a resource for anyone looking for a reliable backup solution.

For detailed steps, scripts, and results, you can follow along with the GitHub repo that accompanies this series of articles here: neapsix/cloudbackup-benchmarks.

It’s a Series of Tubes

My biggest challenge to backing up data over the internet is uploading a lot of data with only a little upload bandwidth. My cable internet connection from Charter is rated for 300 Mbit/s down and 30 Mbit/s up. In practice, I usually see a little less. The only other option in my neighborhood is a DSL connection, but at least I don’t have a data cap.

Backing up a terabyte of data on my connection would take three days using 100% of my upload bandwidth the entire time. With incremental backups, I don’t need to upload that much very often, but because the connection is not a big truck, the incremental process also needs to be efficient to avoid interfering with other traffic on the network.

The other challenge for me personally is puzzling out what backup client best meets my needs while being stable, trusted, and reasonably convenient to use and administer. There are a lot of options, and each has benefits and drawbacks.

Requirements and Considerations

I identified the following requirements for a good cloud backup process:

The following criteria are nice-to-have’s:

Of course, the backup plan must have well documented procedures and defined recovery point and time objectives (RPO and RTO), and those procedures should be regularly tested.

Note that I see cloud backups as a supplement to other methods. For example, a cloud backup might be the third copy in a 3-2-1 strategy (three copies, two local, and one offsite). I already run local backups by copying my data to a pair of USB hard drives using ZFS replication. Keeping one of the two drives at my office gets me an offsite backup, but the RPO with that strategy depends on my remembering to swap the drives often.

Next Steps

With my use case and end state goal (the why and the what) defined, the next task was to figure out how best to implement them. Part II will list the options I considered and the steps I took to test them.