Why are secure backups so hard?

Note: Be sure to check the sequel to this post, about the program that will supersede this one and be compatible with all backup utilities.

Backing things up is important, and, luckily, there are many high-quality services geared to everyday people that are very easy to use and cheap. Unfortunately, I am not everyday people, as I am very paranoid and insist that absolutely nobody be able to see my photos of my dog and lawn. It’s a matter of privacy.

To that end, I’ve long been looking for a secure/encrypted backups service, but I haven’t managed to find a single service or tool that fulfils my requirements:

  • Cheap to store data on (~$30 per year for 50 GB). My dog/lawn photos aren’t that important, and I already back them up automatically to three places at home.
  • Encrypted on the computer that has the original files in plaintext (not already-encrypted, and nothing leaves that computer in plaintext).
  • Open-source. If I can’t check what it’s encrypting, when and how, it might as well not be encrypting anything.
  • Transfer-efficient. I have a slow connection, so only changed data should be transferred. Creating a full backup every so often is unacceptable, as it will take a whole day to upload.
  • Reasonably fast.
  • My files shouldn’t need to be stored encrypted. I just want to back them up.

These requirements pretty much rule out anything that currently exists. duplicity needs huge full backups from time to time, and frequently disconnected/failed when I tried to make them, so I was unable to back my files up even once. SpiderOak is close enough, but it’s not open-source. EncFS in reverse mode with rdiff-backup would pretty much be ideal, but EncFS currently has two bugs that prevent this from working. tarsnap is too expensive. Obnam is pretty damn slow.

Eventually, I got fed up with the situation, and went around looking for any existing solutions I could bolt encryption on to. I came across bup, which provides fast, deduplicated backups, which sounds an ideal target for adding encryption to.

Looking at it a bit further, I saw that it accepts a file in stdin (that can be a tarball), deduplicates and transmits it to the remote server for storage. Multiple users can use this to store their files without having any concept of one another, and deduplication will still work between them.

Adding encryption

Adding encryption to bup turned out to be fairly straightforward, since it expects a big blob. I quickly wrote a small script that accepts a list of files and outputs one big ball of those files, except they’re encrypted. There are a few constraints, because the script can’t very easily store state, so the files are only unique per-key, but one specific file will only ever encrypt to one specific ciphertext for a given key, which makes per-file deduplication possible.

Thus, encbup was born:

https://github.com/skorokithakis/encbup

It’s still a very early prototype, but it can encrypt and decrypt files, hopefully correctly. I have to clean the code up a bit, because it’s very messy right now, and I also want to add an index to the end of the file, so you don’t have to download the entire thing to restore just a few files.

For the time being, though, it has a simple test suite that’s passing, and works quite well with bup. I’d appreciate any feedback or help with it, if you’re at all interested in secure backups.

As always, if you want to stay up to date on my posts or things I create, please subscribe to my mailing list below, or follow me on Twitter.