In my last blog post I said that I had moved my backups from an external disk to Rackspace Cloud Files and promised I’d explain how.
Ok, so why bother? I had about 100 GB of data that was being backed up. I didn’t want to upload 99% of that, have my wifi go bonkers, and then have to start over (because Duplicity apparently isn’t very good at resuming). So, instead I wanted to make the initial backup to an external drive (the backup wouldn’t fit on my laptop’s hard drive) and defer copying it to Rackspace as time and connectivity permitted.
That was simple enough.
Once the first, full backup was made, I wanted incremental backups to go directly to Cloud Files, so I needed to get Deja-Dup to realise that there was already a backup on there.
This was the trickier bit.
When you ask Duplicity to interact with a particular backup location, it calculates a hash of the URI of it and looks that up in its cache to see if it knows about it already. If you’ve made a backup with deja-dup, you can go and look in $HOME/.cache/deja-dup. This is what I had:
soren@lenny:~$ ls -l $HOME/.cache/deja-dup/
drwxr-xr-x 2 soren soren 4096 2011-01-14 18:09 4e33cf513fa4772471272dbd07fca5be
You see a directory named after the hash of the uri of the backup location I used, namely “file:///media/backup” (the MD5 sum of which is 4e33cf513fa4772471272dbd07fca5be).
Inside this directory, we find:
soren@lenny:~$ ls -l /home/soren/.cache/deja-dup/4e33cf513fa4772471272dbd07fca5be/
-rw------- 1 soren soren 750938885 Jan 14 15:47 duplicity-full-signatures.20110113T170937Z.sigtar.gz
-rw------- 1 soren soren 653487 Jan 14 15:47 duplicity-full.20110113T170937Z.manifest
It contains a manifest and a signature file. These files in there have no record of the backup location. That information exists only in the name of the directory. Essentially, all I needed to do was to rename the directory to match the Cloud Files location. Being a bit cautious, I decided to copy it instead. The URI for a container on Cloud Files looks like “cf+http://containername”. Knowing this, it was as simple as:
soren@lenny:~$ echo -n 'cf+http://lenny' | md5sum
soren@lenny:~$ cd $HOME/.cache/deja-dup
soren@lenny:~/.cache/deja-dup$ cp -r 4e33cf513fa4772471272dbd07fca5be 2f66137249874ed1fdc952e9349912d4
The -n option to echo is essential. Without it, I’d have been calculating the MD5 sum of the URI with a trailing newline.
Before I ran deja-dup again, I made sure the two files above were copied to Cloud Files. If I hadn’t, the first time duplicity would talk to Cloud Files, it would realise that these files don’t exist on the expected backup location, hence the local cache of them must be invalid, so it would delete them. This happened to me the first time, so making a copy rather than just renaming the directory turned out to be a good idea.
All that was left to do now was to change my backup location in Deja-Dup. This should be simple enough, so I won’t go into detail about that.
The best part about this, I think, is that wasn’t until 5-6 days later, that my upload of the initial full backup finished. However, in the mean time, I was able to do incremental backups just fine, because all it needs to do that is the signature files from the previous runs.
Oh, and to actually upload the files, I used the “st” tool from Swift. Something like this:
soren@lenny:~$ cd /media/backup
soren@lenny:/media/backup$ st -A https://auth.api.rackspacecloud.com/v1.0 -U soren -K 6e6f742061206368616e636521212121 upload lenny *