howto: rdiff-backup with launchd

I started writing this artlcle a few months ago during a lull. It was nearly-complete then but needed a few finishing touches. So, without further ado, I present an rdiff-backup with launchd solution for you to enjoy/endure. If you’re not a Mac user, this will work replacing launchd with cron on linux and should be available on Windows as well replacing launchd with cron (from cygwin) or the windows “at” command. It’s a pretty slick and entirely free solution. Naturally, you will want to test this thoroughly with your setup as backups are something you shouldn’t mess around with. It works for me as-described, but it might not work for you. Consider yourself warned.

Lastly, this is a bit long. If rampant nerdery offends you, please turn back now!

HOWTO: rdiff-backup with launchd and lingon

I saw a great article awhile back on Daniel Jalkut’s Red Sweater Blog titled Taming Launchd. It got me thinking about my first tentative explorations of Tiger and the loss of my then just-discovered backup software, rdiff-backup.

Flash-forward a year and a few months. I’m relatively familiar with launchd and use the oddly-named Lingon to add jobs to it on a regular basis. A word of warning: Lingon’s interface is a bit… unusual, much like the berry for which it is named.

It is worrisome that my backups have been managed with Apple’s very own Backup utility, provided as an add-on to .Mac. I have mixed feelings about .Mac and Backup. I like the syncing feature, which, I think, most hold-outs to this service claim is the real reason they continue to pay for it. But Backup is something that should be provided with the operating system. That it checks for a valid .Mac license at startup is insulting. Does it mean that if I let my subscription lapse and I have a disk failure, my backups are gone? (I think not, actually, as the files are stored in packages containing “sparse images” - piecing them together would no doubt be somewhat difficult to do without software however).

So, as a public service, I’m going to provide a walk through of setting up rdiff-backup with a launchd scheduler. Note that this will likely be obsolete by the time Leopard is released. Nevertheless, I feel this is the time.

Installation

First, you’ll need DarwinPorts or Fink. My personal preference in this instance is DarwinPorts. The Fink Project offers essentially the same service, but I switched from Fink early on in Tiger’s life and haven’t looked back.

Next, grab the current stable release of rdiff-backup. Using DarwinPorts, the command would be:

$ sudo port install rdiff-backup

(assuming you have sudo enabled on your system, if not, you’ll have to do it as root)

Your package manager will conveniently install librsync and other prerequisites for you. If you’re on a base system, you’ll likely get a copy of Python 2.4 along with it. Don’t be alarmed, this is Good… unless you’re an early adopter and have already moved to 2.5, then it’s slightly annoying.

Once that’s done, you should be informed that rdiff-backup has been installed and activated.

Backing stuff up

Let’s try it out on a small directory.

e.g.,

$ rdiff-backup --override-chars-to-quote '' /Users/boolean/Temps /Volumes/TRex/boolean/Temps

The –override-chars-to-quote option is needed to skip escaping upper-case characters on OS X, see the rdiff-backup wiki details

When that’s done, you should see a copy of the directory in the specified location with a subfolder called “rdiff-backup-data”. This is where rdiff-backup will store all the diffs for future incremental backups. Make some changes to the directory and run the above command again. Look in the rdiff-backup-data directory for a file called session_statistics… with the timestamp for the last run command. Looking at its contents, it gives a breakdown of time to run and number of affected files and sizes. Once you’re satisfied with the results, take a peek in the “increments” subdirectory. You’ll see a list of files corresponding to the changes you made to the directory.

And that’s pretty-much what rdiff-backup does! It keeps a top-level snapshot as a backup of the target directory with a list of incremental changes in the data subdirectory. Restoring is similarly easy:

$ rdiff-backup --restore-as-of 3W4D10h /Volumes/TRex/boolean/Temps /Users/boolean/Temps

This will restore a backup from 3 weeks, 4 days and 10 hours ago. You can also specify dates as datetime format or YYYY-MM-DD. Check out the Time Formats section of the man page for a better breakdown.

Now for the scheduling bit. Fire up Lingon (if it’s installed and you’re using Quicksilver, this’ll be easy, otherwise hunt it down in your dock or your Finder). Click on the User Agents “tab” and authenticate with your password. This is required.

Next, click the “New” button in the toolbar. I know it looks like it’s greyed-out and inactive but this is the way it’s supposed to look. Label it, something suitable, com.n3wb.homedirbackup (if you have a domain, if not, don’t fret about it too much - feel free to use n3wb). If you’d prefer to start with a smaller directory for test purposes, go that way and name it accordingly.

Next, enter the rdiff-backup line to perform the backup, separating out each argument to the program on a different line (execlp calling convention?). In my case and likely yours if you installed with Darwin Ports, you’ll want to put /opt/local/bin/rdiff-backup as your first line, –override-chars-to-quote on the second, ” (empty, single-quoted string) as the third, directory to backup and destination as third and fourth lines, respectively. When that’s all done, give it a descriptive name in “Service Description”.

A note about Lingon’s interface, adding lines to this section of the dialog is made slightly confusing by requiring that the user keep the cursor active when hitting that ‘+’ button to add another line.
Make sure the path to the destination directory you would like to use exists. As in, if you’re backing up ~/Library to /Volumes/Data/backups, make sure you have an accessible directory called “backups” on your Data volume. rdiff-backup will create the directory you’re backing up as long as the destination exists. In this case, Library will get created in /Volumes/Data/backups.

If you want to try it out right away, and I know you do, click the OnDemand checkbox and then hit Save & Reload. You’ll be prompted for your password again, which you can enter and after a moment, you’ll be returned to the main screen. Highlight the service you just created in User Programs and in the Action menu, select Start. If all goes well, you’ll hear your disks spinning and your destination directory should include an updated rdiff-backuped directory.

When you’re ready to set a schedule for this, double-click on the service entry and click the Miscellaneous tab. Here you can enter the time to run the script. Depending on the seriousness of the data you’re backing up, you’ll want to do it daily, weekly or monthly. Enter some values and hit Save & Reload. You should be all set.

Now you have a template for creating other entries in Lingon. Feel free to setup multiple directories to be backed up, or create one master backup entry for your home directory.

I hope this is useful and saves someone from spending $100 for a backup solution that really should have been free.

Technorati Tags: , , , ,

PS, I hate this style-sheet