Easy backups with Obnam

Haloocyn, freeimages.com

Haloocyn, freeimages.com

Lucky Strike

The Obnam command-line tool allows backups and restores, even when the X server is on strike. Its many options will easily meet the needs of a SOHO environment.

Hard drives know only three basic states, empty, full, and broken. In the first two states, the devices store data. However, hardware can change to the third state more often than you might like. So, unless you're following the Torvalds principle of "real men don't do backups," you will sooner or later need a good backup strategy and the right software to go with it.

The number of programs available for backing up data is astounding [1]. The Obnam tool, however, stands out from the crowd thanks to its many options that, in most cases, help ensure that your data is securely backed up. The excellent online tutorial [2] assists in special cases.

Obnam was written by the Finnish Debian developer and Linux old-timer Lars Wirzenius. He's been working since 2006 on Python software that stores desired backups on local disks, NFS or SMB shares, or remote servers, supported by the SFTP protocol.

Playing It Safe

Most backup tools are based on the rsync [4] algorithms, which provide the most important feature in backup software: the incremental backup. This approach involves backing up just the deltas since the last full or incremental backup.

This method therefore does therefore require a complete backup from time to time, which can consume both time and bandwidth if transferred over the Internet. Restoring is also labor-intensive because the data is usually composed of multiple incremental backups. The alternative of a differentiated backup, which always saves the changes from the last complete backup, requires more storage.

When Lars Wirzenius created an online backup service in 2006, these approaches didn't appeal to him, as he announced in the release notes for version 1.0 of Obnam in 2012 [5]. Instead, Wirzenius implemented the copy-on-write function, COW [6] for short, which is also used in Btrfs for creating snapshots.

The block-based approach, which allows the snapshot function in Obnam, is distinctly different from many other backup systems and is much closer to professional high-end products with "near-continuous data protection" (Near CDP) [7].

The software reuses already existing identical blocks, even when they're in different files or an older backup. The process is called deduplication [8]. Thus, each Obnam backup is a full backup, even when in theory it's an incremental backup.

Got Your Back

What sounds like a complicated process is quite simple in practice with Obnam – so much so that the tool doesn't even need a graphical interface. If it did, the many options would either overburden it or render it inadequate. (See the "Lucky Backup" box for an alternative tool.)

Lucky Backup

An alternative to Obnam with a graphical interface for home use is the Qt framework-based Lucky Backup (Figure 1) [14]. This application does its job in the background with rsync, so the backups fail somewhat more frequently than they do with Obnam. Beginners might find it a bit easier to use, however.

Figure 1: Lucky Backup is an alternative for users who don't feel quite comfortable with command lines.

Thanks to the comprehensive documentation, the program can serve your everyday needs. You can generally configure the software in less than half an hour. After that, it is set up to do its work on demand completely automatically. Additionally, Obnam gives you the option of initiating backup and restore operations from the command line.

The installation on Ubuntu can be carried out with the following simple command:

$ sudo apt-get
  install obnam

It copies less than 5MB of data onto the hard drive. For Debian, Gentoo, and openSUSE, packages are ready for download along with the source code [9]. For most scenarios, a configuration file makes the most sense. You create a text file for it as user with the command touch ~/.obnam.conf in your home directory. Listing 1 shows sample content.

Listing 1

Configuration File

[config]
# Backup storage
repository=/media/Backup
# Log file storage
log = /home/Username/obnam.log
# Log level
log-level = info
# Maximum log file size
log-max = 100 mb
# Backup exclusions (file types, downloads folder)
exclude = .mp3$, .mp4$, .part$, .rar$, .nfo$, /Downloads$
# Excluding caches
exclude-caches = yes
# Excluding external filesystems (/proc, NFS, etc.)
one-file-system = yes
# Keeps a daily backup over the last 14 days, etc.
keep = 14d,10w,12m
# Encrypts using GnuPG
encrypt-with = "My ID"

Note that to back up the root filesystem or parts of it, I recommend keeping a separate configuration file that includes a line such as root = /etc, /var , which backs up in the appropriate locations.

First Backup

You can start the first manual backup of the entire home directory with the following command:

$ obnam backup $HOME

This step assumes that you've already mounted the backup target and that you're running the console from that directory. Alternatively, specify the target of the backup with the -r option or set it up in the configuration.

You trigger the next backup in the same way. This step backs up all new and modified files. With larger backups, such as the initial one, you risk losing the connection – especially if you're going over a wireless LAN. To address this, the program adds a marker every 100MB and resumes its work when the connection comes back up.

Obnam is quite fast at doing its work. In my test, the 61GB home directory backup took 37 minutes with a USB3 connection. After completion, the software reported with a message from the first line in Listing 2. A second run a few days later returned the message from the second line – the program was a good half hour faster the second time around.

Listing 2

Backup Report

$ Backed up 98627 files (of 98628 found), uploaded 61.0 GiB in 37m18s at 28.1 MiB/s average speed
$ Backed up 4633 files (of 101010 found), uploaded 3.0 GiB in 3m24s at 15.1 MiB/s average speed

To control how many backup generations you've created or retained, use the obnam generations command. Listing 3 shows the results for the backups in Listing 2.

Listing 3

Backup Results

5543    2014-04-27 19:52:12 .. 2014-04-27 19:59:35 (98628 files, 69491282768 bytes)
6751    2014-05-01 09:42:39 .. 2014-05-01 09:43:20 (101010 files, 71257775259 bytes)

Divide and Conquer

In rare cases, you might be backing up the entire home directory. The procedure for the selection or inclusion of certain files depends on what's faster to define. If you want to back up only five directories, it makes sense to determine that with the backup commands or in the configuration.

It's more likely that you'll want to exclude individual directories or files from the backup. You can do this with the --exclude parameter that you can use on the command line, or, better yet, write into the configuration. It's also possible to exclude temporary storage with exclude-caches = yes or externally attached filesystems or the virtual proc directory with one-file-system = yes .

When backing up to the Internet, you can use the option to encrypt with GnuPG, which assumes that you've installed the GnuPG agent and generated a key pair. You can do so as user with the gpg --gen-key command. The GnuPG home page provides instructions on how to set up your keys [10].

Each key has an identifier, which the software uses when creating the key pair. The identifier appears in the list when you call up the information on the stored keys with --list-keys . You enter the identifier in the configuration with the following lines:

encrypt-with = "<identifier>"

That's all it takes to set up encryption.

Emergency Exit

Obnam provides two ways to restore lost data. The first takes advantage of the properties of the FUSE filesystem [11], which is standard on most current Linux systems. The second way is for when FUSE isn't around and is not as easy.

Each directory shows a backup generation (clearly indicated by number). You can thus look at the backup that you want to recover fully or partially. Change to the appropriate directory and check its contents. To restore the latest backup or parts of it, use the latest parameter instead of a number.

To restore a single file, first mount a directory anywhere, then use the copy command to copy the file to it. You can then use the diff tool [12] to compare the file content. After a successful restore, simply unmount the directory again (Listing 4, line 7).

Listing 4

Restoring Data with FUSE

01 $ mkdir ~/backups
02 $ obnam mount --to ~/backups
03 $ ls -l ~/backups
04 drwxr-xr-x 25 root root 4096 Apr 27 19:59 5543
05 drwxr-xr-x 25 root root 4096 May  1 09:43 6751
06 lrwxr-xr-x 25 root root 4096 May  1 09:43 latest -> 6751
07 $ fusermount -u ~/backups

If FUSE or the obnam mount command is not at hand, search for what you want to restore with the generations parameters and ls . You use the command on the first line of Listing 5 to extract the file in question from the repository. You can restore the complete last generation with the command on line 2, and an older backup by specifying its number (line 3).

Listing 5

Obnam Restore

01 $ obnam restore --repository repository/path --to path
02 $ obnam restore --to path
03 $ obnam restore --to path --5543

FUSE is a userspace filesystem. If it's used, Obnam shows the backups as normal directories that you mount by using the first line in Listing 4 to make a new home subdirectory. Then, you mount the directory for the backups in line 2 and list its contents in line 3.

You can make automated backups with a cron job that specifies time intervals. Get the gnome-schedule package for Gnome for a graphical interface. The same goes for KDE with a similar tool under System settings | Task scheduler . The console has an editor that you can call up with crontab -e .

It's advisable to start off with a small backup and test a complete or partial restore. This process should assure some confidence when it comes to restoring real lost data, which you may have lost because of a bad hard drive or an accidentally deleted directory. You can count on one or the other occurring at some point.

Conclusion

Obnam lets you easily create space-saving backups locally or remotely and restores reliably in an emergency. The program doesn't provide a graphical interface but is still simple to operate. Besides, what's the use of the nicest graphical interface if you can't get to it after a system crash?

Obnam is client-based, which allows for flexibility that will meet the requirements of small and even medium-sized companies. How elaborate these backups need to be depends on each case. Obnam doesn't impose any limits and can safeguard the data of several customers in a repository while taking deduplication into account. l