rsnapshot-based previous versions

The other day, a friend of mine who runs a small business asked me how to avoid the risk of losing files when hardware dies. Of course, the obvious choice is storing important data on a little NAS system that tolerates disk failure, but on top of that, I want to add snapshot-based backup for a little extra safety, integrated with Samba's shadow_copy2 VFS module so the Restore Previous Versions feature found in recent Windows OSes would work. But naturally, there are unexpected caveats.

shadow_copy2

First of all, let's take a quick look at the way shadow_copy2 works. Basically, it's a little Samba VFS module that looks for snapshots of given files or directories. Clients running Windows 7 onward display a context menu entry named "Restore Previous Versions" that offers to open, copy or restore a previous version of the selected file.

In order to enable this module, you need to add a few options to smb.conf:

[global]
vfs objects = shadow_copy2

[share]
path           = /media/share
shadow:snapdir = .snapshots
shadow:format  = %Y-%m-%d-%H-%M-%S

This configuration will share the volume mounted at /media/share, and export all the snapshots under /media/share/.snapshots as previous versions. The "date modified" column in the GUI is populated by parsing the snapshot directory names, so snapshots need to be named accordingly, for example "2015-05-30-21-45-35".

Which snapshot technology to use?

Btrfs and LVM come to mind. I'd really love to use Btrfs, mainly for the fact that LVM is not aware of what's happening in the file system, so I'd have to preallocate the snapshot volumes in size. Problem is, I have only a vague idea how much data we're talking about, and nobody knows how much will change in which time period, so preallocating sounds wasteful and error-prone and hence, LVM sucks.

But the plain truth is, I don't trust btrfs enough just yet. The whole point of this setup is data safety, I won't accept any risks just for my personal amusement. So what else can we do?

rsnapshot

Good old rsnapshot sounds like the perfect solution: Neither does it require preallocation, nor does it waste too much space because it uses hardlinks in order to avoid storing duplicates. And it is easily configured, too: Just apt-get install rsnapshot, add a backup line to /etc/rsnapshot.conf, and you're done — rsnapshot will now create snapshots at /var/cache/rsnapshot.

Unfortunately, those snapshots are named "hourly.0", "hourly.1", "daily.0", "daily.1", "weekly.0" etc — a naming convention that does not work for shadow_copy2. At all.

So, rsnapshot doesn't work either. But well, rsnapshot just uses rsync, so how hard can it be to build something that also uses hardlinks and names its directories in a way that works for shadow_copy2?

bash to the rescue

Luckily, someone posted an answer to a completely different problem that can be abused to do what we need. By adapting Benjamin's script just a tiny little bit, we should be able to get our snapshots. To try it out, we simply snapshot and share /etc. So, let's put this modified version into /usr/local/bin/snapshot:

#!/bin/bash
set -e
set -u
set -x
BACKUPDIR="/var/spool/snapshots"
today=$(date --utc "+%Y-%m-%d-%H-%M-%S")
rsync -a --link-dest="$BACKUPDIR/latest/" /etc/ "$BACKUPDIR/${today}"
cd $BACKUPDIR
rm -f latest
ln -s $today latest

Now, let's create the backup directory and pre-populate it with an empty "latest" backup, so rsync has something to diff against:

mkdir -p /var/spool/snapshots
cd /var/spool/snapshots
mkdir first
ln -s first latest

Create the first snapshot:

root@damien:~$ snapshot
+ BACKUPDIR=/var/spool/snapshots
++ date +%Y-%m-%d-%H-%M-%S
+ today=2015-05-30-21-45-11
+ rsync -a --link-dest=/var/spool/snapshots/latest/ /etc/ /var/spool/snapshots/2015-05-30-21-45-11
+ cd /var/spool/snapshots
+ rm -f latest
+ ln -s 2015-05-30-21-45-11 latest

Now the directory should look something like this:

root@damien:~$ la /var/spool/snapshots/
insgesamt 80
drwxr-xr-x   8 root root  4096 Mai 30 21:55 .
drwxr-xr-x  10 root root  4096 Mai 30 21:40 ..
drwxr-xr-x   2 root root  4096 Mai 30 21:40 first
drwxr-xr-x 188 root root 12288 Mai 30 21:41 2015-05-30-21-45-11
lrwxrwxrwx   1 root root    19 Mai 30 21:55 latest -> 2015-05-30-21-45-11

Nice, we've made our first snapshot, and it even has a name that shadow_copy2 can parse the date from!

Configuring the Samba Share

So now all that's left is configuring Samba to correctly export the snapshots as "previous versions". First of all, we configure a share for /etc:

[etc]
path       = /etc
read only  = yes
writeable  = no
guest ok   = no
force user = root

After verifying this works, we add the options needed for shadow_copy2:

shadow:snapdir = /var/spool/snapshots
shadow:format  = %Y-%m-%d-%H-%M-%S

Now we right-click the "previous versions" context menu entry, and will most likely find it to be empty.

First things first: We didn't change anything when we made our snapshot, so the file has no reason to show up in "previous versions". To get the file to show up, we add a coupl'a commented lines to /etc/hosts and create new snapshots in between:

root@damien:~$ la /var/spool/snapshots/
insgesamt 84
drwxr-xr-x   9 root root  4096 Mai 30 22:53 .
drwxr-xr-x  10 root root  4096 Mai 30 21:41 ..
drwxr-xr-x 188 root root 12288 Mai 30 21:41 2015-05-30-21-45-11
drwxr-xr-x 188 root root 12288 Mai 30 21:41 2015-05-30-21-45-26
drwxr-xr-x 188 root root 12288 Mai 30 21:41 2015-05-30-21-45-35
drwxr-xr-x 188 root root 12288 Mai 30 21:54 2015-05-30-21-54-56
drwxr-xr-x 188 root root 12288 Mai 30 21:55 2015-05-30-21-55-02
drwxr-xr-x 188 root root 12288 Mai 30 21:55 2015-05-30-21-55-08
lrwxrwxrwx   1 root root    19 Mai 30 21:55 latest -> 2015-05-30-21-55-08

However, the "previous versions" tab is still empty. Thing is, we put our snapshots directory outside the share path, so we need to put in a little extra effort: To access files outside the share path, we need to enable wide links and disable unix extensions; and because there's no etc subdirectory in the snapshot directories, we also need to specify the shadow:basedir option. So, our final configuration looks like this:

[global]
workgroup = LOCALLAN
netbios name = DAMIEN
vfs objects = shadow_copy2 acl_xattr
unix extensions = no

[etc]
path       = /etc
read only  = yes
writeable  = no
guest ok   = no
force user = root

shadow:snapdir = /var/spool/snapshots
shadow:format  = %Y-%m-%d-%H-%M-%S
shadow:basedir = /etc

follow symlinks = yes
wide links      = yes
allow insecure wide links = yes

To be honest, I'm not quite sure what those "wide links" options are doing, and I'm definitely going to just put the snapshots directory in the share path to avoid all that stuff in the productive system. (I'm also not going to share /etc.) But, lo and behold, the previous versions of our /etc/hosts file now look like this!

/files/prev_versions.png

So now, all that's left is building a little cleanup script that removes old snapshots after a certain time period, and we'll have a pretty neato snapshot system.

That is, we would, if just using a QNAP wouldn't make far more sense in almost every aspect...

Update: A friend of mine stumbled across dirvish today, which seems to do pretty much exactly what I'd like to build.

Update: Turns out timestamps need to be in UTC, because (I guess) Samba just assumes they are. Using localtime has the funny effect that you can restore a previous version dated two hours in the future, if you look at the list right after a snapshot has been created and you're living in GMT+2.