The other day, a friend of mine who runs a small business asked me how to avoid the risk of losing files when hardware dies. Of course, the obvious choice is storing important data on a little NAS system that tolerates disk failure, but on top of that, I want to add snapshot-based backup for a little extra safety, integrated with Samba's shadow_copy2 VFS module so the Restore Previous Versions feature found in recent Windows OSes would work. But naturally, there are unexpected caveats.
First of all, let's take a quick look at the way shadow_copy2 works. Basically, it's a little Samba VFS module that looks for snapshots of given files or directories. Clients running Windows 7 onward display a context menu entry named "Restore Previous Versions" that offers to open, copy or restore a previous version of the selected file.
In order to enable this module, you need to add a few options to smb.conf:
[global] vfs objects = shadow_copy2 [share] path = /media/share shadow:snapdir = .snapshots shadow:format = %Y-%m-%d-%H-%M-%S
This configuration will share the volume mounted at /media/share, and export all the snapshots under /media/share/.snapshots as previous versions. The "date modified" column in the GUI is populated by parsing the snapshot directory names, so snapshots need to be named accordingly, for example "2015-05-30-21-45-35".
Which snapshot technology to use?
Btrfs and LVM come to mind. I'd really love to use Btrfs, mainly for the fact that LVM is not aware of what's happening in the file system, so I'd have to preallocate the snapshot volumes in size. Problem is, I have only a vague idea how much data we're talking about, and nobody knows how much will change in which time period, so preallocating sounds wasteful and error-prone and hence, LVM sucks.
But the plain truth is, I don't trust btrfs enough just yet. The whole point of this setup is data safety, I won't accept any risks just for my personal amusement. So what else can we do?
Good old rsnapshot sounds like the perfect solution: Neither does it require preallocation, nor does it waste too much space because it uses hardlinks in order to avoid storing duplicates. And it is easily configured, too: Just apt-get install rsnapshot, add a backup line to /etc/rsnapshot.conf, and you're done — rsnapshot will now create snapshots at /var/cache/rsnapshot.
Unfortunately, those snapshots are named "hourly.0", "hourly.1", "daily.0", "daily.1", "weekly.0" etc — a naming convention that does not work for shadow_copy2. At all.
So, rsnapshot doesn't work either. But well, rsnapshot just uses rsync, so how hard can it be to build something that also uses hardlinks and names its directories in a way that works for shadow_copy2?
bash to the rescue
Luckily, someone posted an answer to a completely different problem that can be abused to do what we need. By adapting Benjamin's script just a tiny little bit, we should be able to get our snapshots. To try it out, we simply snapshot and share /etc. So, let's put this modified version into /usr/local/bin/snapshot:
Now, let's create the backup directory and pre-populate it with an empty "latest" backup, so rsync has something to diff against:
mkdir -p /var/spool/snapshots cd /var/spool/snapshots mkdir first ln -s first latest
Create the first snapshot:
root@damien:~$ snapshot + BACKUPDIR=/var/spool/snapshots ++ date +%Y-%m-%d-%H-%M-%S + today=2015-05-30-21-45-11 + rsync -a --link-dest=/var/spool/snapshots/latest/ /etc/ /var/spool/snapshots/2015-05-30-21-45-11 + cd /var/spool/snapshots + rm -f latest + ln -s 2015-05-30-21-45-11 latest
Now the directory should look something like this:
root@damien:~$ la /var/spool/snapshots/ insgesamt 80 drwxr-xr-x 8 root root 4096 Mai 30 21:55 . drwxr-xr-x 10 root root 4096 Mai 30 21:40 .. drwxr-xr-x 2 root root 4096 Mai 30 21:40 first drwxr-xr-x 188 root root 12288 Mai 30 21:41 2015-05-30-21-45-11 lrwxrwxrwx 1 root root 19 Mai 30 21:55 latest -> 2015-05-30-21-45-11
Nice, we've made our first snapshot, and it even has a name that shadow_copy2 can parse the date from!