Incremental daily backup with little space penalty – Time Machine on linux

I am a fan of Apple’s Time Machine backup system since it’s introduction. And I’ve always wanted to implement something similar on my server. Since I have that nice spacious NAS disk right now space stopped to be a problem (at least for a little while). Little googling shows that rsync has a special option allowing me to implement Time Machine’s method of incremental backups using rsync. In that method rsync uses existing backup as additional source for comparing files and if file did not change since last backup has been done a hard link is being created to this file instead of copying. This way I should end up with daily directories of files but only new/changed files will be eating up disk space.

Let’s give it a try. My backup script looks like this:

#!/bin/bash
date=`date “+%Y-%m-%d”`

#Mysql dump
for i in /var/lib/mysql/*/; do
dbname=`basename $i`
/usr/bin/mysqldump -u root -pyourpasshere $dbname | gzip -c > /home/mysql/$dbname-$date.sql.gz
done

#Rsync
rsync -arpvogDtSWz \
-e “ssh ” \
–delete \
–link-dest=../mysql-current \
./mysql/ \
root@nas:/path/to/backups/mysql-$date

#Moving links up one day
ssh root@nas \
“cd /path/to/backups/ && rm -f mysql-current && ln -s mysql-$date mysql-current”

The script dumps all dbs to a directory on hard drive, then rsyncs this directory with dated folder on a nas server using mysql-current (which is a symlink to latest daily backup) as additional sorce of files (–link-dest parameter).

I’ve run the script and then checked out the results:

root@nas# du -sh mysql-*
92M mysql-2011-05-16
19M mysql-2011-05-17
512 mysql-current

Looks like indeed the disk usage of the newer directory reflects only today’s copies of dbs. To confirm that hard links work as they should I deleted one file in the oldest directory. And now:

root@nas# du -sh mysql-*
86M mysql-2011-05-16
25M mysql-2011-05-17
512 mysql-current

the older directory size dropped but the newer directory size went up as the same file was hard linked there too and now that directory represents the only physical copy of the file on disk.

That’s exactly what I wanted to achieve.
Now to fully implement Time Machine mechanisms I need to make a script that will calculate the amount of free space needed on backup drive and delete the oldest backups accordingly until there is enough space available (as you see above simply deleting one file may not free up any hard drive space) to mve on with the transfer.

Leave a Reply

Your email address will not be published. Required fields are marked *