Implementing rotational backups

Backups seem to be the main theme I’m posting about at the moment and the next on my agenda was to get proper rotational backups implemented.

“Proper” meaning incremental (only data being exchanged between client and server is that which has changed) and hard-linked (only data to have changed between backups will incur additional used storage). I.E. If a 1GB folder was backed up 10 times over a year and nothing was changed within it, only 1GB of disk space will be used storing all 10 backups (excluding link overhead) and only 1GB of bandwidth will be used exchanging files.

The main drive to do this was due to the WAN backup I have implemented from my Parent’s machine. I wanted the ability to recover data that had been deleted up to 3 months ago (easily).

To do this, I decided to write a shell script on my Linux HTPC server, it’s not particularly elegant, but get’s the job done quite nicely. The logic is as follows: –

  • First call the rysnc script over SSH from the client with the intended destination backup path and optionally, the number of rotations to keep
  • It will locate the newest sub-folder in the intended path and recursively copy it to a new folder with current date-time as it’s name. The copy uses hard-links so that minimal disk space is taken.
  • Delete the oldest folders that are over the specified number of rotations to keep (defaults to 10)
  • Re-link .latest to the new folder (so we can easily find the most recent backup)
  • Return the new target backup path for use by rsync over SSH
  • Use the returned string as the destination for the rsync command on the client.

It works great, and it pretty easy to navigate to a point in time to see the state of the backup, as you can see below: –

ls -al
drwxr-xr-x 12 owner owner 4096 2012-04-08 00:00 .
drwxr-xr-x 4 owner owner 4096 2012-01-06 18:18 ..
drwxr-xr-x 7 owner owner 20480 2012-01-25 21:12 2012-03-18_00.00.02
drwxr-xr-x 7 owner owner 20480 2012-01-25 21:12 2012-03-25_00.00.04
drwxr-xr-x 6 owner owner 20480 2012-03-26 20:41 2012-04-01_00.00.04
drwxr-xr-x 6 owner owner 20480 2012-03-26 20:41 2012-04-08_00.00.03
lrwxrwxrwx 1 root root 46 2012-04-08 00:00 .latest > /path/to/backup/2012-04-08_00.00.03

Here’s the Linux script

if [ -z "$1" ]; then
	echo "Missing parameter 1: usage rsyncRotate path [numRotations]"
	echo "	path: the path to a folder that should be rotated"
	echo "	numRotations: Optional parameter detailing how may rotations should be maintained - defaults to 10"
	exit -1
if [ -z "$2" ]; then
base_folder=`ls "$1" -1rt | tail -1`
while [ ${rotations} -le `ls -l "$1" | grep ^d | wc -l` ]
	# Find the oldest file in the directory and remove it
	oldest_dir=`ls "$1" -1t | tail -1`
	if ! rm -Rf "$1/${oldest_dir}" >& /dev/null
		echo ${base_folder}
		exit "1"
new_folder=`date +\%Y-\%m-\%d_\%H.\%M.\%S`
mkdir "$1/${new_folder}"
find "$1/${base_folder}" -mindepth 1 -maxdepth 2 -exec cp -al "{}" "`readlink -mn "$1/${new_folder}"`/" \;
# Relink the latest folder
if [ -d "$latest_folder" ]; then
	rm "$latest_folder"
ln -s `readlink -mn "$1/${new_folder}"` "$latest_folder"
echo `readlink -mn "$1/${new_folder}"`

Here’s the Windows backup script

It’s using cwrsync so I can used rsync on Windows, the areas you need to change are in bold.

REM Make environment variable changes local to this batch file

REM ** CUSTOMIZE ** Specify where to find rsync and related files (C:\CWRSYNC)

REM Set CYGWIN variable to 'nontsec'. That makes sure that permissions
REM on your windows machine are not updated as a side effect of cygwin
REM operations.
SET CYGWIN=nontsec

REM Set HOME variable to your windows home directory. That makes sure
REM that ssh command creates known_hosts in a directory you have access.

REM Make cwRsync home as a part of system PATH to find required DLLs
SET KEY=/cygdrive/c/PATH/TO/SSH/KEY/key.dsa
FOR /F "tokens=1 delims=" %%A in ('ssh -i "%KEY%" %HOST% "rsyncRotate \"/path/to/remote/backup\" 16"') do SET target=%%A
rsync -az --delete -e "ssh -i \"%KEY%\"" "/cygdrive/c/PATH/TO/BACKUP/SOURCE/" "%HOST%:\"%target%\""
echo Done: %target%

syncRotate is the name of the Linux script, the first parameter is the remote backup path, the second is the number of rotations to keep. Make sure the script is on the user’s path, or use its absolute reference.

Leave a Reply