[[howto:scripting:versioned-backup-of-a-shared-web-hosting-account]]

Posted 26 April 2013 by Brendan Kidwell

Versioned Backup of a Shared Web Hosting Account Using Rsync and Rdiff-Backup

My wife and I have a number of WordPress sites (including this one) hosted by DreamHost's inexpensive shared web hosting service. Two of those sites are newly launched this month and could be vulnerable to all kinds of operator errors due to changing wishes and requirements.

Clearly, I need a daily backup of both sites. I don't want to rely on the vendor's backup and technical support procedures if my wife and I break something.

This tutorial will show you how to deploy a versioned backup solution for web sites running in a shared web hosting environment such as DreamHost.

Update (11 July 2016): The instructions on this page show you how to use rdiff-backup without having it installed on the server that has the data you want backed up. Today I created a script to help you install rdiff-backup on the server (if it's running Ubuntu) even if you don't have root access.

In the hosting environment at your vendor, you need:

  • Any Unix-like operating system
  • rsync
  • ssh server
  • mysqldump (or other tool for making an SQL dump of your database if not using MySQL)

You need another machine on the Internet to perform the backup. It could be in your home or your office.

  • Any Unix-like operating system (or Windows + Cygwin)
  • rsync
  • ssh client
  • cron
  • Enough storage space for 2x the size of the data being backed up, plus 1x the size of all diffs going from present time to the first backup run.

rdiff-backup performs incremental backups of one folder into another, keeping a full copy of the most recent snapshot, and incremental history working back from the full backup. Only changes from the current version are stored, and these diffs are stored efficiently to remove any redundant blocks of data.

rdiff-backup is available for most operating systems, but it typically doesn't come already installed on a shared web hosting account; DreamHost is no exception. My solution runs rdiff-backup on my personal server that does the backup work, but it needs a fresh copy of the data to work on, so we use rsync running on the client and the server over ssh to make a mirror copy.

The scripts later on in this article are more readable if you put ssh parameters into a config file ahead of time. Here's what my config file looks like on the client (the personal server):

config
Host dh-SITE1-backup
HostName DOMAIN1.TLD
User USER1
 
Host dh-SITE2-backup
HostName DOMAIN2.TLD
User USER2
 
# etc.

Now, when your script makes an ssh connection to dh-SITE1-backup it doesn't have to specify the hostname, username, and any other parameters right there in the ssh command line invokation.

In order to run rsync over ssh without being prompted for a password, you need to create a public/private key pair and install the public key on the server, in each account on the server where there is a site you're backing up.

First, on the client (your personal server), run this command if you haven't already created a key pair:

client$ ssh-keygen

Accept all the defaults. Don't set a password — or else your backup script won't be able to run unattended.

Now copy the public key to all the shell accounts on servers you are backing up:

client$ ssh-copy-id dh-SITE1-backup
USER1@DOMAIN1.TLD's password:
Now try logging into the machine, with "ssh 'dh-SITE1-backup'", and check in:

  ~/.ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

client$ ssh-copy-id dh-SITE2-backup
USER2@DOMAIN2.TLD's password:
Now try logging into the machine, with "ssh 'dh-SITE2-backup'", and check in:

  ~/.ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.
client$ ssh dh-SITE1-backup
server$ mkdir ~/db_backup
server$ touch ~/db_backup/db-dump.sh
server$ chmod +x ~/db_backup/db-dump.sh
server$ nano ~/db_backup/db-dump.sh #Edit the script in nano -- CTRL-X to save and quit
~/db_backup/db-dump.sh
#!/bin/bash
 
SCRIPT_PATH=`dirname $0`
DUMP_PATH=`readlink -f $SCRIPT_PATH`/database.sql
 
mysqldump --host=MYSQL_HOSTNAME --user=MYSQL_USERNAME --password="MYSQL_PASSWORD" MYSQL_DB_NAME >"$DUMP_PATH"

Make sure you fill in the all-caps meta variables above starting with MYSQL_.

If you are using a database engine other than MySQL, lookup how to dump the database to a file and change the script accordingly.

Run ~/db_backup/db-dump.sh on the server and make sure you get an SQL file in that folder.

Repeat this entire step for any additional sites you have if you are backing up more than one site.

On the client, create a folder structure to store your backup. Then create a script to use rsync to sync the remote files and database to a local mirror and then rdiff-backup to make a versioned backup of the mirror.

client$ mkdir -p ~/Backup/DreamHost/SITE1/mirror/db #Make folder with and parent folders
client$ mkdir ~/Backup/DreamHost/SITE1/mirror/www
client$ mkdir ~/Backup/DreamHost/SITE1/history
client$ touch ~/Backup/DreamHost/SITE1/pull.sh
client$ chmod +x ~/Backup/DreamHost/SITE1/pull.sh
client$ nano ~/Backup/DreamHost/SITE1/pull.sh
~/Backup/DreamHost/SITE1/pull.sh
#!/bin/bash
 
function announce {
   echo " "
   echo " "
   echo ------------------------------------------------------------
   echo $*
   date
   echo ------------------------------------------------------------
   echo " "
}
 
cd `dirname $0`
 
announce "Starting up"
echo Working path:
pwd
 
announce "Writing remote database backup"
ssh dh-SITE1-backup db_backup/db-dump.sh
announce "Syncing files"
rsync --progress --archive --rsh=ssh dh-SITE1-backup:"~/SITE1_WEB_FOLDER" mirror/www
announce "Syncing database"
rsync --progress --archive --rsh=ssh dh-SITE1-backup:"~/db_backup/database.sql" mirror/db/database.sql
 
announce "Performing versioned backup"
rdiff-backup mirror history
 
announce "Done"

Make sure you fill in the values for SITE1 and SITE1_WEB_FOLDER according to your site's name and its path on the server.

Run pull.sh and make sure you get a full backup in mirror and a clone in history. (Future runs of the script will refresh the clone in history while saving a linked list of diffs in the same folder working back from present time.)

Since this script will be running from a cron task, it's good to create a wrapper script that redirects all output to a log file, so you can see what went wrong if it fails.

client$ touch ~/Backup/DreamHost/SITE1/run.sh
client$ chmod +x ~/Backup/DreamHost/SITE1/run.sh
client$ nano ~/Backup/DreamHost/SITE1/run.sh
~/Backup/DreamHost/SITE1/run.sh
#!/bin/bash
 
cd `dirname $0`
./pull.sh &>backup.log

Run that file to make sure you get a log file.

Repeat this entire step for any additional sites you have if you are backing up more than one site.

The last step is to create a wrapper of the wrapper scripts to run all of them from a single scheduled cron task.

client$ touch ~/Backup/DreamHost/run-all.sh
client$ chmod +x ~/Backup/DreamHost/run-all.sh
client$ nano ~/Backup/DreamHost/run-all.sh
~/Backup/DreamHost/run-all.sh
#!/bin/bash
 
`dirname $0`/SITE1/run.sh
`dirname $0`/SITE2/run.sh
# etc.

Now pick a random time early in the morning – it's probably better not to pick an exact '00' or '30' minutes past an hour. I chose 03:13 in the client's local time. Create a cron task:

client$ crontab -e
CRONTAB FILE
13 3 * * * /home/MY_USERNAME/Backup/DreamHost/run-all.sh

Make sure you fill in MY_USERNAME. For exact syntax of the time specifier fields in the crontab file, see the man page for the file format.

Now you should be all set. Final test: wait until tomorrow and check to make sure it ran correctly on schedule.

To comment on this page, please copy and paste the following into an email and send it to me. Useful and informative comments will be published within a day or two.

To: brendan@glump.net
Subject: Glump.net comment - Versioned Backup of a Shared Web Hosting Account Using Rsync and Rdiff-Backup

Regarding: https://www.glump.net/howto/scripting/versioned-backup-of-a-shared-web-hosting-account
My name: [your name here]
My social media or web site: [optional URL here]
Publish this comment: yes

[your comment here]