1*56be8454SSascha Wildner$DragonFly: src/bin/cpdup/BACKUPS,v 1.3 2007/05/13 22:25:41 swildner Exp $ 2d05b679bSMatthew Dillon 3d05b679bSMatthew Dillon INCREMENTAL BACKUP HOWTO 4d05b679bSMatthew Dillon 5d05b679bSMatthew Dillon This document describes one of several ways to set up a LAN backup and 6d05b679bSMatthew Dillon an off-site WAN backup system using cpdup's hardlinking capabilities. 7d05b679bSMatthew Dillon 8d05b679bSMatthew Dillon The features described in this document are also encapsulated in scripts 9d05b679bSMatthew Dillon which can be found in the scripts/ directory. These scripts can be used 10d05b679bSMatthew Dillon to automate all backup steps except for the initial preparation of the 11d05b679bSMatthew Dillon backup and off-site machine's directory topology. Operation of these 12d05b679bSMatthew Dillon scripts is described in the last section of this document. 13d05b679bSMatthew Dillon 14d05b679bSMatthew Dillon 15d05b679bSMatthew Dillon PART 1 - PREPARE THE LAN BACKUP BOX 16d05b679bSMatthew Dillon 17d05b679bSMatthew Dillon The easiest way to create a LAN backup box is to NFS mount all your 18d05b679bSMatthew Dillon backup clients onto the backup box. It is also possible to use cpdup's 19d05b679bSMatthew Dillon remote host feature to access your client boxes but that requires root 20d05b679bSMatthew Dillon access to the client boxes and is not described here. 21d05b679bSMatthew Dillon 22d05b679bSMatthew Dillon Create a directory on the backup machine called /nfs, a subdirectory 23d05b679bSMatthew Dillon foreach remote client, and subdirectories for each partition on each 24d05b679bSMatthew Dillon client. Remember that cpdup does not cross mount points so you will 25d05b679bSMatthew Dillon need a mount for each partition you wish to backup. For example: 26d05b679bSMatthew Dillon 27d05b679bSMatthew Dillon [ ON LAN BACKUP BOX ] 28d05b679bSMatthew Dillon 29d05b679bSMatthew Dillon mkdir /nfs 30d05b679bSMatthew Dillon mkdir /nfs/box1 31d05b679bSMatthew Dillon mkdir /nfs/box1/home 32d05b679bSMatthew Dillon mkdir /nfs/box1/var 33d05b679bSMatthew Dillon 34d05b679bSMatthew Dillon Before you actually do the NFS mount, create a dummy file for each 35d05b679bSMatthew Dillon mount point that can be used by scripts to detect when an NFS mount 36d05b679bSMatthew Dillon has not been done. Scripts can thus avoid a common failure scenario 37d05b679bSMatthew Dillon and not accidently cpdup an empty mount point to the backup partition 38d05b679bSMatthew Dillon (destroying that day's backup in the process). 39d05b679bSMatthew Dillon 40d05b679bSMatthew Dillon touch /nfs/box1/home/NOT_MOUNTED 41d05b679bSMatthew Dillon touch /nfs/box1/var/NOT_MOUNTED 42d05b679bSMatthew Dillon 43d05b679bSMatthew Dillon Once the directory structure has been set up, do your NFS mounts and 44d05b679bSMatthew Dillon also add them to your fstab. Since you will probably wind up with a 45d05b679bSMatthew Dillon lot of mounts it is a good idea to use 'ro,bg' (readonly, background 46d05b679bSMatthew Dillon mount) in the fstab entries. 47d05b679bSMatthew Dillon 48d05b679bSMatthew Dillon mount box1:/home /nfs/box1/home 49d05b679bSMatthew Dillon mount box1:/var /nfs/box1/var 50d05b679bSMatthew Dillon 51d05b679bSMatthew Dillon You should create a huge /backup partition on your backup machine which 52d05b679bSMatthew Dillon is capable of holding all your mirrors. Create a subdirectory called 53d05b679bSMatthew Dillon /backup/mirrors in your huge backup partition. 54d05b679bSMatthew Dillon 55d05b679bSMatthew Dillon mount <huge_disk> /backup 56d05b679bSMatthew Dillon mkdir /backup/mirrors 57d05b679bSMatthew Dillon 58d05b679bSMatthew Dillon 59d05b679bSMatthew Dillon PART 2 - DOING A LEVEL 0 BACKUP 60d05b679bSMatthew Dillon 61d05b679bSMatthew Dillon (If you use the supplied scripts, a level 0 backup can be accomplished 62d05b679bSMatthew Dillon simply by running the 'do_mirror' script with an argument of 0). 63d05b679bSMatthew Dillon 64d05b679bSMatthew Dillon Create a level 0 backup using a standard cpdup with no special arguments 65d05b679bSMatthew Dillon other then -i0 -s0 (tell it not to ask questions and turn off the 66d05b679bSMatthew Dillon file-overwrite-with-directory safety feature). Name the mirror with 67d05b679bSMatthew Dillon the date in a string-sortable format. 68d05b679bSMatthew Dillon 69d05b679bSMatthew Dillon set date = `date "+%Y%m%d"` 70d05b679bSMatthew Dillon mkdir /backup/mirrors/box1.${date} 71d05b679bSMatthew Dillon cpdup -i0 -s0 /nfs/box1/home /backup/mirrors/box1.${date}/home 72d05b679bSMatthew Dillon cpdup -i0 -s0 /nfs/box1/var /backup/mirrors/box1.${date}/var 73d05b679bSMatthew Dillon 74d05b679bSMatthew Dillon Create a softlink to the most recently completed backup, which is your 75a05b5f9bSMatthew Dillon level 0 backup. Note that using 'ln -sf' will create a link in the 76a05b5f9bSMatthew Dillon subdirectory pointed to by the current link, not replace the current 77a05b5f9bSMatthew Dillon link. 'ln -shf' can be used to replace the link but is not portable. 78a05b5f9bSMatthew Dillon 'mv -f' has the same problem. 79d05b679bSMatthew Dillon 80d05b679bSMatthew Dillon sync 81a05b5f9bSMatthew Dillon rm -f /backup/mirrors/box1 82a05b5f9bSMatthew Dillon ln -s /backup/mirrors/box1.${date} /backup/mirrors/box1 83d05b679bSMatthew Dillon 84d05b679bSMatthew Dillon PART 3 - DO AN INCREMENTAL BACKUP 85d05b679bSMatthew Dillon 86d05b679bSMatthew Dillon An incremental backup is exactly the same as a level 0 backup EXCEPT 87d05b679bSMatthew Dillon you use the -H option to specify the location of the most recent 88d05b679bSMatthew Dillon completed backup. We simply maintain the handy softlink pointing at 89d05b679bSMatthew Dillon the most recent completed backup and the cpdup required to do this 90d05b679bSMatthew Dillon becomes trivial. 91d05b679bSMatthew Dillon 92d05b679bSMatthew Dillon Each day's incremental backup will reproduce the ENTIRE directory topology 93d05b679bSMatthew Dillon for the client, but cpdup will hardlink files from the most recent backup 94d05b679bSMatthew Dillon instead of copying them and this is what saves you all the disk space. 95d05b679bSMatthew Dillon 96d05b679bSMatthew Dillon set date = `date "+%Y%m%d"` 97d05b679bSMatthew Dillon mkdir /backup/mirrors/box1.${date} 98d05b679bSMatthew Dillon if ( "`readlink /backup/mirrors/box1`" == "box1.${date}" ) then 99d05b679bSMatthew Dillon echo "silly boy, an incremental already exists for today" 100d05b679bSMatthew Dillon exit 1 101d05b679bSMatthew Dillon endif 102d05b679bSMatthew Dillon cpdup -H /backup/mirrors/box1 \ 103d05b679bSMatthew Dillon -i0 -s0 /nfs/box1/home /backup/mirrors/box1.${date}/home 104d05b679bSMatthew Dillon 105d05b679bSMatthew Dillon Be sure to update your 'most recent backup' softlink, but only do it 106d05b679bSMatthew Dillon if the cpdup's for all the partitions for that client have succeeded. 107d05b679bSMatthew Dillon That way the next incremental backup will be based on the previous one. 108d05b679bSMatthew Dillon 109a05b5f9bSMatthew Dillon rm -f /backup/mirrors/box1 110a05b5f9bSMatthew Dillon ln -s /backup/mirrors/box1.${date} /backup/mirrors/box1 111d05b679bSMatthew Dillon 112d05b679bSMatthew Dillon Since these backups are mirrors, locating a backup is as simple 113d05b679bSMatthew Dillon as CDing into the appropriate directory. If your filesystem has a 114d05b679bSMatthew Dillon hardlink limit and cpdup hits it, cpdup will 'break' the hardlink 115d05b679bSMatthew Dillon and copy the file instead. Generally speaking only a few special cases 116d05b679bSMatthew Dillon will hit the hardlink limit for a filesystem. For example, the 117d05b679bSMatthew Dillon CVS/Root file in a checked out cvs repository is often hardlinked, and 118d05b679bSMatthew Dillon the sheer number of hardlinked 'Root' files multiplied by the number 119d05b679bSMatthew Dillon of backups can often hit the filesystem hardlink limit. 120d05b679bSMatthew Dillon 121d05b679bSMatthew Dillon PART 4 - DO AN INCREMENTAL VERIFIED BACKUP 122d05b679bSMatthew Dillon 123d05b679bSMatthew Dillon Since your incremental backups use hardlinks heavily the actual file 124d05b679bSMatthew Dillon might exist on the physical /backup disk in only one place even though 125d05b679bSMatthew Dillon it may be present in dozens of daily mirrors. To ensure that the 126d05b679bSMatthew Dillon file being hardlinked does not get corrupted cpdup's -f option can be 127d05b679bSMatthew Dillon used in conjuction with -H to force cpdup to validate the contents 128d05b679bSMatthew Dillon of the file, even if all the stat info looks identical. 129d05b679bSMatthew Dillon 130d05b679bSMatthew Dillon cpdup -f -H /backup/mirrors/box1 ... 131d05b679bSMatthew Dillon 132*56be8454SSascha Wildner You can create completely redundant (non-hardlinked-dependent) backups 133d05b679bSMatthew Dillon by doing the equivalent of your level 0, i.e. not using -H. However I 134d05b679bSMatthew Dillon do NOT recommend that you do this, or that you do it very often (maybe 135d05b679bSMatthew Dillon once every 6 months at the most), because each mirror created this way 136d05b679bSMatthew Dillon will have a distinct copy of all the file data and you will quickly 137d05b679bSMatthew Dillon run out of space in your /backup partition. 138d05b679bSMatthew Dillon 139d05b679bSMatthew Dillon MAINTAINANCE OF THE "/backup" DIRECTORY 140d05b679bSMatthew Dillon 141d05b679bSMatthew Dillon Now, clearly you are going to run out of space in /backup if you keep 142d05b679bSMatthew Dillon doing this, but you may be surprised at just how many daily incrementals 143d05b679bSMatthew Dillon you can create before you fill up your /backup partition. 144d05b679bSMatthew Dillon 145d05b679bSMatthew Dillon If /backup becomes full, simply start rm -rf'ing older mirror directories 146d05b679bSMatthew Dillon until enough space is freed up. You do not have to remove the oldest 147d05b679bSMatthew Dillon directory first. In fact, you might want to keep it around and remove 148d05b679bSMatthew Dillon a day's backup here, a day's backup there, etc, until you free up enough 149d05b679bSMatthew Dillon space. 150d05b679bSMatthew Dillon 151d05b679bSMatthew Dillon OFF-SITE BACKUPS 152d05b679bSMatthew Dillon 153d05b679bSMatthew Dillon Making an off-site backup involves similar methodology, but you use 154d05b679bSMatthew Dillon cpdup's remote host capability to generate the backup. To avoid 155d05b679bSMatthew Dillon complications it is usually best to take a mirror already generated on 156d05b679bSMatthew Dillon your LAN backup box and copy that to the remote box. 157d05b679bSMatthew Dillon 158d05b679bSMatthew Dillon The remote backup box does not use NFS, so setup is trivial. Just 159d05b679bSMatthew Dillon create your super-large /backup partition and mkdir /backup/mirrors. 160d05b679bSMatthew Dillon Your LAN backup box will need root access via ssh to your remote backup 161d05b679bSMatthew Dillon box. 162d05b679bSMatthew Dillon 163d05b679bSMatthew Dillon You can use the handy softlink to get the latest 'box1.date' mirror 164d05b679bSMatthew Dillon directory and since the mirror is all in one partition you can just 165d05b679bSMatthew Dillon cpdup the entire machine in one command. Use the same dated directory 166d05b679bSMatthew Dillon name on the remote box, so: 167d05b679bSMatthew Dillon 168d05b679bSMatthew Dillon # latest will wind up something like 'box1.20060915' 169d05b679bSMatthew Dillon set latest = `readlink /backup/mirrors/box1` 170d05b679bSMatthew Dillon cpdup -i0 -s0 /backup/mirrors/$latest remote.box:/backup/mirrors/$latest 171d05b679bSMatthew Dillon 172d05b679bSMatthew Dillon As with your LAN backup, create a softlink on the backup box denoting the 173d05b679bSMatthew Dillon latest mirror for any given site. 174d05b679bSMatthew Dillon 175d05b679bSMatthew Dillon if ( $status == 0 ) then 176d05b679bSMatthew Dillon ssh remote.box -n \ 177a05b5f9bSMatthew Dillon "rm -f /backup/mirrors/box1; ln -s /backup/mirrors/$latest /backup/mirrors/box1" 178d05b679bSMatthew Dillon endif 179d05b679bSMatthew Dillon 180d05b679bSMatthew Dillon Incremental backups can be accomplished using the same cpdup command, 181d05b679bSMatthew Dillon but adding the -H option to the latest backup on the remote box. Note 182d05b679bSMatthew Dillon that the -H path is relative to the remote box, not the LAN backup box 183d05b679bSMatthew Dillon you are running the command from. 184d05b679bSMatthew Dillon 185d05b679bSMatthew Dillon set latest = `readlink /backup/mirrors/box1` 186d05b679bSMatthew Dillon set remotelatest = `ssh remote.box -n "readlink /backup/mirrors/box1"` 187d05b679bSMatthew Dillon if ( "$latest" == "$remotelatest" ) then 188d05b679bSMatthew Dillon echo "silly boy, you already made a remote incremental backup today" 189d05b679bSMatthew Dillon exit 1 190d05b679bSMatthew Dillon endif 191d05b679bSMatthew Dillon cpdup -H /backup/mirrors/$remotelatest \ 192d05b679bSMatthew Dillon -i0 -s0 /backup/mirrors/$latest remote.box:/backup/mirrors/$latest 193d05b679bSMatthew Dillon if ( $status == 0 ) then 194d05b679bSMatthew Dillon ssh remote.box -n \ 195a05b5f9bSMatthew Dillon "rm -f /backup/mirrors/box1; ln -s /backup/mirrors/$latest /backup/mirrors/box1" 196d05b679bSMatthew Dillon endif 197d05b679bSMatthew Dillon 198d05b679bSMatthew Dillon Cleaning out the remote directory works the same as cleaning out the LAN 199d05b679bSMatthew Dillon backup directory. 200d05b679bSMatthew Dillon 201d05b679bSMatthew Dillon 202d05b679bSMatthew Dillon RESTORING FROM BACKUPS 203d05b679bSMatthew Dillon 204d05b679bSMatthew Dillon Each backup is a full filesystem mirror, and depending on how much space 205d05b679bSMatthew Dillon you have you should be able to restore it simply by cd'ing into the 206d05b679bSMatthew Dillon appropriate backup directory and using 'cpdup blah box1:blah' (assuming 207d05b679bSMatthew Dillon root access), or you can export the backup directory via NFS to your 208d05b679bSMatthew Dillon client boxes and use cpdup locally on the client to extract the backup. 209d05b679bSMatthew Dillon Using NFS is probably the most efficient solution. 210d05b679bSMatthew Dillon 211d05b679bSMatthew Dillon 212d05b679bSMatthew Dillon PUTTING IT ALL TOGETHER - SOME SCRIPTS 213d05b679bSMatthew Dillon 214d05b679bSMatthew Dillon Please refer to the scripts in the script/ subdirectory. These scripts 215d05b679bSMatthew Dillon are EXAMPLES ONLY. If you want to use them, put them in your ~root/adm 216d05b679bSMatthew Dillon directory on your backup box and set up a root crontab. 217d05b679bSMatthew Dillon 218d05b679bSMatthew Dillon First follow the preparation rules in PART 1 above. The scripts do not 219d05b679bSMatthew Dillon do this automatically. Edit the 'params' file that the scripts use 220d05b679bSMatthew Dillon to set default paths and such. 221d05b679bSMatthew Dillon 222d05b679bSMatthew Dillon ** FOLLOW DIRECTIONS IN PART 1 ABOVE TO SET UP THE LAN BACKUP BOX ** 223d05b679bSMatthew Dillon 224d05b679bSMatthew Dillon Copy the scripts to ~/adm. Do NOT install a crontab yet (but an example 225d05b679bSMatthew Dillon can be found in scripts/crontab). 226d05b679bSMatthew Dillon 227d05b679bSMatthew Dillon Do a manual lavel 0 LAN BACKUP using the do_mirror script. 228d05b679bSMatthew Dillon 229d05b679bSMatthew Dillon cd ~/adm 230d05b679bSMatthew Dillon ./do_mirror 0 231d05b679bSMatthew Dillon 232d05b679bSMatthew Dillon Once done you can do incremental backups using './do_mirror 1' to do a 233d05b679bSMatthew Dillon verified incremental, or './do_mirror 2' to do a stat-optimized 234d05b679bSMatthew Dillon incremental. You can enable the cron jobs that run do_mirror and 235d05b679bSMatthew Dillon do_cleanup now. 236d05b679bSMatthew Dillon 237d05b679bSMatthew Dillon -- 238d05b679bSMatthew Dillon 239d05b679bSMatthew Dillon Setting up an off-site backup box is trivial. The off-site backup box 240d05b679bSMatthew Dillon needs to allow root ssh logins from the LAN backup box (at least for 241d05b679bSMatthew Dillon now, sorry!). Set up the off-site backup directory, typically 242d05b679bSMatthew Dillon /backup/mirrors. Then do a level 0 backup from your LAN backup box 243d05b679bSMatthew Dillon to the off-site box using the do_remote script. 244d05b679bSMatthew Dillon 245d05b679bSMatthew Dillon cd ~/adm 246d05b679bSMatthew Dillon ./do_remote 0 247d05b679bSMatthew Dillon 248d05b679bSMatthew Dillon Once done you can do incremental backups using './do_remote 1' to do a 249d05b679bSMatthew Dillon verified incremental, or './do_mirror 2' to do a stat-optimized 250d05b679bSMatthew Dillon incremental. You can enable the cron jobs that run do_remote now. 251d05b679bSMatthew Dillon 252d05b679bSMatthew Dillon NOTE! It is NOT recommended that you use verified-incremental backups 253d05b679bSMatthew Dillon over a WAN, as all related data must be copied over the wire every single 254d05b679bSMatthew Dillon day. Instead, I recommend sticking with stat-optimized backups 255d05b679bSMatthew Dillon (./do_mirror 2). 256d05b679bSMatthew Dillon 257d05b679bSMatthew Dillon You will also need to set up a daily cleaning script on the off-site 258d05b679bSMatthew Dillon backup box. 259d05b679bSMatthew Dillon 260d05b679bSMatthew Dillon SCRIPT TODOS - the ./do_cleanup script is not very smart. We really 261d05b679bSMatthew Dillon should do a tower-of-hanoi removal 262d05b679bSMatthew Dillon 263d05b679bSMatthew Dillon 264