xref: /dflybsd-src/bin/cpdup/BACKUPS (revision 56be84544ab547a6ff5bb12604269ba241ab491d)
1*56be8454SSascha Wildner$DragonFly: src/bin/cpdup/BACKUPS,v 1.3 2007/05/13 22:25:41 swildner Exp $
2d05b679bSMatthew Dillon
3d05b679bSMatthew Dillon			    INCREMENTAL BACKUP HOWTO
4d05b679bSMatthew Dillon
5d05b679bSMatthew Dillon    This document describes one of several ways to set up a LAN backup and
6d05b679bSMatthew Dillon    an off-site WAN backup system using cpdup's hardlinking capabilities.
7d05b679bSMatthew Dillon
8d05b679bSMatthew Dillon    The features described in this document are also encapsulated in scripts
9d05b679bSMatthew Dillon    which can be found in the scripts/ directory.  These scripts can be used
10d05b679bSMatthew Dillon    to automate all backup steps except for the initial preparation of the
11d05b679bSMatthew Dillon    backup and off-site machine's directory topology.  Operation of these
12d05b679bSMatthew Dillon    scripts is described in the last section of this document.
13d05b679bSMatthew Dillon
14d05b679bSMatthew Dillon
15d05b679bSMatthew Dillon		    PART 1 - PREPARE THE LAN BACKUP BOX
16d05b679bSMatthew Dillon
17d05b679bSMatthew Dillon    The easiest way to create a LAN backup box is to NFS mount all your
18d05b679bSMatthew Dillon    backup clients onto the backup box.  It is also possible to use cpdup's
19d05b679bSMatthew Dillon    remote host feature to access your client boxes but that requires root
20d05b679bSMatthew Dillon    access to the client boxes and is not described here.
21d05b679bSMatthew Dillon
22d05b679bSMatthew Dillon    Create a directory on the backup machine called /nfs, a subdirectory
23d05b679bSMatthew Dillon    foreach remote client, and subdirectories for each partition on each
24d05b679bSMatthew Dillon    client.  Remember that cpdup does not cross mount points so you will
25d05b679bSMatthew Dillon    need a mount for each partition you wish to backup.  For example:
26d05b679bSMatthew Dillon
27d05b679bSMatthew Dillon	[ ON LAN BACKUP BOX ]
28d05b679bSMatthew Dillon
29d05b679bSMatthew Dillon	mkdir /nfs
30d05b679bSMatthew Dillon	mkdir /nfs/box1
31d05b679bSMatthew Dillon	mkdir /nfs/box1/home
32d05b679bSMatthew Dillon	mkdir /nfs/box1/var
33d05b679bSMatthew Dillon
34d05b679bSMatthew Dillon    Before you actually do the NFS mount, create a dummy file for each
35d05b679bSMatthew Dillon    mount point that can be used by scripts to detect when an NFS mount
36d05b679bSMatthew Dillon    has not been done.  Scripts can thus avoid a common failure scenario
37d05b679bSMatthew Dillon    and not accidently cpdup an empty mount point to the backup partition
38d05b679bSMatthew Dillon    (destroying that day's backup in the process).
39d05b679bSMatthew Dillon
40d05b679bSMatthew Dillon	touch /nfs/box1/home/NOT_MOUNTED
41d05b679bSMatthew Dillon	touch /nfs/box1/var/NOT_MOUNTED
42d05b679bSMatthew Dillon
43d05b679bSMatthew Dillon    Once the directory structure has been set up, do your NFS mounts and
44d05b679bSMatthew Dillon    also add them to your fstab.  Since you will probably wind up with a
45d05b679bSMatthew Dillon    lot of mounts it is a good idea to use 'ro,bg' (readonly, background
46d05b679bSMatthew Dillon    mount) in the fstab entries.
47d05b679bSMatthew Dillon
48d05b679bSMatthew Dillon	mount box1:/home /nfs/box1/home
49d05b679bSMatthew Dillon	mount box1:/var /nfs/box1/var
50d05b679bSMatthew Dillon
51d05b679bSMatthew Dillon    You should create a huge /backup partition on your backup machine which
52d05b679bSMatthew Dillon    is capable of holding all your mirrors.  Create a subdirectory called
53d05b679bSMatthew Dillon    /backup/mirrors in your huge backup partition.
54d05b679bSMatthew Dillon
55d05b679bSMatthew Dillon	mount <huge_disk> /backup
56d05b679bSMatthew Dillon	mkdir /backup/mirrors
57d05b679bSMatthew Dillon
58d05b679bSMatthew Dillon
59d05b679bSMatthew Dillon			PART 2 - DOING A LEVEL 0 BACKUP
60d05b679bSMatthew Dillon
61d05b679bSMatthew Dillon    (If you use the supplied scripts, a level 0 backup can be accomplished
62d05b679bSMatthew Dillon    simply by running the 'do_mirror' script with an argument of 0).
63d05b679bSMatthew Dillon
64d05b679bSMatthew Dillon    Create a level 0 backup using a standard cpdup with no special arguments
65d05b679bSMatthew Dillon    other then -i0 -s0 (tell it not to ask questions and turn off the
66d05b679bSMatthew Dillon    file-overwrite-with-directory safety feature).  Name the mirror with
67d05b679bSMatthew Dillon    the date in a string-sortable format.
68d05b679bSMatthew Dillon
69d05b679bSMatthew Dillon	set date = `date "+%Y%m%d"`
70d05b679bSMatthew Dillon	mkdir /backup/mirrors/box1.${date}
71d05b679bSMatthew Dillon	cpdup -i0 -s0 /nfs/box1/home /backup/mirrors/box1.${date}/home
72d05b679bSMatthew Dillon	cpdup -i0 -s0 /nfs/box1/var /backup/mirrors/box1.${date}/var
73d05b679bSMatthew Dillon
74d05b679bSMatthew Dillon    Create a softlink to the most recently completed backup, which is your
75a05b5f9bSMatthew Dillon    level 0 backup.  Note that using 'ln -sf' will create a link in the
76a05b5f9bSMatthew Dillon    subdirectory pointed to by the current link, not replace the current
77a05b5f9bSMatthew Dillon    link. 'ln -shf' can be used to replace the link but is not portable.
78a05b5f9bSMatthew Dillon    'mv -f' has the same problem.
79d05b679bSMatthew Dillon
80d05b679bSMatthew Dillon	sync
81a05b5f9bSMatthew Dillon	rm -f /backup/mirrors/box1
82a05b5f9bSMatthew Dillon	ln -s /backup/mirrors/box1.${date} /backup/mirrors/box1
83d05b679bSMatthew Dillon
84d05b679bSMatthew Dillon			PART 3 - DO AN INCREMENTAL BACKUP
85d05b679bSMatthew Dillon
86d05b679bSMatthew Dillon    An incremental backup is exactly the same as a level 0 backup EXCEPT
87d05b679bSMatthew Dillon    you use the -H option to specify the location of the most recent
88d05b679bSMatthew Dillon    completed backup.  We simply maintain the handy softlink pointing at
89d05b679bSMatthew Dillon    the most recent completed backup and the cpdup required to do this
90d05b679bSMatthew Dillon    becomes trivial.
91d05b679bSMatthew Dillon
92d05b679bSMatthew Dillon    Each day's incremental backup will reproduce the ENTIRE directory topology
93d05b679bSMatthew Dillon    for the client, but cpdup will hardlink files from the most recent backup
94d05b679bSMatthew Dillon    instead of copying them and this is what saves you all the disk space.
95d05b679bSMatthew Dillon
96d05b679bSMatthew Dillon	set date = `date "+%Y%m%d"`
97d05b679bSMatthew Dillon	mkdir /backup/mirrors/box1.${date}
98d05b679bSMatthew Dillon	if ( "`readlink /backup/mirrors/box1`" == "box1.${date}" ) then
99d05b679bSMatthew Dillon	    echo "silly boy, an incremental already exists for today"
100d05b679bSMatthew Dillon	    exit 1
101d05b679bSMatthew Dillon	endif
102d05b679bSMatthew Dillon	cpdup -H /backup/mirrors/box1 \
103d05b679bSMatthew Dillon	      -i0 -s0 /nfs/box1/home /backup/mirrors/box1.${date}/home
104d05b679bSMatthew Dillon
105d05b679bSMatthew Dillon    Be sure to update your 'most recent backup' softlink, but only do it
106d05b679bSMatthew Dillon    if the cpdup's for all the partitions for that client have succeeded.
107d05b679bSMatthew Dillon    That way the next incremental backup will be based on the previous one.
108d05b679bSMatthew Dillon
109a05b5f9bSMatthew Dillon	rm -f /backup/mirrors/box1
110a05b5f9bSMatthew Dillon	ln -s /backup/mirrors/box1.${date} /backup/mirrors/box1
111d05b679bSMatthew Dillon
112d05b679bSMatthew Dillon    Since these backups are mirrors, locating a backup is as simple
113d05b679bSMatthew Dillon    as CDing into the appropriate directory.  If your filesystem has a
114d05b679bSMatthew Dillon    hardlink limit and cpdup hits it, cpdup will 'break' the hardlink
115d05b679bSMatthew Dillon    and copy the file instead.  Generally speaking only a few special cases
116d05b679bSMatthew Dillon    will hit the hardlink limit for a filesystem.  For example, the
117d05b679bSMatthew Dillon    CVS/Root file in a checked out cvs repository is often hardlinked, and
118d05b679bSMatthew Dillon    the sheer number of hardlinked 'Root' files multiplied by the number
119d05b679bSMatthew Dillon    of backups can often hit the filesystem hardlink limit.
120d05b679bSMatthew Dillon
121d05b679bSMatthew Dillon		    PART 4 - DO AN INCREMENTAL VERIFIED BACKUP
122d05b679bSMatthew Dillon
123d05b679bSMatthew Dillon    Since your incremental backups use hardlinks heavily the actual file
124d05b679bSMatthew Dillon    might exist on the physical /backup disk in only one place even though
125d05b679bSMatthew Dillon    it may be present in dozens of daily mirrors.  To ensure that the
126d05b679bSMatthew Dillon    file being hardlinked does not get corrupted cpdup's -f option can be
127d05b679bSMatthew Dillon    used in conjuction with -H to force cpdup to validate the contents
128d05b679bSMatthew Dillon    of the file, even if all the stat info looks identical.
129d05b679bSMatthew Dillon
130d05b679bSMatthew Dillon	cpdup -f -H /backup/mirrors/box1 ...
131d05b679bSMatthew Dillon
132*56be8454SSascha Wildner    You can create completely redundant (non-hardlinked-dependent) backups
133d05b679bSMatthew Dillon    by doing the equivalent of your level 0, i.e. not using -H.  However I
134d05b679bSMatthew Dillon    do NOT recommend that you do this, or that you do it very often (maybe
135d05b679bSMatthew Dillon    once every 6 months at the most), because each mirror created this way
136d05b679bSMatthew Dillon    will have a distinct copy of all the file data and you will quickly
137d05b679bSMatthew Dillon    run out of space in your /backup partition.
138d05b679bSMatthew Dillon
139d05b679bSMatthew Dillon		    MAINTAINANCE OF THE "/backup" DIRECTORY
140d05b679bSMatthew Dillon
141d05b679bSMatthew Dillon    Now, clearly you are going to run out of space in /backup if you keep
142d05b679bSMatthew Dillon    doing this, but you may be surprised at just how many daily incrementals
143d05b679bSMatthew Dillon    you can create before you fill up your /backup partition.
144d05b679bSMatthew Dillon
145d05b679bSMatthew Dillon    If /backup becomes full, simply start rm -rf'ing older mirror directories
146d05b679bSMatthew Dillon    until enough space is freed up.   You do not have to remove the oldest
147d05b679bSMatthew Dillon    directory first.  In fact, you might want to keep it around and remove
148d05b679bSMatthew Dillon    a day's backup here, a day's backup there, etc, until you free up enough
149d05b679bSMatthew Dillon    space.
150d05b679bSMatthew Dillon
151d05b679bSMatthew Dillon				OFF-SITE BACKUPS
152d05b679bSMatthew Dillon
153d05b679bSMatthew Dillon    Making an off-site backup involves similar methodology, but you use
154d05b679bSMatthew Dillon    cpdup's remote host capability to generate the backup.  To avoid
155d05b679bSMatthew Dillon    complications it is usually best to take a mirror already generated on
156d05b679bSMatthew Dillon    your LAN backup box and copy that to the remote box.
157d05b679bSMatthew Dillon
158d05b679bSMatthew Dillon    The remote backup box does not use NFS, so setup is trivial.  Just
159d05b679bSMatthew Dillon    create your super-large /backup partition and mkdir /backup/mirrors.
160d05b679bSMatthew Dillon    Your LAN backup box will need root access via ssh to your remote backup
161d05b679bSMatthew Dillon    box.
162d05b679bSMatthew Dillon
163d05b679bSMatthew Dillon    You can use the handy softlink to get the latest 'box1.date' mirror
164d05b679bSMatthew Dillon    directory and since the mirror is all in one partition you can just
165d05b679bSMatthew Dillon    cpdup the entire machine in one command.  Use the same dated directory
166d05b679bSMatthew Dillon    name on the remote box, so:
167d05b679bSMatthew Dillon
168d05b679bSMatthew Dillon        # latest will wind up something like 'box1.20060915'
169d05b679bSMatthew Dillon	set latest = `readlink /backup/mirrors/box1`
170d05b679bSMatthew Dillon	cpdup -i0 -s0 /backup/mirrors/$latest remote.box:/backup/mirrors/$latest
171d05b679bSMatthew Dillon
172d05b679bSMatthew Dillon    As with your LAN backup, create a softlink on the backup box denoting the
173d05b679bSMatthew Dillon    latest mirror for any given site.
174d05b679bSMatthew Dillon
175d05b679bSMatthew Dillon	if ( $status == 0 ) then
176d05b679bSMatthew Dillon	    ssh remote.box -n \
177a05b5f9bSMatthew Dillon		"rm -f /backup/mirrors/box1; ln -s /backup/mirrors/$latest /backup/mirrors/box1"
178d05b679bSMatthew Dillon	endif
179d05b679bSMatthew Dillon
180d05b679bSMatthew Dillon    Incremental backups can be accomplished using the same cpdup command,
181d05b679bSMatthew Dillon    but adding the -H option to the latest backup on the remote box.  Note
182d05b679bSMatthew Dillon    that the -H path is relative to the remote box, not the LAN backup box
183d05b679bSMatthew Dillon    you are running the command from.
184d05b679bSMatthew Dillon
185d05b679bSMatthew Dillon	set latest = `readlink /backup/mirrors/box1`
186d05b679bSMatthew Dillon	set remotelatest = `ssh remote.box -n "readlink /backup/mirrors/box1"`
187d05b679bSMatthew Dillon	if ( "$latest" == "$remotelatest" ) then
188d05b679bSMatthew Dillon	    echo "silly boy, you already made a remote incremental backup today"
189d05b679bSMatthew Dillon	    exit 1
190d05b679bSMatthew Dillon	endif
191d05b679bSMatthew Dillon	cpdup -H /backup/mirrors/$remotelatest \
192d05b679bSMatthew Dillon	      -i0 -s0 /backup/mirrors/$latest remote.box:/backup/mirrors/$latest
193d05b679bSMatthew Dillon	if ( $status == 0 ) then
194d05b679bSMatthew Dillon	    ssh remote.box -n \
195a05b5f9bSMatthew Dillon		"rm -f /backup/mirrors/box1; ln -s /backup/mirrors/$latest /backup/mirrors/box1"
196d05b679bSMatthew Dillon	endif
197d05b679bSMatthew Dillon
198d05b679bSMatthew Dillon    Cleaning out the remote directory works the same as cleaning out the LAN
199d05b679bSMatthew Dillon    backup directory.
200d05b679bSMatthew Dillon
201d05b679bSMatthew Dillon
202d05b679bSMatthew Dillon			    RESTORING FROM BACKUPS
203d05b679bSMatthew Dillon
204d05b679bSMatthew Dillon    Each backup is a full filesystem mirror, and depending on how much space
205d05b679bSMatthew Dillon    you have you should be able to restore it simply by cd'ing into the
206d05b679bSMatthew Dillon    appropriate backup directory and using 'cpdup blah box1:blah' (assuming
207d05b679bSMatthew Dillon    root access), or you can export the backup directory via NFS to your
208d05b679bSMatthew Dillon    client boxes and use cpdup locally on the client to extract the backup.
209d05b679bSMatthew Dillon    Using NFS is probably the most efficient solution.
210d05b679bSMatthew Dillon
211d05b679bSMatthew Dillon
212d05b679bSMatthew Dillon			PUTTING IT ALL TOGETHER - SOME SCRIPTS
213d05b679bSMatthew Dillon
214d05b679bSMatthew Dillon    Please refer to the scripts in the script/ subdirectory.  These scripts
215d05b679bSMatthew Dillon    are EXAMPLES ONLY.  If you want to use them, put them in your ~root/adm
216d05b679bSMatthew Dillon    directory on your backup box and set up a root crontab.
217d05b679bSMatthew Dillon
218d05b679bSMatthew Dillon    First follow the preparation rules in PART 1 above.  The scripts do not
219d05b679bSMatthew Dillon    do this automatically.  Edit the 'params' file that the scripts use
220d05b679bSMatthew Dillon    to set default paths and such.
221d05b679bSMatthew Dillon
222d05b679bSMatthew Dillon	** FOLLOW DIRECTIONS IN PART 1 ABOVE TO SET UP THE LAN BACKUP BOX **
223d05b679bSMatthew Dillon
224d05b679bSMatthew Dillon    Copy the scripts to ~/adm.  Do NOT install a crontab yet (but an example
225d05b679bSMatthew Dillon    can be found in scripts/crontab).
226d05b679bSMatthew Dillon
227d05b679bSMatthew Dillon    Do a manual lavel 0 LAN BACKUP using the do_mirror script.
228d05b679bSMatthew Dillon
229d05b679bSMatthew Dillon	cd ~/adm
230d05b679bSMatthew Dillon	./do_mirror 0
231d05b679bSMatthew Dillon
232d05b679bSMatthew Dillon    Once done you can do incremental backups using './do_mirror 1' to do a
233d05b679bSMatthew Dillon    verified incremental, or './do_mirror 2' to do a stat-optimized
234d05b679bSMatthew Dillon    incremental.  You can enable the cron jobs that run do_mirror and
235d05b679bSMatthew Dillon    do_cleanup now.
236d05b679bSMatthew Dillon
237d05b679bSMatthew Dillon    --
238d05b679bSMatthew Dillon
239d05b679bSMatthew Dillon    Setting up an off-site backup box is trivial.  The off-site backup box
240d05b679bSMatthew Dillon    needs to allow root ssh logins from the LAN backup box (at least for
241d05b679bSMatthew Dillon    now, sorry!).  Set up the off-site backup directory, typically
242d05b679bSMatthew Dillon    /backup/mirrors.  Then do a level 0 backup from your LAN backup box
243d05b679bSMatthew Dillon    to the off-site box using the do_remote script.
244d05b679bSMatthew Dillon
245d05b679bSMatthew Dillon	cd ~/adm
246d05b679bSMatthew Dillon	./do_remote 0
247d05b679bSMatthew Dillon
248d05b679bSMatthew Dillon    Once done you can do incremental backups using './do_remote 1' to do a
249d05b679bSMatthew Dillon    verified incremental, or './do_mirror 2' to do a stat-optimized
250d05b679bSMatthew Dillon    incremental.  You can enable the cron jobs that run do_remote now.
251d05b679bSMatthew Dillon
252d05b679bSMatthew Dillon    NOTE!  It is NOT recommended that you use verified-incremental backups
253d05b679bSMatthew Dillon    over a WAN, as all related data must be copied over the wire every single
254d05b679bSMatthew Dillon    day.  Instead, I recommend sticking with stat-optimized backups
255d05b679bSMatthew Dillon    (./do_mirror 2).
256d05b679bSMatthew Dillon
257d05b679bSMatthew Dillon    You will also need to set up a daily cleaning script on the off-site
258d05b679bSMatthew Dillon    backup box.
259d05b679bSMatthew Dillon
260d05b679bSMatthew Dillon    SCRIPT TODOS - the ./do_cleanup script is not very smart.  We really
261d05b679bSMatthew Dillon    should do a tower-of-hanoi removal
262d05b679bSMatthew Dillon
263d05b679bSMatthew Dillon
264