xref: /dflybsd-src/share/man/man5/hammer.5 (revision ae788f37fe53d5d1ca1e12a184a662192caad3c5)
1.\"
2.\" Copyright (c) 2008
3.\"	The DragonFly Project.  All rights reserved.
4.\"
5.\" Redistribution and use in source and binary forms, with or without
6.\" modification, are permitted provided that the following conditions
7.\" are met:
8.\"
9.\" 1. Redistributions of source code must retain the above copyright
10.\"    notice, this list of conditions and the following disclaimer.
11.\" 2. Redistributions in binary form must reproduce the above copyright
12.\"    notice, this list of conditions and the following disclaimer in
13.\"    the documentation and/or other materials provided with the
14.\"    distribution.
15.\" 3. Neither the name of The DragonFly Project nor the names of its
16.\"    contributors may be used to endorse or promote products derived
17.\"    from this software without specific, prior written permission.
18.\"
19.\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
20.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
21.\" LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
22.\" FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE
23.\" COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
24.\" INCIDENTAL, SPECIAL, EXEMPLARY OR CONSEQUENTIAL DAMAGES (INCLUDING,
25.\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
26.\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
27.\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
28.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
29.\" OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
30.\" SUCH DAMAGE.
31.\"
32.\" $DragonFly: src/share/man/man5/hammer.5,v 1.15 2008/11/02 18:56:47 swildner Exp $
33.\"
34.Dd September 28, 2009
35.Os
36.Dt HAMMER 5
37.Sh NAME
38.Nm HAMMER
39.Nd HAMMER file system
40.Sh SYNOPSIS
41To compile this driver into the kernel,
42place the following line in your
43kernel configuration file:
44.Bd -ragged -offset indent
45.Cd options HAMMER
46.Ed
47.Pp
48Alternatively, to load the driver as a
49module at boot time, place the following line in
50.Xr loader.conf 5 :
51.Bd -literal -offset indent
52hammer_load="YES"
53.Ed
54.Pp
55To mount via
56.Xr fstab 5 :
57.Bd -literal -offset indent
58/dev/ad0s1d[:/dev/ad1s1d:...]	/mnt hammer rw 2 0
59.Ed
60.Sh DESCRIPTION
61The
62.Nm
63file system provides facilities to store file system data onto disk devices
64and is intended to replace
65.Xr ffs 5
66as the default file system for
67.Dx .
68Among its features are instant crash recovery,
69large file systems spanning multiple volumes,
70data integrity checking,
71fine grained history retention,
72mirroring capability, and pseudo file systems.
73.Pp
74All functions related to managing
75.Nm
76file systems are provided by the
77.Xr newfs_hammer 8 ,
78.Xr mount_hammer 8 ,
79.Xr hammer 8 ,
80.Xr chflags 1 ,
81and
82.Xr undo 1
83utilities.
84.Pp
85For a more detailed introduction refer to the paper and slides listed in the
86.Sx SEE ALSO
87section.
88For some common usages of
89.Nm
90see the
91.Sx EXAMPLES
92section below.
93.Ss Instant Crash Recovery
94After a non-graceful system shutdown,
95.Nm
96file systems will be brought back into a fully coherent state
97when mounting the file system, usually within a few seconds.
98.Ss Large File Systems & Multi Volume
99A
100.Nm
101file system can be up to 1 Exabyte in size.
102It can span up to 256 volumes,
103each volume occupies a
104.Dx
105disk slice or partition, or another special file,
106and can be up to 4096 TB in size.
107Minimum recommended
108.Nm
109file system size is 50 GB.
110For volumes over 2 TB in size
111.Xr gpt 8
112and
113.Xr disklabel64 8
114normally need to be used.
115.Ss Data Integrity Checking
116.Nm
117has high focus on data integrity,
118CRC checks are made for all major structures and data.
119.Nm
120snapshots implements features to make data integrity checking easier:
121The atime and mtime fields are locked to the ctime
122for files accessed via a snapshot.
123The
124.Fa st_dev
125field is based on the PFS
126.Ar shared-uuid
127and not on any real device.
128This means that archiving the contents of a snapshot with e.g.\&
129.Xr tar 1
130and piping it to something like
131.Xr md5 1
132will yield a consistent result.
133The consistency is also retained on mirroring targets.
134.Ss Transaction IDs
135The
136.Nm
137file system uses 64 bit, hexadecimal transaction IDs to refer to historical
138file or directory data.
139An ID has the
140.Xr printf 3
141format
142.Li %#016llx ,
143such as
144.Li 0x00000001061a8ba6 .
145.Pp
146Related
147.Xr hammer 8
148commands:
149.Ar snapshot ,
150.Ar synctid
151.Ss History & Snapshots
152History metadata on the media is written with every sync operation, so that
153by default the resolution of a file's history is 30-60 seconds until the next
154prune operation.
155Prior versions of files or directories are generally accessible by appending
156.Li @@
157and a transaction ID to the name.
158The common way of accessing history, however, is by taking snapshots.
159.Pp
160Snapshots are softlinks to prior versions of directories and their files.
161Their data will be retained across prune operations for as long as the
162softlink exists.
163Removing the softlink enables the file system to reclaim the space
164again upon the next prune & reblock operations.
165.Pp
166Related
167.Xr hammer 8
168commands:
169.Ar cleanup ,
170.Ar history ,
171.Ar snapshot ;
172see also
173.Xr undo 1
174.Ss Pruning & Reblocking
175Pruning is the act of deleting file system history.
176By default only history used by the given snapshots
177and history from after the latest snapshot will be retained.
178By setting the per PFS parameter
179.Cm prune-min ,
180history is guaranteed to be saved at least this time interval.
181All other history is deleted.
182Reblocking will reorder all elements and thus defragment the file system and
183free space for reuse.
184After pruning a file system must be reblocked to recover all available space.
185Reblocking is needed even when using the
186.Ar nohistory
187.Xr mount_hammer 8
188option or
189.Xr chflags 1
190flag.
191.Pp
192Related
193.Xr hammer 8
194commands:
195.Ar cleanup ,
196.Ar snapshot ,
197.Ar prune ,
198.Ar prune-everything ,
199.Ar rebalance ,
200.Ar reblock ,
201.Ar reblock-btree ,
202.Ar reblock-inodes ,
203.Ar reblock-dirs ,
204.Ar reblock-data
205.Ss Mirroring & Pseudo File Systems
206In order to allow inode numbers to be duplicated on the slaves
207.Nm Ap s
208mirroring feature uses
209.Dq Pseudo File Systems
210(PFSs).
211A
212.Nm
213file system supports up to 65535 PFSs.
214Multiple slaves per master are supported, but multiple masters per slave
215are not.
216Slaves are always read-only.
217Upgrading slaves to masters and downgrading masters to slaves are supported.
218.Pp
219It is recommended to use a
220.Nm null
221mount to access a PFS;
222this way no tools are confused by the PFS root being a symlink
223and inodes not being unique across a
224.Nm
225file system.
226.Pp
227Related
228.Xr hammer 8
229commands:
230.Ar pfs-master ,
231.Ar pfs-slave ,
232.Ar pfs-cleanup ,
233.Ar pfs-status ,
234.Ar pfs-update ,
235.Ar pfs-destroy ,
236.Ar pfs-upgrade ,
237.Ar pfs-downgrade ,
238.Ar mirror-copy ,
239.Ar mirror-stream ,
240.Ar mirror-read ,
241.Ar mirror-read-stream ,
242.Ar mirror-write ,
243.Ar mirror-dump
244.Ss NFS Export
245.Nm
246file systems support NFS export.
247NFS export of PFSs is done using
248.Nm null
249mounts.
250For example, to export the PFS
251.Pa /hammer/pfs/data ,
252create a
253.Nm null
254mount, e.g.\& to
255.Pa /hammer/data
256and export the latter path.
257.Pp
258Don't export a directory containing a PFS (e.g.\&
259.Pa /hammer/pfs
260above).
261Only
262.Nm null
263mount for PFS root
264(e.g.\&
265.Pa /hammer/data
266above)
267should be exported
268(subdirectory may be escaped if exported).
269.Sh EXAMPLES
270.Ss Preparing the File System
271To create and mount a
272.Nm
273file system use the
274.Xr newfs_hammer 8
275and
276.Xr mount_hammer 8
277commands.
278Note that all
279.Nm
280file systems must have a unique name on a per-machine basis.
281.Bd -literal -offset indent
282newfs_hammer -L HOME /dev/ad0s1d
283mount_hammer /dev/ad0s1d /home
284.Ed
285.Pp
286Similarly, multi volume file systems can be created and mounted by
287specifying additional arguments.
288.Bd -literal -offset indent
289newfs_hammer -L MULTIHOME /dev/ad0s1d /dev/ad1s1d
290mount_hammer /dev/ad0s1d /dev/ad1s1d /home
291.Ed
292.Pp
293Once created and mounted,
294.Nm
295file systems need periodic clean up making snapshots, pruning and reblocking,
296in order to have access to history and file system not to fill up.
297For this it is recommended to use the
298.Xr hammer 8
299.Ar cleanup
300metacommand.
301.Pp
302By default,
303.Dx
304is set up to run
305.Nm hammer Ar cleanup
306nightly via
307.Xr periodic 8 .
308.Pp
309It is also possible to perform these operations individually via
310.Xr crontab 5 .
311For example, to reblock the
312.Pa /home
313file system every night at 2:15 for up to 5 minutes:
314.Bd -literal -offset indent
31515 2 * * * hammer -c /var/run/HOME.reblock -t 300 reblock /home \e
316	>/dev/null 2>&1
317.Ed
318.Ss Snapshots
319The
320.Xr hammer 8
321utility's
322.Ar snapshot
323command provides several ways of taking snapshots.
324They all assume a directory where snapshots are kept.
325.Bd -literal -offset indent
326mkdir /snaps
327hammer snapshot /home /snaps/snap1
328(...after some changes in /home...)
329hammer snapshot /home /snaps/snap2
330.Ed
331.Pp
332The softlinks in
333.Pa /snaps
334point to the state of the
335.Pa /home
336directory at the time each snapshot was taken, and could now be used to copy
337the data somewhere else for backup purposes.
338.Pp
339By default,
340.Dx
341is set up to create nightly snapshots of all
342.Nm
343file systems via
344.Xr periodic 8
345and to keep them for 60 days.
346.Ss Pruning
347A snapshot directory is also the argument to the
348.Xr hammer 8 Ap s
349.Ar prune
350command which frees historical data from the file system that is not
351pointed to by any snapshot link and is not from after the latest snapshot.
352.Bd -literal -offset indent
353rm /snaps/snap1
354hammer prune /snaps
355.Ed
356.Ss Mirroring
357Mirroring can be set up using
358.Nm Ap s
359pseudo file systems.
360To associate the slave with the master its shared UUID should be set to
361the master's shared UUID as output by the
362.Nm hammer Ar pfs-master
363command.
364.Bd -literal -offset indent
365hammer pfs-master /home/pfs/master
366hammer pfs-slave /home/pfs/slave shared-uuid=<master's shared uuid>
367.Ed
368.Pp
369The
370.Pa /home/pfs/slave
371link is unusable for as long as no mirroring operation has taken place.
372.Pp
373To mirror the master's data, either pipe a
374.Fa mirror-read
375command into a
376.Fa mirror-write
377or, as a short-cut, use the
378.Fa mirror-copy
379command (which works across a
380.Xr ssh 1
381connection as well).
382Initial mirroring operation has to be done to the PFS path (as
383.Xr mount_null 8
384can't access it yet).
385.Bd -literal -offset indent
386hammer mirror-copy /home/pfs/master /home/pfs/slave
387.Ed
388.Pp
389After this initial step
390.Nm null
391mount can be setup for
392.Pa /home/pfs/slave .
393Further operations can use
394.Nm null
395mounts.
396.Bd -literal -offset indent
397mount_null /home/pfs/master /home/master
398mount_null /home/pfs/slave /home/slave
399
400hammer mirror-copy /home/master /home/slave
401.Ed
402.Ss NFS Export
403To NFS export from the
404.Nm
405file system
406.Pa /hammer
407the directory
408.Pa /hammer/non-pfs
409without PFSs, and the PFS
410.Pa /hammer/pfs/data ,
411the latter is null mounted to
412.Pa /hammer/data .
413.Pp
414Add to
415.Pa /etc/fstab
416(see
417.Xr fstab 5 ) :
418.Bd -literal -offset indent
419/hammer/pfs/data /hammer/data null rw
420.Ed
421.Pp
422Add to
423.Pa /etc/exports
424(see
425.Xr exports 5 ) :
426.Bd -literal -offset indent
427/hammer/non-pfs
428/hammer/data
429.Ed
430.Sh SEE ALSO
431.Xr chflags 1 ,
432.Xr md5 1 ,
433.Xr tar 1 ,
434.Xr undo 1 ,
435.Xr exports 5 ,
436.Xr ffs 5 ,
437.Xr fstab 5 ,
438.Xr disklabel64 8 ,
439.Xr gpt 8 ,
440.Xr hammer 8 ,
441.Xr mount_hammer 8 ,
442.Xr mount_null 8 ,
443.Xr newfs_hammer 8
444.Rs
445.%A Matthew Dillon
446.%D June 2008
447.%O http://www.dragonflybsd.org/hammer/hammer.pdf
448.%T "The HAMMER Filesystem"
449.Re
450.Rs
451.%A Matthew Dillon
452.%D October 2008
453.%O http://www.dragonflybsd.org/hammer/nycbsdcon/
454.%T "Slideshow from NYCBSDCon 2008"
455.Re
456.Sh FILESYSTEM PERFORMANCE
457The
458.Nm
459file system has a front-end which processes VNOPS and issues necessary
460block reads from disk, and a back-end which handles meta-data updates
461on-media and performs all meta-data write operations.
462Bulk file write operations are handled by the front-end.
463Because
464.Nm
465defers meta-data updates virtually no meta-data read operations will be
466issued by the frontend while writing large amounts of data to the file system
467or even when creating new files or directories, and even though the
468kernel prioritizes reads over writes the fact that writes are cached by
469the drive itself tends to lead to excessive priority given to writes.
470.Pp
471There are four bioq sysctls, shown below with default values,
472which can be adjusted to give reads a higher priority:
473.Bd -literal -offset indent
474kern.bioq_reorder_minor_bytes: 262144
475kern.bioq_reorder_burst_bytes: 3000000
476kern.bioq_reorder_minor_interval: 5
477kern.bioq_reorder_burst_interval: 60
478.Ed
479.Pp
480If a higher read priority is desired it is recommended that the
481.Fa kern.bioq_reorder_minor_interval
482be increased to 15, 30, or even 60, and the
483.Fa kern.bioq_reorder_burst_bytes
484be decreased to 262144 or 524288.
485.Sh HISTORY
486The
487.Nm
488file system first appeared in
489.Dx 1.11 .
490.Sh AUTHORS
491.An -nosplit
492The
493.Nm
494file system was designed and implemented by
495.An Matthew Dillon Aq dillon@backplane.com .
496This manual page was written by
497.An Sascha Wildner .
498