xref: /plan9/sys/doc/fossil.ms (revision 8e42ac6b098d2518418cf14bffe2a43f2379a084)
1426d2b71SDavid du Colombier.HTML "Fossil, an Archival File Server
25e96a66cSDavid du Colombier... .FP times
35e96a66cSDavid du Colombier... .fp 1 R R.nomath
45e96a66cSDavid du Colombier... .fp 5 CW LucidaSansCW83
55e96a66cSDavid du Colombier.TL
65e96a66cSDavid du ColombierFossil, an Archival File Server
75e96a66cSDavid du Colombier.AU
85e96a66cSDavid du ColombierSean Quinlan
95e96a66cSDavid du ColombierJim McKie
105e96a66cSDavid du ColombierRuss Cox
115e96a66cSDavid du Colombierjmk,rsc@plan9.bell-labs.com
125e96a66cSDavid du Colombier.AB
135e96a66cSDavid du ColombierThis paper describes the internals and
145e96a66cSDavid du Colombieroperation of Fossil, an archival file server built for Plan 9.
155e96a66cSDavid du ColombierFossil has not yet replaced the current Plan 9 file server
165e96a66cSDavid du Colombierand
175e96a66cSDavid du Colombier.CW kfs ,
185e96a66cSDavid du Colombierbut that is our eventual intent.
195e96a66cSDavid du ColombierBoth fossil and this documentation are
205e96a66cSDavid du Colombierworks in progress.  Comments on either are most welcome.
215e96a66cSDavid du Colombier.AE
225e96a66cSDavid du Colombier.de HP
235e96a66cSDavid du Colombier.LP
245e96a66cSDavid du Colombier..
255e96a66cSDavid du Colombier.NH 1
265e96a66cSDavid du ColombierIntroduction
275e96a66cSDavid du Colombier.HP
285e96a66cSDavid du ColombierFossil is an archival file server built for Plan 9.
295e96a66cSDavid du ColombierIn a typical configuration, it maintains a traditional file system
305e96a66cSDavid du Colombierin a local disk partition and periodically archives snapshots of the file system
315e96a66cSDavid du Colombierto a Venti server.  These archives are made available through
325e96a66cSDavid du Colombiera file system interface.
335e96a66cSDavid du ColombierFossil can also be run without a Venti server, in which case the
345e96a66cSDavid du Colombiersnapshots (if any) occupy local disk space.
355e96a66cSDavid du Colombier.PP
365e96a66cSDavid du ColombierThe bulk of this paper explains the underlying data structures:
375e96a66cSDavid du ColombierVenti trees, the Venti archival file system format, and finally Fossil's
385e96a66cSDavid du Colombierfile system format.
395e96a66cSDavid du ColombierThe end of the paper discusses the architecture of the Fossil server.
405e96a66cSDavid du Colombier.PP
415e96a66cSDavid du ColombierThe presentation of the data structures is very detailed, perhaps
425e96a66cSDavid du Colombiertoo detailed for most readers.
435e96a66cSDavid du ColombierThe intent is to record all the details necessary to make structural
445e96a66cSDavid du Colombierchanges to the file system format.
455e96a66cSDavid du ColombierFeel free to jump ahead when boredom strikes.
465e96a66cSDavid du Colombier.NH 1
475e96a66cSDavid du ColombierVenti trees and directory hierarchies
485e96a66cSDavid du Colombier.HP
495e96a66cSDavid du ColombierVenti [3] is an archival block storage server.
505e96a66cSDavid du ColombierOnce a block is stored, it can be retrieved by presenting the 20-byte
515e96a66cSDavid du ColombierSHA1 hash of its contents, called a
525e96a66cSDavid du Colombier.I score .
535e96a66cSDavid du ColombierBlocks on Venti have a maximum length of about 56 kilobytes,
545e96a66cSDavid du Colombierthough in practice smaller blocks are used.
555e96a66cSDavid du ColombierTo store a byte stream of arbitrary length, Venti uses a hash tree.
565e96a66cSDavid du ColombierConceptually, the data stream is broken into fixed-size (say,
575e96a66cSDavid du Colombier.I dsize -byte)
585e96a66cSDavid du Colombierchunks, which are stored on the Venti server.
595e96a66cSDavid du ColombierThe resulting scores are concatenated into a new pointer stream, which is
605e96a66cSDavid du Colombierbroken into fixed size (say,
615e96a66cSDavid du Colombier.I psize -byte)
625e96a66cSDavid du Colombierchunks, which are stored on the Venti server.
635e96a66cSDavid du Colombier.I Psize "" (
645e96a66cSDavid du Colombieris different from
655e96a66cSDavid du Colombier.I dsize
665e96a66cSDavid du Colombierso that we can ensure that each pointer block holds an
675e96a66cSDavid du Colombierintegral number of pointers.)
685e96a66cSDavid du ColombierThis yields a new pointer stream, and so on, until there is a single block
695e96a66cSDavid du Colombierand finally a single score describing the entire tree.
705e96a66cSDavid du ColombierThe resulting structure looks like:
715e96a66cSDavid du Colombier.PS
725e96a66cSDavid du Colombier.ps 8
735e96a66cSDavid du Colombier.vs 10
745e96a66cSDavid du Colombierboxht=0.1
755e96a66cSDavid du Colombierboxwid=0.1
765e96a66cSDavid du Colombier
775e96a66cSDavid du ColombierB0: box invis wid 1 "\f(CWVtDataType\fP"
785e96a66cSDavid du Colombiermove right 0.1
795e96a66cSDavid du ColombierL0a: box wid 0.2
805e96a66cSDavid du Colombiermove right 0.1
815e96a66cSDavid du ColombierL0b: box wid 0.2
825e96a66cSDavid du Colombiermove right 0.1
835e96a66cSDavid du ColombierL0c: box invis wid 0.2 "..."
845e96a66cSDavid du Colombiermove right 0.1
855e96a66cSDavid du Colombier
865e96a66cSDavid du ColombierL0d: box wid 0.2
875e96a66cSDavid du Colombiermove right 0.1
885e96a66cSDavid du ColombierL0e: box wid 0.2
895e96a66cSDavid du Colombiermove right 0.2
905e96a66cSDavid du ColombierL0f: box invis wid 0.2 "..."
915e96a66cSDavid du Colombiermove right 0.2
925e96a66cSDavid du Colombier
935e96a66cSDavid du ColombierL0g: box wid 0.2
945e96a66cSDavid du Colombiermove right 0.1
955e96a66cSDavid du ColombierL0h: box wid 0.2
965e96a66cSDavid du Colombiermove right 0.1
975e96a66cSDavid du ColombierL0i: box invis wid 0.2 "..."
985e96a66cSDavid du Colombiermove right 0.1
995e96a66cSDavid du Colombier
1005e96a66cSDavid du ColombierL0j: box wid 0.2
1015e96a66cSDavid du Colombiermove right 0.1
1025e96a66cSDavid du ColombierL0k: box wid 0.2
1035e96a66cSDavid du Colombiermove right 0.1
1045e96a66cSDavid du ColombierL0l: box invis wid 0.2 "..."
1055e96a66cSDavid du Colombiermove right 0.1
1065e96a66cSDavid du ColombierL0m: box wid 0.2
1075e96a66cSDavid du Colombier
1085e96a66cSDavid du Colombierdefine boxppddd {
1095e96a66cSDavid du Colombier	line from 0.2<$1.nw,$1.ne> to 0.2<$1.sw,$1.se>
1105e96a66cSDavid du Colombier	line from 0.4<$1.nw,$1.ne> to 0.4<$1.sw,$1.se>
1115e96a66cSDavid du Colombier	X: box invis at 0.1<$1.nw,$1.ne>
1125e96a66cSDavid du Colombier	Y: box invis at 0.1<$1.sw,$1.se>
1135e96a66cSDavid du Colombier	line -> from 0.5<X,Y> to $2.nw
1145e96a66cSDavid du Colombier	X: box invis at 0.3<$1.nw,$1.ne>
1155e96a66cSDavid du Colombier	Y: box invis at 0.3<$1.sw,$1.se>
1165e96a66cSDavid du Colombier	line -> from 0.5<X,Y> to $3.nw
1175e96a66cSDavid du Colombier	"..." at 0.7<$1.w,$1.e>
1185e96a66cSDavid du Colombier}
1195e96a66cSDavid du Colombier
1205e96a66cSDavid du Colombierdefine boxppdddp {
1215e96a66cSDavid du Colombier	line from 0.2<$1.nw,$1.ne> to 0.2<$1.sw,$1.se>
1225e96a66cSDavid du Colombier	line from 0.4<$1.nw,$1.ne> to 0.4<$1.sw,$1.se>
1235e96a66cSDavid du Colombier	line from 0.8<$1.nw,$1.ne> to 0.8<$1.sw,$1.se>
1245e96a66cSDavid du Colombier	X: box invis at 0.1<$1.nw,$1.ne>
1255e96a66cSDavid du Colombier	Y: box invis at 0.1<$1.sw,$1.se>
1265e96a66cSDavid du Colombier	line -> from 0.5<X,Y> to $2.nw
1275e96a66cSDavid du Colombier	X: box invis at 0.3<$1.nw,$1.ne>
1285e96a66cSDavid du Colombier	Y: box invis at 0.3<$1.sw,$1.se>
1295e96a66cSDavid du Colombier	line -> from 0.5<X,Y> to $3.nw
1305e96a66cSDavid du Colombier	"..." at 0.6<$1.w,$1.e>
1315e96a66cSDavid du Colombier	X: box invis at 0.9<$1.nw,$1.ne>
1325e96a66cSDavid du Colombier	Y: box invis at 0.9<$1.sw,$1.se>
1335e96a66cSDavid du Colombier	line -> from 0.5<X,Y> to $4.nw
1345e96a66cSDavid du Colombier}
1355e96a66cSDavid du Colombier
1365e96a66cSDavid du Colombierdefine boxpdddp {
1375e96a66cSDavid du Colombier	line from 0.2<$1.nw,$1.ne> to 0.2<$1.sw,$1.se>
1385e96a66cSDavid du Colombier	line from 0.8<$1.nw,$1.ne> to 0.8<$1.sw,$1.se>
1395e96a66cSDavid du Colombier	X: box invis at 0.1<$1.nw,$1.ne>
1405e96a66cSDavid du Colombier	Y: box invis at 0.1<$1.sw,$1.se>
1415e96a66cSDavid du Colombier	line -> from 0.5<X,Y> to $2.nw
1425e96a66cSDavid du Colombier	"..." at 0.5<$1.w,$1.e>
1435e96a66cSDavid du Colombier	X: box invis at 0.9<$1.nw,$1.ne>
1445e96a66cSDavid du Colombier	Y: box invis at 0.9<$1.sw,$1.se>
1455e96a66cSDavid du Colombier	line -> from 0.5<X,Y> to $3.nw
1465e96a66cSDavid du Colombier}
1475e96a66cSDavid du Colombier
1485e96a66cSDavid du Colombierbhd=0.4
1495e96a66cSDavid du ColombierL1abc: box wid 0.5 at 0.5<L0a, L0b>+(0,bhd)
1505e96a66cSDavid du Colombierboxppddd(L1abc, L0a, L0b)
1515e96a66cSDavid du ColombierL1def: box wid 0.5 at 0.5<L0d, L0e>+(0,bhd)
1525e96a66cSDavid du Colombierboxppddd(L1def, L0d, L0e)
1535e96a66cSDavid du ColombierL1ghi: box wid 0.5 at 0.5<L0g, L0h>+(0,bhd)
1545e96a66cSDavid du Colombierboxppddd(L1ghi, L0g, L0h)
1555e96a66cSDavid du ColombierL1jklm: box wid 0.5 at 0.5<L0j, L0k>+(0,bhd)
1565e96a66cSDavid du Colombierboxppdddp(L1jklm, L0j, L0k, L0m)
1575e96a66cSDavid du ColombierB1: box invis wid 1 "\f(CWVtPointerType0\fP" at B0+(0,bhd)
1585e96a66cSDavid du Colombier
1595e96a66cSDavid du ColombierL2abcdef: box wid 0.5 at 0.5<L1abc,L1def>+(0,bhd)
1605e96a66cSDavid du Colombierboxppddd(L2abcdef, L1abc, L1def)
1615e96a66cSDavid du ColombierL2ghijklm: box wid 0.5 at 0.5<L1ghi,L1jklm>+(0,bhd)
1625e96a66cSDavid du Colombierboxpdddp(L2ghijklm, L1ghi, L1jklm)
1635e96a66cSDavid du ColombierB2: box invis wid 1 "\f(CWVtPointerType1\fP" at B1+(0,bhd)
1645e96a66cSDavid du Colombier
1655e96a66cSDavid du ColombierL3atom: box wid 0.5 at 0.5<L2abcdef, L2ghijklm>+(0,bhd)
1665e96a66cSDavid du Colombierboxpdddp(L3atom, L2abcdef, L2ghijklm)
1675e96a66cSDavid du ColombierB3: box invis wid 1 "\f(CWVtPointerType2\fP" at B2+(0,bhd)
1685e96a66cSDavid du Colombier.PE
1695e96a66cSDavid du Colombier.LP
1705e96a66cSDavid du ColombierThe leaves are the original data stream.  Those blocks have type
1715e96a66cSDavid du Colombier.CW VtDataType .
1725e96a66cSDavid du ColombierThe first pointer stream has type
1735e96a66cSDavid du Colombier.CW VtPointerType0 ,
1745e96a66cSDavid du Colombierthe next has type
1755e96a66cSDavid du Colombier.CW VtPointerType1 ,
1765e96a66cSDavid du Colombierand so on.
1775e96a66cSDavid du ColombierThe figure ends with a single block of type
1785e96a66cSDavid du Colombier.CW VtPointerType2 ,
1795e96a66cSDavid du Colombierbut in general trees can have height up to
1805e96a66cSDavid du Colombier.CW VtPointerType6 .
1815e96a66cSDavid du ColombierFor a
1825e96a66cSDavid du Colombier.I dsize
1835e96a66cSDavid du Colombierof 8192 bytes
1845e96a66cSDavid du Colombierand
1855e96a66cSDavid du Colombier.I psize
1865e96a66cSDavid du Colombierof 8180 bytes (409 pointers),
1875e96a66cSDavid du Colombierthis gives a maximum stream size of approximately 10 zettabytes
1885e96a66cSDavid du Colombier(2\s-2\u73\d\s+2 or 10\s-2\u22\d\s+2 bytes).
1895e96a66cSDavid du Colombier.PP
19025ea35e3SDavid du ColombierData blocks are truncated to remove trailing runs of zeros before
1915e96a66cSDavid du Colombierstorage to Venti; they are zero-filled back to
1925e96a66cSDavid du Colombier.I dsize
1935e96a66cSDavid du Colombierbytes after retrieval from Venti.
19425ea35e3SDavid du ColombierSimilarly, trailing runs of pointers to zero-length blocks are
1955e96a66cSDavid du Colombierremoved from and added back to pointer blocks.
1965e96a66cSDavid du ColombierThese simple rules happen to make it particularly efficient to store
1975e96a66cSDavid du Colombierlarge runs of zeros, as might occur in a data stream with ``holes:''
1985e96a66cSDavid du Colombierthe zero-length block itself can be interpreted as a tree of any
1995e96a66cSDavid du Colombierdepth encoding an all-zero data stream.
2005e96a66cSDavid du Colombier.PP
2015e96a66cSDavid du ColombierReconstructing the data stream requires the score and type of the
2025e96a66cSDavid du Colombiertopmost block in the tree, the data chunk size, the pointer chunk size,
2035e96a66cSDavid du Colombierand the data stream size.
2045e96a66cSDavid du Colombier(From the data stream size and the chunk sizes we could derive the
2055e96a66cSDavid du Colombierdepth of the tree and thus the type of the topmost block, but it is convenient
2065e96a66cSDavid du Colombierto allow trees that are deeper than necessary.)
2075e96a66cSDavid du ColombierThis information is kept in a 40-byte structure called a
2085e96a66cSDavid du Colombier.CW VtEntry :
2095e96a66cSDavid du Colombier.P1
2105e96a66cSDavid du ColombierVtEntry:
2115e96a66cSDavid du Colombier.ta +\w'    'u +\w'            'u
2125e96a66cSDavid du Colombier	gen[4]	\fRgeneration number\fP
2135e96a66cSDavid du Colombier	psize[2]	\fRsize of pointer blocks\fP
2145e96a66cSDavid du Colombier	dsize[2]	\fRsize of data blocks\fP
2155e96a66cSDavid du Colombier	flags[1]
2165e96a66cSDavid du Colombier	zero[5]
2175e96a66cSDavid du Colombier	size[6]	\fRlength of file\fP
2185e96a66cSDavid du Colombier	score[20]	\fRscore of root block in tree\fP
2195e96a66cSDavid du Colombier.P2
2205e96a66cSDavid du Colombier(In this notation,
2215e96a66cSDavid du Colombier.CW name[sz]
2225e96a66cSDavid du Colombierindicates a
2235e96a66cSDavid du Colombier.CW sz -byte
2245e96a66cSDavid du Colombierfield called
2255e96a66cSDavid du Colombier.CW name .
2265e96a66cSDavid du ColombierIntegers are stored in big-endian order.
2275e96a66cSDavid du Colombier.CW Size
2285e96a66cSDavid du Colombierreally is a 48-bit field.)
2295e96a66cSDavid du Colombier.CW Flags
2305e96a66cSDavid du Colombieris made up of the following bit fields.
2315e96a66cSDavid du Colombier.P1
2325e96a66cSDavid du Colombier.ta +\w'      'u +\w'                      'u
2335e96a66cSDavid du Colombier0x01	VtEntryActive	\fRentry is allocated\fP
2345e96a66cSDavid du Colombier0x02	VtEntryDir	\fRentry describes a Venti directory (q.v.)\fP
2355e96a66cSDavid du Colombier0x1C	VtEntryDepthMask	\fRmask for tree depth\fP
2365e96a66cSDavid du Colombier0x20	VtEntryLocal	\fRreserved (q.v.)\fP
2375e96a66cSDavid du Colombier.P2
2385e96a66cSDavid du Colombier.LP
23939734e7eSDavid du ColombierThe depth of the described tree is stored in the 3 bits indicated:
2405e96a66cSDavid du Colombiera tree with a topmost node of type
2415e96a66cSDavid du Colombier.CW VtPointerType3
2425e96a66cSDavid du Colombierhas depth 4.
2435e96a66cSDavid du Colombier.PP
2445e96a66cSDavid du ColombierWith
2455e96a66cSDavid du Colombier.CW VtEntry
2465e96a66cSDavid du Colombierwe can build more complicated data structures,
2475e96a66cSDavid du Colombierones with multiple or nested data streams.
2485e96a66cSDavid du ColombierA data stream consisting of
2495e96a66cSDavid du Colombier.CW VtEntry
2505e96a66cSDavid du Colombierstructures is called a Venti directory.
2515e96a66cSDavid du ColombierIt is identical in structure to the Venti data stream
2525e96a66cSDavid du Colombierwe described earlier except that the bottom-level type is
2535e96a66cSDavid du Colombier.CW VtDirType ,
2545e96a66cSDavid du Colombierand
2555e96a66cSDavid du Colombierthe
2565e96a66cSDavid du Colombier.CW VtEntry
2575e96a66cSDavid du Colombierdescribing a Venti directory has the
2585e96a66cSDavid du Colombier.CW VtEntryDir
2595e96a66cSDavid du Colombierflag bit set.
2605e96a66cSDavid du ColombierThe
2615e96a66cSDavid du Colombier.I dsize
2625e96a66cSDavid du Colombierfor a Venti directory
2635e96a66cSDavid du Colombieris a multiple of 40 so that each data chunk holds
2645e96a66cSDavid du Colombieran integer number of
2655e96a66cSDavid du Colombier.CW VtEntry
2665e96a66cSDavid du Colombierstructures.
2675e96a66cSDavid du ColombierBy analogy with Venti directories,
2685e96a66cSDavid du Colombierwe call the original data stream a
2695e96a66cSDavid du ColombierVenti file.
2705e96a66cSDavid du ColombierNote that Venti files are assumed
2715e96a66cSDavid du Colombier.I not
2725e96a66cSDavid du Colombierto contain pointers to other Venti blocks.
2735e96a66cSDavid du ColombierThe only pointers to Venti blocks occur in
2745e96a66cSDavid du Colombier.CW VtEntry
2755e96a66cSDavid du Colombierstructures in
2765e96a66cSDavid du ColombierVenti directories
2775e96a66cSDavid du Colombier(and in the internal hash tree structure of the
2785e96a66cSDavid du Colombierindividual files and directories).
2795e96a66cSDavid du ColombierNote also that these directories are nothing more than pointer lists.
2805e96a66cSDavid du ColombierIn particular, there are no names or metadata like in a file system.
2815e96a66cSDavid du Colombier.PP
2825e96a66cSDavid du ColombierTo make it easier to pass hierarchies between applications,
2835e96a66cSDavid du Colombierthe root of a hierarchy is described in a 300-byte structure
2845e96a66cSDavid du Colombiercalled a
2855e96a66cSDavid du Colombier.CW VtRoot :
2865e96a66cSDavid du Colombier.P1
2875e96a66cSDavid du ColombierVtRoot:
2885e96a66cSDavid du Colombier.ta +\w'    'u +\w'                'u
2895e96a66cSDavid du Colombier	version[2]	\f(CW2\fP
2905e96a66cSDavid du Colombier	name[128]	\fRname of structure (just a comment)\fP
2915e96a66cSDavid du Colombier	type[128]	\fRstring describing structure (\f(CWvac\fR)\f(CW
2925e96a66cSDavid du Colombier	score[20]	\fRpointer to \f(CWVtDirType\fP block\f(CW
2935e96a66cSDavid du Colombier	blockSize[2]	\fRmaximum block size in structure\fP
2945e96a66cSDavid du Colombier	prev[20]	\fRprevious \f(CWVtRoot\fP in chain, if any\fP
2955e96a66cSDavid du Colombier.P2
2965e96a66cSDavid du Colombier.LP
2975e96a66cSDavid du ColombierThis structure is stored to Venti and its score is passed
2985e96a66cSDavid du Colombierbetween applications, typically in the form
2995e96a66cSDavid du Colombier``\fItype\f(CW:\fIrootscore\fR,''
3005e96a66cSDavid du Colombierwhere
3015e96a66cSDavid du Colombier.I type
3025e96a66cSDavid du Colombieris the type field from the
3035e96a66cSDavid du Colombier.CW VtRoot
3045e96a66cSDavid du Colombierstructure, and
3055e96a66cSDavid du Colombier.I rootscore
3065e96a66cSDavid du Colombieris the score of the
3075e96a66cSDavid du Colombier.CW VtRoot
3085e96a66cSDavid du Colombierblock.
3095e96a66cSDavid du Colombier.CW VtRoot
3105e96a66cSDavid du Colombierstructures can be chained together using the
3115e96a66cSDavid du Colombier.I prev
3125e96a66cSDavid du Colombierfield to encode an archival history
3135e96a66cSDavid du Colombierof the data structure.
3145e96a66cSDavid du Colombier.PP
3155e96a66cSDavid du ColombierFor example, a small Venti hierarchy might look like:
3165e96a66cSDavid du Colombier.PS
3175e96a66cSDavid du Colombier.ps 8
3185e96a66cSDavid du Colombier.vs 10
3195e96a66cSDavid du Colombierboxwid=0.1
3205e96a66cSDavid du Colombierboxht=0.1
3215e96a66cSDavid du Colombierf=0.9
3225e96a66cSDavid du Colombiermb=0.16
3235e96a66cSDavid du Colombier
3245e96a66cSDavid du ColombierVtRoot: [
3255e96a66cSDavid du Colombier	right
3265e96a66cSDavid du Colombier	B1: box
3275e96a66cSDavid du Colombier	move right 0.1
3285e96a66cSDavid du Colombier	"\f(CWVtRoot\fP" ljust
3295e96a66cSDavid du Colombier]
3305e96a66cSDavid du Colombier
3315e96a66cSDavid du ColombierRoot: [
3325e96a66cSDavid du Colombier	right
3335e96a66cSDavid du Colombier	B1: box fill f
3345e96a66cSDavid du Colombier	B2: box fill f
3355e96a66cSDavid du Colombier	B3: box fill f
3365e96a66cSDavid du Colombier	move right 0.1
3375e96a66cSDavid du Colombier] with .nw at VtRoot.sw+(0.2,-.1)
3385e96a66cSDavid du ColombierLevel1: [
3395e96a66cSDavid du Colombier	RootMeta: [
3405e96a66cSDavid du Colombier		box wid mb
3415e96a66cSDavid du Colombier	]
3425e96a66cSDavid du Colombier	MetaSource: [
3435e96a66cSDavid du Colombier		right
3445e96a66cSDavid du Colombier		B1: box wid 5*mb
3455e96a66cSDavid du Colombier	] with .nw at RootMeta.sw+(0,-.1)
3465e96a66cSDavid du Colombier
3475e96a66cSDavid du Colombier	Source: [
3485e96a66cSDavid du Colombier		right
3495e96a66cSDavid du Colombier		B1: box fill f
3505e96a66cSDavid du Colombier		B2: box fill f
3515e96a66cSDavid du Colombier		B3: box fill f
3525e96a66cSDavid du Colombier		B4: box fill f
3535e96a66cSDavid du Colombier		B5: box fill f
3545e96a66cSDavid du Colombier		B6: box fill f
3555e96a66cSDavid du Colombier		B7: box fill f
3565e96a66cSDavid du Colombier		B8: box fill f
3575e96a66cSDavid du Colombier	] with .nw at MetaSource.sw+(0,-.1)
3585e96a66cSDavid du Colombier	SB1: box invis at Source.B1
3595e96a66cSDavid du Colombier	SB2: box invis at Source.B2
3605e96a66cSDavid du Colombier	SB3: box invis at Source.B3
3615e96a66cSDavid du Colombier] with .nw at Root.sw+(0.4,-.1)
3625e96a66cSDavid du ColombierLevel2: [
3635e96a66cSDavid du Colombier	MetaSource: [
3645e96a66cSDavid du Colombier		right
3655e96a66cSDavid du Colombier		B1: box wid 5*mb
3665e96a66cSDavid du Colombier	]
3675e96a66cSDavid du Colombier	Source: [
3685e96a66cSDavid du Colombier		right
3695e96a66cSDavid du Colombier		B1: box fill f
3705e96a66cSDavid du Colombier		B2: box fill f
3715e96a66cSDavid du Colombier		B3: box fill f
3725e96a66cSDavid du Colombier		B4: box fill f
3735e96a66cSDavid du Colombier		B5: box fill f
3745e96a66cSDavid du Colombier		B6: box fill f
3755e96a66cSDavid du Colombier		B7: box fill f
3765e96a66cSDavid du Colombier		B8: box fill f
3775e96a66cSDavid du Colombier	] with .nw at MetaSource.sw+(0,-.1)
3785e96a66cSDavid du Colombier	File: box wid 0.8 with .nw at Source.sw+(0,-.1)
3795e96a66cSDavid du Colombier] with .nw at Level1.sw+(0.6,-.1)
3805e96a66cSDavid du Colombier
3815e96a66cSDavid du Colombierline -> from VtRoot.B1 down boxwid/2+0.1+boxwid/2 then to Root.w
3825e96a66cSDavid du Colombierline -> from Root.B3 down boxwid/2+0.1+boxwid/2 then to Level1.RootMeta.w
3835e96a66cSDavid du Colombierline -> from Root.B2 down boxwid/2+0.1+boxwid+0.1+boxwid/2 then to Level1.MetaSource.w
3845e96a66cSDavid du Colombierline -> from Root.B1 down boxwid/2+0.1+boxwid+0.1+boxwid+0.1+boxwid/2 then to Level1.Source.w
3855e96a66cSDavid du Colombier
3865e96a66cSDavid du Colombierline -> from Level1.SB3 down boxwid/2+0.1+boxwid/2 then to Level2.MetaSource.w
3875e96a66cSDavid du Colombierline -> from Level1.SB2 down boxwid/2+0.1+boxwid+0.1+boxwid/2 then to Level2.Source.w
3885e96a66cSDavid du Colombierline -> from Level1.SB1 down boxwid/2+0.1+boxwid+0.1+boxwid+0.1+boxwid/2 then to Level2.File.w
3895e96a66cSDavid du Colombier
3905e96a66cSDavid du Colombier[
3915e96a66cSDavid du Colombier	KEY: box wid 1.5 invis "Key"
3925e96a66cSDavid du Colombier	line from KEY.sw to KEY.se
3935e96a66cSDavid du Colombier	k = -.1
3945e96a66cSDavid du Colombier	kk=0.5
3955e96a66cSDavid du Colombier	A: [
3965e96a66cSDavid du Colombier		box wid 4*boxwid
3975e96a66cSDavid du Colombier		"Venti file" ljust with .w at last box .w+(kk,0)
3985e96a66cSDavid du Colombier	] with .nw at KEY.sw+(0,2*k)
3995e96a66cSDavid du Colombier	B: [
4005e96a66cSDavid du Colombier		box fill f
4015e96a66cSDavid du Colombier		"Venti entry (\f(CWVtEntry\fP)" ljust with .w at last box .w+(kk,0)
4025e96a66cSDavid du Colombier	] with .nw at A.sw+(0,k)
4035e96a66cSDavid du Colombier	C: [
4045e96a66cSDavid du Colombier		right
4055e96a66cSDavid du Colombier		CC: box fill f
4065e96a66cSDavid du Colombier		box fill f
4075e96a66cSDavid du Colombier		box fill f
4085e96a66cSDavid du Colombier		box fill f
4095e96a66cSDavid du Colombier		"Venti directory" ljust with .w at CC.w+(kk,0)
4105e96a66cSDavid du Colombier	] with .nw at B.sw+(0,k)
4115e96a66cSDavid du Colombier	D: [
4125e96a66cSDavid du Colombier		line -> right 3*boxwid
4135e96a66cSDavid du Colombier		"Venti pointer (score)" ljust with .w at last line .w+(kk, 0)
4145e96a66cSDavid du Colombier	] with .nw at C.sw+(0,k)
4155e96a66cSDavid du Colombier] with .nw at VtRoot.nw+(3,0)
4165e96a66cSDavid du Colombier.PE
4175e96a66cSDavid du Colombier.LP
4185e96a66cSDavid du ColombierVenti files are shown as white boxes, while directories are shown
4195e96a66cSDavid du Colombieras shaded boxes.  Each shaded square represents a
4205e96a66cSDavid du Colombier.CW VtEntry .
4215e96a66cSDavid du ColombierArrows represent pointers from
4225e96a66cSDavid du Colombier.CW VtEntry
4235e96a66cSDavid du Colombierstructures to other
4245e96a66cSDavid du ColombierVenti files or directories.
4255e96a66cSDavid du Colombier.PP
4265e96a66cSDavid du ColombierThe hierarchical structure provided by Venti files and directories
4275e96a66cSDavid du Colombiercan be used as the base for more complicated data structures.
4285e96a66cSDavid du ColombierBecause this structure captures all the information
4295e96a66cSDavid du Colombierabout pointers to other blocks, tools written to traverse
4305e96a66cSDavid du ColombierVenti hierarchies can traverse the more complicated
4315e96a66cSDavid du Colombierdata structures as well.
4325e96a66cSDavid du ColombierFor example,
4335e96a66cSDavid du Colombier.I venti/copy
4345e96a66cSDavid du Colombier(see
435*8e42ac6bSDavid du Colombier.I venti (1))
4365e96a66cSDavid du Colombiercopies a Venti hierarchy from one Venti server to another,
4375e96a66cSDavid du Colombiergiven the root
4385e96a66cSDavid du Colombier.CW VtEntry .
4395e96a66cSDavid du ColombierBecause the traditional file system described in later sections is
4405e96a66cSDavid du Colombierlayered on a Venti hierarchy,
4415e96a66cSDavid du Colombier.I venti/copy
4425e96a66cSDavid du Colombiercan copy it without fully understanding its structure.
4435e96a66cSDavid du Colombier.NH 1
4445e96a66cSDavid du ColombierVac file system format
4455e96a66cSDavid du Colombier.HP
4465e96a66cSDavid du ColombierThe Venti archive format
4475e96a66cSDavid du Colombier.I vac
4485e96a66cSDavid du Colombierbuilds a traditional file system using a Venti hierarchy.
4495e96a66cSDavid du ColombierEach vac file is implemented as a Venti file;
4505e96a66cSDavid du Colombiereach vac directory is implemented as a Venti
4515e96a66cSDavid du Colombierdirectory and a Venti file to provide traditional file system metadata.
4525e96a66cSDavid du ColombierThe metadata is stored in a structure called a
4535e96a66cSDavid du Colombier.CW DirEntry :
4545e96a66cSDavid du Colombier.P1
4555e96a66cSDavid du ColombierDirEntry:
4565e96a66cSDavid du Colombier.ta +\w'    'u +\w'            'u
4575e96a66cSDavid du Colombier	magic[4]	\f(CW0x1c4d9072\fP (DirMagic)\fP
4585e96a66cSDavid du Colombier	version[2]	\f(CW9\fP
4595e96a66cSDavid du Colombier	elem[s]	\fRname (final path element only)\fP
4605e96a66cSDavid du Colombier	entry[4]	\fRentry number for Venti file or directory\fP
4615e96a66cSDavid du Colombier	gen[4]	\fRgeneration number\fP
4625e96a66cSDavid du Colombier	mentry[4]	\fRentry number for Venti file holding metadata\fP
4635e96a66cSDavid du Colombier	mgen[4]	\fRgeneration number\fP
4645e96a66cSDavid du Colombier	qid[8]	\fRunique file serial number\fP
4655e96a66cSDavid du Colombier	uid[s]	\fRowner\fP
4665e96a66cSDavid du Colombier	gid[s]	\fRgroup\fP
4675e96a66cSDavid du Colombier	mid[s]	\fRlast modified by\fP
4685e96a66cSDavid du Colombier	mtime[4]	\fRlast modification time\fP
4695e96a66cSDavid du Colombier	ctime[4]	\fRcreation time\fP
4705e96a66cSDavid du Colombier	atime[4]	\fRlast access time\fP
4715e96a66cSDavid du Colombier	mode[4]	\fRmode bits\fP
4725e96a66cSDavid du Colombier.P2
4735e96a66cSDavid du ColombierThe notation
4745e96a66cSDavid du Colombier.CW name[s]
4755e96a66cSDavid du Colombierdenotes a string stored as a two-byte length
4765e96a66cSDavid du Colombierand then that many bytes.
4775e96a66cSDavid du ColombierThe above describes Version 9 of the
4785e96a66cSDavid du Colombier.CW DirEntry
4795e96a66cSDavid du Colombierformat.  Versions 7 and 8 are very similar; they can be
4805e96a66cSDavid du Colombierread by the current
4815e96a66cSDavid du Colombier.I vac
4825e96a66cSDavid du Colombiersource code but are not written.
4835e96a66cSDavid du ColombierEarlier versions were not widespread.
4845e96a66cSDavid du ColombierA
4855e96a66cSDavid du Colombier.CW DirEntry
4865e96a66cSDavid du Colombiermay be followed by optional extension sections, though none
4875e96a66cSDavid du Colombierare currently used.
4885e96a66cSDavid du ColombierThe
4895e96a66cSDavid du Colombier.CW mode
4905e96a66cSDavid du Colombierbits include bits commonly used by
4915e96a66cSDavid du ColombierUnix and Windows, in addition to those used by Plan 9.
4925e96a66cSDavid du Colombier.PP
4935e96a66cSDavid du ColombierThe
4945e96a66cSDavid du Colombier.CW entry
4955e96a66cSDavid du Colombierfield is an index into the parallel Venti directory.
4965e96a66cSDavid du ColombierThe
4975e96a66cSDavid du Colombier.CW gen
4985e96a66cSDavid du Colombierfield must match the
4995e96a66cSDavid du Colombier.CW gen
5005e96a66cSDavid du Colombierfield in the corresponding
5015e96a66cSDavid du Colombier.CW VtEntry
5025e96a66cSDavid du Colombierin the directory;
5035e96a66cSDavid du Colombierit is used to detect
5045e96a66cSDavid du Colombierstale indices.
5055e96a66cSDavid du ColombierSimilarly,
5065e96a66cSDavid du Colombier.CW mentry
5075e96a66cSDavid du Colombierand
5085e96a66cSDavid du Colombier.CW mgen
5095e96a66cSDavid du Colombierare the index and generation number
5105e96a66cSDavid du Colombierfor the metadata Venti file,
5115e96a66cSDavid du Colombierif the
5125e96a66cSDavid du Colombier.CW DirEntry
5135e96a66cSDavid du Colombierdescribes a vac directory.
5145e96a66cSDavid du Colombier.PP
5155e96a66cSDavid du ColombierThe relation between Venti files and directories and
5165e96a66cSDavid du Colombiervac files and directories can be seen in this figure:
5175e96a66cSDavid du Colombier.PS
5185e96a66cSDavid du Colombier.ps 8
5195e96a66cSDavid du Colombier.vs 10
5205e96a66cSDavid du Colombierboxwid=0.1
5215e96a66cSDavid du Colombierboxht=0.1
5225e96a66cSDavid du Colombierf=0.9
5235e96a66cSDavid du Colombiermb=0.16
5245e96a66cSDavid du Colombier
5255e96a66cSDavid du ColombierVtRoot: [
5265e96a66cSDavid du Colombier	right
5275e96a66cSDavid du Colombier	B1: box
5285e96a66cSDavid du Colombier	move right 0.1
5295e96a66cSDavid du Colombier	"\f(CWVtRoot\fP" ljust
5305e96a66cSDavid du Colombier]
5315e96a66cSDavid du Colombier
5325e96a66cSDavid du ColombierSuperRoot: [
5335e96a66cSDavid du Colombier	right
5345e96a66cSDavid du Colombier	B1: box fill f
5355e96a66cSDavid du Colombier	move right 0.1
5365e96a66cSDavid du Colombier	"fs root block" ljust
5375e96a66cSDavid du Colombier] with .nw at VtRoot.sw + (0.2, -.2)
5385e96a66cSDavid du ColombierRoot: [
5395e96a66cSDavid du Colombier	right
5405e96a66cSDavid du Colombier	B1: box fill f
5415e96a66cSDavid du Colombier	B2: box fill f
5425e96a66cSDavid du Colombier	B3: box fill f
5435e96a66cSDavid du Colombier	move right 0.1
5445e96a66cSDavid du Colombier	"root directory info block" ljust
5455e96a66cSDavid du Colombier] with .nw at SuperRoot.sw+(0.2, -.2)
5465e96a66cSDavid du ColombierLevel1: [
5475e96a66cSDavid du Colombier	RootMeta: [
5485e96a66cSDavid du Colombier		box wid mb
5495e96a66cSDavid du Colombier		move right 0.1
5505e96a66cSDavid du Colombier		"root metadata" ljust
5515e96a66cSDavid du Colombier	]
5525e96a66cSDavid du Colombier	MetaSource: [
5535e96a66cSDavid du Colombier		right
5545e96a66cSDavid du Colombier		B1: box wid mb
5555e96a66cSDavid du Colombier		B2: box wid mb
5565e96a66cSDavid du Colombier		B3: box wid mb
5575e96a66cSDavid du Colombier		B4: box wid mb
5585e96a66cSDavid du Colombier		B5: box wid mb
5595e96a66cSDavid du Colombier	] with .nw at RootMeta.sw+(0,-.2)
5605e96a66cSDavid du Colombier	MB1: box wid mb invis at MetaSource.B1
5615e96a66cSDavid du Colombier	MB2: box wid mb invis at MetaSource.B2
5625e96a66cSDavid du Colombier	MB3: box wid mb invis at MetaSource.B3
5635e96a66cSDavid du Colombier	MB4: box wid mb invis at MetaSource.B4
5645e96a66cSDavid du Colombier	MB5: box wid mb invis at MetaSource.B5
5655e96a66cSDavid du Colombier
5665e96a66cSDavid du Colombier	Source: [
5675e96a66cSDavid du Colombier		right
5685e96a66cSDavid du Colombier		B1: box fill f
5695e96a66cSDavid du Colombier		B2: box fill f
5705e96a66cSDavid du Colombier		B3: box fill f
5715e96a66cSDavid du Colombier		B4: box fill f
5725e96a66cSDavid du Colombier		B5: box fill f
5735e96a66cSDavid du Colombier		B6: box fill f
5745e96a66cSDavid du Colombier		B7: box fill f
5755e96a66cSDavid du Colombier		B8: box fill f
5765e96a66cSDavid du Colombier	] with .nw at MetaSource.sw+(0,-.1)
5775e96a66cSDavid du Colombier	SB1: box invis at Source.B1
5785e96a66cSDavid du Colombier	SB2: box invis at Source.B2
5795e96a66cSDavid du Colombier	SB3: box invis at Source.B3
5805e96a66cSDavid du Colombier	SB4: box invis at Source.B4
5815e96a66cSDavid du Colombier	SB5: box invis at Source.B5
5825e96a66cSDavid du Colombier	SB6: box invis at Source.B6
5835e96a66cSDavid du Colombier	SB7: box invis at Source.B7
5845e96a66cSDavid du Colombier	SB8: box invis at Source.B8
5855e96a66cSDavid du Colombier] with .nw at Root.sw+(0.4,-.2)
5865e96a66cSDavid du ColombierLevel2: [
5875e96a66cSDavid du Colombier	MetaSource: [
5885e96a66cSDavid du Colombier		right
5895e96a66cSDavid du Colombier		B1: box wid mb
5905e96a66cSDavid du Colombier		B2: box wid mb
5915e96a66cSDavid du Colombier		B3: box wid mb
5925e96a66cSDavid du Colombier		B4: box wid mb
5935e96a66cSDavid du Colombier		B5: box wid mb
5945e96a66cSDavid du Colombier	]
5955e96a66cSDavid du Colombier	Source: [
5965e96a66cSDavid du Colombier		right
5975e96a66cSDavid du Colombier		B1: box fill f
5985e96a66cSDavid du Colombier		B2: box fill f
5995e96a66cSDavid du Colombier		B3: box fill f
6005e96a66cSDavid du Colombier		B4: box fill f
6015e96a66cSDavid du Colombier		B5: box fill f
6025e96a66cSDavid du Colombier		B6: box fill f
6035e96a66cSDavid du Colombier		B7: box fill f
6045e96a66cSDavid du Colombier		B8: box fill f
6055e96a66cSDavid du Colombier	] with .nw at MetaSource.sw+(0,-.1)
6065e96a66cSDavid du Colombier	File: box wid 0.8 with .nw at Source.sw+(0,-.2)
6075e96a66cSDavid du Colombier] with .nw at Level1.sw+(0.6,-.2)
6085e96a66cSDavid du Colombier
6095e96a66cSDavid du Colombierline -> from VtRoot.B1 down boxwid/2+0.2+boxwid/2 then to SuperRoot.w
6105e96a66cSDavid du Colombierline -> from SuperRoot.B1 down boxwid/2+0.2+boxwid/2 then to Root.w
6115e96a66cSDavid du Colombierline -> from Root.B3 down boxwid/2+0.2+boxwid/2 then to Level1.RootMeta.w
6125e96a66cSDavid du Colombierline -> from Root.B2 down boxwid/2+0.2+boxwid+0.2+boxwid/2 then to Level1.MetaSource.w
6135e96a66cSDavid du Colombierline -> from Root.B1 down boxwid/2+0.2+boxwid+0.1+boxwid+0.2+boxwid/2 then to Level1.Source.w
6145e96a66cSDavid du Colombier
6155e96a66cSDavid du Colombierline -> from Level1.SB3 down boxwid/2+0.2+boxwid/2 then to Level2.MetaSource.w
6165e96a66cSDavid du Colombierline -> from Level1.SB2 down boxwid/2+0.2+boxwid+0.1+boxwid/2 then to Level2.Source.w
6175e96a66cSDavid du Colombierline -> from Level1.SB1 down boxwid/2+0.2+boxwid+0.1+boxwid+0.2+boxwid/2 then to Level2.File.w
6185e96a66cSDavid du Colombier
6195e96a66cSDavid du Colombierarrowwid = arrowwid/2
6205e96a66cSDavid du Colombierarrowht = arrowht/2
6215e96a66cSDavid du Colombierline -> from Level1.MB1 to Level1.SB1.n
6225e96a66cSDavid du Colombierline -> from Level1.MB2 to Level1.SB2.n
6235e96a66cSDavid du Colombierline -> from Level1.MB2 to Level1.SB3.n
6245e96a66cSDavid du Colombierline -> from Level1.MB4 to Level1.SB7.n
6255e96a66cSDavid du Colombierline -> from Level1.MB5 to Level1.SB5.n
6265e96a66cSDavid du Colombierarrowwid = arrowwid * 2
6275e96a66cSDavid du Colombierarrowht = arrowht * 2
6285e96a66cSDavid du Colombier
6295e96a66cSDavid du Colombierbox dashed with .nw at Level1.MetaSource.nw+(-.05,.05) wid 0.8+.05*2 ht .3+.05*2
6305e96a66cSDavid du Colombierbox dashed with .nw at Level2.MetaSource.nw+(-.05,.05) wid 0.8+.05*2 ht .3+.05*2
6315e96a66cSDavid du Colombierbox dotted with .nw at Level2.File.nw+(-.05,.05) wid 0.8+0.05*2 ht .1+.05*2
6325e96a66cSDavid du Colombier
6335e96a66cSDavid du Colombier[
6345e96a66cSDavid du Colombier	KEY: box wid 1.5 invis "Key"
6355e96a66cSDavid du Colombier	line from KEY.sw to KEY.se
6365e96a66cSDavid du Colombier	k = -.1
6375e96a66cSDavid du Colombier	kk=0.5
6385e96a66cSDavid du Colombier	A: [
6395e96a66cSDavid du Colombier		box wid 4*boxwid
6405e96a66cSDavid du Colombier		"Venti file" ljust with .w at last box .w+(kk,0)
6415e96a66cSDavid du Colombier	] with .nw at KEY.sw+(0,2*k)
6425e96a66cSDavid du Colombier	B: [
6435e96a66cSDavid du Colombier		box fill f
6445e96a66cSDavid du Colombier		"Venti entry (\f(CWEntry\fP)" ljust with .w at last box .w+(kk,0)
6455e96a66cSDavid du Colombier	] with .nw at A.sw+(0,k)
6465e96a66cSDavid du Colombier	C: [
6475e96a66cSDavid du Colombier		right
6485e96a66cSDavid du Colombier		CC: box fill f
6495e96a66cSDavid du Colombier		box fill f
6505e96a66cSDavid du Colombier		box fill f
6515e96a66cSDavid du Colombier		box fill f
6525e96a66cSDavid du Colombier		"Venti directory" ljust with .w at CC.w+(kk,0)
6535e96a66cSDavid du Colombier	] with .nw at B.sw+(0,k)
6545e96a66cSDavid du Colombier	D: [
6555e96a66cSDavid du Colombier		line -> right 3*boxwid
6565e96a66cSDavid du Colombier		"Venti pointer (score)" ljust with .w at last line .w+(kk, 0)
6575e96a66cSDavid du Colombier	] with .nw at C.sw+(0,k)
6585e96a66cSDavid du Colombier	DD: [
6595e96a66cSDavid du Colombier		box dotted wid 4*boxwid
6605e96a66cSDavid du Colombier		"Vac file" ljust with .w at last box .w+(kk,0)
6615e96a66cSDavid du Colombier	] with .nw at D.sw+(0,k)
6625e96a66cSDavid du Colombier	E: [
6635e96a66cSDavid du Colombier		box wid mb
6645e96a66cSDavid du Colombier		"Vac entry (\f(CWDirEntry\fP)" ljust with .w at last box .w+(kk,0)
6655e96a66cSDavid du Colombier	] with .nw at DD.sw+(0,k)
6665e96a66cSDavid du Colombier	G: [
6675e96a66cSDavid du Colombier		box dashed wid 4*boxwid
6685e96a66cSDavid du Colombier		"Vac directory" ljust with .w at last box .w+(kk,0)
6695e96a66cSDavid du Colombier	] with .nw at E.sw+(0,k)
6705e96a66cSDavid du Colombier	H: [
6715e96a66cSDavid du Colombier		arrowwid = arrowwid/2
6725e96a66cSDavid du Colombier		arrowht = arrowht/2
6735e96a66cSDavid du Colombier		line -> right 1.5*boxwid
6745e96a66cSDavid du Colombier		"Vac pointer (integer index)" ljust with .w at last line .w+(kk, 0)
6755e96a66cSDavid du Colombier		arrowwid = arrowwid * 2
6765e96a66cSDavid du Colombier		arrowht = arrowht * 2
6775e96a66cSDavid du Colombier	] with .nw at G.sw+(0,k)
6785e96a66cSDavid du Colombier] with .nw at VtRoot.nw+(3,0)
6795e96a66cSDavid du Colombier.PE
6805e96a66cSDavid du Colombier.LP
6815e96a66cSDavid du ColombierIn reality, the story is slightly more complicated.
6825e96a66cSDavid du ColombierThe metadata file in a Vac directory
6835e96a66cSDavid du Colombieris not just the concatenation of
6845e96a66cSDavid du Colombier.CW DirEntry
6855e96a66cSDavid du Colombierstructures.
6865e96a66cSDavid du ColombierInstead, it is the concatenation of
6875e96a66cSDavid du Colombier.CW MetaBlocks .
6885e96a66cSDavid du ColombierA
6895e96a66cSDavid du Colombier.CW MetaBlock
6905e96a66cSDavid du Colombiercontains some number of
6915e96a66cSDavid du Colombier.CW DirEntry
6925e96a66cSDavid du Colombierstructures along with a sorted index to make it easy
6935e96a66cSDavid du Colombierto look for a particular
6945e96a66cSDavid du Colombier.CW DirEntry
6955e96a66cSDavid du Colombierby its
6965e96a66cSDavid du Colombier.CW elem
6975e96a66cSDavid du Colombierfield.
6985e96a66cSDavid du ColombierThe details are in the source code.
6995e96a66cSDavid du Colombier.PP
7005e96a66cSDavid du ColombierAs shown in the diagram,
7015e96a66cSDavid du Colombierthe root directory of the file system is summarized by
7025e96a66cSDavid du Colombierthree
7035e96a66cSDavid du Colombier.CW VtEntry
7045e96a66cSDavid du Colombierstructures describing
7055e96a66cSDavid du Colombierthe Venti directory for the children of the root,
7065e96a66cSDavid du Colombierthe Venti file for the metadata describing the children of the root,
7075e96a66cSDavid du Colombierand a Venti file holding metadata for the root directory itself.
7085e96a66cSDavid du ColombierThese
7095e96a66cSDavid du Colombier.CW VtEntry
7105e96a66cSDavid du Colombierstructures are placed in a Venti directory of their own,
7115e96a66cSDavid du Colombierdescribed by the single
7125e96a66cSDavid du Colombier.CW VtEntry
7135e96a66cSDavid du Colombierin the
7145e96a66cSDavid du Colombierroot block.
7155e96a66cSDavid du Colombier.NH 1
7165e96a66cSDavid du ColombierFossil file system format
7175e96a66cSDavid du Colombier.HP
7185e96a66cSDavid du ColombierFossil uses the vac format, with some small changes.
7195e96a66cSDavid du ColombierThe changes only affect the data on the local disk; the data
7205e96a66cSDavid du Colombierarchived to Venti is exactly in vac format.
7215e96a66cSDavid du Colombier.PP
7225e96a66cSDavid du ColombierBlocks stored on local disk may contain scores pointing at local disk
7235e96a66cSDavid du Colombierblocks or at Venti blocks.
7245e96a66cSDavid du ColombierLocal block addresses are stored as 20-byte scores in which the first 16 bytes
7255e96a66cSDavid du Colombierare all zero and the last 4 bytes specify a block number in the disk.
7265e96a66cSDavid du ColombierBefore a block is archived, all the
7275e96a66cSDavid du Colombierblocks it points to must be archived, and the local scores in the block
7285e96a66cSDavid du Colombiermust be changed to Venti scores.
7295e96a66cSDavid du ColombierUsing block addresses rather than content hashes for local data
7305e96a66cSDavid du Colombiermakes the local file system easier to manage: if a local block's contents
7315e96a66cSDavid du Colombierchange, the pointer to the block does not need to change.
7325e96a66cSDavid du Colombier.NH 2
7335e96a66cSDavid du ColombierSnapshots
7345e96a66cSDavid du Colombier.HP
7355e96a66cSDavid du ColombierFossil is an archival file server.
7365e96a66cSDavid du ColombierIt takes periodic snapshots of the file system,
7375e96a66cSDavid du Colombierwhich are made accessible through the file system.
7385e96a66cSDavid du ColombierSpecifically, the active file system is presented in
7395e96a66cSDavid du Colombier.CW /active .
7405e96a66cSDavid du ColombierEphemeral snapshots (those that are kept on local disk and eventually deleted)
7415e96a66cSDavid du Colombierare presented in
7425e96a66cSDavid du Colombier\f(CW/snapshot/\fIyyyy\f(CW/\fImmdd\f(CW/\fIhhmm\fR,
7435e96a66cSDavid du Colombierwhere
7445e96a66cSDavid du Colombier.I yyyy
7455e96a66cSDavid du Colombieris the full year,
7465e96a66cSDavid du Colombier.I mm
7475e96a66cSDavid du Colombieris the month number,
7485e96a66cSDavid du Colombier.I dd
7495e96a66cSDavid du Colombieris the day number,
7505e96a66cSDavid du Colombier.I hh
7515e96a66cSDavid du Colombieris the hour,
7525e96a66cSDavid du Colombierand
7535e96a66cSDavid du Colombier.I mm
7545e96a66cSDavid du Colombieris the minute.
7555e96a66cSDavid du ColombierArchival snapshots (those that are archived to Venti and persist forever)
7565e96a66cSDavid du Colombierare presented in
7575e96a66cSDavid du Colombier\f(CW/archive/\fIyyyy\f(CW/\fImmdds\fR,
7585e96a66cSDavid du Colombierwhere
7595e96a66cSDavid du Colombier.I yyyy ,
7605e96a66cSDavid du Colombier.I mm ,
7615e96a66cSDavid du Colombierand
7625e96a66cSDavid du Colombier.I dd
7635e96a66cSDavid du Colombierare year, month, and day as before,
7645e96a66cSDavid du Colombierand
7655e96a66cSDavid du Colombier.I s
7665e96a66cSDavid du Colombieris a sequence number if more than one
7675e96a66cSDavid du Colombierarchival snapshot is done in a day.
7685e96a66cSDavid du ColombierFor the first snapshot,
7695e96a66cSDavid du Colombier.I s
7705e96a66cSDavid du Colombieris null.
7715e96a66cSDavid du ColombierFor the subsequent snapshots,
7725e96a66cSDavid du Colombier.I s
7735e96a66cSDavid du Colombieris
7745e96a66cSDavid du Colombier.CW .1 ,
7755e96a66cSDavid du Colombier.CW .2 ,
7765e96a66cSDavid du Colombier.CW .3 ,
7775e96a66cSDavid du Colombieretc.
7785e96a66cSDavid du Colombier.PP
7795e96a66cSDavid du ColombierTo implement the snapshots, the file server maintains a
7805e96a66cSDavid du Colombiercurrent
7815e96a66cSDavid du Colombier.I epoch
7825e96a66cSDavid du Colombierfor the active file system.
7835e96a66cSDavid du ColombierEach local block has a label that records, among other things,
7845e96a66cSDavid du Colombierthe epoch in which the block was allocated.
7855e96a66cSDavid du ColombierIf a block was allocated in an epoch earlier than the current one,
7865e96a66cSDavid du Colombierit is immutable and treated as copy-on-write.
7875e96a66cSDavid du ColombierTaking a snapshot can be accomplished by
7885e96a66cSDavid du Colombierrecording the address of the current root block and then
7895e96a66cSDavid du Colombierincrementing the epoch number.
7905e96a66cSDavid du ColombierNotice that the copy-on-write method makes
7915e96a66cSDavid du Colombiersnapshots both time efficient and space efficient.
7925e96a66cSDavid du ColombierThe only time cost is waiting for all current file system
7935e96a66cSDavid du Colombierrequests to finish and then incrementing a counter.
7945e96a66cSDavid du ColombierAfter a snapshot, blocks only get copied when they are
7955e96a66cSDavid du Colombiernext modified, so the per-snapshot
7965e96a66cSDavid du Colombierspace requirement is proportional
7975e96a66cSDavid du Colombierto the amount of new data rather than the total
7985e96a66cSDavid du Colombiersize of the file system.
7995e96a66cSDavid du Colombier.PP
8005e96a66cSDavid du ColombierThe blocks in the archival snapshots are moved to Venti,
8015e96a66cSDavid du Colombierbut the blocks in the ephemeral snapshots take up space
8025e96a66cSDavid du Colombierin the local disk file.
8035e96a66cSDavid du ColombierTo allow reclamation of this disk space, the file system
8045e96a66cSDavid du Colombiermaintains a
8055e96a66cSDavid du Colombier.I low
8065e96a66cSDavid du Colombier.I epoch ,
8075e96a66cSDavid du Colombierwhich is the epoch of the earliest ephemeral snapshot
8085e96a66cSDavid du Colombierstill available.
8095e96a66cSDavid du ColombierFossil only allows access to snapshots with epoch numbers
8105e96a66cSDavid du Colombierbetween the
8115e96a66cSDavid du Colombierlow epoch and the current epoch
8125e96a66cSDavid du Colombier(also called the high epoch).
8135e96a66cSDavid du ColombierIncrementing the low epoch thus makes old
8145e96a66cSDavid du Colombiersnapshots inaccessible.
8155e96a66cSDavid du ColombierThe space required to store those snapshots can then
8165e96a66cSDavid du Colombierbe reclaimed, as described below.
8175e96a66cSDavid du Colombier.NH 2
8185e96a66cSDavid du ColombierLocal blocks
8195e96a66cSDavid du Colombier.HP
8205e96a66cSDavid du ColombierThe bulk of the local disk file is the local blocks.
8215e96a66cSDavid du ColombierEach block has a 14-byte label associated with it, of the format:
8225e96a66cSDavid du Colombier.P1
8235e96a66cSDavid du ColombierLabel:
8245e96a66cSDavid du Colombier.ta +\w'    'u +\w'                'u
8255e96a66cSDavid du Colombier	state[1]	\fRblock state\fP
8265e96a66cSDavid du Colombier	type[1]	\fRblock type\fP
8275e96a66cSDavid du Colombier	epoch[4]	\fRallocation epoch\fP
8285e96a66cSDavid du Colombier	epochClose[4]	\fRclose epoch\fP
8295e96a66cSDavid du Colombier	tag[4]	\fRrandom tag\fP
8305e96a66cSDavid du Colombier.P2
8315e96a66cSDavid du Colombier.LP
8325e96a66cSDavid du ColombierThe
8335e96a66cSDavid du Colombier.CW type
8345e96a66cSDavid du Colombieris an analogue of the block types described earlier,
8355e96a66cSDavid du Colombierthough different names are used, to distinguish between
8365e96a66cSDavid du Colombierpointers blocks in a hash tree for a data stream
8375e96a66cSDavid du Colombierand pointer blocks for a directory stream.
8385e96a66cSDavid du ColombierThe
8395e96a66cSDavid du Colombier.CW epoch
8405e96a66cSDavid du Colombierwas mentioned in the last section.
8415e96a66cSDavid du ColombierThe other fields are explained below.
8425e96a66cSDavid du Colombier.PP
8435e96a66cSDavid du ColombierThere are two distinguished blocks states
8445e96a66cSDavid du Colombier.CW BsFree
8455e96a66cSDavid du Colombier.CW 0x00 ) (
8465e96a66cSDavid du Colombierand
8475e96a66cSDavid du Colombier.CW BsBad
8485e96a66cSDavid du Colombier.CW 0xFF ), (
8495e96a66cSDavid du Colombierwhich mark blocks that are available for allocation
8505e96a66cSDavid du Colombierand blocks that are bad and should be avoided.
8515e96a66cSDavid du ColombierIf
8525e96a66cSDavid du Colombier.CW state
8535e96a66cSDavid du Colombieris not one of these values, it is a bitwise
8545e96a66cSDavid du Colombier.I or ' `
8555e96a66cSDavid du Colombierof the following flags:
8565e96a66cSDavid du Colombier.P1
8575e96a66cSDavid du Colombier.ta +\w'      'u +\w'                'u
8585e96a66cSDavid du Colombier0x01	BsAlloc	\fRblock is in use\fP
8595e96a66cSDavid du Colombier0x02	BsCopied	\fRblock has been copied\fP
8605e96a66cSDavid du Colombier0x04	BsVenti	\fRblock has been stored on Venti\fP
8615e96a66cSDavid du Colombier0x08	BsClosed	\fRblock has been unlinked from active file system\fP
8625e96a66cSDavid du Colombier.P2
8635e96a66cSDavid du Colombier.LP
8645e96a66cSDavid du ColombierThe flags are explained as they arise in the discussions below.
8655e96a66cSDavid du Colombier.PP
8665e96a66cSDavid du ColombierIt is convenient to store some extra fields in the
8675e96a66cSDavid du Colombier.CW VtEntry
8685e96a66cSDavid du Colombierstructure when it describes a Venti file or directory
8695e96a66cSDavid du Colombierstored on local disk.
8705e96a66cSDavid du ColombierSpecifically, we set the
8715e96a66cSDavid du Colombier.CW VtEntryLocal
8725e96a66cSDavid du Colombierflag bit
8735e96a66cSDavid du Colombierand then use the bytes 7-16 of the score (which would
8745e96a66cSDavid du Colombierotherwise be zero, since it is a local score) to hold these fields:
8755e96a66cSDavid du Colombier.P1
8765e96a66cSDavid du Colombier.ta +\w'    'u +\w'                'u
8775e96a66cSDavid du Colombier	archive[1]	\fRboolean: this is an archival snapshot\fP
8785e96a66cSDavid du Colombier	snap[4]	\fRepoch number if root of snapshot\fP
8795e96a66cSDavid du Colombier	tag[4]	\fRrandom tag\fP
8805e96a66cSDavid du Colombier.P2
8815e96a66cSDavid du Colombier.LP
8825e96a66cSDavid du ColombierThe extended
8835e96a66cSDavid du Colombier.CW VtEntry
8845e96a66cSDavid du Colombierstructure is called an
8855e96a66cSDavid du Colombier.CW Entry .
8865e96a66cSDavid du ColombierThe
8875e96a66cSDavid du Colombier.CW tag
8885e96a66cSDavid du Colombierfield
8895e96a66cSDavid du Colombierin the
8905e96a66cSDavid du Colombier.CW Label
8915e96a66cSDavid du Colombierand the
8925e96a66cSDavid du Colombier.CW Entry
8935e96a66cSDavid du Colombieris used to identify dangling pointers or other file system corruption:
8945e96a66cSDavid du Colombierall the local blocks in a hash tree must
8955e96a66cSDavid du Colombierhave tags matching the tag in the
8965e96a66cSDavid du Colombier.CW Entry .
8975e96a66cSDavid du ColombierIf this
8985e96a66cSDavid du Colombier.CW Entry
8995e96a66cSDavid du Colombierpoints at the root of a snapshot,
9005e96a66cSDavid du Colombierthe
9015e96a66cSDavid du Colombier.CW snap
9025e96a66cSDavid du Colombierfield is the epoch of the snapshot.
9035e96a66cSDavid du ColombierIf the snapshot is intended to be archived to Venti,
9045e96a66cSDavid du Colombierthe
9055e96a66cSDavid du Colombier.CW archive
9065e96a66cSDavid du Colombierfield is non-zero.
9075e96a66cSDavid du Colombier.NH 2
9085e96a66cSDavid du ColombierBlock reclamation
9095e96a66cSDavid du Colombier.HP
9105e96a66cSDavid du ColombierThe blocks in the active file system form a tree: each
9115e96a66cSDavid du Colombierblock has only one parent.
9125e96a66cSDavid du ColombierOnce a copy-on-write block
9135e96a66cSDavid du Colombier.I b
9145e96a66cSDavid du Colombieris replaced by its copy, it is no longer
9155e96a66cSDavid du Colombierneeded by the active file system.
9165e96a66cSDavid du ColombierAt this point,
9175e96a66cSDavid du Colombier.I b
9185e96a66cSDavid du Colombieris unlinked from the active file system.
9195e96a66cSDavid du ColombierWe say that
9205e96a66cSDavid du Colombier.I b
9215e96a66cSDavid du Colombieris now
9225e96a66cSDavid du Colombier.I closed :
9235e96a66cSDavid du Colombierit is needed only for snapshots.
9245e96a66cSDavid du ColombierWhen a block is closed, the
9255e96a66cSDavid du Colombier.CW BsClosed
9265e96a66cSDavid du Colombierbit is set in its state, and the current epoch (called the block's closing epoch)
9275e96a66cSDavid du Colombieris stored in the
9285e96a66cSDavid du Colombier.CW epochClose
9295e96a66cSDavid du Colombierlabel field.
9305e96a66cSDavid du Colombier(Open blocks have an
9315e96a66cSDavid du Colombier.CW epochClose
9325e96a66cSDavid du Colombierof
9335e96a66cSDavid du Colombier.CW ~0 ).
9345e96a66cSDavid du Colombier.PP
9355e96a66cSDavid du ColombierA block is referenced by snapshots with epochs
9365e96a66cSDavid du Colombierbetween the block's allocation epoch and its closing epoch.
9375e96a66cSDavid du ColombierOnce the file system's low epoch grows to be greater than or equal to the block's
9385e96a66cSDavid du Colombierclosing epoch, the block is no longer needed for any snapshots
9395e96a66cSDavid du Colombierand can be reused.
9405e96a66cSDavid du Colombier.PP
9415e96a66cSDavid du ColombierIn a typical configuration, where nightly archival snapshots
9425e96a66cSDavid du Colombierare taken and written to Venti, it is desirable to reclaim
9435e96a66cSDavid du Colombierthe space occupied by now-archived blocks if possible.
9445e96a66cSDavid du ColombierTo do this, Fossil keeps track of whether the pointers
9455e96a66cSDavid du Colombierin each block are unique to that block.
9465e96a66cSDavid du ColombierWhen a block
9475e96a66cSDavid du Colombier.I bb
9485e96a66cSDavid du Colombieris allocated, a pointer to
9495e96a66cSDavid du Colombier.I bb
9505e96a66cSDavid du Colombieris written into exactly one active block (say,
9515e96a66cSDavid du Colombier.I b ).
9525e96a66cSDavid du ColombierIn the absence of snapshots, the pointer to
9535e96a66cSDavid du Colombier.I bb
9545e96a66cSDavid du Colombierwill remain unique to
9555e96a66cSDavid du Colombier.I b ,
9565e96a66cSDavid du Colombierso that if the pointer is zeroed,
9575e96a66cSDavid du Colombier.I bb
9585e96a66cSDavid du Colombiercan be immediately reused.
9595e96a66cSDavid du ColombierSnapshots complicate this invariant:
9605e96a66cSDavid du Colombierwhen
9615e96a66cSDavid du Colombier.I b
9625e96a66cSDavid du Colombieris copied-on-write, all its pointers
9635e96a66cSDavid du Colombierare no longer unique to it.
9645e96a66cSDavid du ColombierAt time of the copy, the
9655e96a66cSDavid du Colombier.CW BsCopied
9665e96a66cSDavid du Colombierstate bit in the block's label
9675e96a66cSDavid du Colombieris set to note the duplication of the pointers contained within.
9685e96a66cSDavid du Colombier.NH 2
9695e96a66cSDavid du ColombierDisk layout
9705e96a66cSDavid du Colombier.HP
9715e96a66cSDavid du ColombierThe file system header describes the file system layout and has this format:
9725e96a66cSDavid du Colombier.P1
9735e96a66cSDavid du Colombier.ta +\w'    'u +\w'                'u
9745e96a66cSDavid du ColombierHeader:
9755e96a66cSDavid du Colombier	magic[4]	\fR0x3776AE89 (HeaderMagic)\fP
9765e96a66cSDavid du Colombier	version[2]	\fR1 (HeaderVersion)\fP
9775e96a66cSDavid du Colombier	blockSize[2]	\fIfile system block size\fP
9785e96a66cSDavid du Colombier	super[4]	\fRblock offset of super block\fP
9795e96a66cSDavid du Colombier	label[4]	\fRblock offset of labels\fP
9805e96a66cSDavid du Colombier	data[4]	\fRdata blocks\fP
9815e96a66cSDavid du Colombier	end[4]	\fRend of file system\fP
9825e96a66cSDavid du Colombier.P2
9835e96a66cSDavid du Colombier.LP
9845e96a66cSDavid du ColombierThe corresponding file system layout is:
9855e96a66cSDavid du Colombier.PS
9865e96a66cSDavid du Colombier.ps 8
9875e96a66cSDavid du Colombier.vs 9
9885e96a66cSDavid du Colombierboxwid=0.75
9895e96a66cSDavid du Colombierboxht=0.15
9905e96a66cSDavid du ColombierEmpty: box "empty" ht 0.25
9915e96a66cSDavid du ColombierHeader: box "header" with .n at Empty.s
9925e96a66cSDavid du ColombierEmpty2: box "empty" with .n at Header.s
9935e96a66cSDavid du ColombierSuper: box "super block" with .n at Empty2.s
9945e96a66cSDavid du ColombierLabel: box "label" "blocks" with .n at Super.s ht 0.25
9955e96a66cSDavid du ColombierData: box "data" "blocks" with .n at Label.s ht 0.3
9965e96a66cSDavid du Colombier"  0" ljust at Empty.ne
9975e96a66cSDavid du Colombier"  128kB" ljust at Header.ne
9985e96a66cSDavid du Colombier"  \f5super\fP \(mu \f(CWblockSize\fP" ljust at Super.ne
9995e96a66cSDavid du Colombier"  \f5label\fP \(mu \f(CWblockSize\fP" ljust at Label.ne
10005e96a66cSDavid du Colombier"  \f5data\fP \(mu \f(CWblockSize\fP" ljust at Data.ne
10015e96a66cSDavid du Colombier"  \f5end\fP \(mu \f(CWblockSize\fP" ljust at Data.se
10025e96a66cSDavid du Colombier"" at (-1,0)
10035e96a66cSDavid du Colombier"" at (6,0)
10045e96a66cSDavid du Colombier.PE
10055e96a66cSDavid du Colombier.LP
10065e96a66cSDavid du ColombierThe numbers to the right of the blocks are byte offsets
10075e96a66cSDavid du Colombierof the boundaries.
10085e96a66cSDavid du Colombier.LP
10095e96a66cSDavid du ColombierThe super block describes the file system itself and looks like:
10105e96a66cSDavid du Colombier.P1
10115e96a66cSDavid du Colombier.ta +\w'    'u +\w'                'u
10125e96a66cSDavid du ColombierSuper:
10135e96a66cSDavid du Colombier	magic[4]	\fR0x2340A3B1 (SuperMagic)\fP
10145e96a66cSDavid du Colombier	version[2]	\fR1 (SuperVersion)\fP
10155e96a66cSDavid du Colombier	epochLow[4]	\fRfile system low epoch\fP
10165e96a66cSDavid du Colombier	epochHigh[4]	\fRfile system high (active) epoch\fP
10175e96a66cSDavid du Colombier	qid[8]	\fRnext qid to allocate\fP
10185e96a66cSDavid du Colombier	active[4]	\fRdata block number: root of active file system\fP
10195e96a66cSDavid du Colombier	next[4]	\fRdata block number: root of next file system to archive\fP
10205e96a66cSDavid du Colombier	current[4]	\fRdata block number: root of file system currently being archived\fP
10215e96a66cSDavid du Colombier	last[20]	\fRVenti score of last successful archive\fP
10225e96a66cSDavid du Colombier	name[128]	\fRname of file system (just a comment)\fP
10235e96a66cSDavid du Colombier.P2
10245e96a66cSDavid du Colombier.LP
10255e96a66cSDavid du Colombier.NH 1
10265e96a66cSDavid du ColombierFossil server
10275e96a66cSDavid du Colombier.HP
10285e96a66cSDavid du ColombierThe Fossil server is a user-space program that runs on a standard Plan 9 kernel.
10295e96a66cSDavid du Colombier.NH 2
10305e96a66cSDavid du ColombierProcess structure
10315e96a66cSDavid du Colombier.PP
10325e96a66cSDavid du ColombierThe file server is structured as a set of processes synchronizing
10335e96a66cSDavid du Colombiermostly through message passing along queues.
10345e96a66cSDavid du ColombierThe processes are given names, which can be seen in the output of
10355e96a66cSDavid du Colombier.CW ps
10365e96a66cSDavid du Colombier.CW -a .
10375e96a66cSDavid du Colombier.PP
10385e96a66cSDavid du Colombier.CW Listen
10395e96a66cSDavid du Colombierprocesses announce on various network addresses.
10405e96a66cSDavid du ColombierA
10415e96a66cSDavid du Colombier.CW con
10425e96a66cSDavid du Colombierprocess handles each incoming connection, reading 9P requests
10435e96a66cSDavid du Colombierand adding them to a central message queue.
10445e96a66cSDavid du Colombier.CW Msg
10455e96a66cSDavid du Colombierprocesses remove 9P requests from the queue,
10465e96a66cSDavid du Colombierhandle them, and write the responses to the appropriate
10475e96a66cSDavid du Colombierfile descriptors.
10485e96a66cSDavid du Colombier.PP
10495e96a66cSDavid du ColombierThe
10505e96a66cSDavid du Colombier.CW disk
10515e96a66cSDavid du Colombierprocess handles disk I/O requests made by the other processes.
10525e96a66cSDavid du ColombierThe
10535e96a66cSDavid du Colombier.CW flush
10545e96a66cSDavid du Colombierprocess writes dirty blocks from the in-memory block cache to disk.
10555e96a66cSDavid du ColombierThe
10565e96a66cSDavid du Colombier.CW unlink
10575e96a66cSDavid du Colombierprocess frees previously linked blocks once the blocks that point at them
10585e96a66cSDavid du Colombierhave been written to disk.
10595e96a66cSDavid du Colombier.PP
10605e96a66cSDavid du ColombierA
10615e96a66cSDavid du Colombier.CW consI
10625e96a66cSDavid du Colombierreads from each console file (typically a pipe posted in
10635e96a66cSDavid du Colombier.CW /srv ),
10645e96a66cSDavid du Colombieradding the typed characters to the input queue.
10655e96a66cSDavid du ColombierThe
10665e96a66cSDavid du Colombier.CW cons
10675e96a66cSDavid du Colombierprocess echoes input and runs the commands, saving
10685e96a66cSDavid du Colombieroutput in a ring buffer.
10695e96a66cSDavid du ColombierBecause there is only one
10705e96a66cSDavid du Colombier.CW cons
10715e96a66cSDavid du Colombierprocess, only one console command may be executing at a time.
10725e96a66cSDavid du ColombierA
10735e96a66cSDavid du Colombier.CW consO
107425ea35e3SDavid du Colombierprocess copies this ring buffer to each console file.
10755e96a66cSDavid du Colombier.PP
10765e96a66cSDavid du ColombierThe
10775e96a66cSDavid du Colombier.CW periodic
10785e96a66cSDavid du Colombierprocess runs periodic events, like
10795e96a66cSDavid du Colombierflushing the root metadata to disk or
10805e96a66cSDavid du Colombiertaking snapshots of the file system.
10815e96a66cSDavid du Colombier.NH 2
10825e96a66cSDavid du ColombierBlock cache
10835e96a66cSDavid du Colombier.HP
10845e96a66cSDavid du ColombierFossil maintains an in-memory block cache which
10855e96a66cSDavid du Colombierholds both local disk blocks and Venti blocks.
10865e96a66cSDavid du ColombierCache eviction follows a least recently used policy.
10875e96a66cSDavid du ColombierDirty blocks are restricted to at most half the cache.
10885e96a66cSDavid du ColombierThis can be changed by editing
10895e96a66cSDavid du Colombier.CW DirtyPercentage
10905e96a66cSDavid du Colombierin
10915e96a66cSDavid du Colombier.CW dat.h .
10925e96a66cSDavid du Colombier.PP
10935e96a66cSDavid du ColombierThe block cache uses soft updates [1] to ensure that the on-disk
10945e96a66cSDavid du Colombierfile system is always self-consistent.
10955e96a66cSDavid du ColombierThus there is no
10965e96a66cSDavid du Colombier.I halt
10975e96a66cSDavid du Colombierconsole command
10985e96a66cSDavid du Colombierand no need to check a file system
10995e96a66cSDavid du Colombierthat was shut down without halting.
11005e96a66cSDavid du Colombier.NH 2
11015e96a66cSDavid du ColombierArchiving
11025e96a66cSDavid du Colombier.HP
11035e96a66cSDavid du ColombierA background process writes blocks in archival snapshots to Venti.
11045e96a66cSDavid du ColombierAlthough
11055e96a66cSDavid du Colombier.CW /archive/\fIyyyy\fP/\fImmdds\fR
11065e96a66cSDavid du Colombieris a copy of only
11075e96a66cSDavid du Colombier.CW /active
11085e96a66cSDavid du Colombierat the time of the snapshot,
11095e96a66cSDavid du Colombierthe archival process archives the
11105e96a66cSDavid du Colombierentire file tree rather than just
11115e96a66cSDavid du Colombierthe subtree rooted at
11125e96a66cSDavid du Colombier.CW /active .
11135e96a66cSDavid du ColombierThe snapshots
11145e96a66cSDavid du Colombier.CW /snapshot/\fIyyyy\fP/\fImmdd\fP/\fIhhmm
11155e96a66cSDavid du Colombierare stored as empty directories.
11165e96a66cSDavid du ColombierOnce all the blocks have been archived,
11175e96a66cSDavid du Colombiera
11185e96a66cSDavid du Colombier.CW VtRoot
11195e96a66cSDavid du Colombierheader for the file system is archived.
11205e96a66cSDavid du ColombierThe score of that header is recorded in
11215e96a66cSDavid du Colombier.CW super.score
11225e96a66cSDavid du Colombierand also printed on the file server console.
11235e96a66cSDavid du ColombierThe score can used by
11245e96a66cSDavid du Colombier.I flfmt
11255e96a66cSDavid du Colombierto restore a file system (see
11265e96a66cSDavid du Colombier.I fossil (4)).
11275e96a66cSDavid du Colombier.NH 2
11285e96a66cSDavid du ColombierContrast with the old file server
11295e96a66cSDavid du Colombier.HP
11305e96a66cSDavid du ColombierThe most obvious difference between Fossil and the
11315e96a66cSDavid du Colombierold Plan 9 file server [2] is that Fossil uses a Venti server as
11325e96a66cSDavid du Colombierits archival storage in place of a WORM juke box.
11335e96a66cSDavid du ColombierThere are a few other architectural differences to be
11345e96a66cSDavid du Colombieraware of.
11355e96a66cSDavid du Colombier.PP
11365e96a66cSDavid du ColombierFossil is a user-level program run on a standard kernel.
11375e96a66cSDavid du Colombier.PP
11385e96a66cSDavid du ColombierFossil does not have any way to concatenate, stripe, or
11395e96a66cSDavid du Colombiermirror disk files.  For functionality similar to the old file server's
11405e96a66cSDavid du Colombierconfiguration strings, use the experimental file stack device
11415e96a66cSDavid du Colombier(see
1142b748718eSDavid du Colombier.I fs (3)).
11435e96a66cSDavid du Colombier.PP
11445e96a66cSDavid du ColombierFossil speaks only 9P2000.  Old 9P (aka 9P1) is not supported.
11455e96a66cSDavid du Colombier.PP
11465e96a66cSDavid du Colombier... XXX words about converting an old file system to fossil?
11475e96a66cSDavid du Colombier.NH 1
11485e96a66cSDavid du ColombierReferences
11495e96a66cSDavid du Colombier.LP
11505e96a66cSDavid du Colombier[1] Gregory R. Ganger, Marshall Kirk McKusick, Craig A. N. Soules,
11515e96a66cSDavid du Colombierand Yale N. Patt.
11525e96a66cSDavid du Colombier``Soft Updates: A Solution to the Metadata Update Problem
11535e96a66cSDavid du Colombierin File Systems,''
11545e96a66cSDavid du Colombier.I "ACM Transactions on Computer Systems" ,
11555e96a66cSDavid du ColombierVol 18., No. 2, May 2000, pp. 127\-153.
11565e96a66cSDavid du Colombier.LP
11575e96a66cSDavid du Colombier[2] Sean Quinlan, ``A Cached WORM File System,''
11585e96a66cSDavid du Colombier.I "Software\(emPractice and Experience" ,
11595e96a66cSDavid du ColombierVol 21., No 12., December 1991, pp. 1289\-1299.
11605e96a66cSDavid du Colombier.LP
11615e96a66cSDavid du Colombier[3] Sean Quinlan and Sean Dorward, ``Venti: A New Approach to Archival Storage,''
11625e96a66cSDavid du Colombier.I "Usenix Conference on File and Storage Technologies" ,
11635e96a66cSDavid du Colombier2002.
1164