15e96a66cSDavid du Colombier... .FP times 25e96a66cSDavid du Colombier... .fp 1 R R.nomath 35e96a66cSDavid du Colombier... .fp 5 CW LucidaSansCW83 45e96a66cSDavid du Colombier.TL 55e96a66cSDavid du ColombierFossil, an Archival File Server 65e96a66cSDavid du Colombier.AU 75e96a66cSDavid du ColombierSean Quinlan 85e96a66cSDavid du ColombierJim McKie 95e96a66cSDavid du ColombierRuss Cox 105e96a66cSDavid du Colombierjmk,rsc@plan9.bell-labs.com 115e96a66cSDavid du Colombier.AB 125e96a66cSDavid du ColombierThis paper describes the internals and 135e96a66cSDavid du Colombieroperation of Fossil, an archival file server built for Plan 9. 145e96a66cSDavid du ColombierFossil has not yet replaced the current Plan 9 file server 155e96a66cSDavid du Colombierand 165e96a66cSDavid du Colombier.CW kfs , 175e96a66cSDavid du Colombierbut that is our eventual intent. 185e96a66cSDavid du ColombierBoth fossil and this documentation are 195e96a66cSDavid du Colombierworks in progress. Comments on either are most welcome. 205e96a66cSDavid du Colombier.AE 215e96a66cSDavid du Colombier.de HP 225e96a66cSDavid du Colombier.LP 235e96a66cSDavid du Colombier.. 245e96a66cSDavid du Colombier.NH 1 255e96a66cSDavid du ColombierIntroduction 265e96a66cSDavid du Colombier.HP 275e96a66cSDavid du ColombierFossil is an archival file server built for Plan 9. 285e96a66cSDavid du ColombierIn a typical configuration, it maintains a traditional file system 295e96a66cSDavid du Colombierin a local disk partition and periodically archives snapshots of the file system 305e96a66cSDavid du Colombierto a Venti server. These archives are made available through 315e96a66cSDavid du Colombiera file system interface. 325e96a66cSDavid du ColombierFossil can also be run without a Venti server, in which case the 335e96a66cSDavid du Colombiersnapshots (if any) occupy local disk space. 345e96a66cSDavid du Colombier.PP 355e96a66cSDavid du ColombierThe bulk of this paper explains the underlying data structures: 365e96a66cSDavid du ColombierVenti trees, the Venti archival file system format, and finally Fossil's 375e96a66cSDavid du Colombierfile system format. 385e96a66cSDavid du ColombierThe end of the paper discusses the architecture of the Fossil server. 395e96a66cSDavid du Colombier.PP 405e96a66cSDavid du ColombierThe presentation of the data structures is very detailed, perhaps 415e96a66cSDavid du Colombiertoo detailed for most readers. 425e96a66cSDavid du ColombierThe intent is to record all the details necessary to make structural 435e96a66cSDavid du Colombierchanges to the file system format. 445e96a66cSDavid du ColombierFeel free to jump ahead when boredom strikes. 455e96a66cSDavid du Colombier.NH 1 465e96a66cSDavid du ColombierVenti trees and directory hierarchies 475e96a66cSDavid du Colombier.HP 485e96a66cSDavid du ColombierVenti [3] is an archival block storage server. 495e96a66cSDavid du ColombierOnce a block is stored, it can be retrieved by presenting the 20-byte 505e96a66cSDavid du ColombierSHA1 hash of its contents, called a 515e96a66cSDavid du Colombier.I score . 525e96a66cSDavid du ColombierBlocks on Venti have a maximum length of about 56 kilobytes, 535e96a66cSDavid du Colombierthough in practice smaller blocks are used. 545e96a66cSDavid du ColombierTo store a byte stream of arbitrary length, Venti uses a hash tree. 555e96a66cSDavid du ColombierConceptually, the data stream is broken into fixed-size (say, 565e96a66cSDavid du Colombier.I dsize -byte) 575e96a66cSDavid du Colombierchunks, which are stored on the Venti server. 585e96a66cSDavid du ColombierThe resulting scores are concatenated into a new pointer stream, which is 595e96a66cSDavid du Colombierbroken into fixed size (say, 605e96a66cSDavid du Colombier.I psize -byte) 615e96a66cSDavid du Colombierchunks, which are stored on the Venti server. 625e96a66cSDavid du Colombier.I Psize "" ( 635e96a66cSDavid du Colombieris different from 645e96a66cSDavid du Colombier.I dsize 655e96a66cSDavid du Colombierso that we can ensure that each pointer block holds an 665e96a66cSDavid du Colombierintegral number of pointers.) 675e96a66cSDavid du ColombierThis yields a new pointer stream, and so on, until there is a single block 685e96a66cSDavid du Colombierand finally a single score describing the entire tree. 695e96a66cSDavid du ColombierThe resulting structure looks like: 705e96a66cSDavid du Colombier.PS 715e96a66cSDavid du Colombier.ps 8 725e96a66cSDavid du Colombier.vs 10 735e96a66cSDavid du Colombierboxht=0.1 745e96a66cSDavid du Colombierboxwid=0.1 755e96a66cSDavid du Colombier 765e96a66cSDavid du ColombierB0: box invis wid 1 "\f(CWVtDataType\fP" 775e96a66cSDavid du Colombiermove right 0.1 785e96a66cSDavid du ColombierL0a: box wid 0.2 795e96a66cSDavid du Colombiermove right 0.1 805e96a66cSDavid du ColombierL0b: box wid 0.2 815e96a66cSDavid du Colombiermove right 0.1 825e96a66cSDavid du ColombierL0c: box invis wid 0.2 "..." 835e96a66cSDavid du Colombiermove right 0.1 845e96a66cSDavid du Colombier 855e96a66cSDavid du ColombierL0d: box wid 0.2 865e96a66cSDavid du Colombiermove right 0.1 875e96a66cSDavid du ColombierL0e: box wid 0.2 885e96a66cSDavid du Colombiermove right 0.2 895e96a66cSDavid du ColombierL0f: box invis wid 0.2 "..." 905e96a66cSDavid du Colombiermove right 0.2 915e96a66cSDavid du Colombier 925e96a66cSDavid du ColombierL0g: box wid 0.2 935e96a66cSDavid du Colombiermove right 0.1 945e96a66cSDavid du ColombierL0h: box wid 0.2 955e96a66cSDavid du Colombiermove right 0.1 965e96a66cSDavid du ColombierL0i: box invis wid 0.2 "..." 975e96a66cSDavid du Colombiermove right 0.1 985e96a66cSDavid du Colombier 995e96a66cSDavid du ColombierL0j: box wid 0.2 1005e96a66cSDavid du Colombiermove right 0.1 1015e96a66cSDavid du ColombierL0k: box wid 0.2 1025e96a66cSDavid du Colombiermove right 0.1 1035e96a66cSDavid du ColombierL0l: box invis wid 0.2 "..." 1045e96a66cSDavid du Colombiermove right 0.1 1055e96a66cSDavid du ColombierL0m: box wid 0.2 1065e96a66cSDavid du Colombier 1075e96a66cSDavid du Colombierdefine boxppddd { 1085e96a66cSDavid du Colombier line from 0.2<$1.nw,$1.ne> to 0.2<$1.sw,$1.se> 1095e96a66cSDavid du Colombier line from 0.4<$1.nw,$1.ne> to 0.4<$1.sw,$1.se> 1105e96a66cSDavid du Colombier X: box invis at 0.1<$1.nw,$1.ne> 1115e96a66cSDavid du Colombier Y: box invis at 0.1<$1.sw,$1.se> 1125e96a66cSDavid du Colombier line -> from 0.5<X,Y> to $2.nw 1135e96a66cSDavid du Colombier X: box invis at 0.3<$1.nw,$1.ne> 1145e96a66cSDavid du Colombier Y: box invis at 0.3<$1.sw,$1.se> 1155e96a66cSDavid du Colombier line -> from 0.5<X,Y> to $3.nw 1165e96a66cSDavid du Colombier "..." at 0.7<$1.w,$1.e> 1175e96a66cSDavid du Colombier} 1185e96a66cSDavid du Colombier 1195e96a66cSDavid du Colombierdefine boxppdddp { 1205e96a66cSDavid du Colombier line from 0.2<$1.nw,$1.ne> to 0.2<$1.sw,$1.se> 1215e96a66cSDavid du Colombier line from 0.4<$1.nw,$1.ne> to 0.4<$1.sw,$1.se> 1225e96a66cSDavid du Colombier line from 0.8<$1.nw,$1.ne> to 0.8<$1.sw,$1.se> 1235e96a66cSDavid du Colombier X: box invis at 0.1<$1.nw,$1.ne> 1245e96a66cSDavid du Colombier Y: box invis at 0.1<$1.sw,$1.se> 1255e96a66cSDavid du Colombier line -> from 0.5<X,Y> to $2.nw 1265e96a66cSDavid du Colombier X: box invis at 0.3<$1.nw,$1.ne> 1275e96a66cSDavid du Colombier Y: box invis at 0.3<$1.sw,$1.se> 1285e96a66cSDavid du Colombier line -> from 0.5<X,Y> to $3.nw 1295e96a66cSDavid du Colombier "..." at 0.6<$1.w,$1.e> 1305e96a66cSDavid du Colombier X: box invis at 0.9<$1.nw,$1.ne> 1315e96a66cSDavid du Colombier Y: box invis at 0.9<$1.sw,$1.se> 1325e96a66cSDavid du Colombier line -> from 0.5<X,Y> to $4.nw 1335e96a66cSDavid du Colombier} 1345e96a66cSDavid du Colombier 1355e96a66cSDavid du Colombierdefine boxpdddp { 1365e96a66cSDavid du Colombier line from 0.2<$1.nw,$1.ne> to 0.2<$1.sw,$1.se> 1375e96a66cSDavid du Colombier line from 0.8<$1.nw,$1.ne> to 0.8<$1.sw,$1.se> 1385e96a66cSDavid du Colombier X: box invis at 0.1<$1.nw,$1.ne> 1395e96a66cSDavid du Colombier Y: box invis at 0.1<$1.sw,$1.se> 1405e96a66cSDavid du Colombier line -> from 0.5<X,Y> to $2.nw 1415e96a66cSDavid du Colombier "..." at 0.5<$1.w,$1.e> 1425e96a66cSDavid du Colombier X: box invis at 0.9<$1.nw,$1.ne> 1435e96a66cSDavid du Colombier Y: box invis at 0.9<$1.sw,$1.se> 1445e96a66cSDavid du Colombier line -> from 0.5<X,Y> to $3.nw 1455e96a66cSDavid du Colombier} 1465e96a66cSDavid du Colombier 1475e96a66cSDavid du Colombierbhd=0.4 1485e96a66cSDavid du ColombierL1abc: box wid 0.5 at 0.5<L0a, L0b>+(0,bhd) 1495e96a66cSDavid du Colombierboxppddd(L1abc, L0a, L0b) 1505e96a66cSDavid du ColombierL1def: box wid 0.5 at 0.5<L0d, L0e>+(0,bhd) 1515e96a66cSDavid du Colombierboxppddd(L1def, L0d, L0e) 1525e96a66cSDavid du ColombierL1ghi: box wid 0.5 at 0.5<L0g, L0h>+(0,bhd) 1535e96a66cSDavid du Colombierboxppddd(L1ghi, L0g, L0h) 1545e96a66cSDavid du ColombierL1jklm: box wid 0.5 at 0.5<L0j, L0k>+(0,bhd) 1555e96a66cSDavid du Colombierboxppdddp(L1jklm, L0j, L0k, L0m) 1565e96a66cSDavid du ColombierB1: box invis wid 1 "\f(CWVtPointerType0\fP" at B0+(0,bhd) 1575e96a66cSDavid du Colombier 1585e96a66cSDavid du ColombierL2abcdef: box wid 0.5 at 0.5<L1abc,L1def>+(0,bhd) 1595e96a66cSDavid du Colombierboxppddd(L2abcdef, L1abc, L1def) 1605e96a66cSDavid du ColombierL2ghijklm: box wid 0.5 at 0.5<L1ghi,L1jklm>+(0,bhd) 1615e96a66cSDavid du Colombierboxpdddp(L2ghijklm, L1ghi, L1jklm) 1625e96a66cSDavid du ColombierB2: box invis wid 1 "\f(CWVtPointerType1\fP" at B1+(0,bhd) 1635e96a66cSDavid du Colombier 1645e96a66cSDavid du ColombierL3atom: box wid 0.5 at 0.5<L2abcdef, L2ghijklm>+(0,bhd) 1655e96a66cSDavid du Colombierboxpdddp(L3atom, L2abcdef, L2ghijklm) 1665e96a66cSDavid du ColombierB3: box invis wid 1 "\f(CWVtPointerType2\fP" at B2+(0,bhd) 1675e96a66cSDavid du Colombier.PE 1685e96a66cSDavid du Colombier.LP 1695e96a66cSDavid du ColombierThe leaves are the original data stream. Those blocks have type 1705e96a66cSDavid du Colombier.CW VtDataType . 1715e96a66cSDavid du ColombierThe first pointer stream has type 1725e96a66cSDavid du Colombier.CW VtPointerType0 , 1735e96a66cSDavid du Colombierthe next has type 1745e96a66cSDavid du Colombier.CW VtPointerType1 , 1755e96a66cSDavid du Colombierand so on. 1765e96a66cSDavid du ColombierThe figure ends with a single block of type 1775e96a66cSDavid du Colombier.CW VtPointerType2 , 1785e96a66cSDavid du Colombierbut in general trees can have height up to 1795e96a66cSDavid du Colombier.CW VtPointerType6 . 1805e96a66cSDavid du ColombierFor a 1815e96a66cSDavid du Colombier.I dsize 1825e96a66cSDavid du Colombierof 8192 bytes 1835e96a66cSDavid du Colombierand 1845e96a66cSDavid du Colombier.I psize 1855e96a66cSDavid du Colombierof 8180 bytes (409 pointers), 1865e96a66cSDavid du Colombierthis gives a maximum stream size of approximately 10 zettabytes 1875e96a66cSDavid du Colombier(2\s-2\u73\d\s+2 or 10\s-2\u22\d\s+2 bytes). 1885e96a66cSDavid du Colombier.PP 189*25ea35e3SDavid du ColombierData blocks are truncated to remove trailing runs of zeros before 1905e96a66cSDavid du Colombierstorage to Venti; they are zero-filled back to 1915e96a66cSDavid du Colombier.I dsize 1925e96a66cSDavid du Colombierbytes after retrieval from Venti. 193*25ea35e3SDavid du ColombierSimilarly, trailing runs of pointers to zero-length blocks are 1945e96a66cSDavid du Colombierremoved from and added back to pointer blocks. 1955e96a66cSDavid du ColombierThese simple rules happen to make it particularly efficient to store 1965e96a66cSDavid du Colombierlarge runs of zeros, as might occur in a data stream with ``holes:'' 1975e96a66cSDavid du Colombierthe zero-length block itself can be interpreted as a tree of any 1985e96a66cSDavid du Colombierdepth encoding an all-zero data stream. 1995e96a66cSDavid du Colombier.PP 2005e96a66cSDavid du ColombierReconstructing the data stream requires the score and type of the 2015e96a66cSDavid du Colombiertopmost block in the tree, the data chunk size, the pointer chunk size, 2025e96a66cSDavid du Colombierand the data stream size. 2035e96a66cSDavid du Colombier(From the data stream size and the chunk sizes we could derive the 2045e96a66cSDavid du Colombierdepth of the tree and thus the type of the topmost block, but it is convenient 2055e96a66cSDavid du Colombierto allow trees that are deeper than necessary.) 2065e96a66cSDavid du ColombierThis information is kept in a 40-byte structure called a 2075e96a66cSDavid du Colombier.CW VtEntry : 2085e96a66cSDavid du Colombier.P1 2095e96a66cSDavid du ColombierVtEntry: 2105e96a66cSDavid du Colombier.ta +\w' 'u +\w' 'u 2115e96a66cSDavid du Colombier gen[4] \fRgeneration number\fP 2125e96a66cSDavid du Colombier psize[2] \fRsize of pointer blocks\fP 2135e96a66cSDavid du Colombier dsize[2] \fRsize of data blocks\fP 2145e96a66cSDavid du Colombier flags[1] 2155e96a66cSDavid du Colombier zero[5] 2165e96a66cSDavid du Colombier size[6] \fRlength of file\fP 2175e96a66cSDavid du Colombier score[20] \fRscore of root block in tree\fP 2185e96a66cSDavid du Colombier.P2 2195e96a66cSDavid du Colombier(In this notation, 2205e96a66cSDavid du Colombier.CW name[sz] 2215e96a66cSDavid du Colombierindicates a 2225e96a66cSDavid du Colombier.CW sz -byte 2235e96a66cSDavid du Colombierfield called 2245e96a66cSDavid du Colombier.CW name . 2255e96a66cSDavid du ColombierIntegers are stored in big-endian order. 2265e96a66cSDavid du Colombier.CW Size 2275e96a66cSDavid du Colombierreally is a 48-bit field.) 2285e96a66cSDavid du Colombier.CW Flags 2295e96a66cSDavid du Colombieris made up of the following bit fields. 2305e96a66cSDavid du Colombier.P1 2315e96a66cSDavid du Colombier.ta +\w' 'u +\w' 'u 2325e96a66cSDavid du Colombier0x01 VtEntryActive \fRentry is allocated\fP 2335e96a66cSDavid du Colombier0x02 VtEntryDir \fRentry describes a Venti directory (q.v.)\fP 2345e96a66cSDavid du Colombier0x1C VtEntryDepthMask \fRmask for tree depth\fP 2355e96a66cSDavid du Colombier0x20 VtEntryLocal \fRreserved (q.v.)\fP 2365e96a66cSDavid du Colombier.P2 2375e96a66cSDavid du Colombier.LP 2385e96a66cSDavid du ColombierThe depth of the described tree is stored in the 5 bits indicated: 2395e96a66cSDavid du Colombiera tree with a topmost node of type 2405e96a66cSDavid du Colombier.CW VtPointerType3 2415e96a66cSDavid du Colombierhas depth 4. 2425e96a66cSDavid du Colombier.PP 2435e96a66cSDavid du ColombierWith 2445e96a66cSDavid du Colombier.CW VtEntry 2455e96a66cSDavid du Colombierwe can build more complicated data structures, 2465e96a66cSDavid du Colombierones with multiple or nested data streams. 2475e96a66cSDavid du ColombierA data stream consisting of 2485e96a66cSDavid du Colombier.CW VtEntry 2495e96a66cSDavid du Colombierstructures is called a Venti directory. 2505e96a66cSDavid du ColombierIt is identical in structure to the Venti data stream 2515e96a66cSDavid du Colombierwe described earlier except that the bottom-level type is 2525e96a66cSDavid du Colombier.CW VtDirType , 2535e96a66cSDavid du Colombierand 2545e96a66cSDavid du Colombierthe 2555e96a66cSDavid du Colombier.CW VtEntry 2565e96a66cSDavid du Colombierdescribing a Venti directory has the 2575e96a66cSDavid du Colombier.CW VtEntryDir 2585e96a66cSDavid du Colombierflag bit set. 2595e96a66cSDavid du ColombierThe 2605e96a66cSDavid du Colombier.I dsize 2615e96a66cSDavid du Colombierfor a Venti directory 2625e96a66cSDavid du Colombieris a multiple of 40 so that each data chunk holds 2635e96a66cSDavid du Colombieran integer number of 2645e96a66cSDavid du Colombier.CW VtEntry 2655e96a66cSDavid du Colombierstructures. 2665e96a66cSDavid du ColombierBy analogy with Venti directories, 2675e96a66cSDavid du Colombierwe call the original data stream a 2685e96a66cSDavid du ColombierVenti file. 2695e96a66cSDavid du ColombierNote that Venti files are assumed 2705e96a66cSDavid du Colombier.I not 2715e96a66cSDavid du Colombierto contain pointers to other Venti blocks. 2725e96a66cSDavid du ColombierThe only pointers to Venti blocks occur in 2735e96a66cSDavid du Colombier.CW VtEntry 2745e96a66cSDavid du Colombierstructures in 2755e96a66cSDavid du ColombierVenti directories 2765e96a66cSDavid du Colombier(and in the internal hash tree structure of the 2775e96a66cSDavid du Colombierindividual files and directories). 2785e96a66cSDavid du ColombierNote also that these directories are nothing more than pointer lists. 2795e96a66cSDavid du ColombierIn particular, there are no names or metadata like in a file system. 2805e96a66cSDavid du Colombier.PP 2815e96a66cSDavid du ColombierTo make it easier to pass hierarchies between applications, 2825e96a66cSDavid du Colombierthe root of a hierarchy is described in a 300-byte structure 2835e96a66cSDavid du Colombiercalled a 2845e96a66cSDavid du Colombier.CW VtRoot : 2855e96a66cSDavid du Colombier.P1 2865e96a66cSDavid du ColombierVtRoot: 2875e96a66cSDavid du Colombier.ta +\w' 'u +\w' 'u 2885e96a66cSDavid du Colombier version[2] \f(CW2\fP 2895e96a66cSDavid du Colombier name[128] \fRname of structure (just a comment)\fP 2905e96a66cSDavid du Colombier type[128] \fRstring describing structure (\f(CWvac\fR)\f(CW 2915e96a66cSDavid du Colombier score[20] \fRpointer to \f(CWVtDirType\fP block\f(CW 2925e96a66cSDavid du Colombier blockSize[2] \fRmaximum block size in structure\fP 2935e96a66cSDavid du Colombier prev[20] \fRprevious \f(CWVtRoot\fP in chain, if any\fP 2945e96a66cSDavid du Colombier.P2 2955e96a66cSDavid du Colombier.LP 2965e96a66cSDavid du ColombierThis structure is stored to Venti and its score is passed 2975e96a66cSDavid du Colombierbetween applications, typically in the form 2985e96a66cSDavid du Colombier``\fItype\f(CW:\fIrootscore\fR,'' 2995e96a66cSDavid du Colombierwhere 3005e96a66cSDavid du Colombier.I type 3015e96a66cSDavid du Colombieris the type field from the 3025e96a66cSDavid du Colombier.CW VtRoot 3035e96a66cSDavid du Colombierstructure, and 3045e96a66cSDavid du Colombier.I rootscore 3055e96a66cSDavid du Colombieris the score of the 3065e96a66cSDavid du Colombier.CW VtRoot 3075e96a66cSDavid du Colombierblock. 3085e96a66cSDavid du Colombier.CW VtRoot 3095e96a66cSDavid du Colombierstructures can be chained together using the 3105e96a66cSDavid du Colombier.I prev 3115e96a66cSDavid du Colombierfield to encode an archival history 3125e96a66cSDavid du Colombierof the data structure. 3135e96a66cSDavid du Colombier.PP 3145e96a66cSDavid du ColombierFor example, a small Venti hierarchy might look like: 3155e96a66cSDavid du Colombier.PS 3165e96a66cSDavid du Colombier.ps 8 3175e96a66cSDavid du Colombier.vs 10 3185e96a66cSDavid du Colombierboxwid=0.1 3195e96a66cSDavid du Colombierboxht=0.1 3205e96a66cSDavid du Colombierf=0.9 3215e96a66cSDavid du Colombiermb=0.16 3225e96a66cSDavid du Colombier 3235e96a66cSDavid du ColombierVtRoot: [ 3245e96a66cSDavid du Colombier right 3255e96a66cSDavid du Colombier B1: box 3265e96a66cSDavid du Colombier move right 0.1 3275e96a66cSDavid du Colombier "\f(CWVtRoot\fP" ljust 3285e96a66cSDavid du Colombier] 3295e96a66cSDavid du Colombier 3305e96a66cSDavid du ColombierRoot: [ 3315e96a66cSDavid du Colombier right 3325e96a66cSDavid du Colombier B1: box fill f 3335e96a66cSDavid du Colombier B2: box fill f 3345e96a66cSDavid du Colombier B3: box fill f 3355e96a66cSDavid du Colombier move right 0.1 3365e96a66cSDavid du Colombier] with .nw at VtRoot.sw+(0.2,-.1) 3375e96a66cSDavid du ColombierLevel1: [ 3385e96a66cSDavid du Colombier RootMeta: [ 3395e96a66cSDavid du Colombier box wid mb 3405e96a66cSDavid du Colombier ] 3415e96a66cSDavid du Colombier MetaSource: [ 3425e96a66cSDavid du Colombier right 3435e96a66cSDavid du Colombier B1: box wid 5*mb 3445e96a66cSDavid du Colombier ] with .nw at RootMeta.sw+(0,-.1) 3455e96a66cSDavid du Colombier 3465e96a66cSDavid du Colombier Source: [ 3475e96a66cSDavid du Colombier right 3485e96a66cSDavid du Colombier B1: box fill f 3495e96a66cSDavid du Colombier B2: box fill f 3505e96a66cSDavid du Colombier B3: box fill f 3515e96a66cSDavid du Colombier B4: box fill f 3525e96a66cSDavid du Colombier B5: box fill f 3535e96a66cSDavid du Colombier B6: box fill f 3545e96a66cSDavid du Colombier B7: box fill f 3555e96a66cSDavid du Colombier B8: box fill f 3565e96a66cSDavid du Colombier ] with .nw at MetaSource.sw+(0,-.1) 3575e96a66cSDavid du Colombier SB1: box invis at Source.B1 3585e96a66cSDavid du Colombier SB2: box invis at Source.B2 3595e96a66cSDavid du Colombier SB3: box invis at Source.B3 3605e96a66cSDavid du Colombier] with .nw at Root.sw+(0.4,-.1) 3615e96a66cSDavid du ColombierLevel2: [ 3625e96a66cSDavid du Colombier MetaSource: [ 3635e96a66cSDavid du Colombier right 3645e96a66cSDavid du Colombier B1: box wid 5*mb 3655e96a66cSDavid du Colombier ] 3665e96a66cSDavid du Colombier Source: [ 3675e96a66cSDavid du Colombier right 3685e96a66cSDavid du Colombier B1: box fill f 3695e96a66cSDavid du Colombier B2: box fill f 3705e96a66cSDavid du Colombier B3: box fill f 3715e96a66cSDavid du Colombier B4: box fill f 3725e96a66cSDavid du Colombier B5: box fill f 3735e96a66cSDavid du Colombier B6: box fill f 3745e96a66cSDavid du Colombier B7: box fill f 3755e96a66cSDavid du Colombier B8: box fill f 3765e96a66cSDavid du Colombier ] with .nw at MetaSource.sw+(0,-.1) 3775e96a66cSDavid du Colombier File: box wid 0.8 with .nw at Source.sw+(0,-.1) 3785e96a66cSDavid du Colombier] with .nw at Level1.sw+(0.6,-.1) 3795e96a66cSDavid du Colombier 3805e96a66cSDavid du Colombierline -> from VtRoot.B1 down boxwid/2+0.1+boxwid/2 then to Root.w 3815e96a66cSDavid du Colombierline -> from Root.B3 down boxwid/2+0.1+boxwid/2 then to Level1.RootMeta.w 3825e96a66cSDavid du Colombierline -> from Root.B2 down boxwid/2+0.1+boxwid+0.1+boxwid/2 then to Level1.MetaSource.w 3835e96a66cSDavid du Colombierline -> from Root.B1 down boxwid/2+0.1+boxwid+0.1+boxwid+0.1+boxwid/2 then to Level1.Source.w 3845e96a66cSDavid du Colombier 3855e96a66cSDavid du Colombierline -> from Level1.SB3 down boxwid/2+0.1+boxwid/2 then to Level2.MetaSource.w 3865e96a66cSDavid du Colombierline -> from Level1.SB2 down boxwid/2+0.1+boxwid+0.1+boxwid/2 then to Level2.Source.w 3875e96a66cSDavid du Colombierline -> from Level1.SB1 down boxwid/2+0.1+boxwid+0.1+boxwid+0.1+boxwid/2 then to Level2.File.w 3885e96a66cSDavid du Colombier 3895e96a66cSDavid du Colombier[ 3905e96a66cSDavid du Colombier KEY: box wid 1.5 invis "Key" 3915e96a66cSDavid du Colombier line from KEY.sw to KEY.se 3925e96a66cSDavid du Colombier k = -.1 3935e96a66cSDavid du Colombier kk=0.5 3945e96a66cSDavid du Colombier A: [ 3955e96a66cSDavid du Colombier box wid 4*boxwid 3965e96a66cSDavid du Colombier "Venti file" ljust with .w at last box .w+(kk,0) 3975e96a66cSDavid du Colombier ] with .nw at KEY.sw+(0,2*k) 3985e96a66cSDavid du Colombier B: [ 3995e96a66cSDavid du Colombier box fill f 4005e96a66cSDavid du Colombier "Venti entry (\f(CWVtEntry\fP)" ljust with .w at last box .w+(kk,0) 4015e96a66cSDavid du Colombier ] with .nw at A.sw+(0,k) 4025e96a66cSDavid du Colombier C: [ 4035e96a66cSDavid du Colombier right 4045e96a66cSDavid du Colombier CC: box fill f 4055e96a66cSDavid du Colombier box fill f 4065e96a66cSDavid du Colombier box fill f 4075e96a66cSDavid du Colombier box fill f 4085e96a66cSDavid du Colombier "Venti directory" ljust with .w at CC.w+(kk,0) 4095e96a66cSDavid du Colombier ] with .nw at B.sw+(0,k) 4105e96a66cSDavid du Colombier D: [ 4115e96a66cSDavid du Colombier line -> right 3*boxwid 4125e96a66cSDavid du Colombier "Venti pointer (score)" ljust with .w at last line .w+(kk, 0) 4135e96a66cSDavid du Colombier ] with .nw at C.sw+(0,k) 4145e96a66cSDavid du Colombier] with .nw at VtRoot.nw+(3,0) 4155e96a66cSDavid du Colombier.PE 4165e96a66cSDavid du Colombier.LP 4175e96a66cSDavid du ColombierVenti files are shown as white boxes, while directories are shown 4185e96a66cSDavid du Colombieras shaded boxes. Each shaded square represents a 4195e96a66cSDavid du Colombier.CW VtEntry . 4205e96a66cSDavid du ColombierArrows represent pointers from 4215e96a66cSDavid du Colombier.CW VtEntry 4225e96a66cSDavid du Colombierstructures to other 4235e96a66cSDavid du ColombierVenti files or directories. 4245e96a66cSDavid du Colombier.PP 4255e96a66cSDavid du ColombierThe hierarchical structure provided by Venti files and directories 4265e96a66cSDavid du Colombiercan be used as the base for more complicated data structures. 4275e96a66cSDavid du ColombierBecause this structure captures all the information 4285e96a66cSDavid du Colombierabout pointers to other blocks, tools written to traverse 4295e96a66cSDavid du ColombierVenti hierarchies can traverse the more complicated 4305e96a66cSDavid du Colombierdata structures as well. 4315e96a66cSDavid du ColombierFor example, 4325e96a66cSDavid du Colombier.I venti/copy 4335e96a66cSDavid du Colombier(see 4345e96a66cSDavid du Colombier.I ventiaux (8)) 4355e96a66cSDavid du Colombiercopies a Venti hierarchy from one Venti server to another, 4365e96a66cSDavid du Colombiergiven the root 4375e96a66cSDavid du Colombier.CW VtEntry . 4385e96a66cSDavid du ColombierBecause the traditional file system described in later sections is 4395e96a66cSDavid du Colombierlayered on a Venti hierarchy, 4405e96a66cSDavid du Colombier.I venti/copy 4415e96a66cSDavid du Colombiercan copy it without fully understanding its structure. 4425e96a66cSDavid du Colombier.NH 1 4435e96a66cSDavid du ColombierVac file system format 4445e96a66cSDavid du Colombier.HP 4455e96a66cSDavid du ColombierThe Venti archive format 4465e96a66cSDavid du Colombier.I vac 4475e96a66cSDavid du Colombierbuilds a traditional file system using a Venti hierarchy. 4485e96a66cSDavid du ColombierEach vac file is implemented as a Venti file; 4495e96a66cSDavid du Colombiereach vac directory is implemented as a Venti 4505e96a66cSDavid du Colombierdirectory and a Venti file to provide traditional file system metadata. 4515e96a66cSDavid du ColombierThe metadata is stored in a structure called a 4525e96a66cSDavid du Colombier.CW DirEntry : 4535e96a66cSDavid du Colombier.P1 4545e96a66cSDavid du ColombierDirEntry: 4555e96a66cSDavid du Colombier.ta +\w' 'u +\w' 'u 4565e96a66cSDavid du Colombier magic[4] \f(CW0x1c4d9072\fP (DirMagic)\fP 4575e96a66cSDavid du Colombier version[2] \f(CW9\fP 4585e96a66cSDavid du Colombier elem[s] \fRname (final path element only)\fP 4595e96a66cSDavid du Colombier entry[4] \fRentry number for Venti file or directory\fP 4605e96a66cSDavid du Colombier gen[4] \fRgeneration number\fP 4615e96a66cSDavid du Colombier mentry[4] \fRentry number for Venti file holding metadata\fP 4625e96a66cSDavid du Colombier mgen[4] \fRgeneration number\fP 4635e96a66cSDavid du Colombier qid[8] \fRunique file serial number\fP 4645e96a66cSDavid du Colombier uid[s] \fRowner\fP 4655e96a66cSDavid du Colombier gid[s] \fRgroup\fP 4665e96a66cSDavid du Colombier mid[s] \fRlast modified by\fP 4675e96a66cSDavid du Colombier mtime[4] \fRlast modification time\fP 4685e96a66cSDavid du Colombier ctime[4] \fRcreation time\fP 4695e96a66cSDavid du Colombier atime[4] \fRlast access time\fP 4705e96a66cSDavid du Colombier mode[4] \fRmode bits\fP 4715e96a66cSDavid du Colombier.P2 4725e96a66cSDavid du ColombierThe notation 4735e96a66cSDavid du Colombier.CW name[s] 4745e96a66cSDavid du Colombierdenotes a string stored as a two-byte length 4755e96a66cSDavid du Colombierand then that many bytes. 4765e96a66cSDavid du ColombierThe above describes Version 9 of the 4775e96a66cSDavid du Colombier.CW DirEntry 4785e96a66cSDavid du Colombierformat. Versions 7 and 8 are very similar; they can be 4795e96a66cSDavid du Colombierread by the current 4805e96a66cSDavid du Colombier.I vac 4815e96a66cSDavid du Colombiersource code but are not written. 4825e96a66cSDavid du ColombierEarlier versions were not widespread. 4835e96a66cSDavid du ColombierA 4845e96a66cSDavid du Colombier.CW DirEntry 4855e96a66cSDavid du Colombiermay be followed by optional extension sections, though none 4865e96a66cSDavid du Colombierare currently used. 4875e96a66cSDavid du ColombierThe 4885e96a66cSDavid du Colombier.CW mode 4895e96a66cSDavid du Colombierbits include bits commonly used by 4905e96a66cSDavid du ColombierUnix and Windows, in addition to those used by Plan 9. 4915e96a66cSDavid du Colombier.PP 4925e96a66cSDavid du ColombierThe 4935e96a66cSDavid du Colombier.CW entry 4945e96a66cSDavid du Colombierfield is an index into the parallel Venti directory. 4955e96a66cSDavid du ColombierThe 4965e96a66cSDavid du Colombier.CW gen 4975e96a66cSDavid du Colombierfield must match the 4985e96a66cSDavid du Colombier.CW gen 4995e96a66cSDavid du Colombierfield in the corresponding 5005e96a66cSDavid du Colombier.CW VtEntry 5015e96a66cSDavid du Colombierin the directory; 5025e96a66cSDavid du Colombierit is used to detect 5035e96a66cSDavid du Colombierstale indices. 5045e96a66cSDavid du ColombierSimilarly, 5055e96a66cSDavid du Colombier.CW mentry 5065e96a66cSDavid du Colombierand 5075e96a66cSDavid du Colombier.CW mgen 5085e96a66cSDavid du Colombierare the index and generation number 5095e96a66cSDavid du Colombierfor the metadata Venti file, 5105e96a66cSDavid du Colombierif the 5115e96a66cSDavid du Colombier.CW DirEntry 5125e96a66cSDavid du Colombierdescribes a vac directory. 5135e96a66cSDavid du Colombier.PP 5145e96a66cSDavid du ColombierThe relation between Venti files and directories and 5155e96a66cSDavid du Colombiervac files and directories can be seen in this figure: 5165e96a66cSDavid du Colombier.PS 5175e96a66cSDavid du Colombier.ps 8 5185e96a66cSDavid du Colombier.vs 10 5195e96a66cSDavid du Colombierboxwid=0.1 5205e96a66cSDavid du Colombierboxht=0.1 5215e96a66cSDavid du Colombierf=0.9 5225e96a66cSDavid du Colombiermb=0.16 5235e96a66cSDavid du Colombier 5245e96a66cSDavid du ColombierVtRoot: [ 5255e96a66cSDavid du Colombier right 5265e96a66cSDavid du Colombier B1: box 5275e96a66cSDavid du Colombier move right 0.1 5285e96a66cSDavid du Colombier "\f(CWVtRoot\fP" ljust 5295e96a66cSDavid du Colombier] 5305e96a66cSDavid du Colombier 5315e96a66cSDavid du ColombierSuperRoot: [ 5325e96a66cSDavid du Colombier right 5335e96a66cSDavid du Colombier B1: box fill f 5345e96a66cSDavid du Colombier move right 0.1 5355e96a66cSDavid du Colombier "fs root block" ljust 5365e96a66cSDavid du Colombier] with .nw at VtRoot.sw + (0.2, -.2) 5375e96a66cSDavid du ColombierRoot: [ 5385e96a66cSDavid du Colombier right 5395e96a66cSDavid du Colombier B1: box fill f 5405e96a66cSDavid du Colombier B2: box fill f 5415e96a66cSDavid du Colombier B3: box fill f 5425e96a66cSDavid du Colombier move right 0.1 5435e96a66cSDavid du Colombier "root directory info block" ljust 5445e96a66cSDavid du Colombier] with .nw at SuperRoot.sw+(0.2, -.2) 5455e96a66cSDavid du ColombierLevel1: [ 5465e96a66cSDavid du Colombier RootMeta: [ 5475e96a66cSDavid du Colombier box wid mb 5485e96a66cSDavid du Colombier move right 0.1 5495e96a66cSDavid du Colombier "root metadata" ljust 5505e96a66cSDavid du Colombier ] 5515e96a66cSDavid du Colombier MetaSource: [ 5525e96a66cSDavid du Colombier right 5535e96a66cSDavid du Colombier B1: box wid mb 5545e96a66cSDavid du Colombier B2: box wid mb 5555e96a66cSDavid du Colombier B3: box wid mb 5565e96a66cSDavid du Colombier B4: box wid mb 5575e96a66cSDavid du Colombier B5: box wid mb 5585e96a66cSDavid du Colombier ] with .nw at RootMeta.sw+(0,-.2) 5595e96a66cSDavid du Colombier MB1: box wid mb invis at MetaSource.B1 5605e96a66cSDavid du Colombier MB2: box wid mb invis at MetaSource.B2 5615e96a66cSDavid du Colombier MB3: box wid mb invis at MetaSource.B3 5625e96a66cSDavid du Colombier MB4: box wid mb invis at MetaSource.B4 5635e96a66cSDavid du Colombier MB5: box wid mb invis at MetaSource.B5 5645e96a66cSDavid du Colombier 5655e96a66cSDavid du Colombier Source: [ 5665e96a66cSDavid du Colombier right 5675e96a66cSDavid du Colombier B1: box fill f 5685e96a66cSDavid du Colombier B2: box fill f 5695e96a66cSDavid du Colombier B3: box fill f 5705e96a66cSDavid du Colombier B4: box fill f 5715e96a66cSDavid du Colombier B5: box fill f 5725e96a66cSDavid du Colombier B6: box fill f 5735e96a66cSDavid du Colombier B7: box fill f 5745e96a66cSDavid du Colombier B8: box fill f 5755e96a66cSDavid du Colombier ] with .nw at MetaSource.sw+(0,-.1) 5765e96a66cSDavid du Colombier SB1: box invis at Source.B1 5775e96a66cSDavid du Colombier SB2: box invis at Source.B2 5785e96a66cSDavid du Colombier SB3: box invis at Source.B3 5795e96a66cSDavid du Colombier SB4: box invis at Source.B4 5805e96a66cSDavid du Colombier SB5: box invis at Source.B5 5815e96a66cSDavid du Colombier SB6: box invis at Source.B6 5825e96a66cSDavid du Colombier SB7: box invis at Source.B7 5835e96a66cSDavid du Colombier SB8: box invis at Source.B8 5845e96a66cSDavid du Colombier] with .nw at Root.sw+(0.4,-.2) 5855e96a66cSDavid du ColombierLevel2: [ 5865e96a66cSDavid du Colombier MetaSource: [ 5875e96a66cSDavid du Colombier right 5885e96a66cSDavid du Colombier B1: box wid mb 5895e96a66cSDavid du Colombier B2: box wid mb 5905e96a66cSDavid du Colombier B3: box wid mb 5915e96a66cSDavid du Colombier B4: box wid mb 5925e96a66cSDavid du Colombier B5: box wid mb 5935e96a66cSDavid du Colombier ] 5945e96a66cSDavid du Colombier Source: [ 5955e96a66cSDavid du Colombier right 5965e96a66cSDavid du Colombier B1: box fill f 5975e96a66cSDavid du Colombier B2: box fill f 5985e96a66cSDavid du Colombier B3: box fill f 5995e96a66cSDavid du Colombier B4: box fill f 6005e96a66cSDavid du Colombier B5: box fill f 6015e96a66cSDavid du Colombier B6: box fill f 6025e96a66cSDavid du Colombier B7: box fill f 6035e96a66cSDavid du Colombier B8: box fill f 6045e96a66cSDavid du Colombier ] with .nw at MetaSource.sw+(0,-.1) 6055e96a66cSDavid du Colombier File: box wid 0.8 with .nw at Source.sw+(0,-.2) 6065e96a66cSDavid du Colombier] with .nw at Level1.sw+(0.6,-.2) 6075e96a66cSDavid du Colombier 6085e96a66cSDavid du Colombierline -> from VtRoot.B1 down boxwid/2+0.2+boxwid/2 then to SuperRoot.w 6095e96a66cSDavid du Colombierline -> from SuperRoot.B1 down boxwid/2+0.2+boxwid/2 then to Root.w 6105e96a66cSDavid du Colombierline -> from Root.B3 down boxwid/2+0.2+boxwid/2 then to Level1.RootMeta.w 6115e96a66cSDavid du Colombierline -> from Root.B2 down boxwid/2+0.2+boxwid+0.2+boxwid/2 then to Level1.MetaSource.w 6125e96a66cSDavid du Colombierline -> from Root.B1 down boxwid/2+0.2+boxwid+0.1+boxwid+0.2+boxwid/2 then to Level1.Source.w 6135e96a66cSDavid du Colombier 6145e96a66cSDavid du Colombierline -> from Level1.SB3 down boxwid/2+0.2+boxwid/2 then to Level2.MetaSource.w 6155e96a66cSDavid du Colombierline -> from Level1.SB2 down boxwid/2+0.2+boxwid+0.1+boxwid/2 then to Level2.Source.w 6165e96a66cSDavid du Colombierline -> from Level1.SB1 down boxwid/2+0.2+boxwid+0.1+boxwid+0.2+boxwid/2 then to Level2.File.w 6175e96a66cSDavid du Colombier 6185e96a66cSDavid du Colombierarrowwid = arrowwid/2 6195e96a66cSDavid du Colombierarrowht = arrowht/2 6205e96a66cSDavid du Colombierline -> from Level1.MB1 to Level1.SB1.n 6215e96a66cSDavid du Colombierline -> from Level1.MB2 to Level1.SB2.n 6225e96a66cSDavid du Colombierline -> from Level1.MB2 to Level1.SB3.n 6235e96a66cSDavid du Colombierline -> from Level1.MB4 to Level1.SB7.n 6245e96a66cSDavid du Colombierline -> from Level1.MB5 to Level1.SB5.n 6255e96a66cSDavid du Colombierarrowwid = arrowwid * 2 6265e96a66cSDavid du Colombierarrowht = arrowht * 2 6275e96a66cSDavid du Colombier 6285e96a66cSDavid du Colombierbox dashed with .nw at Level1.MetaSource.nw+(-.05,.05) wid 0.8+.05*2 ht .3+.05*2 6295e96a66cSDavid du Colombierbox dashed with .nw at Level2.MetaSource.nw+(-.05,.05) wid 0.8+.05*2 ht .3+.05*2 6305e96a66cSDavid du Colombierbox dotted with .nw at Level2.File.nw+(-.05,.05) wid 0.8+0.05*2 ht .1+.05*2 6315e96a66cSDavid du Colombier 6325e96a66cSDavid du Colombier[ 6335e96a66cSDavid du Colombier KEY: box wid 1.5 invis "Key" 6345e96a66cSDavid du Colombier line from KEY.sw to KEY.se 6355e96a66cSDavid du Colombier k = -.1 6365e96a66cSDavid du Colombier kk=0.5 6375e96a66cSDavid du Colombier A: [ 6385e96a66cSDavid du Colombier box wid 4*boxwid 6395e96a66cSDavid du Colombier "Venti file" ljust with .w at last box .w+(kk,0) 6405e96a66cSDavid du Colombier ] with .nw at KEY.sw+(0,2*k) 6415e96a66cSDavid du Colombier B: [ 6425e96a66cSDavid du Colombier box fill f 6435e96a66cSDavid du Colombier "Venti entry (\f(CWEntry\fP)" ljust with .w at last box .w+(kk,0) 6445e96a66cSDavid du Colombier ] with .nw at A.sw+(0,k) 6455e96a66cSDavid du Colombier C: [ 6465e96a66cSDavid du Colombier right 6475e96a66cSDavid du Colombier CC: box fill f 6485e96a66cSDavid du Colombier box fill f 6495e96a66cSDavid du Colombier box fill f 6505e96a66cSDavid du Colombier box fill f 6515e96a66cSDavid du Colombier "Venti directory" ljust with .w at CC.w+(kk,0) 6525e96a66cSDavid du Colombier ] with .nw at B.sw+(0,k) 6535e96a66cSDavid du Colombier D: [ 6545e96a66cSDavid du Colombier line -> right 3*boxwid 6555e96a66cSDavid du Colombier "Venti pointer (score)" ljust with .w at last line .w+(kk, 0) 6565e96a66cSDavid du Colombier ] with .nw at C.sw+(0,k) 6575e96a66cSDavid du Colombier DD: [ 6585e96a66cSDavid du Colombier box dotted wid 4*boxwid 6595e96a66cSDavid du Colombier "Vac file" ljust with .w at last box .w+(kk,0) 6605e96a66cSDavid du Colombier ] with .nw at D.sw+(0,k) 6615e96a66cSDavid du Colombier E: [ 6625e96a66cSDavid du Colombier box wid mb 6635e96a66cSDavid du Colombier "Vac entry (\f(CWDirEntry\fP)" ljust with .w at last box .w+(kk,0) 6645e96a66cSDavid du Colombier ] with .nw at DD.sw+(0,k) 6655e96a66cSDavid du Colombier G: [ 6665e96a66cSDavid du Colombier box dashed wid 4*boxwid 6675e96a66cSDavid du Colombier "Vac directory" ljust with .w at last box .w+(kk,0) 6685e96a66cSDavid du Colombier ] with .nw at E.sw+(0,k) 6695e96a66cSDavid du Colombier H: [ 6705e96a66cSDavid du Colombier arrowwid = arrowwid/2 6715e96a66cSDavid du Colombier arrowht = arrowht/2 6725e96a66cSDavid du Colombier line -> right 1.5*boxwid 6735e96a66cSDavid du Colombier "Vac pointer (integer index)" ljust with .w at last line .w+(kk, 0) 6745e96a66cSDavid du Colombier arrowwid = arrowwid * 2 6755e96a66cSDavid du Colombier arrowht = arrowht * 2 6765e96a66cSDavid du Colombier ] with .nw at G.sw+(0,k) 6775e96a66cSDavid du Colombier] with .nw at VtRoot.nw+(3,0) 6785e96a66cSDavid du Colombier.PE 6795e96a66cSDavid du Colombier.LP 6805e96a66cSDavid du ColombierIn reality, the story is slightly more complicated. 6815e96a66cSDavid du ColombierThe metadata file in a Vac directory 6825e96a66cSDavid du Colombieris not just the concatenation of 6835e96a66cSDavid du Colombier.CW DirEntry 6845e96a66cSDavid du Colombierstructures. 6855e96a66cSDavid du ColombierInstead, it is the concatenation of 6865e96a66cSDavid du Colombier.CW MetaBlocks . 6875e96a66cSDavid du ColombierA 6885e96a66cSDavid du Colombier.CW MetaBlock 6895e96a66cSDavid du Colombiercontains some number of 6905e96a66cSDavid du Colombier.CW DirEntry 6915e96a66cSDavid du Colombierstructures along with a sorted index to make it easy 6925e96a66cSDavid du Colombierto look for a particular 6935e96a66cSDavid du Colombier.CW DirEntry 6945e96a66cSDavid du Colombierby its 6955e96a66cSDavid du Colombier.CW elem 6965e96a66cSDavid du Colombierfield. 6975e96a66cSDavid du ColombierThe details are in the source code. 6985e96a66cSDavid du Colombier.PP 6995e96a66cSDavid du ColombierAs shown in the diagram, 7005e96a66cSDavid du Colombierthe root directory of the file system is summarized by 7015e96a66cSDavid du Colombierthree 7025e96a66cSDavid du Colombier.CW VtEntry 7035e96a66cSDavid du Colombierstructures describing 7045e96a66cSDavid du Colombierthe Venti directory for the children of the root, 7055e96a66cSDavid du Colombierthe Venti file for the metadata describing the children of the root, 7065e96a66cSDavid du Colombierand a Venti file holding metadata for the root directory itself. 7075e96a66cSDavid du ColombierThese 7085e96a66cSDavid du Colombier.CW VtEntry 7095e96a66cSDavid du Colombierstructures are placed in a Venti directory of their own, 7105e96a66cSDavid du Colombierdescribed by the single 7115e96a66cSDavid du Colombier.CW VtEntry 7125e96a66cSDavid du Colombierin the 7135e96a66cSDavid du Colombierroot block. 7145e96a66cSDavid du Colombier.NH 1 7155e96a66cSDavid du ColombierFossil file system format 7165e96a66cSDavid du Colombier.HP 7175e96a66cSDavid du ColombierFossil uses the vac format, with some small changes. 7185e96a66cSDavid du ColombierThe changes only affect the data on the local disk; the data 7195e96a66cSDavid du Colombierarchived to Venti is exactly in vac format. 7205e96a66cSDavid du Colombier.PP 7215e96a66cSDavid du ColombierBlocks stored on local disk may contain scores pointing at local disk 7225e96a66cSDavid du Colombierblocks or at Venti blocks. 7235e96a66cSDavid du ColombierLocal block addresses are stored as 20-byte scores in which the first 16 bytes 7245e96a66cSDavid du Colombierare all zero and the last 4 bytes specify a block number in the disk. 7255e96a66cSDavid du ColombierBefore a block is archived, all the 7265e96a66cSDavid du Colombierblocks it points to must be archived, and the local scores in the block 7275e96a66cSDavid du Colombiermust be changed to Venti scores. 7285e96a66cSDavid du ColombierUsing block addresses rather than content hashes for local data 7295e96a66cSDavid du Colombiermakes the local file system easier to manage: if a local block's contents 7305e96a66cSDavid du Colombierchange, the pointer to the block does not need to change. 7315e96a66cSDavid du Colombier.NH 2 7325e96a66cSDavid du ColombierSnapshots 7335e96a66cSDavid du Colombier.HP 7345e96a66cSDavid du ColombierFossil is an archival file server. 7355e96a66cSDavid du ColombierIt takes periodic snapshots of the file system, 7365e96a66cSDavid du Colombierwhich are made accessible through the file system. 7375e96a66cSDavid du ColombierSpecifically, the active file system is presented in 7385e96a66cSDavid du Colombier.CW /active . 7395e96a66cSDavid du ColombierEphemeral snapshots (those that are kept on local disk and eventually deleted) 7405e96a66cSDavid du Colombierare presented in 7415e96a66cSDavid du Colombier\f(CW/snapshot/\fIyyyy\f(CW/\fImmdd\f(CW/\fIhhmm\fR, 7425e96a66cSDavid du Colombierwhere 7435e96a66cSDavid du Colombier.I yyyy 7445e96a66cSDavid du Colombieris the full year, 7455e96a66cSDavid du Colombier.I mm 7465e96a66cSDavid du Colombieris the month number, 7475e96a66cSDavid du Colombier.I dd 7485e96a66cSDavid du Colombieris the day number, 7495e96a66cSDavid du Colombier.I hh 7505e96a66cSDavid du Colombieris the hour, 7515e96a66cSDavid du Colombierand 7525e96a66cSDavid du Colombier.I mm 7535e96a66cSDavid du Colombieris the minute. 7545e96a66cSDavid du ColombierArchival snapshots (those that are archived to Venti and persist forever) 7555e96a66cSDavid du Colombierare presented in 7565e96a66cSDavid du Colombier\f(CW/archive/\fIyyyy\f(CW/\fImmdds\fR, 7575e96a66cSDavid du Colombierwhere 7585e96a66cSDavid du Colombier.I yyyy , 7595e96a66cSDavid du Colombier.I mm , 7605e96a66cSDavid du Colombierand 7615e96a66cSDavid du Colombier.I dd 7625e96a66cSDavid du Colombierare year, month, and day as before, 7635e96a66cSDavid du Colombierand 7645e96a66cSDavid du Colombier.I s 7655e96a66cSDavid du Colombieris a sequence number if more than one 7665e96a66cSDavid du Colombierarchival snapshot is done in a day. 7675e96a66cSDavid du ColombierFor the first snapshot, 7685e96a66cSDavid du Colombier.I s 7695e96a66cSDavid du Colombieris null. 7705e96a66cSDavid du ColombierFor the subsequent snapshots, 7715e96a66cSDavid du Colombier.I s 7725e96a66cSDavid du Colombieris 7735e96a66cSDavid du Colombier.CW .1 , 7745e96a66cSDavid du Colombier.CW .2 , 7755e96a66cSDavid du Colombier.CW .3 , 7765e96a66cSDavid du Colombieretc. 7775e96a66cSDavid du Colombier.PP 7785e96a66cSDavid du ColombierTo implement the snapshots, the file server maintains a 7795e96a66cSDavid du Colombiercurrent 7805e96a66cSDavid du Colombier.I epoch 7815e96a66cSDavid du Colombierfor the active file system. 7825e96a66cSDavid du ColombierEach local block has a label that records, among other things, 7835e96a66cSDavid du Colombierthe epoch in which the block was allocated. 7845e96a66cSDavid du ColombierIf a block was allocated in an epoch earlier than the current one, 7855e96a66cSDavid du Colombierit is immutable and treated as copy-on-write. 7865e96a66cSDavid du ColombierTaking a snapshot can be accomplished by 7875e96a66cSDavid du Colombierrecording the address of the current root block and then 7885e96a66cSDavid du Colombierincrementing the epoch number. 7895e96a66cSDavid du ColombierNotice that the copy-on-write method makes 7905e96a66cSDavid du Colombiersnapshots both time efficient and space efficient. 7915e96a66cSDavid du ColombierThe only time cost is waiting for all current file system 7925e96a66cSDavid du Colombierrequests to finish and then incrementing a counter. 7935e96a66cSDavid du ColombierAfter a snapshot, blocks only get copied when they are 7945e96a66cSDavid du Colombiernext modified, so the per-snapshot 7955e96a66cSDavid du Colombierspace requirement is proportional 7965e96a66cSDavid du Colombierto the amount of new data rather than the total 7975e96a66cSDavid du Colombiersize of the file system. 7985e96a66cSDavid du Colombier.PP 7995e96a66cSDavid du ColombierThe blocks in the archival snapshots are moved to Venti, 8005e96a66cSDavid du Colombierbut the blocks in the ephemeral snapshots take up space 8015e96a66cSDavid du Colombierin the local disk file. 8025e96a66cSDavid du ColombierTo allow reclamation of this disk space, the file system 8035e96a66cSDavid du Colombiermaintains a 8045e96a66cSDavid du Colombier.I low 8055e96a66cSDavid du Colombier.I epoch , 8065e96a66cSDavid du Colombierwhich is the epoch of the earliest ephemeral snapshot 8075e96a66cSDavid du Colombierstill available. 8085e96a66cSDavid du ColombierFossil only allows access to snapshots with epoch numbers 8095e96a66cSDavid du Colombierbetween the 8105e96a66cSDavid du Colombierlow epoch and the current epoch 8115e96a66cSDavid du Colombier(also called the high epoch). 8125e96a66cSDavid du ColombierIncrementing the low epoch thus makes old 8135e96a66cSDavid du Colombiersnapshots inaccessible. 8145e96a66cSDavid du ColombierThe space required to store those snapshots can then 8155e96a66cSDavid du Colombierbe reclaimed, as described below. 8165e96a66cSDavid du Colombier.NH 2 8175e96a66cSDavid du ColombierLocal blocks 8185e96a66cSDavid du Colombier.HP 8195e96a66cSDavid du ColombierThe bulk of the local disk file is the local blocks. 8205e96a66cSDavid du ColombierEach block has a 14-byte label associated with it, of the format: 8215e96a66cSDavid du Colombier.P1 8225e96a66cSDavid du ColombierLabel: 8235e96a66cSDavid du Colombier.ta +\w' 'u +\w' 'u 8245e96a66cSDavid du Colombier state[1] \fRblock state\fP 8255e96a66cSDavid du Colombier type[1] \fRblock type\fP 8265e96a66cSDavid du Colombier epoch[4] \fRallocation epoch\fP 8275e96a66cSDavid du Colombier epochClose[4] \fRclose epoch\fP 8285e96a66cSDavid du Colombier tag[4] \fRrandom tag\fP 8295e96a66cSDavid du Colombier.P2 8305e96a66cSDavid du Colombier.LP 8315e96a66cSDavid du ColombierThe 8325e96a66cSDavid du Colombier.CW type 8335e96a66cSDavid du Colombieris an analogue of the block types described earlier, 8345e96a66cSDavid du Colombierthough different names are used, to distinguish between 8355e96a66cSDavid du Colombierpointers blocks in a hash tree for a data stream 8365e96a66cSDavid du Colombierand pointer blocks for a directory stream. 8375e96a66cSDavid du ColombierThe 8385e96a66cSDavid du Colombier.CW epoch 8395e96a66cSDavid du Colombierwas mentioned in the last section. 8405e96a66cSDavid du ColombierThe other fields are explained below. 8415e96a66cSDavid du Colombier.PP 8425e96a66cSDavid du ColombierThere are two distinguished blocks states 8435e96a66cSDavid du Colombier.CW BsFree 8445e96a66cSDavid du Colombier.CW 0x00 ) ( 8455e96a66cSDavid du Colombierand 8465e96a66cSDavid du Colombier.CW BsBad 8475e96a66cSDavid du Colombier.CW 0xFF ), ( 8485e96a66cSDavid du Colombierwhich mark blocks that are available for allocation 8495e96a66cSDavid du Colombierand blocks that are bad and should be avoided. 8505e96a66cSDavid du ColombierIf 8515e96a66cSDavid du Colombier.CW state 8525e96a66cSDavid du Colombieris not one of these values, it is a bitwise 8535e96a66cSDavid du Colombier.I or ' ` 8545e96a66cSDavid du Colombierof the following flags: 8555e96a66cSDavid du Colombier.P1 8565e96a66cSDavid du Colombier.ta +\w' 'u +\w' 'u 8575e96a66cSDavid du Colombier0x01 BsAlloc \fRblock is in use\fP 8585e96a66cSDavid du Colombier0x02 BsCopied \fRblock has been copied\fP 8595e96a66cSDavid du Colombier0x04 BsVenti \fRblock has been stored on Venti\fP 8605e96a66cSDavid du Colombier0x08 BsClosed \fRblock has been unlinked from active file system\fP 8615e96a66cSDavid du Colombier.P2 8625e96a66cSDavid du Colombier.LP 8635e96a66cSDavid du ColombierThe flags are explained as they arise in the discussions below. 8645e96a66cSDavid du Colombier.PP 8655e96a66cSDavid du ColombierIt is convenient to store some extra fields in the 8665e96a66cSDavid du Colombier.CW VtEntry 8675e96a66cSDavid du Colombierstructure when it describes a Venti file or directory 8685e96a66cSDavid du Colombierstored on local disk. 8695e96a66cSDavid du ColombierSpecifically, we set the 8705e96a66cSDavid du Colombier.CW VtEntryLocal 8715e96a66cSDavid du Colombierflag bit 8725e96a66cSDavid du Colombierand then use the bytes 7-16 of the score (which would 8735e96a66cSDavid du Colombierotherwise be zero, since it is a local score) to hold these fields: 8745e96a66cSDavid du Colombier.P1 8755e96a66cSDavid du Colombier.ta +\w' 'u +\w' 'u 8765e96a66cSDavid du Colombier archive[1] \fRboolean: this is an archival snapshot\fP 8775e96a66cSDavid du Colombier snap[4] \fRepoch number if root of snapshot\fP 8785e96a66cSDavid du Colombier tag[4] \fRrandom tag\fP 8795e96a66cSDavid du Colombier.P2 8805e96a66cSDavid du Colombier.LP 8815e96a66cSDavid du ColombierThe extended 8825e96a66cSDavid du Colombier.CW VtEntry 8835e96a66cSDavid du Colombierstructure is called an 8845e96a66cSDavid du Colombier.CW Entry . 8855e96a66cSDavid du ColombierThe 8865e96a66cSDavid du Colombier.CW tag 8875e96a66cSDavid du Colombierfield 8885e96a66cSDavid du Colombierin the 8895e96a66cSDavid du Colombier.CW Label 8905e96a66cSDavid du Colombierand the 8915e96a66cSDavid du Colombier.CW Entry 8925e96a66cSDavid du Colombieris used to identify dangling pointers or other file system corruption: 8935e96a66cSDavid du Colombierall the local blocks in a hash tree must 8945e96a66cSDavid du Colombierhave tags matching the tag in the 8955e96a66cSDavid du Colombier.CW Entry . 8965e96a66cSDavid du ColombierIf this 8975e96a66cSDavid du Colombier.CW Entry 8985e96a66cSDavid du Colombierpoints at the root of a snapshot, 8995e96a66cSDavid du Colombierthe 9005e96a66cSDavid du Colombier.CW snap 9015e96a66cSDavid du Colombierfield is the epoch of the snapshot. 9025e96a66cSDavid du ColombierIf the snapshot is intended to be archived to Venti, 9035e96a66cSDavid du Colombierthe 9045e96a66cSDavid du Colombier.CW archive 9055e96a66cSDavid du Colombierfield is non-zero. 9065e96a66cSDavid du Colombier.NH 2 9075e96a66cSDavid du ColombierBlock reclamation 9085e96a66cSDavid du Colombier.HP 9095e96a66cSDavid du ColombierThe blocks in the active file system form a tree: each 9105e96a66cSDavid du Colombierblock has only one parent. 9115e96a66cSDavid du ColombierOnce a copy-on-write block 9125e96a66cSDavid du Colombier.I b 9135e96a66cSDavid du Colombieris replaced by its copy, it is no longer 9145e96a66cSDavid du Colombierneeded by the active file system. 9155e96a66cSDavid du ColombierAt this point, 9165e96a66cSDavid du Colombier.I b 9175e96a66cSDavid du Colombieris unlinked from the active file system. 9185e96a66cSDavid du ColombierWe say that 9195e96a66cSDavid du Colombier.I b 9205e96a66cSDavid du Colombieris now 9215e96a66cSDavid du Colombier.I closed : 9225e96a66cSDavid du Colombierit is needed only for snapshots. 9235e96a66cSDavid du ColombierWhen a block is closed, the 9245e96a66cSDavid du Colombier.CW BsClosed 9255e96a66cSDavid du Colombierbit is set in its state, and the current epoch (called the block's closing epoch) 9265e96a66cSDavid du Colombieris stored in the 9275e96a66cSDavid du Colombier.CW epochClose 9285e96a66cSDavid du Colombierlabel field. 9295e96a66cSDavid du Colombier(Open blocks have an 9305e96a66cSDavid du Colombier.CW epochClose 9315e96a66cSDavid du Colombierof 9325e96a66cSDavid du Colombier.CW ~0 ). 9335e96a66cSDavid du Colombier.PP 9345e96a66cSDavid du ColombierA block is referenced by snapshots with epochs 9355e96a66cSDavid du Colombierbetween the block's allocation epoch and its closing epoch. 9365e96a66cSDavid du ColombierOnce the file system's low epoch grows to be greater than or equal to the block's 9375e96a66cSDavid du Colombierclosing epoch, the block is no longer needed for any snapshots 9385e96a66cSDavid du Colombierand can be reused. 9395e96a66cSDavid du Colombier.PP 9405e96a66cSDavid du ColombierIn a typical configuration, where nightly archival snapshots 9415e96a66cSDavid du Colombierare taken and written to Venti, it is desirable to reclaim 9425e96a66cSDavid du Colombierthe space occupied by now-archived blocks if possible. 9435e96a66cSDavid du ColombierTo do this, Fossil keeps track of whether the pointers 9445e96a66cSDavid du Colombierin each block are unique to that block. 9455e96a66cSDavid du ColombierWhen a block 9465e96a66cSDavid du Colombier.I bb 9475e96a66cSDavid du Colombieris allocated, a pointer to 9485e96a66cSDavid du Colombier.I bb 9495e96a66cSDavid du Colombieris written into exactly one active block (say, 9505e96a66cSDavid du Colombier.I b ). 9515e96a66cSDavid du ColombierIn the absence of snapshots, the pointer to 9525e96a66cSDavid du Colombier.I bb 9535e96a66cSDavid du Colombierwill remain unique to 9545e96a66cSDavid du Colombier.I b , 9555e96a66cSDavid du Colombierso that if the pointer is zeroed, 9565e96a66cSDavid du Colombier.I bb 9575e96a66cSDavid du Colombiercan be immediately reused. 9585e96a66cSDavid du ColombierSnapshots complicate this invariant: 9595e96a66cSDavid du Colombierwhen 9605e96a66cSDavid du Colombier.I b 9615e96a66cSDavid du Colombieris copied-on-write, all its pointers 9625e96a66cSDavid du Colombierare no longer unique to it. 9635e96a66cSDavid du ColombierAt time of the copy, the 9645e96a66cSDavid du Colombier.CW BsCopied 9655e96a66cSDavid du Colombierstate bit in the block's label 9665e96a66cSDavid du Colombieris set to note the duplication of the pointers contained within. 9675e96a66cSDavid du Colombier.NH 2 9685e96a66cSDavid du ColombierDisk layout 9695e96a66cSDavid du Colombier.HP 9705e96a66cSDavid du ColombierThe file system header describes the file system layout and has this format: 9715e96a66cSDavid du Colombier.P1 9725e96a66cSDavid du Colombier.ta +\w' 'u +\w' 'u 9735e96a66cSDavid du ColombierHeader: 9745e96a66cSDavid du Colombier magic[4] \fR0x3776AE89 (HeaderMagic)\fP 9755e96a66cSDavid du Colombier version[2] \fR1 (HeaderVersion)\fP 9765e96a66cSDavid du Colombier blockSize[2] \fIfile system block size\fP 9775e96a66cSDavid du Colombier super[4] \fRblock offset of super block\fP 9785e96a66cSDavid du Colombier label[4] \fRblock offset of labels\fP 9795e96a66cSDavid du Colombier data[4] \fRdata blocks\fP 9805e96a66cSDavid du Colombier end[4] \fRend of file system\fP 9815e96a66cSDavid du Colombier.P2 9825e96a66cSDavid du Colombier.LP 9835e96a66cSDavid du ColombierThe corresponding file system layout is: 9845e96a66cSDavid du Colombier.PS 9855e96a66cSDavid du Colombier.ps 8 9865e96a66cSDavid du Colombier.vs 9 9875e96a66cSDavid du Colombierboxwid=0.75 9885e96a66cSDavid du Colombierboxht=0.15 9895e96a66cSDavid du ColombierEmpty: box "empty" ht 0.25 9905e96a66cSDavid du ColombierHeader: box "header" with .n at Empty.s 9915e96a66cSDavid du ColombierEmpty2: box "empty" with .n at Header.s 9925e96a66cSDavid du ColombierSuper: box "super block" with .n at Empty2.s 9935e96a66cSDavid du ColombierLabel: box "label" "blocks" with .n at Super.s ht 0.25 9945e96a66cSDavid du ColombierData: box "data" "blocks" with .n at Label.s ht 0.3 9955e96a66cSDavid du Colombier" 0" ljust at Empty.ne 9965e96a66cSDavid du Colombier" 128kB" ljust at Header.ne 9975e96a66cSDavid du Colombier" \f5super\fP \(mu \f(CWblockSize\fP" ljust at Super.ne 9985e96a66cSDavid du Colombier" \f5label\fP \(mu \f(CWblockSize\fP" ljust at Label.ne 9995e96a66cSDavid du Colombier" \f5data\fP \(mu \f(CWblockSize\fP" ljust at Data.ne 10005e96a66cSDavid du Colombier" \f5end\fP \(mu \f(CWblockSize\fP" ljust at Data.se 10015e96a66cSDavid du Colombier"" at (-1,0) 10025e96a66cSDavid du Colombier"" at (6,0) 10035e96a66cSDavid du Colombier.PE 10045e96a66cSDavid du Colombier.LP 10055e96a66cSDavid du ColombierThe numbers to the right of the blocks are byte offsets 10065e96a66cSDavid du Colombierof the boundaries. 10075e96a66cSDavid du Colombier.LP 10085e96a66cSDavid du ColombierThe super block describes the file system itself and looks like: 10095e96a66cSDavid du Colombier.P1 10105e96a66cSDavid du Colombier.ta +\w' 'u +\w' 'u 10115e96a66cSDavid du ColombierSuper: 10125e96a66cSDavid du Colombier magic[4] \fR0x2340A3B1 (SuperMagic)\fP 10135e96a66cSDavid du Colombier version[2] \fR1 (SuperVersion)\fP 10145e96a66cSDavid du Colombier epochLow[4] \fRfile system low epoch\fP 10155e96a66cSDavid du Colombier epochHigh[4] \fRfile system high (active) epoch\fP 10165e96a66cSDavid du Colombier qid[8] \fRnext qid to allocate\fP 10175e96a66cSDavid du Colombier active[4] \fRdata block number: root of active file system\fP 10185e96a66cSDavid du Colombier next[4] \fRdata block number: root of next file system to archive\fP 10195e96a66cSDavid du Colombier current[4] \fRdata block number: root of file system currently being archived\fP 10205e96a66cSDavid du Colombier last[20] \fRVenti score of last successful archive\fP 10215e96a66cSDavid du Colombier name[128] \fRname of file system (just a comment)\fP 10225e96a66cSDavid du Colombier.P2 10235e96a66cSDavid du Colombier.LP 10245e96a66cSDavid du Colombier.NH 1 10255e96a66cSDavid du ColombierFossil server 10265e96a66cSDavid du Colombier.HP 10275e96a66cSDavid du ColombierThe Fossil server is a user-space program that runs on a standard Plan 9 kernel. 10285e96a66cSDavid du Colombier.NH 2 10295e96a66cSDavid du ColombierProcess structure 10305e96a66cSDavid du Colombier.PP 10315e96a66cSDavid du ColombierThe file server is structured as a set of processes synchronizing 10325e96a66cSDavid du Colombiermostly through message passing along queues. 10335e96a66cSDavid du ColombierThe processes are given names, which can be seen in the output of 10345e96a66cSDavid du Colombier.CW ps 10355e96a66cSDavid du Colombier.CW -a . 10365e96a66cSDavid du Colombier.PP 10375e96a66cSDavid du Colombier.CW Listen 10385e96a66cSDavid du Colombierprocesses announce on various network addresses. 10395e96a66cSDavid du ColombierA 10405e96a66cSDavid du Colombier.CW con 10415e96a66cSDavid du Colombierprocess handles each incoming connection, reading 9P requests 10425e96a66cSDavid du Colombierand adding them to a central message queue. 10435e96a66cSDavid du Colombier.CW Msg 10445e96a66cSDavid du Colombierprocesses remove 9P requests from the queue, 10455e96a66cSDavid du Colombierhandle them, and write the responses to the appropriate 10465e96a66cSDavid du Colombierfile descriptors. 10475e96a66cSDavid du Colombier.PP 10485e96a66cSDavid du ColombierThe 10495e96a66cSDavid du Colombier.CW disk 10505e96a66cSDavid du Colombierprocess handles disk I/O requests made by the other processes. 10515e96a66cSDavid du ColombierThe 10525e96a66cSDavid du Colombier.CW flush 10535e96a66cSDavid du Colombierprocess writes dirty blocks from the in-memory block cache to disk. 10545e96a66cSDavid du ColombierThe 10555e96a66cSDavid du Colombier.CW unlink 10565e96a66cSDavid du Colombierprocess frees previously linked blocks once the blocks that point at them 10575e96a66cSDavid du Colombierhave been written to disk. 10585e96a66cSDavid du Colombier.PP 10595e96a66cSDavid du ColombierA 10605e96a66cSDavid du Colombier.CW consI 10615e96a66cSDavid du Colombierreads from each console file (typically a pipe posted in 10625e96a66cSDavid du Colombier.CW /srv ), 10635e96a66cSDavid du Colombieradding the typed characters to the input queue. 10645e96a66cSDavid du ColombierThe 10655e96a66cSDavid du Colombier.CW cons 10665e96a66cSDavid du Colombierprocess echoes input and runs the commands, saving 10675e96a66cSDavid du Colombieroutput in a ring buffer. 10685e96a66cSDavid du ColombierBecause there is only one 10695e96a66cSDavid du Colombier.CW cons 10705e96a66cSDavid du Colombierprocess, only one console command may be executing at a time. 10715e96a66cSDavid du ColombierA 10725e96a66cSDavid du Colombier.CW consO 1073*25ea35e3SDavid du Colombierprocess copies this ring buffer to each console file. 10745e96a66cSDavid du Colombier.PP 10755e96a66cSDavid du ColombierThe 10765e96a66cSDavid du Colombier.CW periodic 10775e96a66cSDavid du Colombierprocess runs periodic events, like 10785e96a66cSDavid du Colombierflushing the root metadata to disk or 10795e96a66cSDavid du Colombiertaking snapshots of the file system. 10805e96a66cSDavid du Colombier.NH 2 10815e96a66cSDavid du ColombierBlock cache 10825e96a66cSDavid du Colombier.HP 10835e96a66cSDavid du ColombierFossil maintains an in-memory block cache which 10845e96a66cSDavid du Colombierholds both local disk blocks and Venti blocks. 10855e96a66cSDavid du ColombierCache eviction follows a least recently used policy. 10865e96a66cSDavid du ColombierDirty blocks are restricted to at most half the cache. 10875e96a66cSDavid du ColombierThis can be changed by editing 10885e96a66cSDavid du Colombier.CW DirtyPercentage 10895e96a66cSDavid du Colombierin 10905e96a66cSDavid du Colombier.CW dat.h . 10915e96a66cSDavid du Colombier.PP 10925e96a66cSDavid du ColombierThe block cache uses soft updates [1] to ensure that the on-disk 10935e96a66cSDavid du Colombierfile system is always self-consistent. 10945e96a66cSDavid du ColombierThus there is no 10955e96a66cSDavid du Colombier.I halt 10965e96a66cSDavid du Colombierconsole command 10975e96a66cSDavid du Colombierand no need to check a file system 10985e96a66cSDavid du Colombierthat was shut down without halting. 10995e96a66cSDavid du Colombier.NH 2 11005e96a66cSDavid du ColombierArchiving 11015e96a66cSDavid du Colombier.HP 11025e96a66cSDavid du ColombierA background process writes blocks in archival snapshots to Venti. 11035e96a66cSDavid du ColombierAlthough 11045e96a66cSDavid du Colombier.CW /archive/\fIyyyy\fP/\fImmdds\fR 11055e96a66cSDavid du Colombieris a copy of only 11065e96a66cSDavid du Colombier.CW /active 11075e96a66cSDavid du Colombierat the time of the snapshot, 11085e96a66cSDavid du Colombierthe archival process archives the 11095e96a66cSDavid du Colombierentire file tree rather than just 11105e96a66cSDavid du Colombierthe subtree rooted at 11115e96a66cSDavid du Colombier.CW /active . 11125e96a66cSDavid du ColombierThe snapshots 11135e96a66cSDavid du Colombier.CW /snapshot/\fIyyyy\fP/\fImmdd\fP/\fIhhmm 11145e96a66cSDavid du Colombierare stored as empty directories. 11155e96a66cSDavid du ColombierOnce all the blocks have been archived, 11165e96a66cSDavid du Colombiera 11175e96a66cSDavid du Colombier.CW VtRoot 11185e96a66cSDavid du Colombierheader for the file system is archived. 11195e96a66cSDavid du ColombierThe score of that header is recorded in 11205e96a66cSDavid du Colombier.CW super.score 11215e96a66cSDavid du Colombierand also printed on the file server console. 11225e96a66cSDavid du ColombierThe score can used by 11235e96a66cSDavid du Colombier.I flfmt 11245e96a66cSDavid du Colombierto restore a file system (see 11255e96a66cSDavid du Colombier.I fossil (4)). 11265e96a66cSDavid du Colombier.NH 2 11275e96a66cSDavid du ColombierContrast with the old file server 11285e96a66cSDavid du Colombier.HP 11295e96a66cSDavid du ColombierThe most obvious difference between Fossil and the 11305e96a66cSDavid du Colombierold Plan 9 file server [2] is that Fossil uses a Venti server as 11315e96a66cSDavid du Colombierits archival storage in place of a WORM juke box. 11325e96a66cSDavid du ColombierThere are a few other architectural differences to be 11335e96a66cSDavid du Colombieraware of. 11345e96a66cSDavid du Colombier.PP 11355e96a66cSDavid du ColombierFossil is a user-level program run on a standard kernel. 11365e96a66cSDavid du Colombier.PP 11375e96a66cSDavid du ColombierFossil does not have any way to concatenate, stripe, or 11385e96a66cSDavid du Colombiermirror disk files. For functionality similar to the old file server's 11395e96a66cSDavid du Colombierconfiguration strings, use the experimental file stack device 11405e96a66cSDavid du Colombier(see 1141b748718eSDavid du Colombier.I fs (3)). 11425e96a66cSDavid du Colombier.PP 11435e96a66cSDavid du ColombierFossil speaks only 9P2000. Old 9P (aka 9P1) is not supported. 11445e96a66cSDavid du Colombier.PP 11455e96a66cSDavid du Colombier... XXX words about converting an old file system to fossil? 11465e96a66cSDavid du Colombier.NH 1 11475e96a66cSDavid du ColombierReferences 11485e96a66cSDavid du Colombier.LP 11495e96a66cSDavid du Colombier[1] Gregory R. Ganger, Marshall Kirk McKusick, Craig A. N. Soules, 11505e96a66cSDavid du Colombierand Yale N. Patt. 11515e96a66cSDavid du Colombier``Soft Updates: A Solution to the Metadata Update Problem 11525e96a66cSDavid du Colombierin File Systems,'' 11535e96a66cSDavid du Colombier.I "ACM Transactions on Computer Systems" , 11545e96a66cSDavid du ColombierVol 18., No. 2, May 2000, pp. 127\-153. 11555e96a66cSDavid du Colombier.LP 11565e96a66cSDavid du Colombier[2] Sean Quinlan, ``A Cached WORM File System,'' 11575e96a66cSDavid du Colombier.I "Software\(emPractice and Experience" , 11585e96a66cSDavid du ColombierVol 21., No 12., December 1991, pp. 1289\-1299. 11595e96a66cSDavid du Colombier.LP 11605e96a66cSDavid du Colombier[3] Sean Quinlan and Sean Dorward, ``Venti: A New Approach to Archival Storage,'' 11615e96a66cSDavid du Colombier.I "Usenix Conference on File and Storage Technologies" , 11625e96a66cSDavid du Colombier2002. 1163