xref: /plan9/sys/src/cmd/venti/words/notes (revision 42d82305fbcd839fd6b81940f0fae5a107fcfc96)
1*42d82305SDavid du Colombierall data is big-endian on disk.
2*42d82305SDavid du Colombier
3*42d82305SDavid du Colombierarena layout:
4*42d82305SDavid du Colombier
5*42d82305SDavid du ColombierArenaPart (first at offset PartBlank = 256kB in the disk file)
6*42d82305SDavid du Colombier	magic[4] 0xA9E4A5E7
7*42d82305SDavid du Colombier	version[4] 3
8*42d82305SDavid du Colombier	blockSize[4]
9*42d82305SDavid du Colombier	arenaBase[4] offset of first ArenaHead structure in the disk file
10*42d82305SDavid du Colombier
11*42d82305SDavid du Colombierthe ArenaMap starts at the first block at offset >= PartBlank+512 bytes.
12*42d82305SDavid du Colombierit is a sequence of text lines
13*42d82305SDavid du Colombier/*
14*42d82305SDavid du Colombier * amap: n '\n' amapelem * n
15*42d82305SDavid du Colombier * n: u32int
16*42d82305SDavid du Colombier * amapelem: name '\t' astart '\t' asize '\n'
17*42d82305SDavid du Colombier * astart, asize: u64int
18*42d82305SDavid du Colombier */
19*42d82305SDavid du Colombier
20*42d82305SDavid du Colombierthe astart and astop are byte offsets in the disk file.
21*42d82305SDavid du Colombierthey are the offsets to the ArenaHead and the end of the Arena block.
22*42d82305SDavid du Colombier
23*42d82305SDavid du ColombierArenaHead
24*42d82305SDavid du Colombier[base points here in the C code]
25*42d82305SDavid du Colombiersize bytes
26*42d82305SDavid du Colombier	Clumps
27*42d82305SDavid du Colombier	ClumpInfo blocks
28*42d82305SDavid du ColombierArena
29*42d82305SDavid du Colombier
30*42d82305SDavid du ColombierArena
31*42d82305SDavid du Colombier	magic[4] 0xF2A14EAD
32*42d82305SDavid du Colombier	version[4] 4
33*42d82305SDavid du Colombier	name[64]
34*42d82305SDavid du Colombier	clumps[4]
35*42d82305SDavid du Colombier	cclumps[4]
36*42d82305SDavid du Colombier	ctime[4]
37*42d82305SDavid du Colombier	wtime[4]
38*42d82305SDavid du Colombier	used[8]
39*42d82305SDavid du Colombier	uncsize[8]
40*42d82305SDavid du Colombier	sealed[1]
41*42d82305SDavid du Colombier	optional score[20]
42*42d82305SDavid du Colombier
43*42d82305SDavid du Colombieronce sealed, the sha1 hash of every block from the
44*42d82305SDavid du ColombierArenaHead to the Arena is checksummed, as though
45*42d82305SDavid du Colombierthe final score in Arena were the zeroScore.  strangely,
46*42d82305SDavid du Colombierthe tail of the Arena block (the last one) is not included in the checksum
47*42d82305SDavid du Colombier(i.e., the unused data after the score).
48*42d82305SDavid du Colombier
49*42d82305SDavid du ColombierclumpMax = blocksize/ClumpInfoSize = blocksize/25
50*42d82305SDavid du Colombierdirsize = ((clumps/clumpMax)+1) * blocksize
51*42d82305SDavid du Colombierwant used+dirsize <= size
52*42d82305SDavid du Colombierwant cclumps <= clumps
53*42d82305SDavid du Colombierwant uncsize+clumps*ClumpSize+blocksize < used
54*42d82305SDavid du Colombierwant ctime <= wtime
55*42d82305SDavid du Colombier
56*42d82305SDavid du Colombierclump info is stored packed into blocks in order.
57*42d82305SDavid du Colombierclump info moves forward through a block but the
58*42d82305SDavid du Colombierblocks themselves move backwards.  so if cm=clumpMax
59*42d82305SDavid du Colombierand there are two blocks worth of clumpinfo, the blocks
60*42d82305SDavid du Colombierlook like;
61*42d82305SDavid du Colombier
62*42d82305SDavid du Colombier	[cm..2*cm-1] [0..cm-1] [Arena]
63*42d82305SDavid du Colombier
64*42d82305SDavid du Colombierwith the blocks pushed right up against the Arena trailer.
65*42d82305SDavid du Colombier
66*42d82305SDavid du ColombierArenaHead
67*42d82305SDavid du Colombier	magic[4] 0xD15C4EAD
68*42d82305SDavid du Colombier	version[4] = Arena.version
69*42d82305SDavid du Colombier	name[64]
70*42d82305SDavid du Colombier	blockSize[4]
71*42d82305SDavid du Colombier	size[8]
72*42d82305SDavid du Colombier
73*42d82305SDavid du ColombierClump
74*42d82305SDavid du Colombier	magic[4] 0xD15CB10C (0 for an unused clump)
75*42d82305SDavid du Colombier	type[1]
76*42d82305SDavid du Colombier	size[2]
77*42d82305SDavid du Colombier	uncsize[2]
78*42d82305SDavid du Colombier	score[20]
79*42d82305SDavid du Colombier	encoding[1] raw=1, compress=2
80*42d82305SDavid du Colombier	creator[4]
81*42d82305SDavid du Colombier	time[4]
82*42d82305SDavid du Colombier
83*42d82305SDavid du ColombierClumpInfo
84*42d82305SDavid du Colombier	type[1]
85*42d82305SDavid du Colombier	size[2]
86*42d82305SDavid du Colombier	uncsize[2]
87*42d82305SDavid du Colombier	score[20]
88*42d82305SDavid du Colombier
89*42d82305SDavid du Colombierthe arenas are mapped into a single address space corresponding
90*42d82305SDavid du Colombierto the index that brings them together.  if each arena has 100M bytes
91*42d82305SDavid du Colombierexcluding the headers and there are 4 arenas, then there's 400M of
92*42d82305SDavid du Colombierindex address space between them.  index address space starts at 1M
93*42d82305SDavid du Colombierinstead of 0, so the index addresses assigned to the first arena are
94*42d82305SDavid du Colombier1M up to 101M, then 101M to 201M, etc.
95*42d82305SDavid du Colombier
96*42d82305SDavid du Colombierof course, the assignment of addresses has nothing to do with the index,
97*42d82305SDavid du Colombierbut that's what they're called.
98*42d82305SDavid du Colombier
99*42d82305SDavid du Colombier
100*42d82305SDavid du Colombierthe index is split into index sections, which are put on different disks
101*42d82305SDavid du Colombierto get parallelism of disk heads.  each index section holds some number
102*42d82305SDavid du Colombierof hash buckets, each in its own disk block.  collectively the index sections
103*42d82305SDavid du Colombierhold ix->buckets between them.
104*42d82305SDavid du Colombier
105*42d82305SDavid du Colombierthe top 32-bits of the score is used to assign scores to buckets.
106*42d82305SDavid du Colombierdiv = ceil(2³² / ix->buckets) is the amount of 32-bit score space per bucket.
107*42d82305SDavid du Colombier
108*42d82305SDavid du Colombierto look up a block, take the top 32 bits of score and divide by div
109*42d82305SDavid du Colombierto get the bucket number.  then look through the index section headers
110*42d82305SDavid du Colombierto figure out which index section has that bucket.
111*42d82305SDavid du Colombier
112*42d82305SDavid du Colombierthen load that block from the index section.  it's an IBucket.
113*42d82305SDavid du Colombier
114*42d82305SDavid du Colombierthe IBucket has ib.n IEntry structures in it, sorted by score and then by type.
115*42d82305SDavid du Colombierdo the lookup and get an IEntry.  the ia.addr will be a logical address
116*42d82305SDavid du Colombierthat you then use to get the
117*42d82305SDavid du Colombier
118*42d82305SDavid du ColombierISect
119*42d82305SDavid du Colombier	magic[4] 0xD15C5EC7
120*42d82305SDavid du Colombier	version[4]
121*42d82305SDavid du Colombier	name[64]
122*42d82305SDavid du Colombier	index[64]
123*42d82305SDavid du Colombier	blockSize[4]
124*42d82305SDavid du Colombier	blockBase[4]	address in partition where bucket blocks start
125*42d82305SDavid du Colombier	blocks[4]
126*42d82305SDavid du Colombier	start[4]
127*42d82305SDavid du Colombier	stop[4]	stop - start <= blocks, but not necessarily ==
128*42d82305SDavid du Colombier
129*42d82305SDavid du ColombierIEntry
130*42d82305SDavid du Colombier	score[20]
131*42d82305SDavid du Colombier	wtime[4]
132*42d82305SDavid du Colombier	train[2]
133*42d82305SDavid du Colombier	ia.addr[8]		index address (see note above)
134*42d82305SDavid du Colombier	ia.size[2]		size of uncompressed block data
135*42d82305SDavid du Colombier	ia.type[1]
136*42d82305SDavid du Colombier	ia.blocks[1]	number of blocks of clump on disk
137*42d82305SDavid du Colombier
138*42d82305SDavid du ColombierIBucket
139*42d82305SDavid du Colombier	n[2]
140*42d82305SDavid du Colombier	next[4]	not sure; either 0 or inside [start,stop) for the ISect
141*42d82305SDavid du Colombier	data[n*IEntrySize]
142*42d82305SDavid du Colombier
143*42d82305SDavid du Colombierfinal piece: all the disk partitions start with PartBlank=256kB of unused disk
144*42d82305SDavid du Colombier(presumably to avoid problems with boot sectors and layout tables
145*42d82305SDavid du Colombierand the like).
146*42d82305SDavid du Colombier
147*42d82305SDavid du Colombieractually the last 8k of the 256k (that is, at offset 248kB) can hold
148*42d82305SDavid du Colombiera venti config file to help during bootstrap of the venti file server.
149*42d82305SDavid du Colombier
150