1.SH 2Cache/WORM Driver 3.PP 4The cache/WORM (cw) driver is by far the 5largest and most complicated device driver in the file server. 6There are four devices involved in the cw driver. 7It implements a read/write pseudo-device (the cw-device) 8and a read-only pseudo-device (the dump device) 9by performing operations on its two constituent devices 10the read-write c-device and the write-once-read-many 11w-device. 12The block numbers on the four devices are distinct, 13although the cw addresses, 14dump addresses, 15and the w addresses are 16highly correlated. 17.PP 18The cw-driver uses the w-device as the 19stable storage of the file system at the time of the 20last dump. 21All newly written and a large number of recently used 22exact copies of blocks of the w-device are kept on the c-device. 23The c-device is much smaller than the w-device and 24so the subset of w-blocks that are kept on the c-device are 25mapped through a hash table kept on a partition of the c-device. 26.PP 27The map portion of the c-device consists of blocks of buckets of entries. 28The declarations follow. 29.P1 30 enum 31 { 32 BKPERBLK = 10, 33 CEPERBK = (BUFSIZE - BKPERBLK*sizeof(long)) / 34 (sizeof(Centry)*BKPERBLK), 35 }; 36.P2 37.P1 38 typedef 39 struct 40 { 41 ushort age; 42 short state; 43 long waddr; 44 } Centry; 45.P2 46.P1 47 typedef 48 struct 49 { 50 long agegen; 51 Centry entry[CEPERBK]; 52 } Bucket; 53.P2 54.P1 55 Bucket bucket[BKPERBLK]; 56.P2 57There is exactly one entry structure for each block in the 58data partition of the c-device. 59A bucket contains all of the w-addresses that have 60the same hash code. 61There are as many buckets as will fit 62in a block and enough blocks to have the required 63number of entries. 64The entries in the bucket are maintained 65in FIFO order with an age variable and an incrementing age generator. 66When the age generator is about to overflow, 67all of the ages in the bucket are rescaled 68from zero. 69.PP 70The following steps go into converting a w-address into a c-address. 71The bucket is found by 72.P1 73 bucket_number = w-address % total_buckets 74 getbuf(c-device, bucket_offset + bucket_number/BKPERBLK); 75.P2 76After the desired bucket is found, 77the desired entry is found by a linear search within the bucket for the 78entry with the desired 79.CW waddr . 80.PP 81The state variable in the entry is 82one of the following. 83.P1 84 enum 85 { 86 Cnone = 0, 87 Cdirty, 88 Cdump, 89 Cread, 90 Cwrite, 91 Cdump1, 92 }; 93.P2 94Every w-address has a state. 95Blocks that are not in the 96c-device have the implied 97state 98.CW Cnone . 99The 100.CW Cread 101state is for blocks that have the 102same data as the corresponding block in 103the w-device. 104Since the c-device is much faster than the 105w-device, 106.CW Cread 107blocks are kept as long as possible and 108used in preference to reading the w-device. 109.CW Cread 110blocks may be discarded from the c-device 111when the space is needed for newer data. 112The 113.CW Cwrite 114state is when the c-device contains newer data 115than the corresponding block on the w-device. 116This happens when a 117.CW Cnone , 118.CW Cread , 119or 120.CW Cwrite 121block is written. 122The 123.CW Cdirty 124state 125is when the c-device contains 126new data and the corresponding block 127on the w-device has never been written. 128This happens when a new block has been 129allocated from the free space on the w-device. 130.PP 131The 132.CW Cwrite 133and 134.CW Cdirty 135blocks are created and never removed. 136Unless something is done to 137convert these blocks, 138the c-device will gradually 139fill up and stop functioning. 140Once a day, 141or by command, 142a 143.I dump 144of the cw-device 145is taken. 146The purpose of 147a dump is to queue the writes that 148have been shunted to the c-device 149to be written to the w-device. 150Since the w-device is a WORM, 151blocks cannot be rewritten. 152Blocks that have already been written to the WORM must be 153relocated to the unused portion of the w-device. 154These are precisely the 155blocks with 156.CW Cwrite 157state. 158.PP 159The dump algorithm is as follows: 160a) The tree on the cw-device is walked 161as long as the blocks visited have been 162modified since the last dump. 163These are the blocks with state 164.CW Cwrite 165and 166.CW Cdirty . 167It is possible to restrict the search 168to within these blocks 169since the directory containing a modified 170file must have been accessed to modify the 171file and accessing a directory will set its 172modified time thus causing the block containing it 173to be written. 174The directory containing that directory must be 175modified for the same reason. 176The tree walk is thus drastically restrained and the 177tree walk does not take much time. 178b) All 179.CW Cwrite 180blocks found in the tree search 181are relocated to new blank blocks on the w-device 182and converted to 183.CW Cdump 184state. 185All 186.CW Cdirty 187blocks are converted to 188.CW Cdump 189state without relocation. 190At this point, 191all modified blocks in the cw-device 192have w-addresses that point to unwritten 193WORM blocks. 194These blocks are marked for later 195writing to the w-device 196with the state 197.CW Cdump . 198c) All open files that were pointing to modified 199blocks are reopened to point at the corresponding 200reallocated blocks. 201This causes the directories leading to the 202open files to be modified. 203Thus the invariant discussed in a) is maintained. 204d) The background dumping process will slowly 205go through the map of the c-device and write out 206all blocks with 207.CW Cdump 208state. 209.PP 210The dump takes a few minutes to walk the tree 211and mark the blocks. 212It then takes hours to write the marked blocks 213to the WORM. 214If a marked block is rewritten before the old 215copy has been written to the WORM, 216it must be forced to the WORM before it is rewritten. 217There is no problem if another dump is taken before the first one 218is finished. 219The newly marked blocks are just added to the marked blocks 220left from the first dump. 221.PP 222If there is an error writing a marked block 223to the WORM 224then the 225.CW dump 226state is converted to 227.CW Cdump1 228and manual intervention is needed. 229(See the 230.CW cwcmd 231.CW mvstate 232command in 233.I fs (8)). 234These blocks can be disposed of by converting 235their state back to 236.CW Cdump 237so that they will be written again. 238They can also be converted to 239.CW Cwrite 240state so that they will be allocated new 241addresses at the next dump. 242In most other respects, 243a 244.CW Cdump1 245block behaves like a 246.CW Cwrite 247block. 248