1.SH 2Cache/WORM Driver 3.PP 4The cache/WORM (cw) driver is by far the 5largest and most complicated device driver in the file server. 6There are four devices involved in the cw driver. 7It implements a read/write pseudo-device (the cw-device) 8and a read-only pseudo-device (the dump device) 9by performing operations on its two constituent devices 10the read-write c-device and the write-once-read-many 11w-device. 12The block numbers on the four devices are distinct, 13although the 14.I cw 15addresses, 16dump addresses, 17and the 18.I w 19addresses are 20highly correlated. 21.PP 22The cw-driver uses the w-device as the 23stable storage of the file system at the time of the 24last dump. 25All newly written and a large number of recently used 26exact copies of blocks of the w-device are kept on the c-device. 27The c-device is much smaller than the w-device and 28so the subset of w-blocks that are kept on the c-device are 29mapped through a hash table kept on a partition of the c-device. 30.PP 31The map portion of the c-device consists of blocks of buckets of entries. 32The declarations follow. 33.Ex 34 enum 35 { 36 BKPERBLK = 10, 37 CEPERBK = (BUFSIZE - BKPERBLK*sizeof(Off)) / 38 (sizeof(Centry)*BKPERBLK), 39 }; 40.Ee 41.Ex 42 typedef 43 struct 44 { 45 ushort age; 46 short state; 47 Off waddr; 48 } Centry; 49.Ee 50.Ex 51 typedef 52 struct 53 { 54 long agegen; 55 Centry entry[CEPERBK]; 56 } Bucket; 57.Ee 58.Ex 59 Bucket bucket[BKPERBLK]; 60.Ee 61There is exactly one entry structure for each block in the 62data partition of the c-device. 63A bucket contains all of the w-addresses that have 64the same hash code. 65There are as many buckets as will fit 66in a block and enough blocks to have the required 67number of entries. 68The entries in the bucket are maintained 69in FIFO order with an age variable and an incrementing age generator. 70When the age generator is about to overflow, 71all of the ages in the bucket are rescaled 72from zero. 73.PP 74The following steps go into converting a w-address into a c-address. 75The bucket is found by 76.Ex 77 bucket_number = w-address % total_buckets; 78 getbuf(c-device, bucket_offset + bucket_number/BKPERBLK); 79.Ee 80After the desired bucket is found, 81the desired entry is found by a linear search within the bucket for the 82entry with the desired 83.CW waddr . 84.PP 85The state variable in the entry is 86one of the following. 87.Ex 88 enum 89 { 90 Cnone = 0, 91 Cdirty, 92 Cdump, 93 Cread, 94 Cwrite, 95 Cdump1, 96 }; 97.Ee 98Every w-address has a state. 99Blocks that are not in the 100c-device have the implied 101state 102.CW Cnone . 103The 104.CW Cread 105state is for blocks that have the 106same data as the corresponding block in 107the w-device. 108Since the c-device is much faster than the 109w-device, 110.CW Cread 111blocks are kept as long as possible and 112used in preference to reading the w-device. 113.CW Cread 114blocks may be discarded from the c-device 115when the space is needed for newer data. 116The 117.CW Cwrite 118state is when the c-device contains newer data 119than the corresponding block on the w-device. 120This happens when a 121.CW Cnone , 122.CW Cread , 123or 124.CW Cwrite 125block is written. 126The 127.CW Cdirty 128state 129is when the c-device contains 130new data and the corresponding block 131on the w-device has never been written. 132This happens when a new block has been 133allocated from the free space on the w-device. 134.PP 135The 136.CW Cwrite 137and 138.CW Cdirty 139blocks are created and never removed. 140Unless something is done to 141convert these blocks, 142the c-device will gradually 143fill up and stop functioning. 144Once a day, 145or by command, 146a 147.I dump 148of the cw-device 149is taken. 150The purpose of 151a dump is to queue the writes that 152have been shunted to the c-device 153to be written to the w-device. 154Since the w-device is a WORM, 155blocks cannot be rewritten. 156Blocks that have already been written to the WORM must be 157relocated to the unused portion of the w-device. 158These are precisely the 159blocks with 160.CW Cwrite 161state. 162.PP 163The dump algorithm is as follows: 164.IP a) 165The tree on the cw-device is walked 166as long as the blocks visited have been 167modified since the last dump. 168These are the blocks with state 169.CW Cwrite 170and 171.CW Cdirty . 172It is possible to restrict the search 173to within these blocks 174since the directory containing a modified 175file must have been accessed to modify the 176file and accessing a directory will set its 177modified time thus causing the block containing it 178to be written. 179The directory containing that directory must be 180modified for the same reason. 181The tree walk is thus drastically restrained and the 182tree walk does not take much time. 183.IP b) 184All 185.CW Cwrite 186blocks found in the tree search 187are relocated to new blank blocks on the w-device 188and converted to 189.CW Cdump 190state. 191All 192.CW Cdirty 193blocks are converted to 194.CW Cdump 195state without relocation. 196At this point, 197all modified blocks in the cw-device 198have w-addresses that point to unwritten 199WORM blocks. 200These blocks are marked for later 201writing to the w-device 202with the state 203.CW Cdump . 204.IP c) 205All open files that were pointing to modified 206blocks are reopened to point at the corresponding 207reallocated blocks. 208This causes the directories leading to the 209open files to be modified. 210Thus the invariant discussed in a) is maintained. 211.IP d) 212The background dumping process will slowly 213go through the map of the c-device and write out 214all blocks with 215.CW Cdump 216state. 217.PP 218The dump takes a few minutes to walk the tree 219and mark the blocks. 220It can take hours to write the marked blocks 221to the WORM. 222If a marked block is rewritten before the old 223copy has been written to the WORM, 224it must be forced to the WORM before it is rewritten. 225There is no problem if another dump is taken before the first one 226is finished. 227The newly marked blocks are just added to the marked blocks 228left from the first dump. 229.PP 230If there is an error writing a marked block 231to the WORM 232then the 233.CW dump 234state is converted to 235.CW Cdump1 236and manual intervention is needed. 237(See the 238.CW cwcmd 239.CW mvstate 240command in 241.I fs (8)). 242These blocks can be disposed of by converting 243their state back to 244.CW Cdump 245so that they will be written again. 246They can also be converted to 247.CW Cwrite 248state so that they will be allocated new 249addresses at the next dump. 250In most other respects, 251a 252.CW Cdump1 253block behaves like a 254.CW Cwrite 255block. 256