1.SH 2Block Devices 3.PP 4The block device I/O system is like a 5protocol stack of filters. 6There are a set of pseudo-devices that call 7recursively to other pseudo-devices and real devices. 8The protocol stack is compiled from a configuration 9string that specifies the order of pseudo-devices and devices. 10Each pseudo-device and device has a set of entry points 11that corresponds to the operations that the file system 12requires of a device. 13The most notable operations are 14.CW read , 15.CW write , 16and 17.CW size . 18.PP 19The device stack can best be described by 20describing the syntax of the configuration string 21that specifies the stack. 22Configuration strings are used 23during the setup of the file system. 24For a description see 25.I fsconfig (8). 26In the following recursive definition, 27.I D 28represents a 29string that specifies a block device. 30.IP "\fID\fP = (\fIDD\fP...)" 31.br 32This is a set of devices that 33are concatenated to form a single device. 34The size of the catenated device is the 35sum of the sizes of each sub-device. 36.IP "\fID\fP = [\fIDD\fP...]" 37.br 38This is the interleaving of the 39individual devices. 40If there are N devices in the list, 41then the pseudo-device is the N-way block 42interleaving of the sub-devices. 43The size of the interleaved device is 44N times the size of the smallest sub-device. 45.IP "\fID\fP = {\fIDD\fP...}" 46.br 47This is a set of devices that 48constitute a `mirror' of the first sub-device, and form a single device. 49A write to the device is performed, 50at the same block address, 51on the sub-devices, in right-to-left order. 52A read from the device is performed on each sub-device, 53in left-to-right order, until a read succeeds without error, 54or the set is exhausted. 55One can think of this as a poor man's RAID 1. 56The size of the device is the size of the smallest sub-device. 57.IP "\fID\fP = \f(CWp\fP\fIDN1.N2\fP" 58.br 59This is a partition of a sub-device. 60The sub-device is partitioned into 100 equal pieces. 61If the size of the sub-device is not divisible by 100, 62then there will be some slop thrown away at the top. 63The pseudo-device starts at the N1-th piece and 64continues for N2 pieces. Thus 65.CW p\fID\fP67.33 66will be the 67last third of the device 68.I D . 69.IP "\fID\fP = \f(CWf\fP\fID\fP" 70.br 71This is a fake write-once-read-many device simulated by a 72second read-write device. 73This second device is partitioned 74into a set of block flags and a set of blocks. 75The flags are used to generate errors if a 76block is ever written twice or read without being written first. 77.IP "\fID\fP = \f(CWx\fP\fID\fP" 78.br 79This is a byte-swapped version of the file system on D. 80Since the file server currently writes integers in metadata to disk 81in native byte order, moving a file system to a machine of the other 82major byte order (e.g., MIPS to Pentium) 83requires the use of 84.CW x . 85It knows the sizes of the various integer fields in the file system metadata. 86Ideally, the file server would follow the Plan 9 religion and write a consistent 87byte order on disk, regardless of processor. 88In the mean time, it should be possible to automatically determine the need 89for byte-swapping by examining data in the super-block of each file system, 90though this has not been implemented yet. 91.IP "\fID\fP = \f(CWc\fP\fIDD\fP" 92.br 93This is the cache/WORM device made up of a cache (read-write) 94device and a WORM (write-once-read-many) device. 95More on this later. 96.IP "\fID\fP = \f(CWo\fP" 97.br 98This is the dump file system that is the 99two-level hierarchy of all dumps ever taken on a cache/WORM. 100The read-only root of the cache/WORM file system 101(on the dump taken Feb 18, 1995) can 102be referenced as 103.CW /1995/0218 104in this pseudo device. 105The second dump taken that day will be 106.CW /1995/02181 . 107.IP "\fID\fP = \f(CWw\fP\fIN1.N2.N3\fP" 108.br 109This is a SCSI disk on controller N1, target N2 and logical unit number N3. 110.IP "\fID\fP = \f(CWh\fP\fIN1.N2.0\fP" 111.br 112This is an (E)IDE or *ATA disk on controller N1, target N2 113(target 0 is the IDE master, 1 the slave device). 114These disks are currently run via programmed I/O, not DMA, 115so they tend to be slower to access than SCSI disks. 116.IP "\fID\fP = \f(CWr\fP\fIN1\fP" 117.br 118This is the same as 119.CW w , 120but refers to a side of a WORM disc. 121See the 122.I j 123device. 124.IP "\fID\fP = \f(CWl\fP\fIN1\fP" 125.br 126This is the same as 127.CW r , 128but one block from the SCSI disk is removed for labeling. 129.IP "\fID\fP = \f(CWj(\fP\fID\d\s-2\&1\s+2\u\fID\d\s-2\&2\s+2\u\f(CW*)\fID\d\s-2\&3\s+2\u\f1" 130.br 131.I D\d\s-2\&1\s+2\u 132is the juke box SCSI interface. 133The 134.I D\d\s-2\&2\s+2\u 's 135are the SCSI drives in the juke box 136and the 137.I D\d\s-2\&3\s+2\u 's 138are the demountable platters in the juke box. 139.I D\d\s-2\&1\s+2\u 140and 141.I D\d\s-2\&2\s+2\u 142must be 143.CW w . 144.I D\d\s-2\&3\s+2\u 145must be pseudo devices of 146.CW w , 147.CW r , 148or 149.CW l 150devices. 151.PP 152For 153.CW w , 154.CW h , 155.CW l , 156and 157.CW r 158devices any of the configuration numbers 159can be replaced by an iterator of the form 160.CW <\fIN1-N2\fP> . 161N1 can be greater than N2, indicating a descending sequence. 162Thus 163.Ex 164 [w0.<2-6>] 165.Ee 166is the interleaved SCSI disks on SCSI targets 1672 through 6 of SCSI controller 0. 168The main file system on 169Emelie 170is defined by the configuration string 171.Ex 172 c[w1.<0-5>.0]j(w6w5w4w3w2)(l<0-236>l<238-474>) 173.Ee 174This is a cache/WORM driver. 175The cache is three interleaved disks on SCSI controller 1 176targets 0, 1, 2, 3, 4, and 5. 177The WORM half of the cache/WORM 178is 474 jukebox disks. 179Another file server, 180.I choline , 181has a main file system defined by 182.Ex 183 c[w<1-3>]j(w1.<6-0>.0)(l<0-124>l<128-252>) 184.Ee 185The order of 186.CW w1.<6-0>.0 187matters here, since the optical jukebox's WORM drives's 188SCSI target ids, 189as delivered, 190run in descending order relative to the numbers of the drives 191in SCSI commands 192(e.g., the jukebox controller is SCSI target 6, 193drive #1 is SCSI target 5, 194and drive #6 is SCSI target 0). 195