1.\" $NetBSD: disk.9,v 1.24 2005/12/26 19:48:12 perry Exp $ 2.\" 3.\" Copyright (c) 1995, 1996 Jason R. Thorpe. 4.\" All rights reserved. 5.\" 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 1. Redistributions of source code must retain the above copyright 10.\" notice, this list of conditions and the following disclaimer. 11.\" 2. Redistributions in binary form must reproduce the above copyright 12.\" notice, this list of conditions and the following disclaimer in the 13.\" documentation and/or other materials provided with the distribution. 14.\" 3. All advertising materials mentioning features or use of this software 15.\" must display the following acknowledgement: 16.\" This product includes software developed for the NetBSD Project 17.\" by Jason R. Thorpe. 18.\" 4. The name of the author may not be used to endorse or promote products 19.\" derived from this software without specific prior written permission. 20.\" 21.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR 22.\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES 23.\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 24.\" IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, 25.\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 26.\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 27.\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED 28.\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 29.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 30.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 31.\" SUCH DAMAGE. 32.\" 33.Dd August 14, 2005 34.Dt DISK 9 35.Os 36.Sh NAME 37.Nm disk , 38.Nm disk_attach , 39.Nm disk_detach , 40.Nm disk_busy , 41.Nm disk_unbusy , 42.Nm disk_find , 43.Nm disk_resetstat 44.Nd generic disk framework 45.Sh SYNOPSIS 46.In sys/types.h 47.In sys/disklabel.h 48.In sys/disk.h 49.Ft void 50.Fn disk_attach "struct disk *" 51.Ft void 52.Fn disk_detach "struct disk *" 53.Ft void 54.Fn disk_busy "struct disk *" 55.Ft void 56.Fn disk_unbusy "struct disk *" "long bcount" "int read" 57.Ft void 58.Fn disk_resetstat "struct disk *" 59.Ft struct disk * 60.Fn disk_find "char *" 61.Sh DESCRIPTION 62The 63.Nx 64generic disk framework is designed to provide flexible, 65scalable, and consistent handling of disk state and metrics information. 66The fundamental component of this framework is the 67.Nm disk 68structure, which is defined as follows: 69.Bd -literal 70struct disk { 71 TAILQ_ENTRY(disk) dk_link; /* link in global disklist */ 72 char *dk_name; /* disk name */ 73 int dk_bopenmask; /* block devices open */ 74 int dk_copenmask; /* character devices open */ 75 int dk_openmask; /* composite (bopen|copen) */ 76 int dk_state; /* label state */ 77 int dk_blkshift; /* shift to convert DEV_BSIZE to blks */ 78 int dk_byteshift; /* shift to convert bytes to blks */ 79 80 /* 81 * Metrics data; note that some metrics may have no meaning 82 * on certain types of disks. 83 */ 84 int dk_busy; /* busy counter */ 85 uint64_t dk_rxfer; /* total number of read transfers */ 86 uint64_t dk_wxfer; /* total number of write transfers */ 87 uint64_t dk_seek; /* total independent seek operations */ 88 uint64_t dk_rbytes; /* total bytes read */ 89 uint64_t dk_wbytes; /* total bytes written */ 90 struct timeval dk_attachtime; /* time disk was attached */ 91 struct timeval dk_timestamp; /* timestamp of last unbusy */ 92 struct timeval dk_time; /* total time spent busy */ 93 94 struct dkdriver *dk_driver; /* pointer to driver */ 95 96 /* 97 * Disk label information. Storage for the in-core disk label 98 * must be dynamically allocated, otherwise the size of this 99 * structure becomes machine-dependent. 100 */ 101 daddr_t dk_labelsector; /* sector containing label */ 102 struct disklabel *dk_label; /* label */ 103 struct cpu_disklabel *dk_cpulabel; 104}; 105.Ed 106.Pp 107The system maintains a global linked-list of all disks attached to the 108system. 109This list, called 110.Nm disklist , 111may grow or shrink over time as disks are dynamically added and removed 112from the system. 113Drivers which currently make use of the detachment 114capability of the framework are the 115.Nm ccd 116and 117.Nm vnd 118pseudo-device drivers. 119.Pp 120The following is a brief description of each function in the framework: 121.Bl -tag -width "disk_resetstat()" 122.It Fn disk_attach 123Attach a disk; allocate storage for the disklabel, set the 124.Dq attached time 125timestamp, insert the disk into the disklist, and increment the 126system disk count. 127.It Fn disk_detach 128Detach a disk; free storage for the disklabel, remove the disk 129from the disklist, and decrement the system disk count. 130If the count drops below zero, panic. 131.It Fn disk_busy 132Increment the disk's 133.Dq busy counter . 134If this counter goes from 0 to 1, set the timestamp corresponding to 135this transfer. 136.It Fn disk_unbusy 137Decrement a disk's busy counter. 138If the count drops below zero, panic. 139Get the current time, subtract it from the disk's timestamp, and add 140the difference to the disk's running total. 141Set the disk's timestamp to the current time. 142If the provided byte count is greater than 0, add it to the disk's 143running total and increment the number of transfers performed by the disk. 144The third argument 145.Ar read 146specifies the direction of I/O; 147if non-zero it means reading from the disk, 148otherwise it means writing to the disk. 149.It Fn disk_resetstat 150Reset the running byte, transfer, and time totals. 151.It Fn disk_find 152Return a pointer to the disk structure corresponding to the name provided, 153or NULL if the disk does not exist. 154.El 155.Pp 156The functions typically called by device drivers are 157.Fn disk_attach , 158.Fn disk_detach , 159.Fn disk_busy , 160.Fn disk_unbusy , 161and 162.Fn disk_resetstat . 163The function 164.Fn disk_find 165is provided as a utility function. 166.Sh USING THE FRAMEWORK 167This section includes a description on basic use of the framework 168and example usage of its functions. 169Actual implementation of a device driver which uses the framework 170may vary. 171.Pp 172Each device in the system uses a 173.Dq softc 174structure which contains autoconfiguration and state information for that 175device. 176In the case of disks, the softc should also contain one instance 177of the disk structure, e.g.: 178.Bd -literal 179struct foo_softc { 180 struct device sc_dev; /* generic device information */ 181 struct disk sc_dk; /* generic disk information */ 182 [ . . . more . . . ] 183}; 184.Ed 185.Pp 186In order for the system to gather metrics data about a disk, the disk must 187be registered with the system. 188The 189.Fn disk_attach 190routine performs all of the functions currently required to register a disk 191with the system including allocation of disklabel storage space, 192recording of the time since boot that the disk was attached, and insertion 193into the disklist. 194Note that since this function allocates storage space for the disklabel, 195it must be called before the disklabel is read from the media or used in 196any other way. 197Before 198.Fn disk_attach 199is called, a portions of the disk structure must be initialized with 200data specific to that disk. 201For example, in the 202.Dq foo 203disk driver, the following would be performed in the autoconfiguration 204.Dq attach 205routine: 206.Bd -literal 207void 208fooattach(parent, self, aux) 209 struct device *parent, *self; 210 void *aux; 211{ 212 struct foo_softc *sc = (struct foo_softc *)self; 213 [ . . . ] 214 215 /* Initialize and attach the disk structure. */ 216 sc-\*[Gt]sc_dk.dk_driver = \*[Am]foodkdriver; 217 sc-\*[Gt]sc_dk.dk_name = sc-\*[Gt]sc_dev.dv_xname; 218 disk_attach(\*[Am]sc-\*[Gt]sc_dk); 219 220 /* Read geometry and fill in pertinent parts of disklabel. */ 221 [ . . . ] 222} 223.Ed 224.Pp 225The 226.Nm foodkdriver 227above is the disk's 228.Dq driver 229switch. 230This switch currently includes a pointer to the disk's 231.Dq strategy 232routine. 233This switch needs to have global scope and should be initialized as follows: 234.Bd -literal 235void foostrategy(struct buf *); 236struct dkdriver foodkdriver = { foostrategy }; 237.Ed 238.Pp 239Once the disk is attached, metrics may be gathered on that disk. 240In order to gather metrics data, the driver must tell the framework when 241the disk starts and stops operations. 242This functionality is provided by the 243.Fn disk_busy 244and 245.Fn disk_unbusy 246routines. 247The 248.Fn disk_busy 249routine should be called immediately before a command to the disk is 250sent, e.g.: 251.Bd -literal 252void 253foostart(sc) 254 struct foo_softc *sc; 255{ 256 [ . . . ] 257 258 /* Get buffer from drive's transfer queue. */ 259 [ . . . ] 260 261 /* Build command to send to drive. */ 262 [ . . . ] 263 264 /* Tell the disk framework we're going busy. */ 265 disk_busy(\*[Am]sc-\*[Gt]sc_dk); 266 267 /* Send command to the drive. */ 268 [ . . . ] 269} 270.Ed 271.Pp 272When 273.Fn disk_busy 274is called, a timestamp is taken if the disk's busy counter moves from 2750 to 1, indicating the disk has gone from an idle to non-idle state. 276Note that 277.Fn disk_busy 278must be called at 279.Fn splbio . 280At the end of a transaction, the 281.Fn disk_unbusy 282routine should be called. 283This routine performs some consistency checks, 284such as ensuring that the calls to 285.Fn disk_busy 286and 287.Fn disk_unbusy 288are balanced. 289This routine also performs the actual metrics calculation. 290A timestamp is taken, and the difference from the timestamp taken in 291.Fn disk_busy 292is added to the disk's total running time. 293The disk's timestamp is then updated in case there is more than one 294pending transfer on the disk. 295A byte count is also added to the disk's running total, and if greater than 296zero, the number of transfers the disk has performed is incremented. 297The third argument 298.Ar read 299specifies the direction of I/O; 300if non-zero it means reading from the disk, 301otherwise it means writing to the disk. 302.Bd -literal 303void 304foodone(xfer) 305 struct foo_xfer *xfer; 306{ 307 struct foo_softc = (struct foo_softc *)xfer-\*[Gt]xf_softc; 308 struct buf *bp = xfer-\*[Gt]xf_buf; 309 long nbytes; 310 [ . . . ] 311 312 /* 313 * Get number of bytes transfered. If there is no buf 314 * associated with the xfer, we are being called at the 315 * end of a non-I/O command. 316 */ 317 if (bp == NULL) 318 nbytes = 0; 319 else 320 nbytes = bp-\*[Gt]b_bcount - bp-\*[Gt]b_resid; 321 322 [ . . . ] 323 324 /* Notify the disk framework that we've completed the transfer. */ 325 disk_unbusy(\*[Am]sc-\*[Gt]sc_dk, nbytes, 326 bp != NULL ? bp-\*[Gt]b_flags \*[Am] B_READ : 0); 327 328 [ . . . ] 329} 330.Ed 331.Pp 332Like 333.Fn disk_busy , 334.Fn disk_unbusy 335must be called at 336.Fn splbio . 337.Pp 338At some point a driver may wish to reset the metrics data gathered on a 339particular disk. 340For this function, the 341.Fn disk_resetstat 342routine is provided. 343.Sh CODE REFERENCES 344This section describes places within the 345.Nx 346source tree where actual 347code implementing or using the disk framework can be found. 348All pathnames are relative to 349.Pa /usr/src . 350.Pp 351The disk framework itself is implemented within the file 352.Pa sys/kern/subr_disk.c . 353Data structures and function prototypes for the framework are located in 354.Pa sys/sys/disk.h . 355.Pp 356The 357.Nx 358machine-independent SCSI disk and CD-ROM drivers use the 359disk framework. 360They are located in 361.Pa sys/scsi/sd.c 362and 363.Pa sys/scsi/cd.c . 364.Pp 365The 366.Nx 367.Nm ccd 368and 369.Nm vnd 370drivers use the detachment capability of the framework. 371They are located in 372.Pa sys/dev/ccd.c 373and 374.Pa sys/dev/vnd.c . 375.Sh SEE ALSO 376.Xr ccd 4 , 377.Xr vnd 4 , 378.Xr spl 9 379.Sh HISTORY 380The 381.Nx 382generic disk framework appeared in 383.Nx 1.2 . 384.Sh AUTHORS 385The 386.Nx 387generic disk framework was architected and implemented by 388.An Jason R. Thorpe 389.Aq thorpej@NetBSD.org . 390