1.\" $NetBSD: disk.9,v 1.25 2006/11/25 12:00:25 scw Exp $ 2.\" 3.\" Copyright (c) 1995, 1996 Jason R. Thorpe. 4.\" All rights reserved. 5.\" 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 1. Redistributions of source code must retain the above copyright 10.\" notice, this list of conditions and the following disclaimer. 11.\" 2. Redistributions in binary form must reproduce the above copyright 12.\" notice, this list of conditions and the following disclaimer in the 13.\" documentation and/or other materials provided with the distribution. 14.\" 3. All advertising materials mentioning features or use of this software 15.\" must display the following acknowledgement: 16.\" This product includes software developed for the NetBSD Project 17.\" by Jason R. Thorpe. 18.\" 4. The name of the author may not be used to endorse or promote products 19.\" derived from this software without specific prior written permission. 20.\" 21.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR 22.\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES 23.\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 24.\" IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, 25.\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 26.\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 27.\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED 28.\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 29.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 30.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 31.\" SUCH DAMAGE. 32.\" 33.Dd November 25, 2006 34.Dt DISK 9 35.Os 36.Sh NAME 37.Nm disk , 38.Nm disk_attach , 39.Nm disk_detach , 40.Nm disk_busy , 41.Nm disk_unbusy , 42.Nm disk_find , 43.Nm disk_resetstat , 44.Nm disk_blocksize 45.Nd generic disk framework 46.Sh SYNOPSIS 47.In sys/types.h 48.In sys/disklabel.h 49.In sys/disk.h 50.Ft void 51.Fn disk_attach "struct disk *" 52.Ft void 53.Fn disk_detach "struct disk *" 54.Ft void 55.Fn disk_busy "struct disk *" 56.Ft void 57.Fn disk_unbusy "struct disk *" "long bcount" "int read" 58.Ft void 59.Fn disk_resetstat "struct disk *" 60.Ft struct disk * 61.Fn disk_find "char *" 62.Ft void 63.Fn disk_blocksize "struct disk *" "int blocksize" 64.Sh DESCRIPTION 65The 66.Nx 67generic disk framework is designed to provide flexible, 68scalable, and consistent handling of disk state and metrics information. 69The fundamental component of this framework is the 70.Nm disk 71structure, which is defined as follows: 72.Bd -literal 73struct disk { 74 TAILQ_ENTRY(disk) dk_link; /* link in global disklist */ 75 char *dk_name; /* disk name */ 76 int dk_bopenmask; /* block devices open */ 77 int dk_copenmask; /* character devices open */ 78 int dk_openmask; /* composite (bopen|copen) */ 79 int dk_state; /* label state */ 80 int dk_blkshift; /* shift to convert DEV_BSIZE to blks */ 81 int dk_byteshift; /* shift to convert bytes to blks */ 82 83 /* 84 * Metrics data; note that some metrics may have no meaning 85 * on certain types of disks. 86 */ 87 int dk_busy; /* busy counter */ 88 uint64_t dk_rxfer; /* total number of read transfers */ 89 uint64_t dk_wxfer; /* total number of write transfers */ 90 uint64_t dk_seek; /* total independent seek operations */ 91 uint64_t dk_rbytes; /* total bytes read */ 92 uint64_t dk_wbytes; /* total bytes written */ 93 struct timeval dk_attachtime; /* time disk was attached */ 94 struct timeval dk_timestamp; /* timestamp of last unbusy */ 95 struct timeval dk_time; /* total time spent busy */ 96 97 struct dkdriver *dk_driver; /* pointer to driver */ 98 99 /* 100 * Disk label information. Storage for the in-core disk label 101 * must be dynamically allocated, otherwise the size of this 102 * structure becomes machine-dependent. 103 */ 104 daddr_t dk_labelsector; /* sector containing label */ 105 struct disklabel *dk_label; /* label */ 106 struct cpu_disklabel *dk_cpulabel; 107}; 108.Ed 109.Pp 110The system maintains a global linked-list of all disks attached to the 111system. 112This list, called 113.Nm disklist , 114may grow or shrink over time as disks are dynamically added and removed 115from the system. 116Drivers which currently make use of the detachment 117capability of the framework are the 118.Nm ccd 119and 120.Nm vnd 121pseudo-device drivers. 122.Pp 123The following is a brief description of each function in the framework: 124.Bl -tag -width "disk_resetstat()" 125.It Fn disk_attach 126Attach a disk; allocate storage for the disklabel, set the 127.Dq attached time 128timestamp, insert the disk into the disklist, and increment the 129system disk count. 130.It Fn disk_detach 131Detach a disk; free storage for the disklabel, remove the disk 132from the disklist, and decrement the system disk count. 133If the count drops below zero, panic. 134.It Fn disk_busy 135Increment the disk's 136.Dq busy counter . 137If this counter goes from 0 to 1, set the timestamp corresponding to 138this transfer. 139.It Fn disk_unbusy 140Decrement a disk's busy counter. 141If the count drops below zero, panic. 142Get the current time, subtract it from the disk's timestamp, and add 143the difference to the disk's running total. 144Set the disk's timestamp to the current time. 145If the provided byte count is greater than 0, add it to the disk's 146running total and increment the number of transfers performed by the disk. 147The third argument 148.Ar read 149specifies the direction of I/O; 150if non-zero it means reading from the disk, 151otherwise it means writing to the disk. 152.It Fn disk_resetstat 153Reset the running byte, transfer, and time totals. 154.It Fn disk_find 155Return a pointer to the disk structure corresponding to the name provided, 156or NULL if the disk does not exist. 157.It Fn disk_blocksize 158Initialize 159.Fa dk_blkshift 160and 161.Fa dk_byteshift 162members of 163.Fa struct disk 164with suitable values derived from the supplied physical blocksize. 165It is only necessary to call this function if the device's physical blocksize 166is not DEV_BSIZE. 167.El 168.Pp 169The functions typically called by device drivers are 170.Fn disk_attach , 171.Fn disk_detach , 172.Fn disk_busy , 173.Fn disk_unbusy , 174.Fn disk_resetstat , 175and 176.Fn disk_blocksize . 177The function 178.Fn disk_find 179is provided as a utility function. 180.Sh USING THE FRAMEWORK 181This section includes a description on basic use of the framework 182and example usage of its functions. 183Actual implementation of a device driver which uses the framework 184may vary. 185.Pp 186Each device in the system uses a 187.Dq softc 188structure which contains autoconfiguration and state information for that 189device. 190In the case of disks, the softc should also contain one instance 191of the disk structure, e.g.: 192.Bd -literal 193struct foo_softc { 194 struct device sc_dev; /* generic device information */ 195 struct disk sc_dk; /* generic disk information */ 196 [ . . . more . . . ] 197}; 198.Ed 199.Pp 200In order for the system to gather metrics data about a disk, the disk must 201be registered with the system. 202The 203.Fn disk_attach 204routine performs all of the functions currently required to register a disk 205with the system including allocation of disklabel storage space, 206recording of the time since boot that the disk was attached, and insertion 207into the disklist. 208Note that since this function allocates storage space for the disklabel, 209it must be called before the disklabel is read from the media or used in 210any other way. 211Before 212.Fn disk_attach 213is called, a portions of the disk structure must be initialized with 214data specific to that disk. 215For example, in the 216.Dq foo 217disk driver, the following would be performed in the autoconfiguration 218.Dq attach 219routine: 220.Bd -literal 221void 222fooattach(parent, self, aux) 223 struct device *parent, *self; 224 void *aux; 225{ 226 struct foo_softc *sc = (struct foo_softc *)self; 227 [ . . . ] 228 229 /* Initialize and attach the disk structure. */ 230 sc-\*[Gt]sc_dk.dk_driver = \*[Am]foodkdriver; 231 sc-\*[Gt]sc_dk.dk_name = sc-\*[Gt]sc_dev.dv_xname; 232 disk_attach(\*[Am]sc-\*[Gt]sc_dk); 233 234 /* Read geometry and fill in pertinent parts of disklabel. */ 235 [ . . . ] 236 disk_blocksize(\*[Am]sc-\*[Gt]sc_dk, bytes_per_sector); 237} 238.Ed 239.Pp 240The 241.Nm foodkdriver 242above is the disk's 243.Dq driver 244switch. 245This switch currently includes a pointer to the disk's 246.Dq strategy 247routine. 248This switch needs to have global scope and should be initialized as follows: 249.Bd -literal 250void foostrategy(struct buf *); 251struct dkdriver foodkdriver = { foostrategy }; 252.Ed 253.Pp 254Once the disk is attached, metrics may be gathered on that disk. 255In order to gather metrics data, the driver must tell the framework when 256the disk starts and stops operations. 257This functionality is provided by the 258.Fn disk_busy 259and 260.Fn disk_unbusy 261routines. 262The 263.Fn disk_busy 264routine should be called immediately before a command to the disk is 265sent, e.g.: 266.Bd -literal 267void 268foostart(sc) 269 struct foo_softc *sc; 270{ 271 [ . . . ] 272 273 /* Get buffer from drive's transfer queue. */ 274 [ . . . ] 275 276 /* Build command to send to drive. */ 277 [ . . . ] 278 279 /* Tell the disk framework we're going busy. */ 280 disk_busy(\*[Am]sc-\*[Gt]sc_dk); 281 282 /* Send command to the drive. */ 283 [ . . . ] 284} 285.Ed 286.Pp 287When 288.Fn disk_busy 289is called, a timestamp is taken if the disk's busy counter moves from 2900 to 1, indicating the disk has gone from an idle to non-idle state. 291Note that 292.Fn disk_busy 293must be called at 294.Fn splbio . 295At the end of a transaction, the 296.Fn disk_unbusy 297routine should be called. 298This routine performs some consistency checks, 299such as ensuring that the calls to 300.Fn disk_busy 301and 302.Fn disk_unbusy 303are balanced. 304This routine also performs the actual metrics calculation. 305A timestamp is taken, and the difference from the timestamp taken in 306.Fn disk_busy 307is added to the disk's total running time. 308The disk's timestamp is then updated in case there is more than one 309pending transfer on the disk. 310A byte count is also added to the disk's running total, and if greater than 311zero, the number of transfers the disk has performed is incremented. 312The third argument 313.Ar read 314specifies the direction of I/O; 315if non-zero it means reading from the disk, 316otherwise it means writing to the disk. 317.Bd -literal 318void 319foodone(xfer) 320 struct foo_xfer *xfer; 321{ 322 struct foo_softc = (struct foo_softc *)xfer-\*[Gt]xf_softc; 323 struct buf *bp = xfer-\*[Gt]xf_buf; 324 long nbytes; 325 [ . . . ] 326 327 /* 328 * Get number of bytes transfered. If there is no buf 329 * associated with the xfer, we are being called at the 330 * end of a non-I/O command. 331 */ 332 if (bp == NULL) 333 nbytes = 0; 334 else 335 nbytes = bp-\*[Gt]b_bcount - bp-\*[Gt]b_resid; 336 337 [ . . . ] 338 339 /* Notify the disk framework that we've completed the transfer. */ 340 disk_unbusy(\*[Am]sc-\*[Gt]sc_dk, nbytes, 341 bp != NULL ? bp-\*[Gt]b_flags \*[Am] B_READ : 0); 342 343 [ . . . ] 344} 345.Ed 346.Pp 347Like 348.Fn disk_busy , 349.Fn disk_unbusy 350must be called at 351.Fn splbio . 352.Pp 353At some point a driver may wish to reset the metrics data gathered on a 354particular disk. 355For this function, the 356.Fn disk_resetstat 357routine is provided. 358.Sh CODE REFERENCES 359This section describes places within the 360.Nx 361source tree where actual 362code implementing or using the disk framework can be found. 363All pathnames are relative to 364.Pa /usr/src . 365.Pp 366The disk framework itself is implemented within the file 367.Pa sys/kern/subr_disk.c . 368Data structures and function prototypes for the framework are located in 369.Pa sys/sys/disk.h . 370.Pp 371The 372.Nx 373machine-independent SCSI disk and CD-ROM drivers use the 374disk framework. 375They are located in 376.Pa sys/scsi/sd.c 377and 378.Pa sys/scsi/cd.c . 379.Pp 380The 381.Nx 382.Nm ccd 383and 384.Nm vnd 385drivers use the detachment capability of the framework. 386They are located in 387.Pa sys/dev/ccd.c 388and 389.Pa sys/dev/vnd.c . 390.Sh SEE ALSO 391.Xr ccd 4 , 392.Xr vnd 4 , 393.Xr spl 9 394.Sh HISTORY 395The 396.Nx 397generic disk framework appeared in 398.Nx 1.2 . 399.Sh AUTHORS 400The 401.Nx 402generic disk framework was architected and implemented by 403.An Jason R. Thorpe 404.Aq thorpej@NetBSD.org . 405