1************************************************************************** 2* The following are additional notes on the iosnoop program. 3* 4* $Id: iosnoop_notes.txt,v 1.1.1.1 2015/09/30 22:01:09 christos Exp $ 5* 6* COPYRIGHT: Copyright (c) 2007 Brendan Gregg. 7************************************************************************** 8 9 10* What does the output represent? 11 12The output is disk events - I/O operations that cause the disk to physically 13read or write data. The output is not application I/O events which may be 14absorbed by memory caches - many of which will be. The output really is 15physical disk events. 16 17iosnoop uses probes from the "io" provider - which traces the block device 18driver before disk events happen. disk events. The stack goes like this, 19 20 application 21 | 22 V 23 syscall 24 | 25 V 26 vfs 27 | 28 V 29 ufs/zfs/... 30 | 31 V 32 block device driver 33 | 34 V 35 physical device driver 36 | 37 V 38 disk 39 40Due to caching (after vfs) few events will make it to the disk for iosnoop 41to see. If you want to trace all I/O activity, try using syscall provider 42based scripts first. 43 44 45* What do the elapsed and delta times mean? 46 47Glad you asked! 48 49The times may *not* be as useful as they appear. I should also add that 50this quickly becomes a very complex topic, 51 52There are two different delta times reported. -D prints the 53elapsed time from the disk request (strategy) to the disk completion 54iodone); -o prints the time for the disk to complete that event 55since it's last event (time between iodones, or since idle->strategy). 56 57The elapsed time is equivalent to the response time from the application 58request to the application completion. The delta time resembles the 59service time for this request (resembles means it will be generally 60correct, but not 100% accurate). The service time is the the time for the 61disk to complete the request, after it has travelled through any bus or 62queue. 63 64buuuttt.... you need to think carefully about what these times mean before 65jumping to conclusions. For example, 66 67 You troubleshoot an application by running iosnoop and filtering 68 on your application's PID. You notice large times for the disk events 69 (responce, service, for this example it doesn't matter). 70 Does this mean there is a problem with that application? 71 What could be happening is that a different application is also using 72 the disks at the same time, and is causing the disk heads to seek to 73 elsewhere on the disk surface - increasing both service and response time. 74 75hmmm! so you can't just look at one application, one set of numbers, and 76understand fully what is going on. 77 78But it gets worse. Disks implement "tagged queueing", where events in the 79queue are reshuffeled to promote "elevator seeking" of the disk heads (this 80reduces head seeking). So the time for a disk event can be effected not 81just by the previous event (and previous location the heads had seeked to), 82but the surrounding events that enter the queue. 83 84So the good and the bad. The good news is that iosnoop makes it easy to 85fetch disk event data on a live system, the bad news is that understanding 86all the data is not really easy. 87 88For further information on disk measurements see, 89 90 "How do disks really work?" - Adrian Cockcroft, SunWorld Online, June 1996 91 "Sun Performance and Tuning" - Adrian Cockcroft, Richard Pettit 92 "Solaris Internals" - Richard McDougall, Jim Mauro 93 94 95 96* The output appears shuffled? 97 98Read the answer to this in ALLsnoop_notes.txt. 99 100