xref: /netbsd-src/external/cddl/dtracetoolkit/dist/Notes/iosnoop_notes.txt (revision c29d51755812ace2e87aeefdb06cb2b4dac7087a)
1**************************************************************************
2* The following are additional notes on the iosnoop program.
3*
4* $Id: iosnoop_notes.txt,v 1.1.1.1 2015/09/30 22:01:09 christos Exp $
5*
6* COPYRIGHT: Copyright (c) 2007 Brendan Gregg.
7**************************************************************************
8
9
10* What does the output represent?
11
12The output is disk events - I/O operations that cause the disk to physically
13read or write data. The output is not application I/O events which may be
14absorbed by memory caches - many of which will be. The output really is
15physical disk events.
16
17iosnoop uses probes from the "io" provider - which traces the block device
18driver before disk events happen. disk events. The stack goes like this,
19
20       application
21         |
22         V
23       syscall
24         |
25         V
26       vfs
27         |
28         V
29       ufs/zfs/...
30         |
31         V
32       block device driver
33         |
34         V
35       physical device driver
36         |
37         V
38       disk
39
40Due to caching (after vfs) few events will make it to the disk for iosnoop
41to see. If you want to trace all I/O activity, try using syscall provider
42based scripts first.
43
44
45* What do the elapsed and delta times mean?
46
47Glad you asked!
48
49The times may *not* be as useful as they appear. I should also add that
50this quickly becomes a very complex topic,
51
52There are two different delta times reported. -D prints the
53elapsed time from the disk request (strategy) to the disk completion
54iodone); -o prints the time for the disk to complete that event
55since it's last event (time between iodones, or since idle->strategy).
56
57The elapsed time is equivalent to the response time from the application
58request to the application completion. The delta time resembles the
59service time for this request (resembles means it will be generally
60correct, but not 100% accurate). The service time is the the time for the
61disk to complete the request, after it has travelled through any bus or
62queue.
63
64buuuttt.... you need to think carefully about what these times mean before
65jumping to conclusions. For example,
66
67   You troubleshoot an application by running iosnoop and filtering
68   on your application's PID. You notice large times for the disk events
69   (responce, service, for this example it doesn't matter).
70   Does this mean there is a problem with that application?
71   What could be happening is that a different application is also using
72   the disks at the same time, and is causing the disk heads to seek to
73   elsewhere on the disk surface - increasing both service and response time.
74
75hmmm! so you can't just look at one application, one set of numbers, and
76understand fully what is going on.
77
78But it gets worse. Disks implement "tagged queueing", where events in the
79queue are reshuffeled to promote "elevator seeking" of the disk heads (this
80reduces head seeking). So the time for a disk event can be effected not
81just by the previous event (and previous location the heads had seeked to),
82but the surrounding events that enter the queue.
83
84So the good and the bad. The good news is that iosnoop makes it easy to
85fetch disk event data on a live system, the bad news is that understanding
86all the data is not really easy.
87
88For further information on disk measurements see,
89
90   "How do disks really work?" - Adrian Cockcroft, SunWorld Online, June 1996
91   "Sun Performance and Tuning" - Adrian Cockcroft, Richard Pettit
92   "Solaris Internals" - Richard McDougall, Jim Mauro
93
94
95
96* The output appears shuffled?
97
98Read the answer to this in ALLsnoop_notes.txt.
99
100