xref: /netbsd-src/external/cddl/dtracetoolkit/dist/Docs/Examples/threaded_example.txt (revision c29d51755812ace2e87aeefdb06cb2b4dac7087a)
1The following is a demonstration of the threaded.d script,
2
3
4Here we run a test program called "cputhread" that creates 4 busy threads
5that run at the same time. Here we run it on a server with only 1 CPU,
6
7   # threaded.d
8
9   2005 Jul 26 02:56:37,
10
11     PID: 8516     CMD: cputhread
12
13              value  ------------- Distribution ------------- count
14                  1 |                                         0
15                  2 |@@@@@@@                                  17
16                  3 |@@@@@@@@@@@                              28
17                  4 |@@@@@@@@@@@                              27
18                  5 |@@@@@@@@@@@                              28
19                  6 |                                         0
20   [...]
21
22In the above output, we can see that cputhread has four busy threads with
23thread IDs 2, 3, 4 and 5. We are sampling at 100 Hertz, and have caught
24each of these threads on the CPU between 17 and 28 times.
25
26Since the above counts add to 100, this is either a sign of a single CPU
27server (which it is), or a sign that a multithreaded application may be
28running "serialised" - only 1 thread at a time. Compare the above output
29to a multi CPU server,
30
31
32
33Here we run the same test program on a server with 4 CPUs,
34
35   # threaded.d
36
37   2005 Jul 26 02:48:44,
38
39     PID: 5218     CMD: cputhread
40
41              value  ------------- Distribution ------------- count
42                  1 |                                         0
43                  2 |@@@@@@@@@@@                              80
44                  3 |@@@@@@@@@@                               72
45                  4 |@@@@@@@@@                                64
46                  5 |@@@@@@@@@@@                              78
47                  6 |                                         0
48   [...]
49
50This time the counts add to equal 294, so this program is definitely
51running on multiple CPUs at the same time, otherwise this total would
52not be beyond our sample rate of 100. The distribution of threads on CPU
53is fairly even, and the above serves as an example of a multithreaded
54application performing well.
55
56
57
58Now we run a test program called "cpuserial", which also create 4 busy
59threads, however due to a coding problem (poor use of mutex locks) they
60only run one at a time,
61
62   # threaded.d
63
64   2005 Jul 26 03:07:21,
65
66     PID: 5238     CMD: cpuserial
67
68              value  ------------- Distribution ------------- count
69                  2 |                                         0
70                  3 |@@@@@@@@@@@@                             30
71                  4 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@             70
72                  5 |                                         0
73   [...]
74
75The above looks like we are back on a single CPU server with 100 samples
76in total, however we are still on our 4 CPU server. Only two threads have
77run, and the above distribution is a good indication that they have
78run serialised.
79
80
81
82Now more of a fringe case. This version of cpuserial again creates 4 threads
83that are all busy and hungry for the CPU, and again we run it on a 4 CPU
84server,
85
86   # threaded.d
87
88   2005 Jul 26 03:25:45,
89
90     PID: 5280     CMD: cpuserial
91
92              value  ------------- Distribution ------------- count
93                  1 |                                         0
94                  2 |@@@@@@@@@@@@@@@                          42
95                  3 |@@@@@@@@@@@@@@@@@@                       50
96                  4 |@@@@@@                                   15
97                  5 |@                                        2
98                  6 |                                         0
99   [...]
100
101So all threads are running, good. And with a total of 109, at some point
102more than one thread was running at the same time (otherwise this would
103not have exceeded 100, bearing in mind a sample rate of 100 Hertz). However,
104what is not so good is that with 4 CPUs we have only scored 109 samples -
105since all threads are CPU hungry we'd hope that more often they could
106run across the CPUs simultaneously; however this wasn't the case. Again,
107this fault was created by poor use of mutex locks.
108
109