xref: /spdk/doc/tracing.md (revision 2f2acf4eb25cee406c156120cee22721275ca7fd)
1*2f2acf4eSJim Harris# Tracing Framework {#tracepoints}
2*2f2acf4eSJim Harris
3*2f2acf4eSJim Harris## Introduction {#tracepoints_intro}
4*2f2acf4eSJim Harris
5*2f2acf4eSJim HarrisSPDK has a tracing framework for capturing low-level event information at runtime.
6*2f2acf4eSJim HarrisTracepoints provide a high-performance tracing mechanism that is accessible at runtime.
7*2f2acf4eSJim HarrisThey are implemented as a circular buffer in shared memory that is accessible from other
8*2f2acf4eSJim Harrisprocesses. SPDK libraries and modules are instrumented with tracepoints to enable analysis of
9*2f2acf4eSJim Harrisboth performance and application crashes.
10*2f2acf4eSJim Harris
11*2f2acf4eSJim Harris## Enabling Tracepoints {#enable_tracepoints}
12*2f2acf4eSJim Harris
13*2f2acf4eSJim HarrisTracepoints are placed in groups. They are enabled and disabled as a group or individually
14*2f2acf4eSJim Harrisinside a group.
15*2f2acf4eSJim Harris
16*2f2acf4eSJim Harris### Enabling Tracepoints in Groups
17*2f2acf4eSJim Harris
18*2f2acf4eSJim HarrisTo enable the instrumentation of all the tracepoints groups in an SPDK target
19*2f2acf4eSJim Harrisapplication, start the target with `-e` parameter set to `all`:
20*2f2acf4eSJim Harris
21*2f2acf4eSJim Harris~~~bash
22*2f2acf4eSJim Harrisbuild/bin/nvmf_tgt -e all
23*2f2acf4eSJim Harris~~~
24*2f2acf4eSJim Harris
25*2f2acf4eSJim HarrisTo enable the instrumentation of just the `NVMe-oF RDMA` tracepoints in an SPDK target
26*2f2acf4eSJim Harrisapplication, start the target with the `-e` parameter set to `nvmf_rdma`:
27*2f2acf4eSJim Harris
28*2f2acf4eSJim Harris~~~bash
29*2f2acf4eSJim Harrisbuild/bin/nvmf_tgt -e nvmf_rdma
30*2f2acf4eSJim Harris~~~
31*2f2acf4eSJim Harris
32*2f2acf4eSJim Harris### Enabling Individual Tracepoints
33*2f2acf4eSJim Harris
34*2f2acf4eSJim HarrisTo enable individual tracepoints inside a group:
35*2f2acf4eSJim Harris
36*2f2acf4eSJim Harris~~~bash
37*2f2acf4eSJim Harrisbuild/bin/nvmf_tgt -e nvmf_rdma:B
38*2f2acf4eSJim Harris~~~
39*2f2acf4eSJim Harris
40*2f2acf4eSJim Harriswhere `:` is a separator and `B` is the tracepoint mask. This will enable only the first, second and fourth (binary: 1011) tracepoint inside `NVMe-oF RDMA` group.
41*2f2acf4eSJim Harris
42*2f2acf4eSJim Harris### Combining Tracepoint Masks
43*2f2acf4eSJim Harris
44*2f2acf4eSJim HarrisIt is also possible to combine enabling whole groups of tpoints and individual ones:
45*2f2acf4eSJim Harris
46*2f2acf4eSJim Harris~~~bash
47*2f2acf4eSJim Harrisbuild/bin/nvmf_tgt -e nvmf_rdma:2,thread
48*2f2acf4eSJim Harris~~~
49*2f2acf4eSJim Harris
50*2f2acf4eSJim HarrisThis will enable the second tracepoint inside `NVMe-oF RDMA` group (0x10) and all of the tracepoints defined by the `thread` group (0x400).
51*2f2acf4eSJim Harris
52*2f2acf4eSJim Harris### Tracepoint Group Names
53*2f2acf4eSJim Harris
54*2f2acf4eSJim HarrisThe best way to get the current list of group names is from an application's
55*2f2acf4eSJim Harrishelp message. The list of available tracepoint group's for that application
56*2f2acf4eSJim Harriswill be shown next to the `-e` option.
57*2f2acf4eSJim Harris
58*2f2acf4eSJim Harris### Starting the SPDK Target
59*2f2acf4eSJim Harris
60*2f2acf4eSJim HarrisWhen the target starts, a message is logged with the information you need to view
61*2f2acf4eSJim Harristhe tracepoints in a human-readable format using the spdk_trace application. The target
62*2f2acf4eSJim Harriswill also log information about the shared memory file.
63*2f2acf4eSJim Harris
64*2f2acf4eSJim Harris~~~bash
65*2f2acf4eSJim Harrisapp.c: 527:spdk_app_setup_trace: *NOTICE*: Tracepoint Group Mask 0xFFFF specified.
66*2f2acf4eSJim Harrisapp.c: 531:spdk_app_setup_trace: *NOTICE*: Use 'spdk_trace -s nvmf -p 24147' to capture a snapshot of events at runtime.
67*2f2acf4eSJim Harrisapp.c: 533:spdk_app_setup_trace: *NOTICE*: Or copy /dev/shm/nvmf_trace.pid24147 for offline analysis/debug.
68*2f2acf4eSJim Harris~~~
69*2f2acf4eSJim Harris
70*2f2acf4eSJim HarrisNote that when tracepoints are enabled, the shared memory files are not deleted when the application
71*2f2acf4eSJim Harrisexits.  This ensures the file can be used for analysis after the application exits.  On Linux, the
72*2f2acf4eSJim Harrisshared memory files are in /dev/shm, and can be deleted manually to free shm space if needed.  A system
73*2f2acf4eSJim Harrisreboot will also free all of the /dev/shm files.
74*2f2acf4eSJim Harris
75*2f2acf4eSJim Harris## Capturing a snapshot of events {#capture_tracepoints}
76*2f2acf4eSJim Harris
77*2f2acf4eSJim HarrisSend I/Os to the SPDK target application to generate events. The following is
78*2f2acf4eSJim Harrisan example usage of spdk_nvme_perf to send I/Os to the NVMe-oF target over an RDMA network
79*2f2acf4eSJim Harrisinterface for 10 minutes.
80*2f2acf4eSJim Harris
81*2f2acf4eSJim Harris~~~bash
82*2f2acf4eSJim Harrisspdk_nvme_perf -q 128 -o 4096 -w randread -t 600 -r 'trtype:RDMA adrfam:IPv4 traddr:192.168.100.2 trsvcid:4420'
83*2f2acf4eSJim Harris~~~
84*2f2acf4eSJim Harris
85*2f2acf4eSJim HarrisThe spdk_trace program can be found in the app/trace directory.  To analyze the tracepoints on the same
86*2f2acf4eSJim Harrissystem running the NVMe-oF target, simply execute the command line shown in the log:
87*2f2acf4eSJim Harris
88*2f2acf4eSJim Harris~~~bash
89*2f2acf4eSJim Harrisbuild/bin/spdk_trace -s nvmf -p 24147
90*2f2acf4eSJim Harris~~~
91*2f2acf4eSJim Harris
92*2f2acf4eSJim HarrisNote that you can also call `build/bin/spdk_trace` without any parameters on Linux, and it will
93*2f2acf4eSJim Harrisuse the most recent SPDK trace file generated on that system.
94*2f2acf4eSJim Harris
95*2f2acf4eSJim HarrisTo analyze the tracepoints on a different system, first prepare the tracepoint file for transfer.  The
96*2f2acf4eSJim Harristracepoint file can be large, but usually compresses very well.  This step can also be used to prepare
97*2f2acf4eSJim Harrisa tracepoint file to attach to a GitHub issue for debugging NVMe-oF application crashes.
98*2f2acf4eSJim Harris
99*2f2acf4eSJim Harris~~~bash
100*2f2acf4eSJim Harrisbzip2 -c /dev/shm/nvmf_trace.pid24147 > /tmp/trace.bz2
101*2f2acf4eSJim Harris~~~
102*2f2acf4eSJim Harris
103*2f2acf4eSJim HarrisAfter transferring the /tmp/trace.bz2 tracepoint file to a different system:
104*2f2acf4eSJim Harris
105*2f2acf4eSJim Harris~~~bash
106*2f2acf4eSJim Harrisbunzip2 /tmp/trace.bz2
107*2f2acf4eSJim Harrisbuild/bin/spdk_trace -f /tmp/trace
108*2f2acf4eSJim Harris~~~
109*2f2acf4eSJim Harris
110*2f2acf4eSJim HarrisThe following is sample trace capture showing the cumulative time that each
111*2f2acf4eSJim HarrisI/O spends at each RDMA state. All the trace captures with the same id are for
112*2f2acf4eSJim Harristhe same I/O.
113*2f2acf4eSJim Harris
114*2f2acf4eSJim Harris~~~bash
115*2f2acf4eSJim Harris28:   6026.658 ( 12656064)     RDMA_REQ_NEED_BUFFER                                      id:    r3622            time:  0.019
116*2f2acf4eSJim Harris28:   6026.694 ( 12656140)     RDMA_REQ_RDY_TO_EXECUTE                                   id:    r3622            time:  0.055
117*2f2acf4eSJim Harris28:   6026.820 ( 12656406)     RDMA_REQ_EXECUTING                                        id:    r3622            time:  0.182
118*2f2acf4eSJim Harris28:   6026.992 ( 12656766)     RDMA_REQ_EXECUTED                                         id:    r3477            time:  228.510
119*2f2acf4eSJim Harris28:   6027.010 ( 12656804)     RDMA_REQ_TX_PENDING_C_TO_H                                id:    r3477            time:  228.528
120*2f2acf4eSJim Harris28:   6027.022 ( 12656828)     RDMA_REQ_RDY_TO_COMPLETE                                  id:    r3477            time:  228.539
121*2f2acf4eSJim Harris28:   6027.115 ( 12657024)     RDMA_REQ_COMPLETING                                       id:    r3477            time:  228.633
122*2f2acf4eSJim Harris28:   6027.471 ( 12657770)     RDMA_REQ_COMPLETED                                        id:    r3518            time:  171.577
123*2f2acf4eSJim Harris28:   6028.027 ( 12658940)     RDMA_REQ_NEW                                              id:    r3623
124*2f2acf4eSJim Harris28:   6028.057 ( 12659002)     RDMA_REQ_NEED_BUFFER                                      id:    r3623            time:  0.030
125*2f2acf4eSJim Harris28:   6028.095 ( 12659082)     RDMA_REQ_RDY_TO_EXECUTE                                   id:    r3623            time:  0.068
126*2f2acf4eSJim Harris28:   6028.216 ( 12659336)     RDMA_REQ_EXECUTING                                        id:    r3623            time:  0.189
127*2f2acf4eSJim Harris28:   6028.408 ( 12659740)     RDMA_REQ_EXECUTED                                         id:    r3505            time:  190.509
128*2f2acf4eSJim Harris28:   6028.441 ( 12659808)     RDMA_REQ_TX_PENDING_C_TO_H                                id:    r3505            time:  190.542
129*2f2acf4eSJim Harris28:   6028.452 ( 12659832)     RDMA_REQ_RDY_TO_COMPLETE                                  id:    r3505            time:  190.553
130*2f2acf4eSJim Harris28:   6028.536 ( 12660008)     RDMA_REQ_COMPLETING                                       id:    r3505            time:  190.637
131*2f2acf4eSJim Harris28:   6028.854 ( 12660676)     RDMA_REQ_COMPLETED                                        id:    r3465            time:  247.000
132*2f2acf4eSJim Harris28:   6029.433 ( 12661892)     RDMA_REQ_NEW                                              id:    r3624
133*2f2acf4eSJim Harris28:   6029.452 ( 12661932)     RDMA_REQ_NEED_BUFFER                                      id:    r3624            time:  0.019
134*2f2acf4eSJim Harris28:   6029.482 ( 12661996)     RDMA_REQ_RDY_TO_EXECUTE                                   id:    r3624            time:  0.050
135*2f2acf4eSJim Harris28:   6029.591 ( 12662224)     RDMA_REQ_EXECUTING                                        id:    r3624            time:  0.158
136*2f2acf4eSJim Harris28:   6029.782 ( 12662624)     RDMA_REQ_EXECUTED                                         id:    r3564            time:  96.937
137*2f2acf4eSJim Harris28:   6029.798 ( 12662658)     RDMA_REQ_TX_PENDING_C_TO_H                                id:    r3564            time:  96.953
138*2f2acf4eSJim Harris28:   6029.812 ( 12662688)     RDMA_REQ_RDY_TO_COMPLETE                                  id:    r3564            time:  96.967
139*2f2acf4eSJim Harris28:   6029.899 ( 12662870)     RDMA_REQ_COMPLETING                                       id:    r3564            time:  97.054
140*2f2acf4eSJim Harris28:   6030.262 ( 12663634)     RDMA_REQ_COMPLETED                                        id:    r3477            time:  231.780
141*2f2acf4eSJim Harris28:   6030.786 ( 12664734)     RDMA_REQ_NEW                                              id:    r3625
142*2f2acf4eSJim Harris28:   6030.804 ( 12664772)     RDMA_REQ_NEED_BUFFER                                      id:    r3625            time:  0.018
143*2f2acf4eSJim Harris28:   6030.841 ( 12664848)     RDMA_REQ_RDY_TO_EXECUTE                                   id:    r3625            time:  0.054
144*2f2acf4eSJim Harris28:   6030.963 ( 12665104)     RDMA_REQ_EXECUTING                                        id:    r3625            time:  0.176
145*2f2acf4eSJim Harris28:   6031.139 ( 12665474)     RDMA_REQ_EXECUTED                                         id:    r3552            time:  114.906
146*2f2acf4eSJim Harris28:   6031.196 ( 12665594)     RDMA_REQ_TX_PENDING_C_TO_H                                id:    r3552            time:  114.963
147*2f2acf4eSJim Harris28:   6031.210 ( 12665624)     RDMA_REQ_RDY_TO_COMPLETE                                  id:    r3552            time:  114.977
148*2f2acf4eSJim Harris28:   6031.293 ( 12665798)     RDMA_REQ_COMPLETING                                       id:    r3552            time:  115.060
149*2f2acf4eSJim Harris28:   6031.633 ( 12666512)     RDMA_REQ_COMPLETED                                        id:    r3505            time:  193.734
150*2f2acf4eSJim Harris28:   6032.230 ( 12667766)     RDMA_REQ_NEW                                              id:    r3626
151*2f2acf4eSJim Harris28:   6032.248 ( 12667804)     RDMA_REQ_NEED_BUFFER                                      id:    r3626            time:  0.018
152*2f2acf4eSJim Harris28:   6032.288 ( 12667888)     RDMA_REQ_RDY_TO_EXECUTE                                   id:    r3626            time:  0.058
153*2f2acf4eSJim Harris28:   6032.396 ( 12668114)     RDMA_REQ_EXECUTING                                        id:    r3626            time:  0.166
154*2f2acf4eSJim Harris28:   6032.593 ( 12668528)     RDMA_REQ_EXECUTED                                         id:    r3570            time:  90.443
155*2f2acf4eSJim Harris28:   6032.611 ( 12668564)     RDMA_REQ_TX_PENDING_C_TO_H                                id:    r3570            time:  90.460
156*2f2acf4eSJim Harris28:   6032.623 ( 12668590)     RDMA_REQ_RDY_TO_COMPLETE                                  id:    r3570            time:  90.473
157*2f2acf4eSJim Harris28:   6032.707 ( 12668766)     RDMA_REQ_COMPLETING                                       id:    r3570            time:  90.557
158*2f2acf4eSJim Harris28:   6033.056 ( 12669500)     RDMA_REQ_COMPLETED                                        id:    r3564            time:  100.211
159*2f2acf4eSJim Harris~~~
160*2f2acf4eSJim Harris
161*2f2acf4eSJim Harris## Capturing sufficient trace events {#capture_trace_events}
162*2f2acf4eSJim Harris
163*2f2acf4eSJim HarrisSince the tracepoint file generated directly by SPDK application is a circular buffer in shared memory,
164*2f2acf4eSJim Harristhe trace events captured by it may be insufficient for further analysis.
165*2f2acf4eSJim HarrisThe spdk_trace_record program can be found in the app/trace_record directory.
166*2f2acf4eSJim Harrisspdk_trace_record is used to poll the spdk tracepoint shared memory, record new entries from it,
167*2f2acf4eSJim Harrisand store all entries into specified output file at its shutdown on SIGINT or SIGTERM.
168*2f2acf4eSJim HarrisAfter SPDK nvmf target is launched, simply execute the command line shown in the log:
169*2f2acf4eSJim Harris
170*2f2acf4eSJim Harris~~~bash
171*2f2acf4eSJim Harrisbuild/bin/spdk_trace_record -q -s nvmf -p 24147 -f /tmp/spdk_nvmf_record.trace
172*2f2acf4eSJim Harris~~~
173*2f2acf4eSJim Harris
174*2f2acf4eSJim HarrisAlso send I/Os to the SPDK target application to generate events by previous perf example for 10 minutes.
175*2f2acf4eSJim Harris
176*2f2acf4eSJim Harris~~~bash
177*2f2acf4eSJim Harrisspdk_nvme_perf -q 128 -o 4096 -w randread -t 600 -r 'trtype:RDMA adrfam:IPv4 traddr:192.168.100.2 trsvcid:4420'
178*2f2acf4eSJim Harris~~~
179*2f2acf4eSJim Harris
180*2f2acf4eSJim HarrisAfter the completion of perf example, shut down spdk_trace_record by signal SIGINT (Ctrl + C).
181*2f2acf4eSJim HarrisTo analyze the tracepoints output file from spdk_trace_record, simply run spdk_trace program by:
182*2f2acf4eSJim Harris
183*2f2acf4eSJim Harris~~~bash
184*2f2acf4eSJim Harrisbuild/bin/spdk_trace -f /tmp/spdk_nvmf_record.trace
185*2f2acf4eSJim Harris~~~
186*2f2acf4eSJim Harris
187*2f2acf4eSJim Harris## Adding New Tracepoints {#add_tracepoints}
188*2f2acf4eSJim Harris
189*2f2acf4eSJim HarrisSPDK applications and libraries provide several trace points. You can add new
190*2f2acf4eSJim Harristracepoints to the existing trace groups. For example, to add a new tracepoints
191*2f2acf4eSJim Harristo the SPDK RDMA library (lib/nvmf/rdma.c) trace group TRACE_GROUP_NVMF_RDMA,
192*2f2acf4eSJim Harrisdefine the tracepoints and assigning them a unique ID using the SPDK_TPOINT_ID macro:
193*2f2acf4eSJim Harris
194*2f2acf4eSJim Harris~~~c
195*2f2acf4eSJim Harris#define	TRACE_GROUP_NVMF_RDMA	0x4
196*2f2acf4eSJim Harris#define TRACE_RDMA_REQUEST_STATE_NEW	SPDK_TPOINT_ID(TRACE_GROUP_NVMF_RDMA, 0x0)
197*2f2acf4eSJim Harris#define TRACE_RDMA_REQUEST_STATE_NEED_BUFFER	SPDK_TPOINT_ID(TRACE_GROUP_NVMF_RDMA, 0x1)
198*2f2acf4eSJim Harris...
199*2f2acf4eSJim Harris#define NEW_TRACE_POINT_NAME	SPDK_TPOINT_ID(TRACE_GROUP_NVMF_RDMA, UNIQUE_ID)
200*2f2acf4eSJim Harris~~~
201*2f2acf4eSJim Harris
202*2f2acf4eSJim HarrisYou also need to register the new trace points in the SPDK_TRACE_REGISTER_FN macro call
203*2f2acf4eSJim Harriswithin the application/library using the spdk_trace_register_description function
204*2f2acf4eSJim Harrisas shown below:
205*2f2acf4eSJim Harris
206*2f2acf4eSJim Harris~~~c
207*2f2acf4eSJim Harrisstatic void
208*2f2acf4eSJim Harrisnvmf_trace(void)
209*2f2acf4eSJim Harris{
210*2f2acf4eSJim Harris	spdk_trace_register_object(OBJECT_NVMF_RDMA_IO, 'r');
211*2f2acf4eSJim Harris	spdk_trace_register_description("RDMA_REQ_NEW", TRACE_RDMA_REQUEST_STATE_NEW,
212*2f2acf4eSJim Harris					OWNER_TYPE_NONE, OBJECT_NVMF_RDMA_IO, 1,
213*2f2acf4eSJim Harris					SPDK_TRACE_ARG_TYPE_PTR, "qpair");
214*2f2acf4eSJim Harris	spdk_trace_register_description("RDMA_REQ_NEED_BUFFER", TRACE_RDMA_REQUEST_STATE_NEED_BUFFER,
215*2f2acf4eSJim Harris					OWNER_TYPE_NONE, OBJECT_NVMF_RDMA_IO, 0,
216*2f2acf4eSJim Harris					SPDK_TRACE_ARG_TYPE_PTR, "qpair");
217*2f2acf4eSJim Harris	...
218*2f2acf4eSJim Harris}
219*2f2acf4eSJim HarrisSPDK_TRACE_REGISTER_FN(nvmf_trace, "nvmf_rdma", TRACE_GROUP_NVMF_RDMA)
220*2f2acf4eSJim Harris~~~
221*2f2acf4eSJim Harris
222*2f2acf4eSJim HarrisFinally, use the spdk_trace_record function at the appropriate point in the
223*2f2acf4eSJim Harrisapplication/library to record the current trace state for the new trace points.
224*2f2acf4eSJim HarrisThe following example shows the usage of the spdk_trace_record function to
225*2f2acf4eSJim Harrisrecord the current trace state of several tracepoints.
226*2f2acf4eSJim Harris
227*2f2acf4eSJim Harris~~~c
228*2f2acf4eSJim Harris	case RDMA_REQUEST_STATE_NEW:
229*2f2acf4eSJim Harris		spdk_trace_record(TRACE_RDMA_REQUEST_STATE_NEW, 0, 0, (uintptr_t)rdma_req, (uintptr_t)rqpair);
230*2f2acf4eSJim Harris		...
231*2f2acf4eSJim Harris		break;
232*2f2acf4eSJim Harris	case RDMA_REQUEST_STATE_NEED_BUFFER:
233*2f2acf4eSJim Harris		spdk_trace_record(TRACE_RDMA_REQUEST_STATE_NEED_BUFFER, 0, 0, (uintptr_t)rdma_req, (uintptr_t)rqpair);
234*2f2acf4eSJim Harris		...
235*2f2acf4eSJim Harris		break;
236*2f2acf4eSJim Harris	case RDMA_REQUEST_STATE_DATA_TRANSFER_TO_CONTROLLER_PENDING:
237*2f2acf4eSJim Harris		spdk_trace_record(RDMA_REQUEST_STATE_DATA_TRANSFER_TO_CONTROLLER_PENDING, 0, 0,
238*2f2acf4eSJim Harris			(uintptr_t)rdma_req, (uintptr_t)rqpair);
239*2f2acf4eSJim Harris		...
240*2f2acf4eSJim Harris~~~
241*2f2acf4eSJim Harris
242*2f2acf4eSJim HarrisAll the tracing functions are documented in the [Tracepoint library documentation](https://spdk.io/doc/trace_8h.html)
243