1# Tracing Framework {#tracepoints} 2 3## Introduction {#tracepoints_intro} 4 5SPDK has a tracing framework for capturing low-level event information at runtime. 6Tracepoints provide a high-performance tracing mechanism that is accessible at runtime. 7They are implemented as a circular buffer in shared memory that is accessible from other 8processes. SPDK libraries and modules are instrumented with tracepoints to enable analysis of 9both performance and application crashes. 10 11## Enabling Tracepoints {#enable_tracepoints} 12 13Tracepoints are placed in groups. They are enabled and disabled as a group or individually 14inside a group. 15 16### Enabling Tracepoints in Groups 17 18To enable the instrumentation of all the tracepoints groups in an SPDK target 19application, start the target with `-e` parameter set to `all`: 20 21~~~bash 22build/bin/nvmf_tgt -e all 23~~~ 24 25To enable the instrumentation of just the `NVMe-oF RDMA` tracepoints in an SPDK target 26application, start the target with the `-e` parameter set to `nvmf_rdma`: 27 28~~~bash 29build/bin/nvmf_tgt -e nvmf_rdma 30~~~ 31 32### Enabling Individual Tracepoints 33 34To enable individual tracepoints inside a group: 35 36~~~bash 37build/bin/nvmf_tgt -e nvmf_rdma:B 38~~~ 39 40where `:` is a separator and `B` is the tracepoint mask. This will enable only the first, second and fourth (binary: 1011) tracepoint inside `NVMe-oF RDMA` group. 41 42### Combining Tracepoint Masks 43 44It is also possible to combine enabling whole groups of tpoints and individual ones: 45 46~~~bash 47build/bin/nvmf_tgt -e nvmf_rdma:2,thread 48~~~ 49 50This will enable the second tracepoint inside `NVMe-oF RDMA` group (0x10) and all of the tracepoints defined by the `thread` group (0x400). 51 52### Tracepoint Group Names 53 54The best way to get the current list of group names is from an application's 55help message. The list of available tracepoint group's for that application 56will be shown next to the `-e` option. 57 58### Starting the SPDK Target 59 60When the target starts, a message is logged with the information you need to view 61the tracepoints in a human-readable format using the spdk_trace application. The target 62will also log information about the shared memory file. 63 64~~~bash 65app.c: 527:spdk_app_setup_trace: *NOTICE*: Tracepoint Group Mask 0xFFFF specified. 66app.c: 531:spdk_app_setup_trace: *NOTICE*: Use 'spdk_trace -s nvmf -p 24147' to capture a snapshot of events at runtime. 67app.c: 533:spdk_app_setup_trace: *NOTICE*: Or copy /dev/shm/nvmf_trace.pid24147 for offline analysis/debug. 68~~~ 69 70Note that when tracepoints are enabled, the shared memory files are not deleted when the application 71exits. This ensures the file can be used for analysis after the application exits. On Linux, the 72shared memory files are in /dev/shm, and can be deleted manually to free shm space if needed. A system 73reboot will also free all of the /dev/shm files. 74 75## Capturing a snapshot of events {#capture_tracepoints} 76 77Send I/Os to the SPDK target application to generate events. The following is 78an example usage of spdk_nvme_perf to send I/Os to the NVMe-oF target over an RDMA network 79interface for 10 minutes. 80 81~~~bash 82spdk_nvme_perf -q 128 -o 4096 -w randread -t 600 -r 'trtype:RDMA adrfam:IPv4 traddr:192.168.100.2 trsvcid:4420' 83~~~ 84 85The spdk_trace program can be found in the app/trace directory. To analyze the tracepoints on the same 86system running the NVMe-oF target, simply execute the command line shown in the log: 87 88~~~bash 89build/bin/spdk_trace -s nvmf -p 24147 90~~~ 91 92Note that you can also call `build/bin/spdk_trace` without any parameters on Linux, and it will 93use the most recent SPDK trace file generated on that system. 94 95To analyze the tracepoints on a different system, first prepare the tracepoint file for transfer. The 96tracepoint file can be large, but usually compresses very well. This step can also be used to prepare 97a tracepoint file to attach to a GitHub issue for debugging NVMe-oF application crashes. 98 99~~~bash 100bzip2 -c /dev/shm/nvmf_trace.pid24147 > /tmp/trace.bz2 101~~~ 102 103After transferring the /tmp/trace.bz2 tracepoint file to a different system: 104 105~~~bash 106bunzip2 /tmp/trace.bz2 107build/bin/spdk_trace -f /tmp/trace 108~~~ 109 110The following is sample trace capture showing the cumulative time that each 111I/O spends at each RDMA state. All the trace captures with the same id are for 112the same I/O. 113 114~~~bash 11528: 6026.658 ( 12656064) RDMA_REQ_NEED_BUFFER id: r3622 time: 0.019 11628: 6026.694 ( 12656140) RDMA_REQ_RDY_TO_EXECUTE id: r3622 time: 0.055 11728: 6026.820 ( 12656406) RDMA_REQ_EXECUTING id: r3622 time: 0.182 11828: 6026.992 ( 12656766) RDMA_REQ_EXECUTED id: r3477 time: 228.510 11928: 6027.010 ( 12656804) RDMA_REQ_TX_PENDING_C_TO_H id: r3477 time: 228.528 12028: 6027.022 ( 12656828) RDMA_REQ_RDY_TO_COMPLETE id: r3477 time: 228.539 12128: 6027.115 ( 12657024) RDMA_REQ_COMPLETING id: r3477 time: 228.633 12228: 6027.471 ( 12657770) RDMA_REQ_COMPLETED id: r3518 time: 171.577 12328: 6028.027 ( 12658940) RDMA_REQ_NEW id: r3623 12428: 6028.057 ( 12659002) RDMA_REQ_NEED_BUFFER id: r3623 time: 0.030 12528: 6028.095 ( 12659082) RDMA_REQ_RDY_TO_EXECUTE id: r3623 time: 0.068 12628: 6028.216 ( 12659336) RDMA_REQ_EXECUTING id: r3623 time: 0.189 12728: 6028.408 ( 12659740) RDMA_REQ_EXECUTED id: r3505 time: 190.509 12828: 6028.441 ( 12659808) RDMA_REQ_TX_PENDING_C_TO_H id: r3505 time: 190.542 12928: 6028.452 ( 12659832) RDMA_REQ_RDY_TO_COMPLETE id: r3505 time: 190.553 13028: 6028.536 ( 12660008) RDMA_REQ_COMPLETING id: r3505 time: 190.637 13128: 6028.854 ( 12660676) RDMA_REQ_COMPLETED id: r3465 time: 247.000 13228: 6029.433 ( 12661892) RDMA_REQ_NEW id: r3624 13328: 6029.452 ( 12661932) RDMA_REQ_NEED_BUFFER id: r3624 time: 0.019 13428: 6029.482 ( 12661996) RDMA_REQ_RDY_TO_EXECUTE id: r3624 time: 0.050 13528: 6029.591 ( 12662224) RDMA_REQ_EXECUTING id: r3624 time: 0.158 13628: 6029.782 ( 12662624) RDMA_REQ_EXECUTED id: r3564 time: 96.937 13728: 6029.798 ( 12662658) RDMA_REQ_TX_PENDING_C_TO_H id: r3564 time: 96.953 13828: 6029.812 ( 12662688) RDMA_REQ_RDY_TO_COMPLETE id: r3564 time: 96.967 13928: 6029.899 ( 12662870) RDMA_REQ_COMPLETING id: r3564 time: 97.054 14028: 6030.262 ( 12663634) RDMA_REQ_COMPLETED id: r3477 time: 231.780 14128: 6030.786 ( 12664734) RDMA_REQ_NEW id: r3625 14228: 6030.804 ( 12664772) RDMA_REQ_NEED_BUFFER id: r3625 time: 0.018 14328: 6030.841 ( 12664848) RDMA_REQ_RDY_TO_EXECUTE id: r3625 time: 0.054 14428: 6030.963 ( 12665104) RDMA_REQ_EXECUTING id: r3625 time: 0.176 14528: 6031.139 ( 12665474) RDMA_REQ_EXECUTED id: r3552 time: 114.906 14628: 6031.196 ( 12665594) RDMA_REQ_TX_PENDING_C_TO_H id: r3552 time: 114.963 14728: 6031.210 ( 12665624) RDMA_REQ_RDY_TO_COMPLETE id: r3552 time: 114.977 14828: 6031.293 ( 12665798) RDMA_REQ_COMPLETING id: r3552 time: 115.060 14928: 6031.633 ( 12666512) RDMA_REQ_COMPLETED id: r3505 time: 193.734 15028: 6032.230 ( 12667766) RDMA_REQ_NEW id: r3626 15128: 6032.248 ( 12667804) RDMA_REQ_NEED_BUFFER id: r3626 time: 0.018 15228: 6032.288 ( 12667888) RDMA_REQ_RDY_TO_EXECUTE id: r3626 time: 0.058 15328: 6032.396 ( 12668114) RDMA_REQ_EXECUTING id: r3626 time: 0.166 15428: 6032.593 ( 12668528) RDMA_REQ_EXECUTED id: r3570 time: 90.443 15528: 6032.611 ( 12668564) RDMA_REQ_TX_PENDING_C_TO_H id: r3570 time: 90.460 15628: 6032.623 ( 12668590) RDMA_REQ_RDY_TO_COMPLETE id: r3570 time: 90.473 15728: 6032.707 ( 12668766) RDMA_REQ_COMPLETING id: r3570 time: 90.557 15828: 6033.056 ( 12669500) RDMA_REQ_COMPLETED id: r3564 time: 100.211 159~~~ 160 161## Capturing sufficient trace events {#capture_trace_events} 162 163Since the tracepoint file generated directly by SPDK application is a circular buffer in shared memory, 164the trace events captured by it may be insufficient for further analysis. 165The spdk_trace_record program can be found in the app/trace_record directory. 166spdk_trace_record is used to poll the spdk tracepoint shared memory, record new entries from it, 167and store all entries into specified output file at its shutdown on SIGINT or SIGTERM. 168After SPDK nvmf target is launched, simply execute the command line shown in the log: 169 170~~~bash 171build/bin/spdk_trace_record -q -s nvmf -p 24147 -f /tmp/spdk_nvmf_record.trace 172~~~ 173 174Also send I/Os to the SPDK target application to generate events by previous perf example for 10 minutes. 175 176~~~bash 177spdk_nvme_perf -q 128 -o 4096 -w randread -t 600 -r 'trtype:RDMA adrfam:IPv4 traddr:192.168.100.2 trsvcid:4420' 178~~~ 179 180After the completion of perf example, shut down spdk_trace_record by signal SIGINT (Ctrl + C). 181To analyze the tracepoints output file from spdk_trace_record, simply run spdk_trace program by: 182 183~~~bash 184build/bin/spdk_trace -f /tmp/spdk_nvmf_record.trace 185~~~ 186 187## Adding New Tracepoints {#add_tracepoints} 188 189SPDK applications and libraries provide several trace points. You can add new 190tracepoints to the existing trace groups. For example, to add a new tracepoints 191to the SPDK RDMA library (lib/nvmf/rdma.c) trace group TRACE_GROUP_NVMF_RDMA, 192define the tracepoints and assigning them a unique ID using the SPDK_TPOINT_ID macro: 193 194~~~c 195#define TRACE_GROUP_NVMF_RDMA 0x4 196#define TRACE_RDMA_REQUEST_STATE_NEW SPDK_TPOINT_ID(TRACE_GROUP_NVMF_RDMA, 0x0) 197#define TRACE_RDMA_REQUEST_STATE_NEED_BUFFER SPDK_TPOINT_ID(TRACE_GROUP_NVMF_RDMA, 0x1) 198... 199#define NEW_TRACE_POINT_NAME SPDK_TPOINT_ID(TRACE_GROUP_NVMF_RDMA, UNIQUE_ID) 200~~~ 201 202You also need to register the new trace points in the SPDK_TRACE_REGISTER_FN macro call 203within the application/library using the spdk_trace_register_description function 204as shown below: 205 206~~~c 207static void 208nvmf_trace(void) 209{ 210 spdk_trace_register_object(OBJECT_NVMF_RDMA_IO, 'r'); 211 spdk_trace_register_description("RDMA_REQ_NEW", TRACE_RDMA_REQUEST_STATE_NEW, 212 OWNER_TYPE_NONE, OBJECT_NVMF_RDMA_IO, 1, 213 SPDK_TRACE_ARG_TYPE_PTR, "qpair"); 214 spdk_trace_register_description("RDMA_REQ_NEED_BUFFER", TRACE_RDMA_REQUEST_STATE_NEED_BUFFER, 215 OWNER_TYPE_NONE, OBJECT_NVMF_RDMA_IO, 0, 216 SPDK_TRACE_ARG_TYPE_PTR, "qpair"); 217 ... 218} 219SPDK_TRACE_REGISTER_FN(nvmf_trace, "nvmf_rdma", TRACE_GROUP_NVMF_RDMA) 220~~~ 221 222Finally, use the spdk_trace_record function at the appropriate point in the 223application/library to record the current trace state for the new trace points. 224The following example shows the usage of the spdk_trace_record function to 225record the current trace state of several tracepoints. 226 227~~~c 228 case RDMA_REQUEST_STATE_NEW: 229 spdk_trace_record(TRACE_RDMA_REQUEST_STATE_NEW, 0, 0, (uintptr_t)rdma_req, (uintptr_t)rqpair); 230 ... 231 break; 232 case RDMA_REQUEST_STATE_NEED_BUFFER: 233 spdk_trace_record(TRACE_RDMA_REQUEST_STATE_NEED_BUFFER, 0, 0, (uintptr_t)rdma_req, (uintptr_t)rqpair); 234 ... 235 break; 236 case RDMA_REQUEST_STATE_DATA_TRANSFER_TO_CONTROLLER_PENDING: 237 spdk_trace_record(RDMA_REQUEST_STATE_DATA_TRANSFER_TO_CONTROLLER_PENDING, 0, 0, 238 (uintptr_t)rdma_req, (uintptr_t)rqpair); 239 ... 240~~~ 241 242All the tracing functions are documented in the [Tracepoint library documentation](https://spdk.io/doc/trace_8h.html) 243