11bcc28b8SAlisamar HusainTracing with Intel Processor Trace 21bcc28b8SAlisamar Husain================================== 3752e9cdbSAlisamar Husain 4752e9cdbSAlisamar HusainIntel PT is a technology available in modern Intel CPUs that allows efficient 5752e9cdbSAlisamar Husaintracing of all the instructions executed by a process. 6752e9cdbSAlisamar HusainLLDB can collect traces and dump them using its symbolication stack. 7752e9cdbSAlisamar HusainYou can read more here 8752e9cdbSAlisamar Husainhttps://easyperf.net/blog/2019/08/23/Intel-Processor-Trace. 9752e9cdbSAlisamar Husain 10752e9cdbSAlisamar HusainPrerequisites 11752e9cdbSAlisamar Husain------------- 12752e9cdbSAlisamar Husain 13752e9cdbSAlisamar HusainConfirm that your CPU supports Intel PT 14752e9cdbSAlisamar Husain(see https://www.intel.com/content/www/us/en/support/articles/000056730/processors.html) 15752e9cdbSAlisamar Husainand that your operating system is Linux. 16752e9cdbSAlisamar Husain 17752e9cdbSAlisamar HusainCheck for the existence of this particular file on your Linux system 185a106205SJonas Devlieghere 19752e9cdbSAlisamar Husain:: 20752e9cdbSAlisamar Husain 21752e9cdbSAlisamar Husain $ cat /sys/bus/event_source/devices/intel_pt/type 22752e9cdbSAlisamar Husain 23752e9cdbSAlisamar HusainThe output should be a number. Otherwise, try upgrading your kernel. 24752e9cdbSAlisamar Husain 25752e9cdbSAlisamar Husain 26752e9cdbSAlisamar HusainBuild Instructions 27752e9cdbSAlisamar Husain------------------ 28752e9cdbSAlisamar Husain 29752e9cdbSAlisamar HusainClone and build the low level Intel PT 30*f47914a7SDavid Spickettdecoder library `LibIPT library <https://github.com/intel/libipt>`_. 31752e9cdbSAlisamar Husain:: 32752e9cdbSAlisamar Husain 33752e9cdbSAlisamar Husain $ git clone git@github.com:intel/libipt.git 34752e9cdbSAlisamar Husain $ mkdir libipt-build 35752e9cdbSAlisamar Husain $ cmake -S libipt -B libipt-build 36752e9cdbSAlisamar Husain $ cd libipt-build 37752e9cdbSAlisamar Husain $ make 38752e9cdbSAlisamar Husain 392a812bdcSMichael BuchThis will generate a few files in the ``<libipt-build>/lib`` 402a812bdcSMichael Buchand ``<libipt-build>/libipt/include`` directories. 41752e9cdbSAlisamar Husain 42752e9cdbSAlisamar HusainConfigure and build LLDB with Intel PT support 435a106205SJonas Devlieghere 44752e9cdbSAlisamar Husain:: 45752e9cdbSAlisamar Husain 46752e9cdbSAlisamar Husain $ cmake \ 47752e9cdbSAlisamar Husain -DLLDB_BUILD_INTEL_PT=ON \ 48752e9cdbSAlisamar Husain -DLIBIPT_INCLUDE_PATH="<libipt-build>/libipt/include" \ 49752e9cdbSAlisamar Husain -DLIBIPT_LIBRARY_PATH="<libipt-build>/lib" \ 50752e9cdbSAlisamar Husain ... other common configuration parameters 51752e9cdbSAlisamar Husain 52752e9cdbSAlisamar Husain:: 53752e9cdbSAlisamar Husain 54752e9cdbSAlisamar Husain $ cd <lldb-build> && ninja lldb lldb-server # if using Ninja 55752e9cdbSAlisamar Husain 56752e9cdbSAlisamar Husain 57752e9cdbSAlisamar HusainHow to Use 58752e9cdbSAlisamar Husain---------- 59752e9cdbSAlisamar Husain 60752e9cdbSAlisamar HusainWhen you are debugging a process, you can turn on intel-pt tracing, 61752e9cdbSAlisamar Husainwhich will “record” all the instructions that the process will execute. 62752e9cdbSAlisamar HusainAfter turning it on, you can continue debugging, and at any breakpoint, 63752e9cdbSAlisamar Husainyou can inspect the instruction list. 64752e9cdbSAlisamar Husain 65752e9cdbSAlisamar HusainFor example: 665a106205SJonas Devlieghere 67752e9cdbSAlisamar Husain:: 685a106205SJonas Devlieghere 69752e9cdbSAlisamar Husain lldb <target> 70752e9cdbSAlisamar Husain > b main 71752e9cdbSAlisamar Husain > run 72752e9cdbSAlisamar Husain > process trace start # start tracing on all threads, including future ones 73752e9cdbSAlisamar Husain # keep debugging until you hit a breakpoint 74752e9cdbSAlisamar Husain 75752e9cdbSAlisamar Husain > thread trace dump instructions 76752e9cdbSAlisamar Husain # this should output something like 77752e9cdbSAlisamar Husain 78752e9cdbSAlisamar Husain thread #2: tid = 2861133, total instructions = 5305673 79752e9cdbSAlisamar Husain libc.so.6`__GI___libc_read + 45 at read.c:25:1 80752e9cdbSAlisamar Husain [4962255] 0x00007fffeb64c63d subq $0x10, %rsp 81752e9cdbSAlisamar Husain [4962256] 0x00007fffeb64c641 movq %rdi, -0x18(%rbp) 82752e9cdbSAlisamar Husain libc.so.6`__GI___libc_read + 53 [inlined] __libc_read at read.c:26:10 83752e9cdbSAlisamar Husain [4962257] 0x00007fffeb64c645 callq 0x7fffeb66b640 ; __libc_enable_asynccancel 84752e9cdbSAlisamar Husain libc.so.6`__libc_enable_asynccancel 85752e9cdbSAlisamar Husain [4962258] 0x00007fffeb66b640 movl %fs:0x308, %eax 86752e9cdbSAlisamar Husain libc.so.6`__libc_enable_asynccancel + 8 87752e9cdbSAlisamar Husain [4962259] 0x00007fffeb66b648 movl %eax, %r11d 88752e9cdbSAlisamar Husain 89752e9cdbSAlisamar Husain # you can keep pressing ENTER to see more and more instructions 90752e9cdbSAlisamar Husain 91752e9cdbSAlisamar HusainThe number between brackets is the instruction index, 92752e9cdbSAlisamar Husainand by default the current thread will be picked. 93752e9cdbSAlisamar Husain 94752e9cdbSAlisamar HusainConfiguring the trace size 95752e9cdbSAlisamar Husain-------------------------- 96752e9cdbSAlisamar Husain 97752e9cdbSAlisamar HusainThe CPU stores the instruction list in a compressed format in a ring buffer, 98752e9cdbSAlisamar Husainwhich keeps the latest information. 99752e9cdbSAlisamar HusainBy default, LLDB uses a buffer of 4KB per thread, 100752e9cdbSAlisamar Husainbut you can change it by running. 101752e9cdbSAlisamar HusainThe size must be a power of 2 and at least 4KB. 1025a106205SJonas Devlieghere 103752e9cdbSAlisamar Husain:: 1045a106205SJonas Devlieghere 105752e9cdbSAlisamar Husain thread trace start all -s <size_in_bytes> 106752e9cdbSAlisamar Husain 107752e9cdbSAlisamar HusainFor reference, a 1MB trace buffer can easily store around 5M instructions. 108752e9cdbSAlisamar Husain 109752e9cdbSAlisamar HusainPrinting more instructions 110752e9cdbSAlisamar Husain-------------------------- 111752e9cdbSAlisamar Husain 112752e9cdbSAlisamar HusainIf you want to dump more instructions at a time, you can run 1135a106205SJonas Devlieghere 114752e9cdbSAlisamar Husain:: 1155a106205SJonas Devlieghere 116752e9cdbSAlisamar Husain thread trace dump instructions -c <count> 117752e9cdbSAlisamar Husain 118752e9cdbSAlisamar HusainPrinting the instructions of another thread 119752e9cdbSAlisamar Husain------------------------------------------- 120752e9cdbSAlisamar Husain 121752e9cdbSAlisamar HusainBy default the current thread will be picked when dumping instructions, 122752e9cdbSAlisamar Husainbut you can do 1235a106205SJonas Devlieghere 124752e9cdbSAlisamar Husain:: 1255a106205SJonas Devlieghere 126752e9cdbSAlisamar Husain thread trace dump instructions <#thread index> 127752e9cdbSAlisamar Husain #e.g. 128752e9cdbSAlisamar Husain thread trace dump instructions 8 129752e9cdbSAlisamar Husain 130752e9cdbSAlisamar Husainto select another thread. 131752e9cdbSAlisamar Husain 132752e9cdbSAlisamar HusainCrash Analysis 133752e9cdbSAlisamar Husain-------------- 134752e9cdbSAlisamar Husain 135752e9cdbSAlisamar HusainWhat if you are debugging + tracing a process that crashes? 136752e9cdbSAlisamar HusainThen you can just do 1375a106205SJonas Devlieghere 138752e9cdbSAlisamar Husain:: 1395a106205SJonas Devlieghere 140752e9cdbSAlisamar Husain thread trace dump instructions 141752e9cdbSAlisamar Husain 142752e9cdbSAlisamar HusainTo inspect how it crashed! There's nothing special that you need to do. 143752e9cdbSAlisamar HusainFor example 1445a106205SJonas Devlieghere 145752e9cdbSAlisamar Husain:: 1465a106205SJonas Devlieghere 147752e9cdbSAlisamar Husain * thread #1, name = 'a.out', stop reason = signal SIGFPE: integer divide by zero 148752e9cdbSAlisamar Husain frame #0: 0x00000000004009f1 a.out`main at main.cpp:8:14 149752e9cdbSAlisamar Husain 6 int x; 150752e9cdbSAlisamar Husain 7 cin >> x; 151752e9cdbSAlisamar Husain -> 8 cout << 12 / x << endl; 152752e9cdbSAlisamar Husain 9 return 0; 153752e9cdbSAlisamar Husain 10 } 154752e9cdbSAlisamar Husain (lldb) thread trace dump instructions -c 5 155752e9cdbSAlisamar Husain thread #1: tid = 604302, total instructions = 8388 156752e9cdbSAlisamar Husain libstdc++.so.6`std::istream::operator>>(int&) + 181 157752e9cdbSAlisamar Husain [8383] 0x00007ffff7b41665 popq %rbp 158752e9cdbSAlisamar Husain [8384] 0x00007ffff7b41666 retq 159752e9cdbSAlisamar Husain a.out`main + 66 at main.cpp:8:14 160752e9cdbSAlisamar Husain [8385] 0x00000000004009e8 movl -0x4(%rbp), %ecx 161752e9cdbSAlisamar Husain [8386] 0x00000000004009eb movl $0xc, %eax 162752e9cdbSAlisamar Husain [8387] 0x00000000004009f0 cltd 163752e9cdbSAlisamar Husain 164752e9cdbSAlisamar Husain.. note:: 165752e9cdbSAlisamar Husain At this moment, we are not including the failed instruction in the trace, 166752e9cdbSAlisamar Husain but in the future we might do it for readability. 167752e9cdbSAlisamar Husain 168752e9cdbSAlisamar Husain 169752e9cdbSAlisamar HusainOffline Trace Analysis 170752e9cdbSAlisamar Husain---------------------- 171752e9cdbSAlisamar Husain 172752e9cdbSAlisamar HusainIt's also possible to record a trace using a custom Intel PT collector 173752e9cdbSAlisamar Husainand decode + symbolicate the trace using LLDB. 174752e9cdbSAlisamar HusainFor that, the command trace load is useful. 175752e9cdbSAlisamar HusainIn order to use trace load, you need to first create a JSON file with 176752e9cdbSAlisamar Husainthe definition of the trace session. 177752e9cdbSAlisamar HusainFor example 1785a106205SJonas Devlieghere 179752e9cdbSAlisamar Husain:: 1805a106205SJonas Devlieghere 181752e9cdbSAlisamar Husain { 182752e9cdbSAlisamar Husain "type": "intel-pt", 183fc5ef57cSWalter Erquinigo "cpuInfo": { 184fc5ef57cSWalter Erquinigo "vendor": "GenuineIntel", 185752e9cdbSAlisamar Husain "family": 6, 186752e9cdbSAlisamar Husain "model": 79, 187752e9cdbSAlisamar Husain "stepping": 1 188752e9cdbSAlisamar Husain }, 189752e9cdbSAlisamar Husain "processes": [ 190752e9cdbSAlisamar Husain { 191752e9cdbSAlisamar Husain "pid": 815455, 192752e9cdbSAlisamar Husain "triple": "x86_64-*-linux", 193752e9cdbSAlisamar Husain "threads": [ 194752e9cdbSAlisamar Husain { 195752e9cdbSAlisamar Husain "tid": 815455, 1966a5355e8SWalter Erquinigo "iptTrace": "trace.file" # raw thread-specific trace from the AUX buffer 197752e9cdbSAlisamar Husain } 198752e9cdbSAlisamar Husain ], 199752e9cdbSAlisamar Husain "modules": [ # this are all the shared libraries + the main executable 200752e9cdbSAlisamar Husain { 201752e9cdbSAlisamar Husain "file": "a.out", # optional if it's the same as systemPath 202752e9cdbSAlisamar Husain "systemPath": "a.out", 203fc5ef57cSWalter Erquinigo "loadAddress": 4194304, 204752e9cdbSAlisamar Husain }, 205752e9cdbSAlisamar Husain { 206752e9cdbSAlisamar Husain "file": "libfoo.so", 207752e9cdbSAlisamar Husain "systemPath": "/usr/lib/libfoo.so", 208752e9cdbSAlisamar Husain "loadAddress": "0x00007ffff7bd9000", 209752e9cdbSAlisamar Husain }, 210752e9cdbSAlisamar Husain { 211752e9cdbSAlisamar Husain "systemPath": "libbar.so", 212752e9cdbSAlisamar Husain "loadAddress": "0x00007ffff79d7000", 213752e9cdbSAlisamar Husain } 214752e9cdbSAlisamar Husain ] 215752e9cdbSAlisamar Husain } 216752e9cdbSAlisamar Husain ] 217752e9cdbSAlisamar Husain } 218752e9cdbSAlisamar Husain 219752e9cdbSAlisamar HusainYou can see the full schema by typing 2205a106205SJonas Devlieghere 221752e9cdbSAlisamar Husain:: 2225a106205SJonas Devlieghere 223752e9cdbSAlisamar Husain trace schema intel-pt 224752e9cdbSAlisamar Husain 225752e9cdbSAlisamar HusainThe JSON file mainly contains all the shared libraries that 226752e9cdbSAlisamar Husainwere part of the traced process, along with their memory load address. 227752e9cdbSAlisamar HusainIf the analysis is done on the same computer where the traces were obtained, 228752e9cdbSAlisamar Husainit's enough to use the “systemPath” field. 229752e9cdbSAlisamar HusainIf the analysis is done on a different machines, these files need to be 230752e9cdbSAlisamar Husaincopied over and the “file” field should point to the 231752e9cdbSAlisamar Husainlocation of the file relative to the JSON file. 232752e9cdbSAlisamar HusainOnce you have the JSON file and the module files in place, you can simple run 2335a106205SJonas Devlieghere 234752e9cdbSAlisamar Husain:: 2355a106205SJonas Devlieghere 236752e9cdbSAlisamar Husain lldb 237752e9cdbSAlisamar Husain > trace load /path/to/json 238752e9cdbSAlisamar Husain > thread trace dump instructions <optional thread index> 239752e9cdbSAlisamar Husain 240752e9cdbSAlisamar HusainThen it's like in the live session case 241752e9cdbSAlisamar Husain 242752e9cdbSAlisamar HusainReferences 243752e9cdbSAlisamar Husain---------- 244752e9cdbSAlisamar Husain 24584caf73cSAlisamar Husain- Original RFC document_ for this feature. 24684caf73cSAlisamar Husain- Some details about how Meta is using Intel Processor Trace can be found in this blog_ post. 24784caf73cSAlisamar Husain 24884caf73cSAlisamar Husain.. _document: https://docs.google.com/document/d/1cOVTGp1sL_HBXjP9eB7qjVtDNr5xnuZvUUtv43G5eVI 24984caf73cSAlisamar Husain.. _blog: https://engineering.fb.com/2021/04/27/developer-tools/reverse-debugging/ 250