History log of /llvm-project/lldb/source/Plugins/Trace/intel-pt/ThreadDecoder.cpp (Results 1 – 13 of 13)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# 2fe83274 07-Jan-2023 Kazu Hirata <kazu@google.com>

[lldb] Use std::optional instead of llvm::Optional (NFC)

This patch replaces (llvm::|)Optional< with std::optional<. I'll post
a separate patch to clean up the "using" declarations, #include
"llvm/

[lldb] Use std::optional instead of llvm::Optional (NFC)

This patch replaces (llvm::|)Optional< with std::optional<. I'll post
a separate patch to clean up the "using" declarations, #include
"llvm/ADT/Optional.h", etc.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

show more ...


# f190ce62 07-Jan-2023 Kazu Hirata <kazu@google.com>

[lldb] Add #include <optional> (NFC)

This patch adds #include <optional> to those files containing
llvm::Optional<...> or Optional<...>.

I'll post a separate patch to actually replace llvm::Optiona

[lldb] Add #include <optional> (NFC)

This patch adds #include <optional> to those files containing
llvm::Optional<...> or Optional<...>.

I'll post a separate patch to actually replace llvm::Optional with
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

show more ...


Revision tags: llvmorg-15.0.6
# c903136f 20-Nov-2022 Kazu Hirata <kazu@google.com>

[lldb] Use Optional::has_value instead of Optional::hasValue (NFC)


Revision tags: llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3
# c4fb631c 11-Aug-2022 Walter Erquinigo <wallace@fb.com>

[NFC][lldb][trace] Fix formatting of tracing files

Pavel Labath taught me that clang-format sorts headers automatically
using llvm's rules, and it's better not to have spaces between

So in this dif

[NFC][lldb][trace] Fix formatting of tracing files

Pavel Labath taught me that clang-format sorts headers automatically
using llvm's rules, and it's better not to have spaces between

So in this diff I'm removing those spaces and formatting them as well.

I used `clang-format -i` to format these files.

show more ...


Revision tags: llvmorg-15.0.0-rc2
# d179ea12 03-Aug-2022 Walter Erquinigo <wallace@fb.com>

[NFC][trace] format source files

Cleanup formatting diff


Revision tags: llvmorg-15.0.0-rc1, llvmorg-16-init
# 4f676c25 18-Jul-2022 Walter Erquinigo <wallace@fb.com>

[trace][intel pt] Introduce wall clock time for each trace item

- Decouple TSCs from trace items
- Turn TSCs into events just like CPUs. The new name is HW clock tick, wich could be reused by other

[trace][intel pt] Introduce wall clock time for each trace item

- Decouple TSCs from trace items
- Turn TSCs into events just like CPUs. The new name is HW clock tick, wich could be reused by other vendors.
- Add a GetWallTime that returns the wall time that the trace plug-in can infer for each trace item.
- For intel pt, we are doing the following interpolation: if an instruction takes less than 1 TSC, we use that duration, otherwise, we assume the instruction took 1 TSC. This helps us avoid having to handle context switches, changes to kernel, idle times, decoding errors, etc. We are just trying to show some approximation and not the real data. For the real data, TSCs are the way to go. Besides that, we are making sure that no two trace items will give the same interpolation value. Finally, we are using as time 0 the time at which tracing started.

Sample output:

```
(lldb) r
Process 750047 launched: '/home/wallace/a.out' (x86_64)
Process 750047 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
frame #0: 0x0000000000402479 a.out`main at main.cpp:29:20
26 };
27
28 int main() {
-> 29 std::vector<int> vvv;
30 for (int i = 0; i < 100; i++)
31 vvv.push_back(i);
32
(lldb) process trace start -s 64kb -t --per-cpu
(lldb) b 60
Breakpoint 2: where = a.out`main + 1689 at main.cpp:60:23, address = 0x0000000000402afe
(lldb) c
Process 750047 resuming
Process 750047 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 2.1
frame #0: 0x0000000000402afe a.out`main at main.cpp:60:23
57 map<int, int> m;
58 m[3] = 4;
59
-> 60 map<string, string> m2;
61 m2["5"] = "6";
62
63 std::vector<std::string> vs = {"2", "3"};
(lldb) thread trace dump instructions -t -f -e thread #1: tid = 750047
0: [379567.000 ns] (event) HW clock tick [48599428476224707]
1: [379569.000 ns] (event) CPU core changed [new CPU=2]
2: [390487.000 ns] (event) HW clock tick [48599428476246495]
3: [1602508.000 ns] (event) HW clock tick [48599428478664855]
4: [1662745.000 ns] (event) HW clock tick [48599428478785046]
libc.so.6`malloc
5: [1662746.995 ns] 0x00007ffff7176660 endbr64
6: [1662748.991 ns] 0x00007ffff7176664 movq 0x32387d(%rip), %rax ; + 408
7: [1662750.986 ns] 0x00007ffff717666b pushq %r12
8: [1662752.981 ns] 0x00007ffff717666d pushq %rbp
9: [1662754.977 ns] 0x00007ffff717666e pushq %rbx
10: [1662756.972 ns] 0x00007ffff717666f movq (%rax), %rax
11: [1662758.967 ns] 0x00007ffff7176672 testq %rax, %rax
12: [1662760.963 ns] 0x00007ffff7176675 jne 0x9c7e0 ; <+384>
13: [1662762.958 ns] 0x00007ffff717667b leaq 0x17(%rdi), %rax
14: [1662764.953 ns] 0x00007ffff717667f cmpq $0x1f, %rax
15: [1662766.949 ns] 0x00007ffff7176683 ja 0x9c730 ; <+208>
16: [1662768.944 ns] 0x00007ffff7176730 andq $-0x10, %rax
17: [1662770.939 ns] 0x00007ffff7176734 cmpq $-0x41, %rax
18: [1662772.935 ns] 0x00007ffff7176738 seta %dl
19: [1662774.930 ns] 0x00007ffff717673b jmp 0x9c690 ; <+48>
20: [1662776.925 ns] 0x00007ffff7176690 cmpq %rdi, %rax
21: [1662778.921 ns] 0x00007ffff7176693 jb 0x9c7b0 ; <+336>
22: [1662780.916 ns] 0x00007ffff7176699 testb %dl, %dl
23: [1662782.911 ns] 0x00007ffff717669b jne 0x9c7b0 ; <+336>
24: [1662784.906 ns] 0x00007ffff71766a1 movq 0x3236c0(%rip), %r12 ; + 24
(lldb) thread trace dump instructions -t -f -e -J -c 4
[
{
"id": 0,
"timestamp_ns": "379567.000000",
"event": "HW clock tick",
"hwClock": 48599428476224707
},
{
"id": 1,
"timestamp_ns": "379569.000000",
"event": "CPU core changed",
"cpuId": 2
},
{
"id": 2,
"timestamp_ns": "390487.000000",
"event": "HW clock tick",
"hwClock": 48599428476246495
},
{
"id": 3,
"timestamp_ns": "1602508.000000",
"event": "HW clock tick",
"hwClock": 48599428478664855
},
{
"id": 4,
"timestamp_ns": "1662745.000000",
"event": "HW clock tick",
"hwClock": 48599428478785046
},
{
"id": 5,
"timestamp_ns": "1662746.995324",
"loadAddress": "0x7ffff7176660",
"module": "libc.so.6",
"symbol": "malloc",
"mnemonic": "endbr64"
},
{
"id": 6,
"timestamp_ns": "1662748.990648",
"loadAddress": "0x7ffff7176664",
"module": "libc.so.6",
"symbol": "malloc",
"mnemonic": "movq"
},
{
"id": 7,
"timestamp_ns": "1662750.985972",
"loadAddress": "0x7ffff717666b",
"module": "libc.so.6",
"symbol": "malloc",
"mnemonic": "pushq"
},
{
"id": 8,
"timestamp_ns": "1662752.981296",
"loadAddress": "0x7ffff717666d",
"module": "libc.so.6",
"symbol": "malloc",
"mnemonic": "pushq"
}
]
```

Differential Revision: https://reviews.llvm.org/D130054

show more ...


# 9f9464e0 29-Jun-2022 Peicong Wu <wupeicong@fb.com>

[trace][intel pt] Measure the time it takes to decode a thread in per-cpu mode

This metric was missing. We were only measuring in per-thread mode, and
this completes the work.

For a sample trace I

[trace][intel pt] Measure the time it takes to decode a thread in per-cpu mode

This metric was missing. We were only measuring in per-thread mode, and
this completes the work.

For a sample trace I have, the `dump info` command shows

```
Timing for this thread:
Decoding instructions: 0.12s

```

I also improved a bit the TaskTime function so that callers don't need to
specify the template argument

Differential Revision: https://reviews.llvm.org/D129249

show more ...


# a7d6c3ef 24-Jun-2022 Walter Erquinigo <wallace@fb.com>

[trace] Make events first class items in the trace cursor and rework errors

We want to include events with metadata, like context switches, and this
requires the API to handle events with payloads (

[trace] Make events first class items in the trace cursor and rework errors

We want to include events with metadata, like context switches, and this
requires the API to handle events with payloads (e.g. information about
such context switches). Besides this, we want to support multiple
similar events between two consecutive instructions, like multiple
context switches. However, the current implementation is not good for this because
we are defining events as bitmask enums associated with specific
instructions. Thus, we need to decouple instructions from events and
make events actual items in the trace, just like instructions and
errors.

- Add accessors in the TraceCursor to know if an item is an event or not
- Modify from the TraceDumper all the way to DecodedThread to support
- Renamed the paused event to disabled.
- Improved the tsc handling logic. I was using an API for getting the tsc from libipt, but that was an overkill that should be used when not processing events manually, but as we are already processing events, we can more easily get the tscs.
event items. Fortunately this simplified many things
- As part of this refactor, I also fixed and long stating issue, which is that some non decoding errors were being inserted in the decoded thread. I changed this so that TraceIntelPT::Decode returns an error if the decoder couldn't be set up proplerly. Then, errors within a trace are actual anomalies found in between instrutions.

All test pass

Differential Revision: https://reviews.llvm.org/D128576

show more ...


Revision tags: llvmorg-14.0.6
# 67c24051 14-Jun-2022 Walter Erquinigo <wallace@fb.com>

[trace][intelpt] Support system-wide tracing [19] - Some other minor improvements

This addresses the issues in diffs [13], [14] and [16]

- Add better documentation
- Fix some castings by making the

[trace][intelpt] Support system-wide tracing [19] - Some other minor improvements

This addresses the issues in diffs [13], [14] and [16]

- Add better documentation
- Fix some castings by making them safer
- Simplify CorrelateContextSwitchesAndIntelPtTraces
- Rename some functions

Differential Revision: https://reviews.llvm.org/D127804

show more ...


Revision tags: llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2
# 059f39d2 18-Apr-2022 Walter Erquinigo <wallace@fb.com>

[trace][intel pt] Support events

A trace might contain events traced during the target's execution. For
example, a thread might be paused for some period of time due to context
switches or breakpoin

[trace][intel pt] Support events

A trace might contain events traced during the target's execution. For
example, a thread might be paused for some period of time due to context
switches or breakpoints, which actually force a context switch. Not only
that, a trace might be paused because the CPU decides to trace only a
specific part of the target, like the address filtering provided by
intel pt, which will cause pause events. Besides this case, other kinds
of events might exist.

This patch adds the method `TraceCursor::GetEvents()`` that returns the
list of events that happened right before the instruction being pointed
at by the cursor. Some refactors were done to make this change simpler.

Besides this new API, the instruction dumper now supports the -e flag
which shows pause events, like in the following example, where pauses
happened due to breakpoints.

```
thread #1: tid = 2717361
a.out`main + 20 at main.cpp:27:20
0: 0x00000000004023d9 leaq -0x1200(%rbp), %rax
[paused]
1: 0x00000000004023e0 movq %rax, %rdi
[paused]
2: 0x00000000004023e3 callq 0x403a62 ; std::vector<int, std::allocator<int> >::vector at stl_vector.h:391:7
a.out`std::vector<int, std::allocator<int> >::vector() at stl_vector.h:391:7
3: 0x0000000000403a62 pushq %rbp
4: 0x0000000000403a63 movq %rsp, %rbp
```

The `dump info` command has also been updated and now it shows the
number of instructions that have associated events.

Differential Revision: https://reviews.llvm.org/D123982

show more ...


Revision tags: llvmorg-14.0.1
# bdf3e7e5 08-Apr-2022 Walter Erquinigo <wallace@fb.com>

[trace][intelpt] Add task timer classes

I'm adding two new classes that can be used to measure the duration of long
tasks as process and thread level, e.g. decoding, fetching data from
lldb-server,

[trace][intelpt] Add task timer classes

I'm adding two new classes that can be used to measure the duration of long
tasks as process and thread level, e.g. decoding, fetching data from
lldb-server, etc. In this first patch, I'm using it to measure the time it takes
to decode each thread, which is printed out with the `dump info` command. In a
later patch I'll start adding process-level tasks and I might move these
classes to the upper Trace level, instead of having them in the intel-pt
plugin. I might need to do that anyway in the future when we have to
measure HTR. For now, I want to keep the impact of this change minimal.

With it, I was able to generate the following info of a very big trace:

```
(lldb) thread trace dump info Trace technology: intel-pt

thread #1: tid = 616081
Total number of instructions: 9729366

Memory usage:
Raw trace size: 1024 KiB
Total approximate memory usage (excluding raw trace): 123517.34 KiB
Average memory usage per instruction (excluding raw trace): 13.00 bytes

Timing:
Decoding instructions: 1.62s

Errors:
Number of TSC decoding errors: 0
```

As seen above, it took 1.62 seconds to decode 9.7M instructions. This is great
news, as we don't need to do any optimization work in this area.

Differential Revision: https://reviews.llvm.org/D123357

show more ...


# e0cfe20a 06-Apr-2022 Walter Erquinigo <wallace@fb.com>

[trace][intel pt] Create a common accessor for live and postmortem data

Some parts of the code have to distinguish between live and postmortem threads
to figure out how to get some data, e.g. thread

[trace][intel pt] Create a common accessor for live and postmortem data

Some parts of the code have to distinguish between live and postmortem threads
to figure out how to get some data, e.g. thread trace buffers. This makes the
code less generic and more error prone. An example of that is that we have
two different decoders: LiveThreadDecoder and PostMortemThreadDecoder. They
exist because getting the trace bufer is different for each case.

The problem doesn't stop there. Soon we'll have even more kinds of data, like
the context switch trace, whose fetching will be different for live and post-
mortem processes.

As a way to fix this, I'm creating a common API for accessing thread data,
which is able to figure out how to handle the postmortem and live cases on
behalf of the caller. As a result of that, I was able to eliminate the two
decoders and unify them into a simpler one. Not only that, our TraceSave
functionality only worked for live threads, but now it can also work for
postmortem processes, which might be useful now, but it might in the future.

This common API is OnThreadBinaryDataRead. More information in the inline
documentation.

Differential Revision: https://reviews.llvm.org/D123281

show more ...


# 6423b502 01-Apr-2022 Walter Erquinigo <wallace@fb.com>

[trace][intel pt] Create a class for the libipt decoder wrapper

As we soon will need to decode multiple raw traces for the same thread,
having a class that encapsulates the decoding of a single raw

[trace][intel pt] Create a class for the libipt decoder wrapper

As we soon will need to decode multiple raw traces for the same thread,
having a class that encapsulates the decoding of a single raw trace is
a stepping stone that will make the coming features easier to implement.

So, I'm creating a LibiptDecoder class with that purpose. I refactored
the code and it's now much more readable. Besides that, more comments
were added. With this new structure, it's also easier to implement unit
tests.

Differential Revision: https://reviews.llvm.org/D123106

show more ...