History log of /llvm-project/llvm/lib/IR/MDBuilder.cpp (Results 1 – 25 of 54)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1
# e03f4271 19-Sep-2024 Jay Foad <jay.foad@amd.com>

[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133)

It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all oc

[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133)

It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.

show more ...


Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8
# 294f3ce5 12-Jun-2024 Paul Kirth <paulkirth@google.com>

Reapply "[llvm][IR] Extend BranchWeightMetadata to track provenance o… (#95281)

…f weights" #95136

Reverts #95060, and relands #86609, with the unintended code generation
changes addressed.

T

Reapply "[llvm][IR] Extend BranchWeightMetadata to track provenance o… (#95281)

…f weights" #95136

Reverts #95060, and relands #86609, with the unintended code generation
changes addressed.

This patch implements the changes to LLVM IR discussed in
https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032

In this patch, we add an optional field to MD_prof meatdata nodes for
branch weights, which can be used to distinguish weights added from
llvm.expect* intrinsics from those added via other methods, e.g. from
profiles or inserted by the compiler.

One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.

Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.

We also update the lang ref for branch weights to reflect the change.

show more ...


# 607afa0b 11-Jun-2024 Paul Kirth <paulkirth@google.com>

Revert "[llvm][IR] Extend BranchWeightMetadata to track provenance of weights" (#95060)

Reverts llvm/llvm-project#86609

This change causes compile-time regressions for stage2 builds
(https://llv

Revert "[llvm][IR] Extend BranchWeightMetadata to track provenance of weights" (#95060)

Reverts llvm/llvm-project#86609

This change causes compile-time regressions for stage2 builds
(https://llvm-compile-time-tracker.com/compare.php?from=3254f31a66263ea9647c9547f1531c3123444fcd&to=c5978f1eb5eeca8610b9dfce1fcbf1f473911cd8&stat=instructions:u).
It also introduced unintended changes to `.text` which should be
addressed before relanding.

show more ...


# c5978f1e 10-Jun-2024 Paul Kirth <paulkirth@google.com>

[llvm][IR] Extend BranchWeightMetadata to track provenance of weights (#86609)

This patch implements the changes to LLVM IR discussed in

https://discourse.llvm.org/t/rfc-update-branch-weights-met

[llvm][IR] Extend BranchWeightMetadata to track provenance of weights (#86609)

This patch implements the changes to LLVM IR discussed in

https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032

In this patch, we add an optional field to MD_prof metadata nodes for
branch weights, which can be used to distinguish weights added from
`llvm.expect*` intrinsics from those added via other methods, e.g.
from profiles or inserted by the compiler.

One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.

Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.

We also update the lang ref for branch weights to reflect the change.

show more ...


Revision tags: llvmorg-18.1.7
# 0170bd5d 22-May-2024 Joachim Meyer <5982050+fodinabor@users.noreply.github.com>

[MDBuilder] `mergeCallbackEncodings` fails due to inspecting the wrong node (#92466)

Given the following metadata as, with `!6` as `ExistingCallbacks` and
`!8` as `NewCB`:
```
!6 = !{!7}
!7 = !{

[MDBuilder] `mergeCallbackEncodings` fails due to inspecting the wrong node (#92466)

Given the following metadata as, with `!6` as `ExistingCallbacks` and
`!8` as `NewCB`:
```
!6 = !{!7}
!7 = !{i64 0, i1 false}
!8 = !{i64 2, i64 3, i1 false}
```
The merge function should add `!8` to the list of `!6`, i.e. `!6 =
!{!7,!8}`. However, at the moment the check if this is legal, tries to
interpret `!7` as integer instead of it's operand.

show more ...


Revision tags: llvmorg-18.1.6, llvmorg-18.1.5
# c60aa430 20-Apr-2024 Vitaly Buka <vitalybuka@google.com>

[NFCI][sanitizers][metadata] Exctract create{Unlikely,Likely}BranchWeights (#89464)

We have a lot of repeated code with random constants.
Particular values are not important, the one just needs to

[NFCI][sanitizers][metadata] Exctract create{Unlikely,Likely}BranchWeights (#89464)

We have a lot of repeated code with random constants.
Particular values are not important, the one just needs to be
bigger then another.

UR_NONTAKEN_WEIGHT is selected as it's the most common one.

show more ...


Revision tags: llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1
# fd29a4d2 22-Mar-2023 wlei <wlei@fb.com>

[Pseudo Probe] Use the name from debug info to compute GUID in probe desc

This is to fix a GUID mismatch while decoding pseudo probe, a GUID from the inline tree is not in the GUID2FuncDescMap. It t

[Pseudo Probe] Use the name from debug info to compute GUID in probe desc

This is to fix a GUID mismatch while decoding pseudo probe, a GUID from the inline tree is not in the GUID2FuncDescMap. It turned out that frontend could change the function name making it different from the one in debug info(https://reviews.llvm.org/D111009). Here change to use the function name from debug info to be consistent with the probe name from the inline stack.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D146657

show more ...


Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# e842c06c 03-Dec-2022 Kazu Hirata <kazu@google.com>

[IR] Use std::nullopt instead of None (NFC)

This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manu

[IR] Use std::nullopt instead of None (NFC)

This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

show more ...


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4
# 47b07580 25-Oct-2022 wlei <wlei@fb.com>

[SampleFDO] Persist profile staleness metrics into binary

With https://reviews.llvm.org/D136627, now we have the metrics for profile staleness based on profile statistics, monitoring the profile sta

[SampleFDO] Persist profile staleness metrics into binary

With https://reviews.llvm.org/D136627, now we have the metrics for profile staleness based on profile statistics, monitoring the profile staleness in real-time can help user quickly identify performance issues. For a production scenario, the build is usually incremental and if we want the real-time metrics, we should store/cache all the old object's metrics somewhere and pull them in a post-build time. To make it more convenient, this patch add an option to persist them into the object binary, the metrics can be reported right away by decoding the binary rather than polling the previous stdout/stderrs from a cache system.

For implementation, it writes the statistics first into a new metadata section(llvm.stats) then encode into a special ELF `.llvm_stats` section. The section data is formatted as a list of key/value pair so that future statistics can be easily extended. This is also under a new switch(`-persist-profile-staleness`)

In terms of size overhead, the metrics are computed at module level, so the size overhead should be small, measured on one of our internal service, it costs less than < 1MB for a 10GB+ binary.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D136698

show more ...


Revision tags: llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1
# c70f6e13 06-Sep-2022 Marco Elver <elver@google.com>

[Metadata] Introduce MD_pcsections

Introduces MD_pcsections metadata kind. See added documentation for
more details.

Subsequent patches enable propagating PC sections metadata through code
generati

[Metadata] Introduce MD_pcsections

Introduces MD_pcsections metadata kind. See added documentation for
more details.

Subsequent patches enable propagating PC sections metadata through code
generation to the AsmPrinter.

RFC: https://discourse.llvm.org/t/rfc-pc-keyed-metadata-at-runtime/64191

Reviewed By: dvyukov, vitalybuka

Differential Revision: https://reviews.llvm.org/D130875

show more ...


Revision tags: llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 6678f8e5 27-Jun-2022 Yuanfang Chen <yuanfang.chen@sony.com>

[ubsan] Using metadata instead of prologue data for function sanitizer

Information in the function `Prologue Data` is intentionally opaque.
When a function with `Prologue Data` is duplicated. The se

[ubsan] Using metadata instead of prologue data for function sanitizer

Information in the function `Prologue Data` is intentionally opaque.
When a function with `Prologue Data` is duplicated. The self (global
value) references inside `Prologue Data` is still pointing to the
original function. This may cause errors like `fatal error: error in backend: Cannot represent a difference across sections`.

This patch detaches the information from function `Prologue Data`
and attaches it to a function metadata node.

This and D116130 fix https://github.com/llvm/llvm-project/issues/49689.

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D115844

show more ...


Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2
# 705a4c14 08-Dec-2020 Hongtao Yu <hoy@fb.com>

[CSSPGO] Pseudo probe encoding and emission.

This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdY

[CSSPGO] Pseudo probe encoding and emission.

This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s

Pseudo probes are in the form of intrinsic calls on IR/MIR but they do not turn into any machine instructions. Instead they are emitted into the binary as a piece of data in standalone sections. The probe-specific sections are not needed to be loaded into memory at execution time, thus they do not incur a runtime overhead. 

**ELF object emission**

The binary data to emit are organized as two ELF sections, i.e, the `.pseudo_probe_desc` section and the `.pseudo_probe` section. The `.pseudo_probe_desc` section stores a function descriptor for each function and the `.pseudo_probe` section stores the actual probes, each fo which corresponds to an IR basic block or an IR function callsite. A function descriptor is stored as a module-level metadata during the compilation and is serialized into the object file during object emission.

Both the probe descriptors and pseudo probes can be emitted into a separate ELF section per function to leverage the linker for deduplication. A `.pseudo_probe` section shares the same COMDAT group with the function code so that when the function is dead, the probes are dead and disposed too. On the contrary, a `.pseudo_probe_desc` section has its own COMDAT group. This is because even if a function is dead, its probes may be inlined into other functions and its descriptor is still needed by the profile generation tool.

The format of `.pseudo_probe_desc` section looks like:

```
.section .pseudo_probe_desc,"",@progbits
.quad 6309742469962978389 // Func GUID
.quad 4294967295 // Func Hash
.byte 9 // Length of func name
.ascii "_Z5funcAi" // Func name
.quad 7102633082150537521
.quad 138828622701
.byte 12
.ascii "_Z8funcLeafi"
.quad 446061515086924981
.quad 4294967295
.byte 9
.ascii "_Z5funcBi"
.quad -2016976694713209516
.quad 72617220756
.byte 7
.ascii "_Z3fibi"
```

For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format :

```
FUNCTION BODY (one for each outlined function present in the text section)
GUID (uint64)
GUID of the function
NPROBES (ULEB128)
Number of probes originating from this function.
NUM_INLINED_FUNCTIONS (ULEB128)
Number of callees inlined into this function, aka number of
first-level inlinees
PROBE RECORDS
A list of NPROBES entries. Each entry contains:
INDEX (ULEB128)
TYPE (uint4)
0 - block probe, 1 - indirect call, 2 - direct call
ATTRIBUTE (uint3)
reserved
ADDRESS_TYPE (uint1)
0 - code address, 1 - address delta
CODE_ADDRESS (uint64 or ULEB128)
code address or address delta, depending on ADDRESS_TYPE
INLINED FUNCTION RECORDS
A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined
callees. Each record contains:
INLINE SITE
GUID of the inlinee (uint64)
ID of the callsite probe (ULEB128)
FUNCTION BODY
A FUNCTION BODY entry describing the inlined function.
```

To support building a context-sensitive profile, probes from inlinees are grouped by their inline contexts. An inline context is logically a call path through which a callee function lands in a caller function. The probe emitter builds an inline tree based on the debug metadata for each outlined function in the form of a trie tree. A tree root is the outlined function. Each tree edge stands for a callsite where inlining happens. Pseudo probes originating from an inlinee function are stored in a tree node and the tree path starting from the root all the way down to the tree node is the inline context of the probes. The emission happens on the whole tree top-down recursively. Probes of a tree node will be emitted altogether with their direct parent edge. Since a pseudo probe corresponds to a real code address, for size savings, the address is encoded as a delta from the previous probe except for the first probe. Variant-sized integer encoding, aka LEB128, is used for address delta and probe index.

**Assembling**

Pseudo probes can be printed as assembly directives alternatively. This allows for good assembly code readability and also provides a view of how optimizations and pseudo probes affect each other, especially helpful for diff time assembly analysis.

A pseudo probe directive has the following operands in order: function GUID, probe index, probe type, probe attributes and inline context. The directive is generated by the compiler and can be parsed by the assembler to form an encoded `.pseudoprobe` section in the object file.

A example assembly looks like:

```
foo2: # @foo2
# %bb.0: # %bb0
pushq %rax
testl %edi, %edi
.pseudoprobe 837061429793323041 1 0 0
je .LBB1_1
# %bb.2: # %bb2
.pseudoprobe 837061429793323041 6 2 0
callq foo
.pseudoprobe 837061429793323041 3 0 0
.pseudoprobe 837061429793323041 4 0 0
popq %rax
retq
.LBB1_1: # %bb1
.pseudoprobe 837061429793323041 5 1 0
callq *%rsi
.pseudoprobe 837061429793323041 2 0 0
.pseudoprobe 837061429793323041 4 0 0
popq %rax
retq
# -- End function
.section .pseudo_probe_desc,"",@progbits
.quad 6699318081062747564
.quad 72617220756
.byte 3
.ascii "foo"
.quad 837061429793323041
.quad 281547593931412
.byte 4
.ascii "foo2"
```

With inlining turned on, the assembly may look different around %bb2 with an inlined probe:

```
# %bb.2: # %bb2
.pseudoprobe 837061429793323041 3 0
.pseudoprobe 6699318081062747564 1 0 @ 837061429793323041:6
.pseudoprobe 837061429793323041 4 0
popq %rax
retq
```

**Disassembling**

We have a disassembling tool (llvm-profgen) that can display disassembly alongside with pseudo probes. So far it only supports ELF executable file.

An example disassembly looks like:

```
00000000002011a0 <foo2>:
2011a0: 50 push rax
2011a1: 85 ff test edi,edi
[Probe]: FUNC: foo2 Index: 1 Type: Block
2011a3: 74 02 je 2011a7 <foo2+0x7>
[Probe]: FUNC: foo2 Index: 3 Type: Block
[Probe]: FUNC: foo2 Index: 4 Type: Block
[Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6
2011a5: 58 pop rax
2011a6: c3 ret
[Probe]: FUNC: foo2 Index: 2 Type: Block
2011a7: bf 01 00 00 00 mov edi,0x1
[Probe]: FUNC: foo2 Index: 5 Type: IndirectCall
2011ac: ff d6 call rsi
[Probe]: FUNC: foo2 Index: 4 Type: Block
2011ae: 58 pop rax
2011af: c3 ret
```

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D91878

show more ...


# 7ead5f5a 10-Dec-2020 Mitch Phillips <31459023+hctim@users.noreply.github.com>

Revert "[CSSPGO] Pseudo probe encoding and emission."

This reverts commit b035513c06d1cba2bae8f3e88798334e877523e1.

Reason: Broke the ASan buildbots:
http://lab.llvm.org:8011/#/builders/5/builds/

Revert "[CSSPGO] Pseudo probe encoding and emission."

This reverts commit b035513c06d1cba2bae8f3e88798334e877523e1.

Reason: Broke the ASan buildbots:
http://lab.llvm.org:8011/#/builders/5/builds/2269

show more ...


# b035513c 08-Dec-2020 Hongtao Yu <hoy@fb.com>

[CSSPGO] Pseudo probe encoding and emission.

This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdY

[CSSPGO] Pseudo probe encoding and emission.

This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s

Pseudo probes are in the form of intrinsic calls on IR/MIR but they do not turn into any machine instructions. Instead they are emitted into the binary as a piece of data in standalone sections. The probe-specific sections are not needed to be loaded into memory at execution time, thus they do not incur a runtime overhead. 

**ELF object emission**

The binary data to emit are organized as two ELF sections, i.e, the `.pseudo_probe_desc` section and the `.pseudo_probe` section. The `.pseudo_probe_desc` section stores a function descriptor for each function and the `.pseudo_probe` section stores the actual probes, each fo which corresponds to an IR basic block or an IR function callsite. A function descriptor is stored as a module-level metadata during the compilation and is serialized into the object file during object emission.

Both the probe descriptors and pseudo probes can be emitted into a separate ELF section per function to leverage the linker for deduplication. A `.pseudo_probe` section shares the same COMDAT group with the function code so that when the function is dead, the probes are dead and disposed too. On the contrary, a `.pseudo_probe_desc` section has its own COMDAT group. This is because even if a function is dead, its probes may be inlined into other functions and its descriptor is still needed by the profile generation tool.

The format of `.pseudo_probe_desc` section looks like:

```
.section .pseudo_probe_desc,"",@progbits
.quad 6309742469962978389 // Func GUID
.quad 4294967295 // Func Hash
.byte 9 // Length of func name
.ascii "_Z5funcAi" // Func name
.quad 7102633082150537521
.quad 138828622701
.byte 12
.ascii "_Z8funcLeafi"
.quad 446061515086924981
.quad 4294967295
.byte 9
.ascii "_Z5funcBi"
.quad -2016976694713209516
.quad 72617220756
.byte 7
.ascii "_Z3fibi"
```

For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format :

```
FUNCTION BODY (one for each outlined function present in the text section)
GUID (uint64)
GUID of the function
NPROBES (ULEB128)
Number of probes originating from this function.
NUM_INLINED_FUNCTIONS (ULEB128)
Number of callees inlined into this function, aka number of
first-level inlinees
PROBE RECORDS
A list of NPROBES entries. Each entry contains:
INDEX (ULEB128)
TYPE (uint4)
0 - block probe, 1 - indirect call, 2 - direct call
ATTRIBUTE (uint3)
reserved
ADDRESS_TYPE (uint1)
0 - code address, 1 - address delta
CODE_ADDRESS (uint64 or ULEB128)
code address or address delta, depending on ADDRESS_TYPE
INLINED FUNCTION RECORDS
A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined
callees. Each record contains:
INLINE SITE
GUID of the inlinee (uint64)
ID of the callsite probe (ULEB128)
FUNCTION BODY
A FUNCTION BODY entry describing the inlined function.
```

To support building a context-sensitive profile, probes from inlinees are grouped by their inline contexts. An inline context is logically a call path through which a callee function lands in a caller function. The probe emitter builds an inline tree based on the debug metadata for each outlined function in the form of a trie tree. A tree root is the outlined function. Each tree edge stands for a callsite where inlining happens. Pseudo probes originating from an inlinee function are stored in a tree node and the tree path starting from the root all the way down to the tree node is the inline context of the probes. The emission happens on the whole tree top-down recursively. Probes of a tree node will be emitted altogether with their direct parent edge. Since a pseudo probe corresponds to a real code address, for size savings, the address is encoded as a delta from the previous probe except for the first probe. Variant-sized integer encoding, aka LEB128, is used for address delta and probe index.

**Assembling**

Pseudo probes can be printed as assembly directives alternatively. This allows for good assembly code readability and also provides a view of how optimizations and pseudo probes affect each other, especially helpful for diff time assembly analysis.

A pseudo probe directive has the following operands in order: function GUID, probe index, probe type, probe attributes and inline context. The directive is generated by the compiler and can be parsed by the assembler to form an encoded `.pseudoprobe` section in the object file.

A example assembly looks like:

```
foo2: # @foo2
# %bb.0: # %bb0
pushq %rax
testl %edi, %edi
.pseudoprobe 837061429793323041 1 0 0
je .LBB1_1
# %bb.2: # %bb2
.pseudoprobe 837061429793323041 6 2 0
callq foo
.pseudoprobe 837061429793323041 3 0 0
.pseudoprobe 837061429793323041 4 0 0
popq %rax
retq
.LBB1_1: # %bb1
.pseudoprobe 837061429793323041 5 1 0
callq *%rsi
.pseudoprobe 837061429793323041 2 0 0
.pseudoprobe 837061429793323041 4 0 0
popq %rax
retq
# -- End function
.section .pseudo_probe_desc,"",@progbits
.quad 6699318081062747564
.quad 72617220756
.byte 3
.ascii "foo"
.quad 837061429793323041
.quad 281547593931412
.byte 4
.ascii "foo2"
```

With inlining turned on, the assembly may look different around %bb2 with an inlined probe:

```
# %bb.2: # %bb2
.pseudoprobe 837061429793323041 3 0
.pseudoprobe 6699318081062747564 1 0 @ 837061429793323041:6
.pseudoprobe 837061429793323041 4 0
popq %rax
retq
```

**Disassembling**

We have a disassembling tool (llvm-profgen) that can display disassembly alongside with pseudo probes. So far it only supports ELF executable file.

An example disassembly looks like:

```
00000000002011a0 <foo2>:
2011a0: 50 push rax
2011a1: 85 ff test edi,edi
[Probe]: FUNC: foo2 Index: 1 Type: Block
2011a3: 74 02 je 2011a7 <foo2+0x7>
[Probe]: FUNC: foo2 Index: 3 Type: Block
[Probe]: FUNC: foo2 Index: 4 Type: Block
[Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6
2011a5: 58 pop rax
2011a6: c3 ret
[Probe]: FUNC: foo2 Index: 2 Type: Block
2011a7: bf 01 00 00 00 mov edi,0x1
[Probe]: FUNC: foo2 Index: 5 Type: IndirectCall
2011ac: ff d6 call rsi
[Probe]: FUNC: foo2 Index: 4 Type: Block
2011ae: 58 pop rax
2011af: c3 ret
```

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D91878

show more ...


Revision tags: llvmorg-11.0.1-rc1
# 6861d938 14-Nov-2020 Roman Lebedev <lebedev.ri@gmail.com>

Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM"

See discussion in https://bugs.llvm.org/show_bug.cgi?id=45073 / https://reviews.llvm.org/D66324#2334485
the imp

Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM"

See discussion in https://bugs.llvm.org/show_bug.cgi?id=45073 / https://reviews.llvm.org/D66324#2334485
the implementation is known-broken for certain inputs,
the bugreport was up for a significant amount of timer,
and there has been no activity to address it.
Therefore, just completely rip out all of misexpect handling.

I suspect, fixing it requires redesigning the internals of MD_misexpect.
Should anyone commit to fixing the implementation problem,
starting from clean slate may be better anyways.

This reverts commit 7bdad08429411e7d0ecd58cd696b1efe3cff309e,
and some of it's follow-ups, that don't stand on their own.

show more ...


# 5c31b8b9 31-Oct-2020 Arthur Eubanks <aeubanks@google.com>

Revert "Use uint64_t for branch weights instead of uint32_t"

This reverts commit 10f2a0d662d8d72eaac48d3e9b31ca8dc90df5a4.

More uint64_t overflows.


Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6
# 10f2a0d6 30-Sep-2020 Arthur Eubanks <aeubanks@google.com>

Use uint64_t for branch weights instead of uint32_t

CallInst::updateProfWeight() creates branch_weights with i64 instead of i32.
To be more consistent everywhere and remove lots of casts from uint64

Use uint64_t for branch weights instead of uint32_t

CallInst::updateProfWeight() creates branch_weights with i64 instead of i32.
To be more consistent everywhere and remove lots of casts from uint64_t
to uint32_t, use i64 for branch_weights.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D88609

show more ...


# 2a4e704c 27-Oct-2020 Nico Weber <thakis@chromium.org>

Revert "Use uint64_t for branch weights instead of uint32_t"

This reverts commit e5766f25c62c185632e3a75bf45b313eadab774b.
Makes clang assert when building Chromium, see https://crbug.com/1142813
fo

Revert "Use uint64_t for branch weights instead of uint32_t"

This reverts commit e5766f25c62c185632e3a75bf45b313eadab774b.
Makes clang assert when building Chromium, see https://crbug.com/1142813
for a repro.

show more ...


# e5766f25 30-Sep-2020 Arthur Eubanks <aeubanks@google.com>

Use uint64_t for branch weights instead of uint32_t

CallInst::updateProfWeight() creates branch_weights with i64 instead of i32.
To be more consistent everywhere and remove lots of casts from uint64

Use uint64_t for branch weights instead of uint32_t

CallInst::updateProfWeight() creates branch_weights with i64 instead of i32.
To be more consistent everywhere and remove lots of casts from uint64_t
to uint32_t, use i64 for branch_weights.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D88609

show more ...


# d4c667c9 23-Oct-2020 Duncan P. N. Exon Smith <dexonsmith@apple.com>

Avoid unnecessary uses of `MDNode::getTemporary`, NFC

This is a long-delayed follow-up to
5e5b85098dbeaea2cfa5d01695b5d2982634d7dd.

`TempMDNode` includes a bunch of machinery for RAUW, and should o

Avoid unnecessary uses of `MDNode::getTemporary`, NFC

This is a long-delayed follow-up to
5e5b85098dbeaea2cfa5d01695b5d2982634d7dd.

`TempMDNode` includes a bunch of machinery for RAUW, and should only be
used when necessary. RAUW wasn't being used in any of these cases... it
was just a placeholder for a self-reference.

Where the real node was using `MDNode::getDistinct`, just replace the
temporary argument with `nullptr`.

Where the real node was using `MDNode::get`, the `replaceOperandWith`
call was "promoting" the node to a distinct one implicitly due to
self-reference detection in `MDNode::handleChangedOperand`. The
`TempMDNode` was serving a purpose by delaying uniquing, but it's way
simpler to just call `MDNode::getDistinct` in the first place.

Note that using a self-reference at all in these places is a hold-over
from before `distinct` metadata existed. It was an old trick to create
distinct nodes. It would be intrusive to change, including bitcode
upgrades, etc., and it's harmless so I'm not sure there's much value in
removing it from existing schemas. After this commit it still has a tiny
memory cost (in the extra metadata operand) but no more overhead in
construction.

Differential Revision: https://reviews.llvm.org/D90079

show more ...


Revision tags: llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1
# ba2e72c5 28-Mar-2020 Benjamin Kramer <benny.kra@googlemail.com>

[MDBuilder] Don't use stable sort for sorting integers.


Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5
# 7bdad084 11-Sep-2019 Petr Hosek <phosek@chromium.org>

Reland "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM"

This patch contains the basic functionality for reporting potentially
incorrect usage of __builtin_expect() by

Reland "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM"

This patch contains the basic functionality for reporting potentially
incorrect usage of __builtin_expect() by comparing the developer's
annotation against a collected PGO profile. A more detailed proposal and
discussion appears on the CFE-dev mailing list
(http://lists.llvm.org/pipermail/cfe-dev/2019-July/062971.html) and a
prototype of the initial frontend changes appear here in D65300

We revised the work in D65300 by moving the misexpect check into the
LLVM backend, and adding support for IR and sampling based profiles, in
addition to frontend instrumentation.

We add new misexpect metadata tags to those instructions directly
influenced by the llvm.expect intrinsic (branch, switch, and select)
when lowering the intrinsics. The misexpect metadata contains
information about the expected target of the intrinsic so that we can
check against the correct PGO counter when emitting diagnostics, and the
compiler's values for the LikelyBranchWeight and UnlikelyBranchWeight.
We use these branch weight values to determine when to emit the
diagnostic to the user.

A future patch should address the comment at the top of
LowerExpectIntrisic.cpp to hoist the LikelyBranchWeight and
UnlikelyBranchWeight values into a shared space that can be accessed
outside of the LowerExpectIntrinsic pass. Once that is done, the
misexpect metadata can be updated to be smaller.

In the long term, it is possible to reconstruct portions of the
misexpect metadata from the existing profile data. However, we have
avoided this to keep the code simple, and because some kind of metadata
tag will be required to identify which branch/switch/select instructions
are influenced by the use of llvm.expect

Patch By: paulkirth
Differential Revision: https://reviews.llvm.org/D66324

llvm-svn: 371635

show more ...


# 57256af3 11-Sep-2019 Dmitri Gribenko <gribozavr@gmail.com>

Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM"

This reverts commit r371584. It introduced a dependency from compiler-rt
to llvm/include/ADT, which is problema

Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM"

This reverts commit r371584. It introduced a dependency from compiler-rt
to llvm/include/ADT, which is problematic for multiple reasons.

One is that it is a novel dependency edge, which needs cross-compliation
machinery for llvm/include/ADT (yes, it is true that right now
compiler-rt included only header-only libraries, however, if we allow
compiler-rt to depend on anything from ADT, other libraries will
eventually get used).

Secondly, depending on ADT from compiler-rt exposes ADT symbols from
compiler-rt, which would cause ODR violations when Clang is built with
the profile library.

llvm-svn: 371598

show more ...


# 394a8ed8 11-Sep-2019 Petr Hosek <phosek@chromium.org>

clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM

This patch contains the basic functionality for reporting potentially
incorrect usage of __builtin_expect() by comparing

clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM

This patch contains the basic functionality for reporting potentially
incorrect usage of __builtin_expect() by comparing the developer's
annotation against a collected PGO profile. A more detailed proposal and
discussion appears on the CFE-dev mailing list
(http://lists.llvm.org/pipermail/cfe-dev/2019-July/062971.html) and a
prototype of the initial frontend changes appear here in D65300

We revised the work in D65300 by moving the misexpect check into the
LLVM backend, and adding support for IR and sampling based profiles, in
addition to frontend instrumentation.

We add new misexpect metadata tags to those instructions directly
influenced by the llvm.expect intrinsic (branch, switch, and select)
when lowering the intrinsics. The misexpect metadata contains
information about the expected target of the intrinsic so that we can
check against the correct PGO counter when emitting diagnostics, and the
compiler's values for the LikelyBranchWeight and UnlikelyBranchWeight.
We use these branch weight values to determine when to emit the
diagnostic to the user.

A future patch should address the comment at the top of
LowerExpectIntrisic.cpp to hoist the LikelyBranchWeight and
UnlikelyBranchWeight values into a shared space that can be accessed
outside of the LowerExpectIntrinsic pass. Once that is done, the
misexpect metadata can be updated to be smaller.

In the long term, it is possible to reconstruct portions of the
misexpect metadata from the existing profile data. However, we have
avoided this to keep the code simple, and because some kind of metadata
tag will be required to identify which branch/switch/select instructions
are influenced by the use of llvm.expect

Patch By: paulkirth
Differential Revision: https://reviews.llvm.org/D66324

llvm-svn: 371584

show more ...


Revision tags: llvmorg-9.0.0-rc4
# 7d1757ab 10-Sep-2019 Petr Hosek <phosek@chromium.org>

Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM"

This reverts commit r371484: this broke sanitizer-x86_64-linux-fast bot.

llvm-svn: 371488


123