#
ff5e2bab |
| 06-Oct-2023 |
Job Noorman <jnoorman@igalia.com> |
[BOLT] Improve handling of relocations targeting specific instructions (#66395)
On RISC-V, there are certain relocations that target a specific
instruction instead of a more abstract location like
[BOLT] Improve handling of relocations targeting specific instructions (#66395)
On RISC-V, there are certain relocations that target a specific
instruction instead of a more abstract location like a function or basic
block. Take the following example that loads a value from symbol `foo`:
```
nop
1: auipc t0, %pcrel_hi(foo)
ld t0, %pcrel_lo(1b)(t0)
```
This results in two relocation:
- auipc: `R_RISCV_PCREL_HI20` referencing `foo`;
- ld: `R_RISCV_PCREL_LO12_I` referencing to local label `1` which points
to the auipc instruction.
It is of utmost importance that the `R_RISCV_PCREL_LO12_I` keeps
referring to the auipc instruction; if not, the program will fail to
assemble. However, BOLT currently does not guarantee this.
BOLT currently assumes that all local symbols are jump targets and
always starts a new basic block at symbol locations. The example above
results in a CFG the looks like this:
```
.BB0:
nop
.BB1:
auipc t0, %pcrel_hi(foo)
ld t0, %pcrel_lo(.BB1)(t0)
```
While this currently works (i.e., the `R_RISCV_PCREL_LO12_I` relocation
points to the correct instruction), it has two downsides:
- Too many basic blocks are created (the example above is logically only
one yet two are created);
- If instructions are inserted in `.BB1` (e.g., by instrumentation),
things will break since the label will not point to the auipc anymore.
This patch proposes to fix this issue by teaching BOLT to track labels
that should always point to a specific instruction. This is implemented
as follows:
- Add a new annotation type (`kLabel`) that allows us to annotate
instructions with an `MCSymbol *`;
- Whenever we encounter a relocation type that is used to refer to a
specific instruction (`Relocation::isInstructionReference`), we
register it without a symbol;
- During disassembly, whenever we encounter an instruction with such a
relocation, create a symbol for its target and store it in an offset
to symbol map (to ensure multiple relocations referencing the same
instruction use the same label);
- After disassembly, iterate this map to attach labels to instructions
via the new annotation type;
- During emission, emit these labels right before the instruction.
I believe the use of annotations works quite well for this use case as
it allows us to reliably track instruction labels. If we were to store
them as offsets in basic blocks, it would be error prone to keep them
updated whenever instructions are inserted or removed.
I have chosen to add labels as first-class annotations (as opposed to a
generic one) because the documentation of `MCAnnotation` suggests that
generic annotations are to be used for optional metadata that can be
discarded without affecting correctness. As this is not the case for
labels, a first-class annotation seemed more appropriate.
show more ...
|
#
31e8a9f4 |
| 07-Jul-2023 |
spupyrev <spupyrev@fb.com> |
[BOLT] Add stale-related logging
Adding some logs related to stale profile matching. The new data can be helpful to understand how "stale" the input profile is and how well the inference is able to
[BOLT] Add stale-related logging
Adding some logs related to stale profile matching. The new data can be helpful to understand how "stale" the input profile is and how well the inference is able to utilize the stale data.
Example of outputs on clang-10 built with LTO (profile collected on a year-old release): ``` BOLT-INFO: inferred profile for 2101 (18.52% of profiled, 100.00% of stale) functions responsible for 30.95% samples (14754697 out of 47670654) BOLT-INFO: stale inference matched 89.42% of basic blocks (79052 out of 88402 stale) responsible for 76.99% samples (645737 out of 838719 stale) ```
LTO+AutoFDO: ``` BOLT-INFO: inferred profile for 6146 (57.57% of profiled, 100.00% of stale) functions responsible for 90.34% samples (50891403 out of 56330313) BOLT-INFO: stale inference matched 74.55% of basic blocks (191295 out of 256589 stale) responsible for 57.30% samples (1288632 out of 2248799 stale) ```
Reviewed By: Amir, maksfb
Differential Revision: https://reviews.llvm.org/D154737
show more ...
|
#
deb53102 |
| 20-Jun-2023 |
Maksim Panchenko <maks@fb.com> |
[BOLT] Remove unnecessary diagnostics
When optimizations passes do not change anything, skip their diagnostics output. NFC otherwise.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.
[BOLT] Remove unnecessary diagnostics
When optimizations passes do not change anything, skip their diagnostics output. NFC otherwise.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D153386
show more ...
|
#
44268271 |
| 16-Feb-2023 |
spupyrev <spupyrev@fb.com> |
[BOLT] stale profile matching [part 1 out of 2]
BOLT often has to deal with profiles collected on binaries built from several revisions behind release. As a result, a certain percentage of functions
[BOLT] stale profile matching [part 1 out of 2]
BOLT often has to deal with profiles collected on binaries built from several revisions behind release. As a result, a certain percentage of functions is considered stale and not optimized. This diff adds an ability to match profile to functions that are not 100% binary identical, which increases the optimization coverage and boosts the performance of applications.
The algorithm consists of two phases: matching and inference: - At the matching phase, we try to "guess" as many block and jump counts from the stale profile as possible. To this end, the content of each basic block is hashed and stored in the (yaml) profile. When BOLT optimizes a binary, it computes block hashes and identifies the corresponding entries in the stale profile. It yields a partial profile for every CFG in the binary. - At the inference phase, we employ a network flow-based algorithm (profi) to reconstruct "realistic" block and jump counts from the partial profile generated at the first stage. In practice, we don't always produce proper profile data but the majority (e.g., >90%) of CFGs get the correct counts.
This is a first part of the change; the next stacked diff extends the block hashing and provides perf evaluation numbers.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D144500
show more ...
|
#
62a2feff |
| 16-May-2023 |
Rafael Auler <rafaelauler@fb.com> |
[BOLT] Fix state of MCSymbols in lowering pass
We have mostly harmless data races when running BinaryContext::calculateEmittedSize() in parallel, while performing split function pass. However, it i
[BOLT] Fix state of MCSymbols in lowering pass
We have mostly harmless data races when running BinaryContext::calculateEmittedSize() in parallel, while performing split function pass. However, it is possible to end up in a state where some MCSymbols are still registered and our clean up failed. This happens rarely but it does happen, and when it happens, it is a difficult to diagnose heisenbug. To avoid this, add a new clean pass to perform a last check on MCSymbols, before they undergo our final emission pass, to verify that they are in a sane state. If we fail to do this, we might resolve some symbols to zero and crash the output binary.
Reviewed By: #bolt, Amir
Differential Revision: https://reviews.llvm.org/D137984
show more ...
|
#
be2f67c4 |
| 07-Feb-2023 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Replace anonymous namespace functions with static
Follow LLVM Coding Standards guideline on using anonymous namespaces (https://llvm.org/docs/CodingStandards.html#anonymous-namespaces) a
[BOLT][NFC] Replace anonymous namespace functions with static
Follow LLVM Coding Standards guideline on using anonymous namespaces (https://llvm.org/docs/CodingStandards.html#anonymous-namespaces) and use `static` modifier for function definitions.
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D143124
show more ...
|
#
69a9bbf1 |
| 18-Jan-2023 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Replace ambiguous BinarySection::isReadOnly with isWritable
Address feedback in https://reviews.llvm.org/D102284#2755060
Reviewed By: yota9
Differential Revision: https://reviews.llvm.
[BOLT][NFC] Replace ambiguous BinarySection::isReadOnly with isWritable
Address feedback in https://reviews.llvm.org/D102284#2755060
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D141733
show more ...
|
#
27cf96c4 |
| 10-Jan-2023 |
Maksim Panchenko <maks@fb.com> |
[BOLT] Minor refactoring for -print-sorted-by option
Only display used values for -print-sorted-by option when printing help.
Differential Revision: https://reviews.llvm.org/D141209
|
#
e3b47d31 |
| 10-Jan-2023 |
hezuoqiang <hezuoqiang2@huawei.com> |
[BOLT] Modify the print option to a meaningful value
Using the option `-print-sorted-by=.` cause to core dump, so change to a legal value.
Reviewed By: maksfb
Differential Revision: https://review
[BOLT] Modify the print option to a meaningful value
Using the option `-print-sorted-by=.` cause to core dump, so change to a legal value.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D140847
show more ...
|
#
9966b3e7 |
| 30-Sep-2022 |
Gabriel Ravier <gabravier@gmail.com> |
[BOLT] Fixed some typos
I went over the output of the following mess of a command:
`(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z | parallel --xargs -0 cat | aspell list --mode=none --igno
[BOLT] Fixed some typos
I went over the output of the following mess of a command:
`(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z | parallel --xargs -0 cat | aspell list --mode=none --ignore-case | grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n | grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)`
and proceeded to spend a few days looking at it to find probable typos and fixed a few hundred of them in all of the llvm project (note, the ones I found are not anywhere near all of them, but it seems like a good start).
Reviewed By: Amir, maksfb
Differential Revision: https://reviews.llvm.org/D130824
show more ...
|
#
90d87dbf |
| 28-Sep-2022 |
Amir Ayupov <aaupov@fb.com> |
[BOLT] Report BB reordering %-age vs profiled and total number of functions
Reviewed By: spupyrev
Differential Revision: https://reviews.llvm.org/D134819
|
#
873942e1 |
| 08-Sep-2022 |
Amir Ayupov <aaupov@fb.com> |
[BOLT] Change reorder-blocks deprecated option warning output
Revert to using `BOLT-WARNING`
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D132778
|
#
07f63b0a |
| 25-Aug-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Allocate FunctionFragment on heap
This changes `FunctionFragment` from being used as a temporary proxy object to access basic block ranges to a heap-allocated object that can store fragment-s
[BOLT] Allocate FunctionFragment on heap
This changes `FunctionFragment` from being used as a temporary proxy object to access basic block ranges to a heap-allocated object that can store fragment-specific information.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132050
show more ...
|
#
d5c03def |
| 24-Aug-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Towards FunctionLayout const-correctness
A const-qualified reference to function layout allows accessing non-const qualified basic blocks on a const-qualified function. This patch adds or rem
[BOLT] Towards FunctionLayout const-correctness
A const-qualified reference to function layout allows accessing non-const qualified basic blocks on a const-qualified function. This patch adds or removes const-qualifiers where necessary to indicate where basic blocks are used in a non-const manner.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132049
show more ...
|
#
f24c299e |
| 24-Aug-2022 |
Fabian Parzefall <parzefall@fb.com> |
Revert "[BOLT] Towards FunctionLayout const-correctness"
This reverts commit 587d2653420d75ef10f30bd612d86f1e08fe9ea7.
|
#
5065134a |
| 24-Aug-2022 |
Fabian Parzefall <parzefall@fb.com> |
Revert "[BOLT] Allocate FunctionFragment on heap"
This reverts commit 101344af1af82d1633c773b718788eaa813d7f79.
|
#
101344af |
| 24-Aug-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Allocate FunctionFragment on heap
This changes `FunctionFragment` from being used as a temporary proxy object to access basic block ranges to a heap-allocated object that can store fragment-s
[BOLT] Allocate FunctionFragment on heap
This changes `FunctionFragment` from being used as a temporary proxy object to access basic block ranges to a heap-allocated object that can store fragment-specific information.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132050
show more ...
|
#
587d2653 |
| 24-Aug-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Towards FunctionLayout const-correctness
A const-qualified reference to function layout allows accessing non-const qualified basic blocks on a const-qualified function. This patch adds or rem
[BOLT] Towards FunctionLayout const-correctness
A const-qualified reference to function layout allows accessing non-const qualified basic blocks on a const-qualified function. This patch adds or removes const-qualifiers where necessary to indicate where basic blocks are used in a non-const manner.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132049
show more ...
|
#
a191ea7d |
| 19-Aug-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Make exception handling fragment aware
This adds basic fragment awareness in the exception handling passes and generates the necessary symbols for fragments.
Reviewed By: rafauler
Different
[BOLT] Make exception handling fragment aware
This adds basic fragment awareness in the exception handling passes and generates the necessary symbols for fragments.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D130520
show more ...
|
#
aed75748 |
| 17-Aug-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Remove old layout from function layout
To track whether a function's new layout is different from its old layout when updating it, the old layout would be kept around in memory indefinitely (
[BOLT] Remove old layout from function layout
To track whether a function's new layout is different from its old layout when updating it, the old layout would be kept around in memory indefinitely (if the new layout is different). This was used only for debugging/logging purposes. This patch forces the caller of function layout's update method to copy the old layout into a temporary if they need it by removing the old layout fields.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D131413
show more ...
|
#
8477bc67 |
| 17-Jul-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Add function layout class
This patch adds a dedicated class to keep track of each function's layout. It also lays the groundwork for splitting functions into multiple fragments (as opposed to
[BOLT] Add function layout class
This patch adds a dedicated class to keep track of each function's layout. It also lays the groundwork for splitting functions into multiple fragments (as opposed to a strict hot/cold split).
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D129518
show more ...
|
#
d55dfeaf |
| 14-Jul-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Replace uses of layout with basic block list
As we are moving towards support for multiple fragments, loops that iterate over all basic blocks of a function, but do not depend on the order of
[BOLT] Replace uses of layout with basic block list
As we are moving towards support for multiple fragments, loops that iterate over all basic blocks of a function, but do not depend on the order of basic blocks in the final layout, should iterate over binary functions directly, rather than the layout.
Eventually, all loops using the layout list should either iterate over the function, or be aware of multiple layouts. This patch replaces references to binary function's block layout with the binary function itself where only little code changes are necessary.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D129585
show more ...
|
#
35efe1d8 |
| 06-Jul-2022 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT][AArch64] Handle gold linker veneers
The gold linker veneers are written between functions without symbols, so we to handle it specially in BOLT.
Vladislav Khmelevsky, Advanced Software Techn
[BOLT][AArch64] Handle gold linker veneers
The gold linker veneers are written between functions without symbols, so we to handle it specially in BOLT.
Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D129260
show more ...
|
#
fc2d96c3 |
| 29-Jun-2022 |
Rafael Auler <rafaelauler@fb.com> |
Revert "[BOLT][AArch64] Handle gold linker veneers"
This reverts commit 425dda76e9fac93117289fd68a2abdfb1e4a0ba5.
This commit is currently causing BOLT to crash in one of our binaries and needs a b
Revert "[BOLT][AArch64] Handle gold linker veneers"
This reverts commit 425dda76e9fac93117289fd68a2abdfb1e4a0ba5.
This commit is currently causing BOLT to crash in one of our binaries and needs a bit more checking to make sure it is safe to land.
show more ...
|
#
425dda76 |
| 15-Jun-2022 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT][AArch64] Handle gold linker veneers
The gold linker veneers are written between functions without symbols, so we to handle it specially in BOLT.
Vladislav Khmelevsky, Advanced Software Techn
[BOLT][AArch64] Handle gold linker veneers
The gold linker veneers are written between functions without symbols, so we to handle it specially in BOLT.
Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D128082
show more ...
|