#
2bb511e2 |
| 11-Jan-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Print BAT section size (#76897)
Test Plan: Updated bolt/test/X86/bolt-address-translation.test
|
#
ad8fd5b1 |
| 14-Dec-2023 |
Kazu Hirata <kazu@google.com> |
[BOLT] Use StringRef::{starts,ends}_with (NFC)
This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,end
[BOLT] Use StringRef::{starts,ends}_with (NFC)
This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove StringRef::{starts,ends}with.
show more ...
|
#
9596676e |
| 09-Dec-2023 |
Nathan Sidwell <nathan@acm.org> |
[BOLT] Determine address size from binary (#74870)
Query the executable for address size.
|
#
c43d0432 |
| 30-Nov-2023 |
ShatianWang <38512325+ShatianWang@users.noreply.github.com> |
[BOLT] Create .text.warm for 3-way splitting (#73863)
This commit explicitly adds a warm code section, .text.warm, when
-split-functions -split-strategy=cdsplit is used. This replaces the
previous
[BOLT] Create .text.warm for 3-way splitting (#73863)
This commit explicitly adds a warm code section, .text.warm, when
-split-functions -split-strategy=cdsplit is used. This replaces the
previous approach of using .text.cold.0 as warm and .text.cold.1 as cold
in 3-way function splitting. NFC.
show more ...
|
#
c5a306f0 |
| 15-Nov-2023 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT] Fix LSDA section handling (#71821)
Currently BOLT finds LSDA secition by it's name .gcc_except_table.main .
But sometimes it might have suffix e.g. .gcc_except_table.main. Find
LSDA section
[BOLT] Fix LSDA section handling (#71821)
Currently BOLT finds LSDA secition by it's name .gcc_except_table.main .
But sometimes it might have suffix e.g. .gcc_except_table.main. Find
LSDA section by it's address, rather by it's name.
Fixes #71804
show more ...
|
#
2db9b6a9 |
| 13-Nov-2023 |
Maksim Panchenko <maks@fb.com> |
[BOLT] Make instruction size a first-class annotation (#72167)
When NOP instructions are used to reserve space in the code, e.g. for
patching, it becomes critical to preserve their original size wh
[BOLT] Make instruction size a first-class annotation (#72167)
When NOP instructions are used to reserve space in the code, e.g. for
patching, it becomes critical to preserve their original size while
emitting the code. On x86, we rely on "Size" annotation for NOP
instructions size, as the original instruction size is lost in the
disassembly/assembly process.
This change makes instruction size a first-class annotation and is
affectively NFCI. A follow-up diff will use the annotation for code
emission.
show more ...
|
#
cf18f142 |
| 10-Nov-2023 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT] Read .rela.dyn in static non-pie binary (#71635)
Static non-pie binary doesn't have DYNAMIC segment and BOLT skips
reading .rela.dyn section because of it. But such binaries might have
this
[BOLT] Read .rela.dyn in static non-pie binary (#71635)
Static non-pie binary doesn't have DYNAMIC segment and BOLT skips
reading .rela.dyn section because of it. But such binaries might have
this section for example to store IFUNC relocation which is resolved
by linked-in startup files, so force reading this section for static
executables.
show more ...
|
#
1a2f8336 |
| 09-Nov-2023 |
spaette <111918424+spaette@users.noreply.github.com> |
[BOLT] Fix typos (#68121)
Closes https://github.com/llvm/llvm-project/issues/63097
Before merging please make sure the change to
bolt/include/bolt/Passes/StokeInfo.h is correct.
bolt/include/
[BOLT] Fix typos (#68121)
Closes https://github.com/llvm/llvm-project/issues/63097
Before merging please make sure the change to
bolt/include/bolt/Passes/StokeInfo.h is correct.
bolt/include/bolt/Passes/StokeInfo.h
```diff
// This Pass solves the two major problems to use the Stoke program without
- // proting its code:
+ // probing its code:
```
I'm still not happy about the awkward wording in this comment.
bolt/include/bolt/Passes/FixRelaxationPass.h
```
$ ed -s bolt/include/bolt/Passes/FixRelaxationPass.h <<<'9,12p'
// This file declares the FixRelaxations class, which locates instructions with
// wrong targets and fixes them. Such problems usually occures when linker
// relaxes (changes) instructions, but doesn't fix relocations types properly
// for them.
$
```
bolt/docs/doxygen.cfg.in
bolt/include/bolt/Core/BinaryContext.h
bolt/include/bolt/Core/BinaryFunction.h
bolt/include/bolt/Core/BinarySection.h
bolt/include/bolt/Core/DebugData.h
bolt/include/bolt/Core/DynoStats.h
bolt/include/bolt/Core/Exceptions.h
bolt/include/bolt/Core/MCPlusBuilder.h
bolt/include/bolt/Core/Relocation.h
bolt/include/bolt/Passes/FixRelaxationPass.h
bolt/include/bolt/Passes/InstrumentationSummary.h
bolt/include/bolt/Passes/ReorderAlgorithm.h
bolt/include/bolt/Passes/StackReachingUses.h
bolt/include/bolt/Passes/StokeInfo.h
bolt/include/bolt/Passes/TailDuplication.h
bolt/include/bolt/Profile/DataAggregator.h
bolt/include/bolt/Profile/DataReader.h
bolt/lib/Core/BinaryContext.cpp
bolt/lib/Core/BinarySection.cpp
bolt/lib/Core/DebugData.cpp
bolt/lib/Core/DynoStats.cpp
bolt/lib/Core/Relocation.cpp
bolt/lib/Passes/Instrumentation.cpp
bolt/lib/Passes/JTFootprintReduction.cpp
bolt/lib/Passes/ReorderData.cpp
bolt/lib/Passes/RetpolineInsertion.cpp
bolt/lib/Passes/ShrinkWrapping.cpp
bolt/lib/Passes/TailDuplication.cpp
bolt/lib/Rewrite/BoltDiff.cpp
bolt/lib/Rewrite/DWARFRewriter.cpp
bolt/lib/Rewrite/RewriteInstance.cpp
bolt/lib/Utils/CommandLineOpts.cpp
bolt/runtime/instr.cpp
bolt/test/AArch64/got-ld64-relaxation.test
bolt/test/AArch64/unmarked-data.test
bolt/test/X86/Inputs/dwarf5-cu-no-debug-addr-helper.s
bolt/test/X86/Inputs/linenumber.cpp
bolt/test/X86/double-jump.test
bolt/test/X86/dwarf5-call-pc-function-null-check.test
bolt/test/X86/dwarf5-split-dwarf4-monolithic.test
bolt/test/X86/dynrelocs.s
bolt/test/X86/fallthrough-to-noop.test
bolt/test/X86/tail-duplication-cache.s
bolt/test/runtime/X86/instrumentation-ind-calls.s
show more ...
|
#
96b5e092 |
| 08-Nov-2023 |
Job Noorman <jnoorman@igalia.com> |
[BOLT] Support instrumentation hook via DT_FINI_ARRAY (#67348)
BOLT currently hooks its its instrumentation finalization function via
`DT_FINI`. However, this method of calling finalization routine
[BOLT] Support instrumentation hook via DT_FINI_ARRAY (#67348)
BOLT currently hooks its its instrumentation finalization function via
`DT_FINI`. However, this method of calling finalization routines is not
supported anymore on newer ABIs like RISC-V. `DT_FINI_ARRAY` is
preferred there.
This patch adds support for hooking into `DT_FINI_ARRAY` instead if the
binary does not have a `DT_FINI` entry. If it does, `DT_FINI` takes
precedence so this patch should not change how the currently supported
instrumentation targets behave.
`DT_FINI_ARRAY` points to an array in memory of `DT_FINI_ARRAYSZ` bytes.
It consists of pointer-length entries that contain the addresses of
finalization functions. However, the addresses are only filled-in by the
dynamic linker at load time using relative relocations. This makes
hooking via `DT_FINI_ARRAY` a bit more complicated than via `DT_FINI`.
The implementation works as follows:
- While scanning the binary: find the section where `DT_FINI_ARRAY`
points to, read its first dynamic relocation and use its addend to find
the address of the fini function we will use to hook;
- While writing the output file: overwrite the addend of the dynamic
relocation with the address of the runtime library's fini function.
Updating the dynamic relocation required a bit of boiler plate: since
dynamic relocations are stored in a `std::multiset` which doesn't
support getting mutable references to its items, functions were added to
`BinarySection` to take an existing relocation and insert a new one.
show more ...
|
#
e2f1a95f |
| 08-Nov-2023 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT][AArch64] Handle IFUNCS properly (#71104)
Currently we were testing only the binaries compiled with O0, which
results in indirect call to the IFUNC trampoline and the trampoline has
associat
[BOLT][AArch64] Handle IFUNCS properly (#71104)
Currently we were testing only the binaries compiled with O0, which
results in indirect call to the IFUNC trampoline and the trampoline has
associated IFUNC symbol with it. Compile with O3 results in direct
calling the IFUNC trampoline and no symbols are associated with it, the
IFUNC symbol address becomes the same as IFUNC resolver address. Since
no symbol was associated the BF was not created before PLT analyze and
be the algorithm we're going to analyze target relocation. As we're
expecting the JUMP relocation we're also expecting the associated symbol
with it to be presented. But for IFUNC relocation the IRELATIVE
relocation is used and no symbol is associated with it, the addend value
is pointing on the target symbol, so we need to find BF using it and use
it's symbol in this situation. Currently this is checked only for
AArch64 platform, so I've limited it in code to use this logic only for
this platform, although I wouldn't be surprised if other platforms needs
to activate this logic too.
show more ...
|
#
8244ff67 |
| 24-Oct-2023 |
maksfb <maks@fb.com> |
[BOLT] Fix incorrect basic block output addresses (#70000)
Some optimization passes may duplicate basic blocks and assign the same
input offset to a number of different blocks in a function. This i
[BOLT] Fix incorrect basic block output addresses (#70000)
Some optimization passes may duplicate basic blocks and assign the same
input offset to a number of different blocks in a function. This is done
e.g. to correctly map debugging ranges for duplicated code.
However, duplicate input offsets present a problem when we use
AddressMap to generate new addresses for basic blocks. The output
address is calculated based on the input offset and will be the same for
blocks with identical offsets. The result is potentially incorrect debug
info and BAT records.
To address the issue, we have to eliminate the dependency on input
offsets while generating output addresses for a basic block. Each block
has a unique label, hence we extend AddressMap to include address lookup
based on MCSymbol and use the new functionality to update block
addresses.
show more ...
|
#
c67b8628 |
| 16-Oct-2023 |
Job Noorman <jnoorman@igalia.com> |
[BOLT][RISCV] Don't create function entry points for unnamed symbols (#68977)
Unnamed symbols are used, for example, for debug info related
relocations on RISC-V.
|
#
c6731d38 |
| 10-Oct-2023 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT][runtime] Add start & fini symbols (#68505)
Add absent start & fini symbols, currently setted by bolt for runtime
libraries at DT_INIT and DT_FINI. The proper tests would be added by the
htt
[BOLT][runtime] Add start & fini symbols (#68505)
Add absent start & fini symbols, currently setted by bolt for runtime
libraries at DT_INIT and DT_FINI. The proper tests would be added by the
https://github.com/llvm/llvm-project/pull/67348 PR.
show more ...
|
#
ff5e2bab |
| 06-Oct-2023 |
Job Noorman <jnoorman@igalia.com> |
[BOLT] Improve handling of relocations targeting specific instructions (#66395)
On RISC-V, there are certain relocations that target a specific
instruction instead of a more abstract location like
[BOLT] Improve handling of relocations targeting specific instructions (#66395)
On RISC-V, there are certain relocations that target a specific
instruction instead of a more abstract location like a function or basic
block. Take the following example that loads a value from symbol `foo`:
```
nop
1: auipc t0, %pcrel_hi(foo)
ld t0, %pcrel_lo(1b)(t0)
```
This results in two relocation:
- auipc: `R_RISCV_PCREL_HI20` referencing `foo`;
- ld: `R_RISCV_PCREL_LO12_I` referencing to local label `1` which points
to the auipc instruction.
It is of utmost importance that the `R_RISCV_PCREL_LO12_I` keeps
referring to the auipc instruction; if not, the program will fail to
assemble. However, BOLT currently does not guarantee this.
BOLT currently assumes that all local symbols are jump targets and
always starts a new basic block at symbol locations. The example above
results in a CFG the looks like this:
```
.BB0:
nop
.BB1:
auipc t0, %pcrel_hi(foo)
ld t0, %pcrel_lo(.BB1)(t0)
```
While this currently works (i.e., the `R_RISCV_PCREL_LO12_I` relocation
points to the correct instruction), it has two downsides:
- Too many basic blocks are created (the example above is logically only
one yet two are created);
- If instructions are inserted in `.BB1` (e.g., by instrumentation),
things will break since the label will not point to the auipc anymore.
This patch proposes to fix this issue by teaching BOLT to track labels
that should always point to a specific instruction. This is implemented
as follows:
- Add a new annotation type (`kLabel`) that allows us to annotate
instructions with an `MCSymbol *`;
- Whenever we encounter a relocation type that is used to refer to a
specific instruction (`Relocation::isInstructionReference`), we
register it without a symbol;
- During disassembly, whenever we encounter an instruction with such a
relocation, create a symbol for its target and store it in an offset
to symbol map (to ensure multiple relocations referencing the same
instruction use the same label);
- After disassembly, iterate this map to attach labels to instructions
via the new annotation type;
- During emission, emit these labels right before the instruction.
I believe the use of annotations works quite well for this use case as
it allows us to reliably track instruction labels. If we were to store
them as offsets in basic blocks, it would be error prone to keep them
updated whenever instructions are inserted or removed.
I have chosen to add labels as first-class annotations (as opposed to a
generic one) because the documentation of `MCAnnotation` suggests that
generic annotations are to be used for optional metadata that can be
discarded without affecting correctness. As this is not the case for
labels, a first-class annotation seemed more appropriate.
show more ...
|
#
8fb83bf5 |
| 06-Oct-2023 |
Job Noorman <jnoorman@igalia.com> |
[BOLT][NFC] Add MCSubtargetInfo to MCPlusBuilder (#68223)
On RISC-V, it's helpful to have access to `MCSubtargetInfo` while
generating instructions in `MCPlusBuilder`. For example, a return
instru
[BOLT][NFC] Add MCSubtargetInfo to MCPlusBuilder (#68223)
On RISC-V, it's helpful to have access to `MCSubtargetInfo` while
generating instructions in `MCPlusBuilder`. For example, a return
instruction might be generated differently based on if the target
supports compressed instructions (`c.jr ra`) or not (`jalr ra`).
show more ...
|
#
c7d6d622 |
| 05-Oct-2023 |
Job Noorman <jnoorman@igalia.com> |
[BOLT][RISCV] Implement TLS le/ie relocations (#67112)
Handle the following relocations related to TLS local-exec and
initial-exec:
- R_RISCV_TLS_GOT_HI20
- R_RISCV_TPREL_HI20
- R_RISCV_TPREL_AD
[BOLT][RISCV] Implement TLS le/ie relocations (#67112)
Handle the following relocations related to TLS local-exec and
initial-exec:
- R_RISCV_TLS_GOT_HI20
- R_RISCV_TPREL_HI20
- R_RISCV_TPREL_ADD
- R_RISCV_TPREL_LO12_I
- R_RISCV_TPREL_LO12_S
In addition, GNU ld has a quirk where after TLS le relaxation, two
unofficial relocation types may be emitted:
- R_RISCV_TPREL_I
- R_RISCV_TPREL_S
Since they are unofficial (defined in the reserved range of relocation
types), LLVM does not define them. Hence, I've defined them locally in
BOLT in a private namespace.
show more ...
|
#
8fd02d54 |
| 05-Oct-2023 |
Amir Ayupov <aaupov@fb.com> |
[BOLT] Fix 32-bit overflow in checkOffsets/checkVMA (#68274)
|
#
853e126c |
| 18-Aug-2023 |
Rafael Auler <rafaelauler@fb.com> |
[BOLT] Support input binaries that use R_X86_GOTPC64
In large code model, the address of GOT is calculated by the static linker via R_X86_GOTPC64 reloc applied against a MOVABSQ instruction. In the
[BOLT] Support input binaries that use R_X86_GOTPC64
In large code model, the address of GOT is calculated by the static linker via R_X86_GOTPC64 reloc applied against a MOVABSQ instruction. In the final binary, it can be disassembled as a regular immediate, but because such immediate is the result of PC-relative pointer arithmetic, we need to parse this relocation and update this calculation whenever we move code, otherwise we break the code trying to read GOT.
A test case showing how GOT is accessed was provided.
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D158911
show more ...
|
#
0053cb8e |
| 27-Sep-2023 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT] Fix .relr section addend patching
The new relocation offset in .relr section patching was calculated wrong previously. Pass the new file offset to lambda instead of re-calculating it in it. T
[BOLT] Fix .relr section addend patching
The new relocation offset in .relr section patching was calculated wrong previously. Pass the new file offset to lambda instead of re-calculating it in it. Test removes relocation from mytext section, so in case of wrong offset calculation we won't emit right addend value in expected place, i.e. on the new relocation offset.
Differential Revision: https://reviews.llvm.org/D159543
show more ...
|
#
9b4328fb |
| 17-Sep-2023 |
Maksim Panchenko <maks@fb.com> |
[BOLT][NFC] Refactor RI::discoverFileObjects()
Minor refactoring to delete redundant code.
Reviewed By: jobnoorman
Differential Revision: https://reviews.llvm.org/D159525
|
#
1e9b006a |
| 17-Sep-2023 |
Maksim Panchenko <maks@fb.com> |
[BOLT] Speedup symbol table sort
Memoize SymbolRef::getAddress() for sorting symbol table entries by their address. Saves about 10 seconds of processing time on large binaries with over 2 million sy
[BOLT] Speedup symbol table sort
Memoize SymbolRef::getAddress() for sorting symbol table entries by their address. Saves about 10 seconds of processing time on large binaries with over 2 million symbols. NFCI.
Reviewed By: jobnoorman, Amir
Differential Revision: https://reviews.llvm.org/D159524
show more ...
|
#
473b9dd4 |
| 16-Sep-2023 |
zhoujiapeng <zjpzhoujiapeng@163.com> |
[BOLT] Incorporate umask into the output file permission
Fix https://github.com/llvm/llvm-project/issues/65061
Reviewed By: maksfb, Amir
Differential Revision: https://reviews.llvm.org/D159407
|
#
4a6426a8 |
| 14-Sep-2023 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Simplify RI::selectFunctionsToProcess
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D159516
|
#
1cf2599a |
| 12-Sep-2023 |
Job Noorman <jnoorman@igalia.com> |
[BOLT] Prevent adding secondary entry points for BB labels
When linker relaxation is enabled on RISC-V, every branch has a relocation and a corresponding symbol in the symbol table. BOLT currently r
[BOLT] Prevent adding secondary entry points for BB labels
When linker relaxation is enabled on RISC-V, every branch has a relocation and a corresponding symbol in the symbol table. BOLT currently registers all these symbols as secondary entry points causing almost every function to be marked as multi entry on RISC-V.
This patch modifies `adjustFunctionBoundaries` to ignore these symbols. Note that I currently try to detect them by checking if a symbol's name starts with the private label prefix as defined by `MCAsmInfo`. Since I'm not entirely sure what multi-entry functions look like on different targets, please check if this condition is correct. Maybe it could make sense to only check this on RISC-V?
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D159285
show more ...
|
#
475a93a0 |
| 28-Aug-2023 |
Job Noorman <jnoorman@igalia.com> |
[BOLT] Calculate output values using BOLTLinker
BOLT uses `MCAsmLayout` to calculate the output values of functions and basic blocks. This means output values are calculated based on a pre-linking s
[BOLT] Calculate output values using BOLTLinker
BOLT uses `MCAsmLayout` to calculate the output values of functions and basic blocks. This means output values are calculated based on a pre-linking state and any changes to symbol values during linking will cause incorrect values to be used.
This issue can be triggered by enabling linker relaxation on RISC-V. Since linker relaxation can remove instructions, symbol values may change. This causes, among other things, the symbol table created by BOLT in the output executable to be incorrect.
This patch solves this issue by using `BOLTLinker` to get symbol values instead of `MCAsmLayout`. This way, output values are calculated based on a post-linking state. To make sure the linker can update all necessary symbols, this patch also makes sure all these symbols are not marked as temporary so that they end-up in the object file's symbol table.
Note that this patch only deals with symbols of binary functions (`BinaryFunction::updateOutputValues`). The technique described above turned out to be too expensive for basic block symbols so those are handled differently in D155604.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D154604
show more ...
|