#
05634f73 |
| 15-Jun-2023 |
Job Noorman <jnoorman@igalia.com> |
[BOLT] Move from RuntimeDyld to JITLink
RuntimeDyld has been deprecated in favor of JITLink. [1] This patch replaces all uses of RuntimeDyld in BOLT with JITLink.
Care has been taken to minimize th
[BOLT] Move from RuntimeDyld to JITLink
RuntimeDyld has been deprecated in favor of JITLink. [1] This patch replaces all uses of RuntimeDyld in BOLT with JITLink.
Care has been taken to minimize the impact on the code structure in order to ease the inspection of this (rather large) changeset. Since BOLT relied on the RuntimeDyld API in multiple places, this wasn't always possible though and I'll explain the changes in code structure first.
Design note: BOLT uses a JIT linker to perform what essentially is static linking. No linked code is ever executed; the result of linking is simply written back to an executable file. For this reason, I restricted myself to the use of the core JITLink library and avoided ORC as much as possible.
RuntimeDyld contains methods for loading objects (loadObject) and symbol lookup (getSymbol). Since JITLink doesn't provide a class with a similar interface, the BOLTLinker abstract class was added to implement it. It was added to Core since both the Rewrite and RuntimeLibs libraries make use of it. Wherever a RuntimeDyld object was used before, it was replaced with a BOLTLinker object.
There is one major difference between the RuntimeDyld and BOLTLinker interfaces: in JITLink, section allocation and the application of fixups (relocation) happens in a single call (jitlink::link). That is, there is no separate method like finalizeWithMemoryManagerLocking in RuntimeDyld. BOLT used to remap sections between allocating (loadObject) and linking them (finalizeWithMemoryManagerLocking). This doesn't work anymore with JITLink. Instead, BOLTLinker::loadObject accepts a callback that is called before fixups are applied which is used to remap sections.
The actual implementation of the BOLTLinker interface lives in the JITLinkLinker class in the Rewrite library. It's the only part of the BOLT code that should directly interact with the JITLink API.
For loading object, JITLinkLinker first creates a LinkGraph (jitlink::createLinkGraphFromObject) and then links it (jitlink::link). For the latter, it uses a custom JITLinkContext with the following properties: - Use BOLT's ExecutableFileMemoryManager. This one was updated to implement the JITLinkMemoryManager interface. Since BOLT never executes code, its finalization step is a no-op. - Pass config: don't use the default target passes since they modify DWARF sections in a way that seems incompatible with BOLT. Also run a custom pre-prune pass that makes sure sections without symbols are not pruned by JITLink. - Implement symbol lookup. This used to be implemented by BOLTSymbolResolver. - Call the section mapper callback before the final linking step. - Copy symbol values when the LinkGraph is resolved. Symbols are stored inside JITLinkLinker to ensure that later objects (i.e., instrumentation libraries) can find them. This functionality used to be provided by RuntimeDyld but I did not find a way to use JITLink directly for this.
Some more minor points of interest: - BinarySection::SectionID: JITLink doesn't have something equivalent to RuntimeDyld's Section IDs. Instead, sections can only be referred to by name. Hence, SectionID was updated to a string. - There seem to be no tests for Mach-O. I've tested a small hello-world style binary but not more than that. - On Mach-O, JITLink "normalizes" section names to include the segment name. I had to parse the section name back from this manually which feels slightly hacky.
[1] https://reviews.llvm.org/D145686#4222642
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D147544
show more ...
|
#
702126ae |
| 24-Nov-2022 |
Guillaume Chatelet <gchatelet@google.com> |
[NFC] Add helper method to ensure min alignment on MCSection
Follow up on D138653.
Differential Revision: https://reviews.llvm.org/D138686
|
#
6c09ea3f |
| 24-Nov-2022 |
Guillaume Chatelet <gchatelet@google.com> |
[Alignment][NFC] Use Align in MCStreamer::emitValueToAlignment
Differential Revision: https://reviews.llvm.org/D138674
|
#
4f177341 |
| 24-Nov-2022 |
Guillaume Chatelet <gchatelet@google.com> |
[Alignment][NFC] Use Align in MCStreamer::emitCodeAlignment
This patch makes code less readable but it will clean itself after all functions are converted.
Differential Revision: https://reviews.ll
[Alignment][NFC] Use Align in MCStreamer::emitCodeAlignment
This patch makes code less readable but it will clean itself after all functions are converted.
Differential Revision: https://reviews.llvm.org/D138665
show more ...
|
#
e647b4f5 |
| 24-Nov-2022 |
Guillaume Chatelet <gchatelet@google.com> |
[reland][Alignment][NFC] Use the Align type in MCSection
Differential Revision: https://reviews.llvm.org/D138653
|
#
36989944 |
| 29-Oct-2022 |
Rafael Auler <rafaelauler@fb.com> |
[BOLT] Always move JTs in jump-table=move
We should always move jump tables when requested. Previously, we were not moving jump tables of non-simple functions in relocation mode. That caused a bug d
[BOLT] Always move JTs in jump-table=move
We should always move jump tables when requested. Previously, we were not moving jump tables of non-simple functions in relocation mode. That caused a bug detailed in the attached test case: in PIC jump tables, we force jump tables to be moved, but if they are not moved because the function is not simple, we could incorrectly update original entries in .rodata, corrupting it under special circumstances (see testcase).
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D137357
show more ...
|
#
4d3a0cad |
| 22-Sep-2022 |
Maksim Panchenko <maks@fb.com> |
[BOLT] Section-handling refactoring/overhaul
Simplify the logic of handling sections in BOLT. This change brings more direct and predictable mapping of BinarySection instances to sections in the inp
[BOLT] Section-handling refactoring/overhaul
Simplify the logic of handling sections in BOLT. This change brings more direct and predictable mapping of BinarySection instances to sections in the input and output files.
* Only sections from the input binary will have a non-null SectionRef. When a new section is created as a copy of the input section, its SectionRef is reset to null.
* RewriteInstance::getOutputSectionName() is removed as the section name in the output file is now defined by BinarySection::getOutputName().
* Querying BinaryContext for sections by name uses their original name. E.g., getUniqueSectionByName(".rodata") will return the original section even if the new .rodata section was created.
* Input file sections (with relocations applied) are emitted via MC with ".bolt.org" prefix. However, their name in the output binary is unchanged unless a new section with the same name is created.
* New sections are emitted internally with ".bolt.new" prefix if there's a name conflict with an input file section. Their original name is preserved in the output file.
* Section header string table is properly populated with section names that are actually used. Previously we used to include discarded section names as well.
* Fix the problem when dynamic relocations were propagated to a new section with a name that matched a section in the input binary. E.g., the new .rodata with jump tables had dynamic relocations from the original .rodata.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D135494
show more ...
|
#
4f158995 |
| 27-Aug-2022 |
Rafael Auler <rafaelauler@fb.com> |
[BOLT] Add pass to fix ambiguous memory references
This adds a round of checks to memory references, looking for incorrect references to jump table objects. Fix them by replacing the jump table refe
[BOLT] Add pass to fix ambiguous memory references
This adds a round of checks to memory references, looking for incorrect references to jump table objects. Fix them by replacing the jump table reference with another object reference + offset.
This solves bugs related to regular data references in code accidentally being bound to a jump table, and this reference being updated to a new (incorrect) location because we moved this jump table.
Fixes #55004
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D134098
show more ...
|
#
c683e281 |
| 03-Oct-2022 |
Maksim Panchenko <maks@fb.com> |
[BOLT] Properly set _end symbol
To properly set the "_end" symbol, we need to track the last allocatable address. Simply emitting "_end" at the end of some section is not sufficient since the order
[BOLT] Properly set _end symbol
To properly set the "_end" symbol, we need to track the last allocatable address. Simply emitting "_end" at the end of some section is not sufficient since the order of section allocation is unknown during the emission step.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D135121
show more ...
|
#
c9696322 |
| 18-Sep-2022 |
Kazu Hirata <kazu@google.com> |
[BOLT] Use x.empty() instead of llvm::empty(x) (NFC)
I'm planning to deprecate and eventually remove llvm::empty.
Note that no use of llvm::empty requires the ability of llvm::empty to determine th
[BOLT] Use x.empty() instead of llvm::empty(x) (NFC)
I'm planning to deprecate and eventually remove llvm::empty.
Note that no use of llvm::empty requires the ability of llvm::empty to determine the emptiness from begin/end only.
show more ...
|
#
9742c25b |
| 15-Sep-2022 |
Maksim Panchenko <maks@fb.com> |
[BOLT] Fix empty function emission in non-relocation mode
In non-relocation mode, every function is emitted in its own section. If a function is empty, RuntimeDyld will still allocate 1-byte section
[BOLT] Fix empty function emission in non-relocation mode
In non-relocation mode, every function is emitted in its own section. If a function is empty, RuntimeDyld will still allocate 1-byte section for the function and initialize it with zero. As a result, we will overwrite the first byte of the original function contents with zero. Such scenario can happen when the input function had only NOP instructions which BOLT removes by default. Even though such functions likely cause undefined behavior, it's better to preserve their contents.
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D133978
show more ...
|
#
553c2389 |
| 14-Sep-2022 |
revunov.denis@huawei.com <revunov.denis@huawei-partners.com> |
[BOLT] Preserve original LSDA type encoding
In non-pie binaries BOLT unconditionally converted type encoding from indirect to absptr, which broke std exceptions since pointers to their typeinfo were
[BOLT] Preserve original LSDA type encoding
In non-pie binaries BOLT unconditionally converted type encoding from indirect to absptr, which broke std exceptions since pointers to their typeinfo were only assigned at runtime in .data section. In this patch we preserve original encoding so that indirect remains indirect and can be resolved at runtime, and absolute remains absolute.
Reviewed By: rafauler, maksfb
Differential Revision: https://reviews.llvm.org/D132484
show more ...
|
#
3ac46f37 |
| 09-Sep-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Emit LSDA call sites for all fragments
For exception handling, LSDA call sites have to be emitted for each fragment individually. With this patch, call sites and respective LSDA symbols are g
[BOLT] Emit LSDA call sites for all fragments
For exception handling, LSDA call sites have to be emitted for each fragment individually. With this patch, call sites and respective LSDA symbols are generated and associated with each fragment of their function, such that they can be used by the emitter.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D132052
show more ...
|
#
07f63b0a |
| 25-Aug-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Allocate FunctionFragment on heap
This changes `FunctionFragment` from being used as a temporary proxy object to access basic block ranges to a heap-allocated object that can store fragment-s
[BOLT] Allocate FunctionFragment on heap
This changes `FunctionFragment` from being used as a temporary proxy object to access basic block ranges to a heap-allocated object that can store fragment-specific information.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132050
show more ...
|
#
5065134a |
| 24-Aug-2022 |
Fabian Parzefall <parzefall@fb.com> |
Revert "[BOLT] Allocate FunctionFragment on heap"
This reverts commit 101344af1af82d1633c773b718788eaa813d7f79.
|
#
101344af |
| 24-Aug-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Allocate FunctionFragment on heap
This changes `FunctionFragment` from being used as a temporary proxy object to access basic block ranges to a heap-allocated object that can store fragment-s
[BOLT] Allocate FunctionFragment on heap
This changes `FunctionFragment` from being used as a temporary proxy object to access basic block ranges to a heap-allocated object that can store fragment-specific information.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132050
show more ...
|
#
0f74d191 |
| 19-Aug-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Generate sections for multiple fragments
This patch adds support to generate any number of sections that are assigned to fragments of functions that are split more than two-way. With this, a
[BOLT] Generate sections for multiple fragments
This patch adds support to generate any number of sections that are assigned to fragments of functions that are split more than two-way. With this, a function's *nth* split fragment goes into section `.text.cold.n`.
This also changes `FunctionLayout::erase` to make sure, that there are no empty fragments at the end of the function. This sometimes happens when blocks are erased from the function. To avoid creating symbols pointing to these fragments, they need to be removed.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D130521
show more ...
|
#
a191ea7d |
| 19-Aug-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Make exception handling fragment aware
This adds basic fragment awareness in the exception handling passes and generates the necessary symbols for fragments.
Reviewed By: rafauler
Different
[BOLT] Make exception handling fragment aware
This adds basic fragment awareness in the exception handling passes and generates the necessary symbols for fragments.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D130520
show more ...
|
#
275e075c |
| 19-Aug-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Support passing fragments to code emission
This changes code emission such that it can emit specific function fragments instead of scanning all basic blocks of a function and just emitting th
[BOLT] Support passing fragments to code emission
This changes code emission such that it can emit specific function fragments instead of scanning all basic blocks of a function and just emitting those that are hot or cold.
To implement this, `FunctionLayout` explicitly distinguishes the "main" fragment (i.e. the one that contains the entry block and is associated with the original symbol) from "split" fragments. Additionally, `BinaryFunction` receives support for multiple cold symbols - one for each split fragment.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D130052
show more ...
|
#
fd159c23 |
| 17-Aug-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Fix ignored LP at fragment start
If the first block of a fragment is also a landing pad, the landing pad is not used if an exception is thrown. This is because the landing pad is at the same
[BOLT] Fix ignored LP at fragment start
If the first block of a fragment is also a landing pad, the landing pad is not used if an exception is thrown. This is because the landing pad is at the same start address that the corresponding LSDA describes. In that case, the offset in the call site records to refer to that landing pad is zero, and a zero offset is interpreted by the personality function as "no handler" and ignored.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D132053
show more ...
|
#
0f8412c1 |
| 17-Aug-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Add main fragment to function layout
Functions that do not contain any code still have to be emitted. This occurs on AArch64 where functions can consist only of a constant island. To support
[BOLT] Add main fragment to function layout
Functions that do not contain any code still have to be emitted. This occurs on AArch64 where functions can consist only of a constant island. To support fragment semantics in code emission, this commits adds a guaranteed main fragment to function layout. This fragment might be empty, but allows us omit checks whether the function is empty in most places.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D130051
show more ...
|
#
8477bc67 |
| 17-Jul-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Add function layout class
This patch adds a dedicated class to keep track of each function's layout. It also lays the groundwork for splitting functions into multiple fragments (as opposed to
[BOLT] Add function layout class
This patch adds a dedicated class to keep track of each function's layout. It also lays the groundwork for splitting functions into multiple fragments (as opposed to a strict hot/cold split).
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D129518
show more ...
|
#
eecd41aa |
| 11-Jul-2022 |
spupyrev <spupyrev@fb.com> |
Revert "Rebase: [Facebook] [MC] Introduce NeverAlign fragment type"
This reverts commit 6d0528636ae54fba75938a79ae7a98dfcc949f72.
|
#
6d052863 |
| 05-Aug-2021 |
Rafael Auler <rafaelauler@fb.com> |
Rebase: [Facebook] [MC] Introduce NeverAlign fragment type
Summary: Introduce NeverAlign fragment type.
The intended usage of this fragment is to insert it before a pair of macro-op fusion eligible
Rebase: [Facebook] [MC] Introduce NeverAlign fragment type
Summary: Introduce NeverAlign fragment type.
The intended usage of this fragment is to insert it before a pair of macro-op fusion eligible instructions. NeverAlign fragment ensures that the next fragment (first instruction in the pair) does not end at a given alignment boundary by emitting a minimal size nop if necessary.
In effect, it ensures that a pair of macro-fusible instructions is not split by a given alignment boundary, which is a precondition for macro-op fusion in modern Intel Cores (64B = cache line size, see Intel Architecture Optimization Reference Manual, 2.3.2.1 Legacy Decode Pipeline: Macro-Fusion).
This patch introduces functionality used by BOLT when emitting code with MacroFusion alignment already in place.
The use case is different from BoundaryAlign and instruction bundling: - BoundaryAlign can be extended to perform the desired alignment for the first instruction in the macro-op fusion pair (D101817). However, this approach has higher overhead due to reliance on relaxation as BoundaryAlign requires in the general case - see https://reviews.llvm.org/D97982#2710638. - Instruction bundling: the intent of NeverAlign fragment is to prevent the first instruction in a pair ending at a given alignment boundary, by inserting at most one minimum size nop. It's OK if either instruction crosses the cache line. Padding both instructions using bundles to not cross the alignment boundary would result in excessive padding. There's no straightforward way to request instruction bundling to avoid a given end alignment for the first instruction in the bundle.
LLVM: https://reviews.llvm.org/D97982
Manual rebase conflict history: https://phabricator.intern.facebook.com/D30142613
Test Plan: sandcastle
Reviewers: #llvm-bolt
Subscribers: phabricatorlinter
Differential Revision: https://phabricator.intern.facebook.com/D31361547
show more ...
|
#
d2c87699 |
| 24-Jun-2022 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Use range-based STL wrappers
Replace `std::` algorithms taking begin/end iterators with `llvm::` counterparts accepting ranges.
Reviewed By: rafauler
Differential Revision: https://rev
[BOLT][NFC] Use range-based STL wrappers
Replace `std::` algorithms taking begin/end iterators with `llvm::` counterparts accepting ranges.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D128154
show more ...
|