#
66e943b1 |
| 06-Jul-2023 |
Alexander Yermolovich <ayermolo@meta.com> |
[BOLT][DWARF] Fix for .debug_line with DWARF5
There was a bug in a code that pre-populated line string for a case where parts of .debug_line are not processed by BOLT, but copied as raw data. We wer
[BOLT][DWARF] Fix for .debug_line with DWARF5
There was a bug in a code that pre-populated line string for a case where parts of .debug_line are not processed by BOLT, but copied as raw data. We were not switching sections. This resulted in parts of the binary being over-written with debug data.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D154544
show more ...
|
#
05634f73 |
| 15-Jun-2023 |
Job Noorman <jnoorman@igalia.com> |
[BOLT] Move from RuntimeDyld to JITLink
RuntimeDyld has been deprecated in favor of JITLink. [1] This patch replaces all uses of RuntimeDyld in BOLT with JITLink.
Care has been taken to minimize th
[BOLT] Move from RuntimeDyld to JITLink
RuntimeDyld has been deprecated in favor of JITLink. [1] This patch replaces all uses of RuntimeDyld in BOLT with JITLink.
Care has been taken to minimize the impact on the code structure in order to ease the inspection of this (rather large) changeset. Since BOLT relied on the RuntimeDyld API in multiple places, this wasn't always possible though and I'll explain the changes in code structure first.
Design note: BOLT uses a JIT linker to perform what essentially is static linking. No linked code is ever executed; the result of linking is simply written back to an executable file. For this reason, I restricted myself to the use of the core JITLink library and avoided ORC as much as possible.
RuntimeDyld contains methods for loading objects (loadObject) and symbol lookup (getSymbol). Since JITLink doesn't provide a class with a similar interface, the BOLTLinker abstract class was added to implement it. It was added to Core since both the Rewrite and RuntimeLibs libraries make use of it. Wherever a RuntimeDyld object was used before, it was replaced with a BOLTLinker object.
There is one major difference between the RuntimeDyld and BOLTLinker interfaces: in JITLink, section allocation and the application of fixups (relocation) happens in a single call (jitlink::link). That is, there is no separate method like finalizeWithMemoryManagerLocking in RuntimeDyld. BOLT used to remap sections between allocating (loadObject) and linking them (finalizeWithMemoryManagerLocking). This doesn't work anymore with JITLink. Instead, BOLTLinker::loadObject accepts a callback that is called before fixups are applied which is used to remap sections.
The actual implementation of the BOLTLinker interface lives in the JITLinkLinker class in the Rewrite library. It's the only part of the BOLT code that should directly interact with the JITLink API.
For loading object, JITLinkLinker first creates a LinkGraph (jitlink::createLinkGraphFromObject) and then links it (jitlink::link). For the latter, it uses a custom JITLinkContext with the following properties: - Use BOLT's ExecutableFileMemoryManager. This one was updated to implement the JITLinkMemoryManager interface. Since BOLT never executes code, its finalization step is a no-op. - Pass config: don't use the default target passes since they modify DWARF sections in a way that seems incompatible with BOLT. Also run a custom pre-prune pass that makes sure sections without symbols are not pruned by JITLink. - Implement symbol lookup. This used to be implemented by BOLTSymbolResolver. - Call the section mapper callback before the final linking step. - Copy symbol values when the LinkGraph is resolved. Symbols are stored inside JITLinkLinker to ensure that later objects (i.e., instrumentation libraries) can find them. This functionality used to be provided by RuntimeDyld but I did not find a way to use JITLink directly for this.
Some more minor points of interest: - BinarySection::SectionID: JITLink doesn't have something equivalent to RuntimeDyld's Section IDs. Instead, sections can only be referred to by name. Hence, SectionID was updated to a string. - There seem to be no tests for Mach-O. I've tested a small hello-world style binary but not more than that. - On Mach-O, JITLink "normalizes" section names to include the segment name. I had to parse the section name back from this manually which feels slightly hacky.
[1] https://reviews.llvm.org/D145686#4222642
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D147544
show more ...
|
#
93ce0965 |
| 04-May-2023 |
Alexander Yermolovich <ayermolo@meta.com> |
[BOLT][DWARF] Fix handling of loclists_base without location accesses
There are CUs that have DW_AT_loclists_base, but no DW_AT_location in children DIEs. Pre-bolt it points to a valid offset. We we
[BOLT][DWARF] Fix handling of loclists_base without location accesses
There are CUs that have DW_AT_loclists_base, but no DW_AT_location in children DIEs. Pre-bolt it points to a valid offset. We were not updating it, so it ended up pointing in the middle of a list and caused LLDB to print out errors. Changed it to point to first location list. I don't think it should matter since there are no accesses to it anyway.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D149798
show more ...
|
#
8bbfac7b |
| 13-Apr-2023 |
Job Noorman <jnoorman@igalia.com> |
[BOLT][NFC] Fix UB due to unaligned load in DebugStrOffsetsWriter
The following tests fail when enabling UBSan due to an unaligned memory load:
> runtime error: load of misaligned address 0x6200000
[BOLT][NFC] Fix UB due to unaligned load in DebugStrOffsetsWriter
The following tests fail when enabling UBSan due to an unaligned memory load:
> runtime error: load of misaligned address 0x620000000643 for type > 'const uint32_t' (aka 'const unsigned int'), which requires 4 byte > alignment
BOLT :: AArch64/asm-func-debug.test BOLT :: AArch64/update-debug-reloc.test BOLT :: X86/asm-func-debug.test BOLT :: X86/dwarf5-df-dualcu.test BOLT :: X86/dwarf5-df-mono-dualcu.test BOLT :: X86/dwarf5-ftypes-dwp-input-dwo-output.test BOLT :: X86/dwarf5-locaddrx.test BOLT :: X86/dwarf5-split-dwarf4-monolithic.test BOLT :: X86/inlined-function-mixed.test BOLT :: non-empty-debug-line.test
This patch fixes this by using read32le for the load.
Reviewed By: ayermolo
Differential Revision: https://reviews.llvm.org/D148217
show more ...
|
#
be2f67c4 |
| 07-Feb-2023 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Replace anonymous namespace functions with static
Follow LLVM Coding Standards guideline on using anonymous namespaces (https://llvm.org/docs/CodingStandards.html#anonymous-namespaces) a
[BOLT][NFC] Replace anonymous namespace functions with static
Follow LLVM Coding Standards guideline on using anonymous namespaces (https://llvm.org/docs/CodingStandards.html#anonymous-namespaces) and use `static` modifier for function definitions.
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D143124
show more ...
|
#
f230099c |
| 24-Jan-2023 |
Alexander Yermolovich <ayermolo@fb.com> |
[BOLT][DWARF] Reuse entries in .debug_addr when not modified
In some binaries produced with ThinLTO there are CUs that share entry in .debug_addr. Before we would generate a new entry for each. Whic
[BOLT][DWARF] Reuse entries in .debug_addr when not modified
In some binaries produced with ThinLTO there are CUs that share entry in .debug_addr. Before we would generate a new entry for each. Which lead to binary size increase. This changes the behavior so that we re-use entries in .debug_addr.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D142425
show more ...
|
#
124ca880 |
| 20-Jan-2023 |
Alexander Yermolovich <ayermolo@meta.com> |
[BOLT][DWARF] Change loclist encoding to use base_addrx
Doing the same thing as for rangelists. Changing loclists to use base_addrx, it slightly increases .debug_loclists, but reduces .debug_addr se
[BOLT][DWARF] Change loclist encoding to use base_addrx
Doing the same thing as for rangelists. Changing loclists to use base_addrx, it slightly increases .debug_loclists, but reduces .debug_addr section.
| section | clang-16.bolt.base | clang-16.bolt | raw | % | | debug_loclists | 198208 | 203398 | 5190 | 102% | | .debug_addr | 14415808 | 14351448 | -64360 |99.5% |
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D141969
show more ...
|
#
7fc79340 |
| 11-Jan-2023 |
Alexander Yermolovich <ayermolo@fb.com> |
[llvm][dwwarf] Change CU/TU index to 64-bit
Changed contribution data structure to 64 bit. I added the 32bit and 64bit accessors to make it explicit where we use 32bit and where we use 64bit. Also t
[llvm][dwwarf] Change CU/TU index to 64-bit
Changed contribution data structure to 64 bit. I added the 32bit and 64bit accessors to make it explicit where we use 32bit and where we use 64bit. Also to make sure sure we catch all the cases where this data structure is used.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D139379
show more ...
|
#
6a4a697e |
| 11-Jan-2023 |
Alexander Yermolovich <ayermolo@fb.com> |
Revert "[llvm][dwwarf] Change CU/TU index to 64-bit"
This reverts commit fa3fa4d0d42326005dfd5887bf047b86904d3be6.
|
#
fa3fa4d0 |
| 10-Jan-2023 |
Alexander Yermolovich <ayermolo@fb.com> |
[llvm][dwwarf] Change CU/TU index to 64-bit
Changed contribution data structure to 64 bit. I added the 32bit and 64bit accessors to make it explicit where we use 32bit and where we use 64bit. Also t
[llvm][dwwarf] Change CU/TU index to 64-bit
Changed contribution data structure to 64 bit. I added the 32bit and 64bit accessors to make it explicit where we use 32bit and where we use 64bit. Also to make sure sure we catch all the cases where this data structure is used.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D139379
show more ...
|
#
e22ff52c |
| 06-Jan-2023 |
Alexander Yermolovich <ayermolo@fb.com> |
[BOLT][DWARF] Change rangelists to use DW_RLE_offset_pair
Before we always used DW_RLE_startx_length. This is not very efficient and leads to bigger .debug_addr section. Changed it to use DW_RLE_bas
[BOLT][DWARF] Change rangelists to use DW_RLE_offset_pair
Before we always used DW_RLE_startx_length. This is not very efficient and leads to bigger .debug_addr section. Changed it to use DW_RLE_base_addressx/DW_RLE_offset_pair.
clang-16 build in debug mode llvm-bolt ran on it with --update-debug-sections | section | before | after | diff | % decrease | | .debug_rnglists | 32732292 | 31986051 | -746241 | 2.3% | | .debug_addr | 14415808 | 14184128 | -231680 | 1.6% |
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D140439
show more ...
|
#
f40d25dd |
| 04-Jan-2023 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Use llvm::reverse
Use llvm::reverse instead of `for (auto I = rbegin(), E = rend(); I != E; ++I)`
Reviewed By: #bolt, rafauler
Differential Revision: https://reviews.llvm.org/D140516
|
#
f2f8f709 |
| 07-Dec-2022 |
Alexander Yermolovich <ayermolo@fb.com> |
Revert "[llvm][dwwarf] Change CU/TU index to 64-bit"
This reverts commit 5ebd28f3e56c00a739fda46c72c9e0f6528add87.
|
#
f7a21317 |
| 07-Dec-2022 |
Alexander Yermolovich <ayermolo@fb.com> |
[BOLT][DWARF] Don't create extra .debug_str_offsets contributions
With ThinLTO mutliple CUs can share the same .debug_str_offsets contribution. We were creating a new one for each CU. This lead to a
[BOLT][DWARF] Don't create extra .debug_str_offsets contributions
With ThinLTO mutliple CUs can share the same .debug_str_offsets contribution. We were creating a new one for each CU. This lead to a binary size increase.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D139214
show more ...
|
#
5ebd28f3 |
| 06-Dec-2022 |
Alexander Yermolovich <ayermolo@fb.com> |
[llvm][dwwarf] Change CU/TU index to 64-bit
Summary:
Changed contribution data structure to 64 bit. I added the 32bit and 64bit accessors to make it explicit where we use 32bit and where we use 64b
[llvm][dwwarf] Change CU/TU index to 64-bit
Summary:
Changed contribution data structure to 64 bit. I added the 32bit and 64bit accessors to make it explicit where we use 32bit and where we use 64bit. Also to make sure sure we catch all the cases where this data structure is used.
show more ...
|
#
370e4761 |
| 06-Dec-2022 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Use std::optional for findAttributeInfo
LLVM started switching from `llvm::Optional` to `std::optional`: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getval
[BOLT][NFC] Use std::optional for findAttributeInfo
LLVM started switching from `llvm::Optional` to `std::optional`: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716/11
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D139259
show more ...
|
#
89fab98e |
| 05-Dec-2022 |
Fangrui Song <i@maskray.me> |
[DebugInfo] llvm::Optional => std::optional
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
|
#
f4c16c44 |
| 04-Dec-2022 |
Fangrui Song <i@maskray.me> |
[MC] llvm::Optional => std::optional
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
|
#
e324a80f |
| 03-Dec-2022 |
Kazu Hirata <kazu@google.com> |
[BOLT] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of ma
[BOLT] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional.
This is part of an effort to migrate from llvm::Optional to std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
show more ...
|
#
a0c7ca8a |
| 03-Sep-2022 |
Kazu Hirata <kazu@google.com> |
[BOLT] Use range-based for loops (NFC)
LLVM Coding Standards discourage for_each unless callable objects already exist.
|
#
53113515 |
| 12-Aug-2022 |
Fangrui Song <i@maskray.me> |
[BOLT] Use Optional::emplace to avoid move assignment. NFC
|
#
c4302e4f |
| 27-Jun-2022 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Use llvm::less_first
Follow the case of https://reviews.llvm.org/D126068 and simplify call sites with `llvm::less_first`.
Reviewed By: rafauler
Differential Revision: https://reviews.l
[BOLT][NFC] Use llvm::less_first
Follow the case of https://reviews.llvm.org/D126068 and simplify call sites with `llvm::less_first`.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D128242
show more ...
|
#
d2c87699 |
| 24-Jun-2022 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Use range-based STL wrappers
Replace `std::` algorithms taking begin/end iterators with `llvm::` counterparts accepting ranges.
Reviewed By: rafauler
Differential Revision: https://rev
[BOLT][NFC] Use range-based STL wrappers
Replace `std::` algorithms taking begin/end iterators with `llvm::` counterparts accepting ranges.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D128154
show more ...
|
#
adf4142f |
| 11-Jun-2022 |
Fangrui Song <i@maskray.me> |
[MC] De-capitalize SwitchSection. NFC
Add SwitchSection to return switchSection. The API will be removed soon.
|
#
1c6dc43d |
| 08-Jun-2022 |
Alexander Yermolovich <ayermolo@fb.com> |
[BOLT]DWARF] Eagerly write out loclists
Taking advantage of us being able to re-write .debug_info to reduce memory footprint loclists. Writing out loc-list as they are added, similar to how we handl
[BOLT]DWARF] Eagerly write out loclists
Taking advantage of us being able to re-write .debug_info to reduce memory footprint loclists. Writing out loc-list as they are added, similar to how we handle ranges.
Collected on clang-14 trunk 4:41.20 real, 389.50 user, 59.50 sys, 0 amem, 38412532 mmem 4:30.08 real, 376.10 user, 63.75 sys, 0 amem, 38477844 mmem 4:25.58 real, 373.76 user, 54.71 sys, 0 amem, 38439660 mmem diff 4:34.66 real, 392.83 user, 57.73 sys, 0 amem, 38382560 mmem 4:35.96 real, 377.70 user, 58.62 sys, 0 amem, 38255840 mmem 4:27.61 real, 390.18 user, 57.02 sys, 0 amem, 38223224 mmem
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D126999
show more ...
|