#
3c357a49 |
| 17-Dec-2024 |
Alexander Yermolovich <43973793+ayermolo@users.noreply.github.com> |
[BOLT] Add support for safe-icf (#116275)
Identical Code Folding (ICF) folds functions that are identical into one
function, and updates symbol addresses to the new address. This reduces
the size
[BOLT] Add support for safe-icf (#116275)
Identical Code Folding (ICF) folds functions that are identical into one
function, and updates symbol addresses to the new address. This reduces
the size of a binary, but can lead to problems. For example when
function pointers are compared. This can be done either explicitly in
the code or generated IR by optimization passes like Indirect Call
Promotion (ICP). After ICF what used to be two different addresses
become the same address. This can lead to a different code path being
taken.
This is where safe ICF comes in. Linker (LLD) does it using address
significant section generated by clang. If symbol is in it, or an object
doesn't have this section symbols are not folded.
BOLT does not have the information regarding which objects do not have
this section, so can't re-use this mechanism.
This implementation scans code section and conservatively marks
functions symbols as unsafe. It treats symbols as unsafe if they are
used in non-control flow instruction. It also scans through the data
relocation sections and does the same for relocations that reference a
function symbol. The latter handles the case when function pointer is
stored in a local or global variable, etc. If a relocation address
points within a vtable these symbols are skipped.
show more ...
|
#
b560b87b |
| 13-Dec-2024 |
Maksim Panchenko <maks@fb.com> |
[BOLT] Clean up jump table handling in non-reloc mode. NFCI (#119614)
This change affects non-relocation mode only. Prior to having
CheckLargeFunctions pass, we could have emitted code for function
[BOLT] Clean up jump table handling in non-reloc mode. NFCI (#119614)
This change affects non-relocation mode only. Prior to having
CheckLargeFunctions pass, we could have emitted code for functions that
was discarded at the end due to size limitations. Since we didn't know
at the time of emission if the code would be discarded or not, we had to
emit jump tables in separate sections and handle them separately.
However, now we always run CheckLargeFunctions and make sure all emitted
code is used. Thus, we can get rid of the special jump table handling.
show more ...
|
#
ceb7214b |
| 12-Dec-2024 |
Kristof Beyls <kristof.beyls@arm.com> |
[BOLT] Introduce binary analysis tool based on BOLT (#115330)
This initial commit does not add any specific binary analyses yet, it
merely contains the boilerplate to introduce a new BOLT-based too
[BOLT] Introduce binary analysis tool based on BOLT (#115330)
This initial commit does not add any specific binary analyses yet, it
merely contains the boilerplate to introduce a new BOLT-based tool.
This basically combines the 4 first patches from the prototype pac-ret
and stack-clash binary analyzer discussed in RFC
https://discourse.llvm.org/t/rfc-bolt-based-binary-analysis-tool-to-verify-correctness-of-security-hardening/78148
and published at
https://github.com/llvm/llvm-project/compare/main...kbeyls:llvm-project:bolt-gadget-scanner-prototype
The introduction of such a BOLT-based binary analysis tool was proposed
and discussed in at least the following places:
- The RFC pointed to above
- EuroLLVM 2024 round table
https://discourse.llvm.org/t/summary-of-bolt-as-a-binary-analysis-tool-round-table-at-eurollvm/78441
The round table showed quite a few people interested in being able to
build a custom binary analysis quickly with a tool like this.
- Also at the US LLVM dev meeting a few weeks ago, I heard interest from
a few people, asking when the tool would be available upstream.
- The presentation "Adding Pointer Authentication ABI support for your
ELF platform"
(https://llvm.swoogo.com/2024devmtg/session/2512720/adding-pointer-authentication-abi-support-for-your-elf-platform)
explicitly mentioned interest to extend the prototype tool to verify
correct implementation of pauthabi.
show more ...
|
#
2ccf7ed2 |
| 05-Dec-2024 |
Jared Wyles <jared.wyles@gmail.com> |
[JITLink] Switch to SymbolStringPtr for Symbol names (#115796)
Use SymbolStringPtr for Symbol names in LinkGraph. This reduces string interning
on the boundary between JITLink and ORC, and allows p
[JITLink] Switch to SymbolStringPtr for Symbol names (#115796)
Use SymbolStringPtr for Symbol names in LinkGraph. This reduces string interning
on the boundary between JITLink and ORC, and allows pointer comparisons (rather
than string comparisons) between Symbol names. This should improve the
performance and readability of code that bridges between JITLink and ORC (e.g.
ObjectLinkingLayer and ObjectLinkingLayer::Plugins).
To enable use of SymbolStringPtr a std::shared_ptr<SymbolStringPool> is added to
LinkGraph and threaded through to its construction sites in LLVM and Bolt. All
LinkGraphs that are to have symbol names compared by pointer equality must point
to the same SymbolStringPool instance, which in ORC sessions should be the pool
attached to the ExecutionSession.
---------
Co-authored-by: Lang Hames <lhames@gmail.com>
show more ...
|
#
b5ed375f |
| 28-Nov-2024 |
Peter Waller <peter.waller@arm.com> |
[BOLT] Skip _init; avoiding GOT breakage for static binaries (#117751)
_init is used during startup of binaires. Unfortunately, its
address can be shared (at least on AArch64 glibc static binaries)
[BOLT] Skip _init; avoiding GOT breakage for static binaries (#117751)
_init is used during startup of binaires. Unfortunately, its
address can be shared (at least on AArch64 glibc static binaries) with a
data
reference that lives in the GOT. The GOT rewriting is currently unable
to distinguish between data addresses and function addresses. This leads
to the data address being incorrectly rewritten, causing a crash on
startup of the binary:
Unexpected reloc type in static binary.
To avoid this, don't consider _init for being moved, by skipping it.
~We could add further conditions to narrow the skipped case for known
crashes, but as a straw man I thought it'd be best to keep the condition
as simple as possible and see if there any objections to this.~
(Edit: this broke the test
bolt/test/runtime/X86/retpoline-synthetic.test,
because _init was skipped from the retpoline pass and it has an indirect
call in it, so I include a check for static binaries now, which avoids
the test failure,
but perhaps this could/should be narrowed further?)
For now, skip _init for static binaries on any architecture; we could
add further conditions to narrow the skipped case for known crashes, but
as a straw man I thought it'd be best to keep the condition as simple as
possible and see if there any objections to this.
Updates #100096.
show more ...
|
#
99655322 |
| 19-Nov-2024 |
Maksim Panchenko <maks@fb.com> |
[BOLT] Overwrite .eh_frame and .gcc_except_table (#116755)
Under --use-old-text or --strict, we completely rewrite contents of EH
frames and exception tables sections. If new contents of either sec
[BOLT] Overwrite .eh_frame and .gcc_except_table (#116755)
Under --use-old-text or --strict, we completely rewrite contents of EH
frames and exception tables sections. If new contents of either section
do not exceed the size of the original section, rewrite the section
in-place.
show more ...
|
#
08ef9396 |
| 19-Nov-2024 |
Maksim Panchenko <maks@fb.com> |
[BOLT] Overwrite .eh_frame_hdr in-place (#116730)
If the new EH frame header can fit into the original .eh_frame_hdr
section, overwrite it in-place and pad with zeroes.
|
#
1b8e0cf0 |
| 13-Nov-2024 |
Maksim Panchenko <maks@fb.com> |
[BOLT] Never emit "large" functions (#115974)
"Large" functions are functions that are too big to fit into their
original slots after code modifications. CheckLargeFunctions pass is
designed to pr
[BOLT] Never emit "large" functions (#115974)
"Large" functions are functions that are too big to fit into their
original slots after code modifications. CheckLargeFunctions pass is
designed to prevent such functions from emission. Extend this pass to
work with functions with constant islands.
Now that CheckLargeFunctions covers all functions, it guarantees that we
will never see such functions after code emission on all platforms
(previously it was guaranteed on x86 only). Hence, we can get rid of
RewriteInstance extensions that were meant to support "large" functions.
show more ...
|
#
16cd5cdf |
| 07-Nov-2024 |
Jacob Bramley <jacob.bramley@arm.com> |
[BOLT] Ignore AArch64 markers outside their sections. (#74106)
AArch64 uses $d and $x symbols to delimit data embedded in code.
However, sometimes we see $d symbols, typically in .eh_frame, with
a
[BOLT] Ignore AArch64 markers outside their sections. (#74106)
AArch64 uses $d and $x symbols to delimit data embedded in code.
However, sometimes we see $d symbols, typically in .eh_frame, with
addresses that belong to different sections. These occasionally fall
inside .text functions and cause BOLT to stop disassembling, which in
turn causes DWARF CFA processing to fail.
As a workaround, we just ignore symbols with addresses outside the
section they belong to. This behaviour is consistent with objdump and
similar tools.
show more ...
|
#
0a5edb4d |
| 23-Sep-2024 |
Youngsuk Kim <youngsuk.kim@hpe.com> |
[bolt] Don't call llvm::raw_string_ostream::flush() (NFC)
Don't call raw_string_ostream::flush(), which is essentially a no-op. As specified in the docs, raw_string_ostream is always unbuffered. ( 6
[bolt] Don't call llvm::raw_string_ostream::flush() (NFC)
Don't call raw_string_ostream::flush(), which is essentially a no-op. As specified in the docs, raw_string_ostream is always unbuffered. ( 65b13610a5226b84889b923bae884ba395ad084d for further reference )
show more ...
|
#
6d216fb7 |
| 23-Sep-2024 |
Kristof Beyls <kristof.beyls@arm.com> |
[perf2bolt] Improve heuristic to map in-process addresses to specific… (#109397)
… segments in Elf binary.
The heuristic is improved by also taking into account that only
executable segments sho
[perf2bolt] Improve heuristic to map in-process addresses to specific… (#109397)
… segments in Elf binary.
The heuristic is improved by also taking into account that only
executable segments should contain instructions.
Fixes #109384.
show more ...
|
#
31ac3d09 |
| 23-Sep-2024 |
sinan <sinan.lin@linux.alibaba.com> |
[BOLT] Add .iplt support to x86 (#106513)
Add X86 support for parsing .iplt section and symbols.
|
#
e49549ff |
| 08-Aug-2024 |
Davide Italiano <davidino@fb.com> |
Revert "[BOLT] Abort on out-of-section symbols in GOT (#100801)"
This reverts commit a4900f0d936f0e86bbd04bd9de4291e1795f1768.
|
#
62e894e0 |
| 07-Aug-2024 |
Sayhaan Siddiqui <49014204+sayhaan@users.noreply.github.com> |
[BOLT][DWARF][NFC] Move Arch assignment out of createBinaryContext (#102054)
Moves the assignment of Arch out of createBinaryContext to prevent data
races when parallelized.
|
#
a4900f0d |
| 07-Aug-2024 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT] Abort on out-of-section symbols in GOT (#100801)
This patch aborts BOLT execution if it finds out-of-section (section
end) symbol in GOT table. In order to handle such situations properly in
[BOLT] Abort on out-of-section symbols in GOT (#100801)
This patch aborts BOLT execution if it finds out-of-section (section
end) symbol in GOT table. In order to handle such situations properly in
future, we would need to have an arch-dependent way to analyze
relocations or its sequences, e.g., for ARM it would probably be ADRP +
LDR analysis in order to get GOT entry address. Currently, it is also
challenging because GOT-related relocation symbols are replaced to
__BOLT_got_zero. Anyway, it seems to be quite a rare case, which seems
to be only? related to static binaries. For the most part, it seems that
it should be handled on the linker stage, since static binary should not
have GOT table at all. LLD linker with relaxations enabled would replace
instruction addresses from GOT directly to target symbols, which
eliminates the problem.
Anyway, in order to achieve detection of such cases, this patch fixes a
few things in BOLT:
1. For the end symbols, we're now using the section provided by ELF
binary. Previously it would be tied with a wrong section found by symbol
address.
2. The end symbols would have limited registration we would only
add them in name->data GlobalSymbols map, since using address->data
BinaryDataMap map would likely be impossible due to address duality of
such symbols.
3. The outdated BD->getSection (currently returning refence, not
pointer) check in postProcessSymbolTable is replaced by getSize check in
order to allow zero-sized top-level symbols if they are located in
zero-sized sections. For the most part, such things could only be found
in tests, but I don't see a reason not to handle such cases.
4. Updated section-end-sym test and removed x86_64 requirement since
there is no reason for this (tested on aarch64 linux)
The test was provided by peterwaller-arm (thank you) in #100096 and
slightly modified by me.
show more ...
|
#
097ddd35 |
| 07-Aug-2024 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT] Fix relocations handling (#100890)
After porting BOLT to RISCV some of the relocations were broken on both
AArch64 and X86.
On AArch64 the example of broken relocations would be GOT, during
[BOLT] Fix relocations handling (#100890)
After porting BOLT to RISCV some of the relocations were broken on both
AArch64 and X86.
On AArch64 the example of broken relocations would be GOT, during
handling them, we should replace the symbol to __BOLT_got_zero in order
to address GOT entry, not the symbol that addresses this entry. This is
done further in code, so it is too early to add rel here.
On X86 it is a mistake to add relocations without addend. This is the
exact problem that is raised on #97937. Due to different code generation
I had to use gcc-generated yaml test, since with clang I wasn't able to
reproduce problem.
Added tests for both architectures and made the problematic condition
riscV-specific.
show more ...
|
#
6c8933e1 |
| 07-Aug-2024 |
sinan <sinan.lin@linux.alibaba.com> |
[BOLT] Skip PLT search for zero-value weak reference symbols (#69136)
Take a common weak reference pattern for example
```
__attribute__((weak)) void undef_weak_fun();
if (&undef_
[BOLT] Skip PLT search for zero-value weak reference symbols (#69136)
Take a common weak reference pattern for example
```
__attribute__((weak)) void undef_weak_fun();
if (&undef_weak_fun)
undef_weak_fun();
```
In this case, an undefined weak symbol `undef_weak_fun` has an address
of zero, and Bolt incorrectly changes the relocation for the
corresponding symbol to symbol@PLT, leading to incorrect runtime
behavior.
show more ...
|
#
734c0488 |
| 07-Aug-2024 |
sinan <sinan.lin@linux.alibaba.com> |
[BOLT] Support map other function entry address (#101466)
Allow BOLT to map the old address to a new binary address if the old
address is the entry of the function.
|
#
3f51bec4 |
| 01-Aug-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Print timers in perf2bolt invocation
When BOLT is run in AggregateOnly mode (perf2bolt), it exits with code zero so destructors are not run thus TimerGroup never prints the timers.
Add
[BOLT][NFC] Print timers in perf2bolt invocation
When BOLT is run in AggregateOnly mode (perf2bolt), it exits with code zero so destructors are not run thus TimerGroup never prints the timers.
Add explicit printing just before the exit to honor options requesting timers (`--time-rewrite`, `--time-aggr`).
Test Plan: updated bolt/test/timers.c
Reviewers: ayermolo, maksfb, rafaelauler, dcci
Reviewed By: dcci
Pull Request: https://github.com/llvm/llvm-project/pull/101270
show more ...
|
#
fb97b4f9 |
| 01-Aug-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Add timers for MetadataManager invocations
Test Plan: added bolt/test/timers.c
Reviewers: ayermolo, maksfb, rafaelauler, dcci
Reviewed By: dcci
Pull Request: https://github.com/llvm
[BOLT][NFC] Add timers for MetadataManager invocations
Test Plan: added bolt/test/timers.c
Reviewers: ayermolo, maksfb, rafaelauler, dcci
Reviewed By: dcci
Pull Request: https://github.com/llvm/llvm-project/pull/101267
show more ...
|
#
9b007a19 |
| 19-Jul-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT] Expose pseudo probe function checksum and GUID (#99389)
Add a BinaryFunction field for pseudo probe function GUID.
Populate it during pseudo probe section parsing, and emit it in YAML
profi
[BOLT] Expose pseudo probe function checksum and GUID (#99389)
Add a BinaryFunction field for pseudo probe function GUID.
Populate it during pseudo probe section parsing, and emit it in YAML
profile (both regular and BAT), along with function checksum.
To be used for stale function matching.
Test Plan: update pseudoprobe-decoding-inline.test
show more ...
|
#
51122fb4 |
| 17-Jul-2024 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT][NFC] Fix build (#99361)
On clang 14 the build is failing with:
reference to local binding 'ParentName' declared in enclosing function
'llvm::bolt::RewriteInstance::registerFragments'
|
#
3fe50b6d |
| 17-Jul-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT] Store FileSymRefs in a multimap
With aggressive ICF, it's possible to have different local symbols (under different FILE symbols) to be mapped to the same address.
FileSymRefs only keeps a s
[BOLT] Store FileSymRefs in a multimap
With aggressive ICF, it's possible to have different local symbols (under different FILE symbols) to be mapped to the same address.
FileSymRefs only keeps a single SymbolRef per address, which prevents fragment matching from finding the correct symbol to perform parent function lookup.
Work around this issue by switching FileSymRefs to a multimap. In future, uses of FileSymRefs can be replaced with SortedSymbols which keeps essentially the same information.
Test Plan: added ambiguous_fragment.test
Reviewers: dcci, ayermolo, maksfb, rafaelauler
Reviewed By: rafaelauler
Pull Request: https://github.com/llvm/llvm-project/pull/98992
show more ...
|
#
344228eb |
| 02-Jul-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT] Drop macro-fusion alignment (#97358)
9d0754ada5dbbc0c009bcc2f7824488419cc5530 dropped MC support required for
optimal macro-fusion alignment in BOLT. Remove the support in BOLT as
performan
[BOLT] Drop macro-fusion alignment (#97358)
9d0754ada5dbbc0c009bcc2f7824488419cc5530 dropped MC support required for
optimal macro-fusion alignment in BOLT. Remove the support in BOLT as
performance measurements with large binaries didn't show a significant
improvement.
Test Plan:
macro-fusion alignment was never upstreamed, so no upstream tests are
affected.
show more ...
|
#
e3e0df39 |
| 02-Jul-2024 |
Fangrui Song <i@maskray.me> |
[BOLT] Replace the MCAsmLayout parameter with MCAssembler
Continue the MCAsmLayout removal work started by 67957a45ee1ec42ae1671cdbfa0d73127346cc95.
|