#
671095b4 |
| 16-Dec-2024 |
Nicholas <45984215+liusy58@users.noreply.github.com> |
[BOLT][AArch64] Check Last Element Instead of Returning `nullptr` in `lookupStubFromGroup` (#114015)
The current implementation of `lookupStubFromGroup` is incorrect. The
function is intended to fi
[BOLT][AArch64] Check Last Element Instead of Returning `nullptr` in `lookupStubFromGroup` (#114015)
The current implementation of `lookupStubFromGroup` is incorrect. The
function is intended to find and return the closest stub using
`lower_bound`, which identifies the first element in a sorted range that
is not less than a specified value. However, if such an element is not
found within `Candidates` and the list is not empty, the function
returns `nullptr`. Instead, it should check whether the last element
satisfies the condition.
show more ...
|
#
accd8f98 |
| 07-Nov-2024 |
Kazu Hirata <kazu@google.com> |
[BOLT] Fix a warning
This patch:
bolt/lib/Passes/LongJmp.cpp:830:14: error: variable 'NumIterations' set but not used [-Werror,-Wunused-but-set-variable]
|
#
49ee6069 |
| 07-Nov-2024 |
Maksim Panchenko <maks@fb.com> |
[BOLT][AArch64] Add support for compact code model (#112110)
Add `--compact-code-model` option that executes alternative branch
relaxation with an assumption that the resulting binary has less than
[BOLT][AArch64] Add support for compact code model (#112110)
Add `--compact-code-model` option that executes alternative branch
relaxation with an assumption that the resulting binary has less than
128MB of code. The relaxation is done in `relaxLocalBranches()`, which
operates on a function level and executes on multiple functions in
parallel.
Running the new option on AArch64 Clang binary produces slightly smaller
code and the relaxation finishes in about 1/10th of the time.
Note that the new `.text` has to be smaller than 128MB, *and* `.plt` has
to be closer than 128MB to `.text`.
show more ...
|
#
cb9bacf5 |
| 17-Oct-2024 |
Paschalis Mpeis <paschalis.mpeis@arm.com> |
[AArch64][BOLT] Ensure tentative code layout for cold BBs runs. (#96609)
When split functions is used, BOLT may skip tentative code layout
estimation in some cases, like:
- when there is no profil
[AArch64][BOLT] Ensure tentative code layout for cold BBs runs. (#96609)
When split functions is used, BOLT may skip tentative code layout
estimation in some cases, like:
- when there is no profile data for some blocks (ie cold blocks)
- when there are cold functions in lite mode
- when skip functions is used
However, when rewriting the binary we still need to compute PC-relative
distances between hot and cold basic blocks. Without cold layout
estimation, BOLT uses '0x0' as the address of the first cold block,
leading to incorrect estimations of any PC-relative addresses.
This affects large binaries as the relaxStub method expands more
branches than necessary using the short-jump sequence, at it wrongly
believes it has exceeded the branch distance boundary.
This increases code size with both a larger and slower sequence;
however,
performance regression is expected to be minimal since this only affects
any called cold code.
Example of such an unnecessary relaxation:
from:
```armasm
b .Ltmp1234
```
to:
```armasm
adrp x16, .Ltmp1234
add x16, x16, :lo12:.Ltmp1234
br x16
```
show more ...
|
#
52cf0711 |
| 12-Feb-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Log through JournalingStreams (#81524)
Make core BOLT functionality more friendly to being used as a
library instead of in our standalone driver llvm-bolt. To
accomplish this, we augme
[BOLT][NFC] Log through JournalingStreams (#81524)
Make core BOLT functionality more friendly to being used as a
library instead of in our standalone driver llvm-bolt. To
accomplish this, we augment BinaryContext with journaling streams
that are to be used by most BOLT code whenever something needs to
be logged to the screen. Users of the library can decide if logs
should be printed to a file, no file or to the screen, as
before. To illustrate this, this patch adds a new option
`--log-file` that allows the user to redirect BOLT logging to a
file on disk or completely hide it by using
`--log-file=/dev/null`. Future BOLT code should now use
`BinaryContext::outs()` for printing important messages instead of
`llvm::outs()`. A new test log.test enforces this by verifying that
no strings are print to screen once the `--log-file` option is
used.
In previous patches we also added a new BOLTError class to report
common and fatal errors, so code shouldn't call exit(1) now. To
easily handle problems as before (by quitting with exit(1)),
callers can now use
`BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code
needs to deal with BOLT errors. To test this, we have fatal.s
that checks we are correctly quitting and printing a fatal error
to the screen.
Because this is a significant change by itself, not all code was
yet ported. Code from Profiler libs (DataAggregator and friends)
still print errors directly to screen.
Co-authored-by: Rafael Auler <rafaelauler@fb.com>
Test Plan: NFC
show more ...
|
#
13d60ce2 |
| 12-Feb-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Propagate BOLTErrors from Core, RewriteInstance, and passes (2/2) (#81523)
As part of the effort to refactor old error handling code that
would directly call exit(1), in this patch cont
[BOLT][NFC] Propagate BOLTErrors from Core, RewriteInstance, and passes (2/2) (#81523)
As part of the effort to refactor old error handling code that
would directly call exit(1), in this patch continue the migration
on libCore, libRewrite and libPasses to use the new BOLTError
class whenever a failure occurs.
Test Plan: NFC
Co-authored-by: Rafael Auler <rafaelauler@fb.com>
show more ...
|
#
a5f3d1a8 |
| 12-Feb-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Return Error from BinaryFunctionPass::runOnFunctions (#81521)
As part of the effort to refactor old error handling code that
would directly call exit(1), in this patch we change the
in
[BOLT][NFC] Return Error from BinaryFunctionPass::runOnFunctions (#81521)
As part of the effort to refactor old error handling code that
would directly call exit(1), in this patch we change the
interface to `BinaryFunctionPass` to return an Error on
`runOnFunctions()`. This gives passes the ability to report a
serious problem to the caller (RewriteInstance class), so the
caller may decide how to best handle the exceptional situation.
Co-authored-by: Rafael Auler <rafaelauler@fb.com>
Test Plan: NFC
show more ...
|
#
fdb13cf5 |
| 11-Dec-2023 |
sinan <sinan.lin@linux.alibaba.com> |
[BOLT] Fix local out-of-range stub issue in LongJmp (#73918)
If a local stub is out-of-range, at LongJmp we will try to find another
local stub first. However, The original implementation do not wo
[BOLT] Fix local out-of-range stub issue in LongJmp (#73918)
If a local stub is out-of-range, at LongJmp we will try to find another
local stub first. However, The original implementation do not work as
expected and it leads to an infinite loop between replaceTargetWithStub
and fixBranches.
After this patch, we first convert the target of BB back to the target
of the local stub, and then look up for other valid local stubs and so
on.
show more ...
|
#
b7944f7c |
| 12-Oct-2023 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT] Return proper minimal alignment from BF (#67707)
Currently minimal alignment of function is hardcoded to 2 bytes.
Add 2 more cases:
1. In case BF is data in code return the alignment of CI
[BOLT] Return proper minimal alignment from BF (#67707)
Currently minimal alignment of function is hardcoded to 2 bytes.
Add 2 more cases:
1. In case BF is data in code return the alignment of CI as minimal
alignment
2. For aarch64 and riscv platforms return the minimal value of 4 (added
test for aarch64)
Otherwise fallback to returning the 2 as it previously was.
show more ...
|
#
bae41ff5 |
| 07-Oct-2023 |
qijitao <119509978+qijitao@users.noreply.github.com> |
[BOLT] Fix long jump negative offset issue. (#67132)
In instruction encoding, the relative offset address of the PC is
signed, that is, the number of positive offset bits and the number of
negativ
[BOLT] Fix long jump negative offset issue. (#67132)
In instruction encoding, the relative offset address of the PC is
signed, that is, the number of positive offset bits and the number of
negative offset bits is asymmetric. Therefore, the maximum and minimum
values are used to replace Mask to determine the boundary.
Co-authored-by: qijitao <qijitao@hisilicon.com>
show more ...
|
#
be2f67c4 |
| 07-Feb-2023 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Replace anonymous namespace functions with static
Follow LLVM Coding Standards guideline on using anonymous namespaces (https://llvm.org/docs/CodingStandards.html#anonymous-namespaces) a
[BOLT][NFC] Replace anonymous namespace functions with static
Follow LLVM Coding Standards guideline on using anonymous namespaces (https://llvm.org/docs/CodingStandards.html#anonymous-namespaces) and use `static` modifier for function definitions.
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D143124
show more ...
|
#
8477bc67 |
| 17-Jul-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Add function layout class
This patch adds a dedicated class to keep track of each function's layout. It also lays the groundwork for splitting functions into multiple fragments (as opposed to
[BOLT] Add function layout class
This patch adds a dedicated class to keep track of each function's layout. It also lays the groundwork for splitting functions into multiple fragments (as opposed to a strict hot/cold split).
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D129518
show more ...
|
#
c4302e4f |
| 27-Jun-2022 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Use llvm::less_first
Follow the case of https://reviews.llvm.org/D126068 and simplify call sites with `llvm::less_first`.
Reviewed By: rafauler
Differential Revision: https://reviews.l
[BOLT][NFC] Use llvm::less_first
Follow the case of https://reviews.llvm.org/D126068 and simplify call sites with `llvm::less_first`.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D128242
show more ...
|
#
d2c87699 |
| 24-Jun-2022 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Use range-based STL wrappers
Replace `std::` algorithms taking begin/end iterators with `llvm::` counterparts accepting ranges.
Reviewed By: rafauler
Differential Revision: https://rev
[BOLT][NFC] Use range-based STL wrappers
Replace `std::` algorithms taking begin/end iterators with `llvm::` counterparts accepting ranges.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D128154
show more ...
|
#
8228c703 |
| 16-Jun-2022 |
Maksim Panchenko <maks@fb.com> |
[BOLT][NFCI] Refactor interface for adding basic blocks
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D127935
|
#
b92436ef |
| 05-Jun-2022 |
Fangrui Song <i@maskray.me> |
[bolt] Remove unneeded cl::ZeroOrMore for cl::opt options
|
#
a73b50ad |
| 27-May-2022 |
Balazs Benics <balazs.benics@sigmatechnology.se> |
Revert "[llvm][clang][bolt][NFC] Use llvm::less_first() when applicable"
This reverts commit 3988bd13988aad72ec979beb2361e8738584926b.
Did not build on this bot: https://lab.llvm.org/buildbot#build
Revert "[llvm][clang][bolt][NFC] Use llvm::less_first() when applicable"
This reverts commit 3988bd13988aad72ec979beb2361e8738584926b.
Did not build on this bot: https://lab.llvm.org/buildbot#builders/215/builds/6372
/usr/include/c++/9/bits/predefined_ops.h:177:11: error: no match for call to ‘(llvm::less_first) (std::pair<long unsigned int, llvm::bolt::BinaryBasicBlock*>&, const std::pair<long unsigned int, std::nullptr_t>&)’ 177 | { return bool(_M_comp(*__it, __val)); }
show more ...
|
#
3988bd13 |
| 27-May-2022 |
Balazs Benics <balazs.benics@sigmatechnology.se> |
[llvm][clang][bolt][NFC] Use llvm::less_first() when applicable
One could reuse this functor instead of rolling out your own version. There were a couple other cases where the code was similar, but
[llvm][clang][bolt][NFC] Use llvm::less_first() when applicable
One could reuse this functor instead of rolling out your own version. There were a couple other cases where the code was similar, but not quite the same, such as it might have an assertion in the lambda or other constructs. Thus, I've not touched any of those, as it might change the behavior in some way.
As per https://discourse.llvm.org/t/submitting-simple-nfc-patches/62640/3?u=steakhal Chris Lattner > LLVM intentionally has a “yes, you can apply common sense judgement to > things” policy when it comes to code review. If you are doing mechanical > patches (e.g. adopting less_first) that apply to the entire monorepo, > then you don’t need everyone in the monorepo to sign off on it. Having > some +1 validation from someone is useful, but you don’t need everyone > whose code you touch to weigh in.
Differential Revision: https://reviews.llvm.org/D126068
show more ...
|
#
4c14519e |
| 20-Mar-2022 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT] LongJmp: Check for shouldEmit
Check that the function will be emitted in the final binary. Preserving old function address is needed in case it is PLT trampiline, that is currently not moved
[BOLT] LongJmp: Check for shouldEmit
Check that the function will be emitted in the final binary. Preserving old function address is needed in case it is PLT trampiline, that is currently not moved by the BOLT.
Differential Revision: https://reviews.llvm.org/D122098
show more ...
|
#
af9bdcfc |
| 19-Mar-2022 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT] Align constant islands to 8 bytes
AArch64 requires CI to be aligned to 8 bytes due to access instructions restrictions. E.g. the ldr with imm, where imm must be aligned to 8 bytes.
Different
[BOLT] Align constant islands to 8 bytes
AArch64 requires CI to be aligned to 8 bytes due to access instructions restrictions. E.g. the ldr with imm, where imm must be aligned to 8 bytes.
Differential Revision: https://reviews.llvm.org/D122065
show more ...
|
#
5be5d0f5 |
| 16-Mar-2022 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT] LongJmp speedup refactoring
Run tentativeLayoutRelocMode twice only if UseOldText option was passed. Refactor BF loop to break on condtition met.
Differential Revision: https://reviews.llvm.
[BOLT] LongJmp speedup refactoring
Run tentativeLayoutRelocMode twice only if UseOldText option was passed. Refactor BF loop to break on condtition met.
Differential Revision: https://reviews.llvm.org/D121825
show more ...
|
#
62a289d8 |
| 15-Mar-2022 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT] LongJmp: Fix hot text section alignment
The BinaryEmitter uses opts::AlignText value to align the hot text section. Also check that the opts::AlignText is at least equal opts::AlignFunctions
[BOLT] LongJmp: Fix hot text section alignment
The BinaryEmitter uses opts::AlignText value to align the hot text section. Also check that the opts::AlignText is at least equal opts::AlignFunctions for the same reason, as described in D121392.
Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D121728
show more ...
|
#
8ab69baa |
| 10-Mar-2022 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT] Set cold sections alignment explicitly
The cold text section alignment is set using the maximum alignment value passed to the emitCodeAlignment. In order to calculate tentetive layout right w
[BOLT] Set cold sections alignment explicitly
The cold text section alignment is set using the maximum alignment value passed to the emitCodeAlignment. In order to calculate tentetive layout right we will set the minimum alignment of such sections to the maximum possible function alignment explicitly.
Differential Revision: https://reviews.llvm.org/D121392
show more ...
|
#
04b87cf0 |
| 09-Mar-2022 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT] LongJmp: Use per-function alignment values
The per-function alignment values must be used in order to create tentative layout.
Differential Revision: https://reviews.llvm.org/D121298
|
#
f92ab6af |
| 29-Dec-2021 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Fix braces usage in Passes
Summary: Refactor bolt/*/Passes to follow the braces rule for if/else/loop from [LLVM Coding Standards](https://llvm.org/docs/CodingStandards.html).
(cherry p
[BOLT][NFC] Fix braces usage in Passes
Summary: Refactor bolt/*/Passes to follow the braces rule for if/else/loop from [LLVM Coding Standards](https://llvm.org/docs/CodingStandards.html).
(cherry picked from FBD33344642)
show more ...
|