#
92301180 |
| 22-Nov-2024 |
Maksim Panchenko <maks@fb.com> |
[BOLT] Use compact EH format for fixed-address executables (#117274)
Use ULEB128 format for emitting LSDAs for fixed-address executables,
similar to what we use for PIEs/DSOs. Main difference is th
[BOLT] Use compact EH format for fixed-address executables (#117274)
Use ULEB128 format for emitting LSDAs for fixed-address executables,
similar to what we use for PIEs/DSOs. Main difference is that we don't
use landing pad trampolines when landing pads are not contained in a
single fragment. Instead, we fallback to emitting larger fixed-address
LSDAs, which is still better than adding trampoline instructions.
show more ...
|
#
105ecd8b |
| 22-Nov-2024 |
Maksim Panchenko <maks@fb.com> |
[BOLT] Avoid EH trampolines for PIEs/DSOs (#117106)
We used to emit EH trampolines for PIE/DSO whenever a function fragment
contained a landing pad outside of it. However, it is common to have all
[BOLT] Avoid EH trampolines for PIEs/DSOs (#117106)
We used to emit EH trampolines for PIE/DSO whenever a function fragment
contained a landing pad outside of it. However, it is common to have all
landing pads in a cold fragment even when their throwers are in a hot
one.
To reduce the number of trampolines, analyze landing pads for any given
function fragment, and if they all belong to the same (possibly
different) fragment, designate that fragment as a landing pad fragment
for the "thrower" fragment. Later, emit landing pad fragment symbol as
an LPStart for the thrower LSDA.
show more ...
|
#
dd09a7db |
| 02-May-2024 |
Maksim Panchenko <maks@fb.com> |
[BOLT] Add split function support for the Linux kernel (#90541)
While rewriting the Linux kernel, we try to fit optimized functions into
their original boundaries. When a function becomes larger, w
[BOLT] Add split function support for the Linux kernel (#90541)
While rewriting the Linux kernel, we try to fit optimized functions into
their original boundaries. When a function becomes larger, we skip it
during the rewrite and end up with less than optimal code layout. To
overcome that issue, add support for --split-function option so that hot
part of the function could be fit into the original space. The cold part
should go to reserved space in the binary.
show more ...
|
#
fd38366e |
| 01-Apr-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Clean includes, add license headers (#87200)
|
#
52cf0711 |
| 12-Feb-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Log through JournalingStreams (#81524)
Make core BOLT functionality more friendly to being used as a
library instead of in our standalone driver llvm-bolt. To
accomplish this, we augme
[BOLT][NFC] Log through JournalingStreams (#81524)
Make core BOLT functionality more friendly to being used as a
library instead of in our standalone driver llvm-bolt. To
accomplish this, we augment BinaryContext with journaling streams
that are to be used by most BOLT code whenever something needs to
be logged to the screen. Users of the library can decide if logs
should be printed to a file, no file or to the screen, as
before. To illustrate this, this patch adds a new option
`--log-file` that allows the user to redirect BOLT logging to a
file on disk or completely hide it by using
`--log-file=/dev/null`. Future BOLT code should now use
`BinaryContext::outs()` for printing important messages instead of
`llvm::outs()`. A new test log.test enforces this by verifying that
no strings are print to screen once the `--log-file` option is
used.
In previous patches we also added a new BOLTError class to report
common and fatal errors, so code shouldn't call exit(1) now. To
easily handle problems as before (by quitting with exit(1)),
callers can now use
`BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code
needs to deal with BOLT errors. To test this, we have fatal.s
that checks we are correctly quitting and printing a fatal error
to the screen.
Because this is a significant change by itself, not all code was
yet ported. Code from Profiler libs (DataAggregator and friends)
still print errors directly to screen.
Co-authored-by: Rafael Auler <rafaelauler@fb.com>
Test Plan: NFC
show more ...
|
#
a5f3d1a8 |
| 12-Feb-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Return Error from BinaryFunctionPass::runOnFunctions (#81521)
As part of the effort to refactor old error handling code that
would directly call exit(1), in this patch we change the
in
[BOLT][NFC] Return Error from BinaryFunctionPass::runOnFunctions (#81521)
As part of the effort to refactor old error handling code that
would directly call exit(1), in this patch we change the
interface to `BinaryFunctionPass` to return an Error on
`runOnFunctions()`. This gives passes the ability to report a
serious problem to the caller (RewriteInstance class), so the
caller may decide how to best handle the exceptional situation.
Co-authored-by: Rafael Auler <rafaelauler@fb.com>
Test Plan: NFC
show more ...
|
#
15774834 |
| 21-Dec-2023 |
ShatianWang <38512325+ShatianWang@users.noreply.github.com> |
[BOLT] Don't split likely fallthrough in CDSplit (#76164)
This diff speeds up CDSplit by not considering any hot-warm splitting
point that could break a fall-through branch from a basic block to it
[BOLT] Don't split likely fallthrough in CDSplit (#76164)
This diff speeds up CDSplit by not considering any hot-warm splitting
point that could break a fall-through branch from a basic block to its
most likely successor.
Co-authored-by: spupyrev <spupyrev@fb.com>
show more ...
|
#
296088bd |
| 01-Dec-2023 |
ShatianWang <38512325+ShatianWang@users.noreply.github.com> |
[BOLT][NFC] Remove unused code for CDSplit (#74136)
This diff removes JumpInfo related code that is no longer needed by
CDSplit from SplitFunctions.cpp.
|
#
4483cf2d |
| 01-Dec-2023 |
ShatianWang <38512325+ShatianWang@users.noreply.github.com> |
[BOLT] CDSplit main logic part 2/2 (#74032)
This diff implements the main splitting logic of CDSplit. CDSplit
processes functions in a binary in parallel. For each function BF, it
assumes that all
[BOLT] CDSplit main logic part 2/2 (#74032)
This diff implements the main splitting logic of CDSplit. CDSplit
processes functions in a binary in parallel. For each function BF, it
assumes that all other functions are hot-cold split. For each possible
hot-warm split point of BF, it computes its corresponding SplitScore,
and chooses the split point with the best SplitScore. The SplitScore of
each split point is computed in the following way: each call edge or
jump edge has an edge score that is proportional to its execution count,
and inversely proportional to its distance. The SplitScore of a split
point is a sum of edge scores over a fixed set of edges whose distance
can change due to hot-warm splitting BF. This set contains all cover
calls in the form of X->Y or Y->X given function order [... X ... BF ...
Y ...]; we refer to the sum of edge scores over the set of cover calls
as CoverCallScore. This set also contains all jump edges (branches)
within BF as well as all call edges originated from BF; we refer to the
sum of edge scores over this set of edges as LocalScore. CDSplit finds
the split index maximizing CoverCallScore + LocalScore.
show more ...
|
#
56bbf813 |
| 01-Dec-2023 |
ShatianWang <38512325+ShatianWang@users.noreply.github.com> |
[BOLT] CDSplit main logic part 1/2 (#73895)
This diff defines and initializes auxiliary variables used by CDSplit
and implements two important helper functions. The first helper function
approxima
[BOLT] CDSplit main logic part 1/2 (#73895)
This diff defines and initializes auxiliary variables used by CDSplit
and implements two important helper functions. The first helper function
approximates the block level size increase if a function is hot-warm
split at a given split index (X86 specific). The second helper function
finds all calls in the form of X->Y or Y->X for each BF given function
order [... X ... BF ... Y ...]. These calls are referred to as "cover
calls". Their distance will decrease if BF's hot fragment size is
further reduced by hot-warm splitting. NFC.
show more ...
|
#
c43d0432 |
| 30-Nov-2023 |
ShatianWang <38512325+ShatianWang@users.noreply.github.com> |
[BOLT] Create .text.warm for 3-way splitting (#73863)
This commit explicitly adds a warm code section, .text.warm, when
-split-functions -split-strategy=cdsplit is used. This replaces the
previous
[BOLT] Create .text.warm for 3-way splitting (#73863)
This commit explicitly adds a warm code section, .text.warm, when
-split-functions -split-strategy=cdsplit is used. This replaces the
previous approach of using .text.cold.0 as warm and .text.cold.1 as cold
in 3-way function splitting. NFC.
show more ...
|
#
076bd22f |
| 29-Nov-2023 |
ShatianWang <38512325+ShatianWang@users.noreply.github.com> |
[BOLT] Add structure of CDSplit to SplitFunctions (#73430)
This commit establishes the general structure of the CDSplit strategy in
SplitFunctions without incorporating the exact splitting logic. W
[BOLT] Add structure of CDSplit to SplitFunctions (#73430)
This commit establishes the general structure of the CDSplit strategy in
SplitFunctions without incorporating the exact splitting logic. With
-split-functions -split-strategy=cdsplit, the SplitFunctions pass will
run twice: the first time is before function reordering and functions
are hot-cold split; the second time is after function reordering and
functions are hot-warm-cold split based on the fixed function ordering.
Currently, all functions are hot-warm split after the entry block in the
second splitting pass. Subsequent commits will introduce the precise
splitting logic. NFC.
show more ...
|
#
c30ab9dc |
| 27-Feb-2023 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Log reversing splitting decision
Expose log for testing purposes.
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D144674
|
#
2563fd63 |
| 06-Dec-2022 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Use std::optional in MCPlusBuilder
Reviewed By: maksfb, #bolt
Differential Revision: https://reviews.llvm.org/D139260
|
#
3ac46f37 |
| 09-Sep-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Emit LSDA call sites for all fragments
For exception handling, LSDA call sites have to be emitted for each fragment individually. With this patch, call sites and respective LSDA symbols are g
[BOLT] Emit LSDA call sites for all fragments
For exception handling, LSDA call sites have to be emitted for each fragment individually. With this patch, call sites and respective LSDA symbols are generated and associated with each fragment of their function, such that they can be used by the emitter.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D132052
show more ...
|
#
ae2b4da1 |
| 09-Sep-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Fragment all blocks (not just outlineable blocks)
To enable split strategies that require view of the entire CFG (e.g. to estimate cost of path from entry block), with this patch, all blocks
[BOLT] Fragment all blocks (not just outlineable blocks)
To enable split strategies that require view of the entire CFG (e.g. to estimate cost of path from entry block), with this patch, all blocks of a function are passed to `SplitStrategy::fragment`. Because this might move non-outlineable blocks into a split fragment, these blocks are moved back into the main fragment after fragmenting. This also gives strategies the option to specify whether empty fragments should be kept or removed.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D132423
show more ...
|
#
4fdbe985 |
| 08-Sep-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Introduce SplitStrategy ABC
This introduces an abstract base class for splitting strategies to document the interface a strategy needs to implement, and also to avoid code bloat of the `split
[BOLT] Introduce SplitStrategy ABC
This introduces an abstract base class for splitting strategies to document the interface a strategy needs to implement, and also to avoid code bloat of the `splitFunction` method.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D132054
show more ...
|
#
a0c7ca8a |
| 03-Sep-2022 |
Kazu Hirata <kazu@google.com> |
[BOLT] Use range-based for loops (NFC)
LLVM Coding Standards discourage for_each unless callable objects already exist.
|
#
e001a4e4 |
| 19-Aug-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Insert EH trampolines for multiple fragments
This patch adds exception handling trampolines when a function is split into more than two fragments. Trampolines are tracked per-fragment, such t
[BOLT] Insert EH trampolines for multiple fragments
This patch adds exception handling trampolines when a function is split into more than two fragments. Trampolines are tracked per-fragment, such that they can be removed if splitting is reversed.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132048
show more ...
|
#
48ff38ce |
| 19-Aug-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Add randomN split strategy
This adds a strategy to split functions into a random number of fragments at randomly chosen split points.
Reviewed By: rafauler
Differential Revision: https://re
[BOLT] Add randomN split strategy
This adds a strategy to split functions into a random number of fragments at randomly chosen split points.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D130647
show more ...
|
#
f428db7a |
| 19-Aug-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Add split all blocks strategy
This adds a function splitting strategy that splits each outlineable basic block into its own fragment. This is exposed through a new command line option `--spli
[BOLT] Add split all blocks strategy
This adds a function splitting strategy that splits each outlineable basic block into its own fragment. This is exposed through a new command line option `--split-strategy`.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D129827
show more ...
|
#
0f74d191 |
| 19-Aug-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Generate sections for multiple fragments
This patch adds support to generate any number of sections that are assigned to fragments of functions that are split more than two-way. With this, a
[BOLT] Generate sections for multiple fragments
This patch adds support to generate any number of sections that are assigned to fragments of functions that are split more than two-way. With this, a function's *nth* split fragment goes into section `.text.cold.n`.
This also changes `FunctionLayout::erase` to make sure, that there are no empty fragments at the end of the function. This sometimes happens when blocks are erased from the function. To avoid creating symbols pointing to these fragments, they need to be removed.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D130521
show more ...
|
#
8477bc67 |
| 17-Jul-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Add function layout class
This patch adds a dedicated class to keep track of each function's layout. It also lays the groundwork for splitting functions into multiple fragments (as opposed to
[BOLT] Add function layout class
This patch adds a dedicated class to keep track of each function's layout. It also lays the groundwork for splitting functions into multiple fragments (as opposed to a strict hot/cold split).
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D129518
show more ...
|
#
d55dfeaf |
| 14-Jul-2022 |
Fabian Parzefall <parzefall@fb.com> |
[BOLT] Replace uses of layout with basic block list
As we are moving towards support for multiple fragments, loops that iterate over all basic blocks of a function, but do not depend on the order of
[BOLT] Replace uses of layout with basic block list
As we are moving towards support for multiple fragments, loops that iterate over all basic blocks of a function, but do not depend on the order of basic blocks in the final layout, should iterate over binary functions directly, rather than the layout.
Eventually, all loops using the layout list should either iterate over the function, or be aware of multiple layouts. This patch replaces references to binary function's block layout with the binary function itself where only little code changes are necessary.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D129585
show more ...
|
#
ed743045 |
| 24-Jun-2022 |
Maksim Panchenko <maks@fb.com> |
[BOLT] Fix EH trampoline backout code
When SplitFunctions pass adds a trampoline code for exception landing pads (limited to shared objects), it may increase the size of the hot fragment making it l
[BOLT] Fix EH trampoline backout code
When SplitFunctions pass adds a trampoline code for exception landing pads (limited to shared objects), it may increase the size of the hot fragment making it larger than the whole function pre-split. When this happens, the pass reverts the splitting action by restoring the original block order and marking all blocks hot.
However, if createEHTrampolines() added new blocks to the CFG and modified invoke instructions, simply restoring the original block layout will not suffice as the new CFG has more blocks.
For proper backout of the split, modify the original layout by merging in trampoline blocks immediately before their matching targets. As a result, the number of blocks increases, but the number of instructions and the function size remains the same as pre-split.
Add an assertion for the number of blocks when updating a function layout.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D128696
show more ...
|