Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6 |
|
#
3c357a49 |
| 17-Dec-2024 |
Alexander Yermolovich <43973793+ayermolo@users.noreply.github.com> |
[BOLT] Add support for safe-icf (#116275)
Identical Code Folding (ICF) folds functions that are identical into one
function, and updates symbol addresses to the new address. This reduces
the size
[BOLT] Add support for safe-icf (#116275)
Identical Code Folding (ICF) folds functions that are identical into one
function, and updates symbol addresses to the new address. This reduces
the size of a binary, but can lead to problems. For example when
function pointers are compared. This can be done either explicitly in
the code or generated IR by optimization passes like Indirect Call
Promotion (ICP). After ICF what used to be two different addresses
become the same address. This can lead to a different code path being
taken.
This is where safe ICF comes in. Linker (LLD) does it using address
significant section generated by clang. If symbol is in it, or an object
doesn't have this section symbols are not folded.
BOLT does not have the information regarding which objects do not have
this section, so can't re-use this mechanism.
This implementation scans code section and conservatively marks
functions symbols as unsafe. It treats symbols as unsafe if they are
used in non-control flow instruction. It also scans through the data
relocation sections and does the same for relocations that reference a
function symbol. The latter handles the case when function pointer is
stored in a local or global variable, etc. If a relocation address
points within a vtable these symbols are skipped.
show more ...
|
#
2df48fa7 |
| 16-Dec-2024 |
Paschalis Mpeis <paschalis.mpeis@arm.com> |
[BOLT][AArch64] Enable function print after ADRRelaxation (#119869)
Introduce `--print-adr-relaxation` to print after ADR Relaxation pass.
|
Revision tags: llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2 |
|
#
4cab01f0 |
| 08-Oct-2024 |
ShatianWang <38512325+ShatianWang@users.noreply.github.com> |
[BOLT] Profile quality stats -- CFG discontinuity (#109683)
In a perfect profile, each positive-execution-count block in the
function’s CFG should be reachable from a positive-execution-count
func
[BOLT] Profile quality stats -- CFG discontinuity (#109683)
In a perfect profile, each positive-execution-count block in the
function’s CFG should be reachable from a positive-execution-count
function entry block through a positive-execution-count path. This new
pass checks how well the BOLT input profile satisfies this “CFG
continuity” property.
More specifically, for each of the hottest 1000 functions, the pass
calculates the function’s fraction of basic block execution counts that
is “unreachable”. It then reports the 95th percentile of the
distribution of the 1000 unreachable fractions in a single BOLT-INFO
line. The smaller the reported value is, the better the BOLT profile
satisfies the CFG continuity property.
The default value of 1000 above can be changed via the hidden BOLT
option `-num-functions-for-continuity-check=[N]`. If more detailed stats
are needed, `-v=1` can be added to the BOLT invocation: the hottest N
functions will be grouped into 5 equally-sized buckets, from the hottest
to the coldest; for each bucket, various summary statistics of the
distribution of the fractions and the raw unreachable execution counts
will be reported.
show more ...
|
Revision tags: llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3 |
|
#
445023f1 |
| 07-Aug-2024 |
Vladislav Khmelevsky <och95@yandex.ru> |
Revert "[BOLT] Move ADRRelaxationPass (#101371)" (#102333)
This reverts commit 750b12f06badc4cdf767139c70090db62358bb44.
The pass should run after splitting phase, but before nop removal
|
#
750b12f0 |
| 07-Aug-2024 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT] Move ADRRelaxationPass (#101371)
For non-simple functions we need nop instruction to be presented to
transform ADR to ADRP+ADD sequence, so run this pass before remove nops
pass.
|
Revision tags: llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
b686600a |
| 19-Jul-2024 |
Daniel Hill <dhhillaz@gmail.com> |
[BOLT] Skip instruction shortening (#93032)
Add the ability to disable the instruction shortening pass through
--shorten-instructions=false
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7 |
|
#
a38f0157 |
| 23-May-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT] Set InitialDynoStats after EstimateEdgeCounts (#93218)
InitialDynoStats used to be assigned inside `runAllPasses`, but the
assignment executed before any of the passes. As we've moved
`Esti
[BOLT] Set InitialDynoStats after EstimateEdgeCounts (#93218)
InitialDynoStats used to be assigned inside `runAllPasses`, but the
assignment executed before any of the passes. As we've moved
`EstimateEdgeCounts` into a pass out of ProfileReader, it needs to
execute before initial dyno stats are set.
Thus move `InitialDynoStats` into BinaryContext and assignment into
`DynoStatsSetPass`.
show more ...
|
#
f3dc732b |
| 22-May-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Make estimateEdgeCounts a BinaryFunctionPass (#93074)
|
Revision tags: llvmorg-18.1.6, llvmorg-18.1.5 |
|
#
5fb59e74 |
| 25-Apr-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT] Print program stats in perf2bolt/aggregate-only mode (#89763)
|
Revision tags: llvmorg-18.1.4 |
|
#
e2d48239 |
| 11-Apr-2024 |
Nathan Sidwell <nathan@acm.org> |
[BOLT][NFC] Make RepRet X86-specific (#88286)
Bolt's RepRet pass is x86-specific, no need to add it for non-x86
targets.
|
Revision tags: llvmorg-18.1.3 |
|
#
51268a57 |
| 22-Mar-2024 |
Maksim Panchenko <maks@fb.com> |
[BOLT] Enable --keep-nops option for Linux kernel by default (#86349)
Preserve nop instructions in the Linux kernel since they could be used
for runtime patching.
|
Revision tags: llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3 |
|
#
52cf0711 |
| 12-Feb-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Log through JournalingStreams (#81524)
Make core BOLT functionality more friendly to being used as a
library instead of in our standalone driver llvm-bolt. To
accomplish this, we augme
[BOLT][NFC] Log through JournalingStreams (#81524)
Make core BOLT functionality more friendly to being used as a
library instead of in our standalone driver llvm-bolt. To
accomplish this, we augment BinaryContext with journaling streams
that are to be used by most BOLT code whenever something needs to
be logged to the screen. Users of the library can decide if logs
should be printed to a file, no file or to the screen, as
before. To illustrate this, this patch adds a new option
`--log-file` that allows the user to redirect BOLT logging to a
file on disk or completely hide it by using
`--log-file=/dev/null`. Future BOLT code should now use
`BinaryContext::outs()` for printing important messages instead of
`llvm::outs()`. A new test log.test enforces this by verifying that
no strings are print to screen once the `--log-file` option is
used.
In previous patches we also added a new BOLTError class to report
common and fatal errors, so code shouldn't call exit(1) now. To
easily handle problems as before (by quitting with exit(1)),
callers can now use
`BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code
needs to deal with BOLT errors. To test this, we have fatal.s
that checks we are correctly quitting and printing a fatal error
to the screen.
Because this is a significant change by itself, not all code was
yet ported. Code from Profiler libs (DataAggregator and friends)
still print errors directly to screen.
Co-authored-by: Rafael Auler <rafaelauler@fb.com>
Test Plan: NFC
show more ...
|
#
13d60ce2 |
| 12-Feb-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Propagate BOLTErrors from Core, RewriteInstance, and passes (2/2) (#81523)
As part of the effort to refactor old error handling code that
would directly call exit(1), in this patch cont
[BOLT][NFC] Propagate BOLTErrors from Core, RewriteInstance, and passes (2/2) (#81523)
As part of the effort to refactor old error handling code that
would directly call exit(1), in this patch continue the migration
on libCore, libRewrite and libPasses to use the new BOLTError
class whenever a failure occurs.
Test Plan: NFC
Co-authored-by: Rafael Auler <rafaelauler@fb.com>
show more ...
|
#
a5f3d1a8 |
| 12-Feb-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Return Error from BinaryFunctionPass::runOnFunctions (#81521)
As part of the effort to refactor old error handling code that
would directly call exit(1), in this patch we change the
in
[BOLT][NFC] Return Error from BinaryFunctionPass::runOnFunctions (#81521)
As part of the effort to refactor old error handling code that
would directly call exit(1), in this patch we change the
interface to `BinaryFunctionPass` to return an Error on
`runOnFunctions()`. This gives passes the ability to report a
serious problem to the caller (RewriteInstance class), so the
caller may decide how to best handle the exceptional situation.
Co-authored-by: Rafael Auler <rafaelauler@fb.com>
Test Plan: NFC
show more ...
|
Revision tags: llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init |
|
#
076bd22f |
| 29-Nov-2023 |
ShatianWang <38512325+ShatianWang@users.noreply.github.com> |
[BOLT] Add structure of CDSplit to SplitFunctions (#73430)
This commit establishes the general structure of the CDSplit strategy in
SplitFunctions without incorporating the exact splitting logic. W
[BOLT] Add structure of CDSplit to SplitFunctions (#73430)
This commit establishes the general structure of the CDSplit strategy in
SplitFunctions without incorporating the exact splitting logic. With
-split-functions -split-strategy=cdsplit, the SplitFunctions pass will
run twice: the first time is before function reordering and functions
are hot-cold split; the second time is after function reordering and
functions are hot-warm-cold split based on the fixed function ordering.
Currently, all functions are hot-warm split after the entry block in the
second splitting pass. Subsequent commits will introduce the precise
splitting logic. NFC.
show more ...
|
Revision tags: llvmorg-17.0.6 |
|
#
e823136d |
| 14-Nov-2023 |
Maksim Panchenko <maks@fb.com> |
[BOLT] Refactor --keep-nops option. NFC. (#72228)
Run RemoveNops pass only if --keep-nops is set to false (default).
|
Revision tags: llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2 |
|
#
061adc18 |
| 30-Sep-2023 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT][NFC] Hide pass print options (#67718)
Most of the print options are hidden, make hidden them all.
|
Revision tags: llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init |
|
#
f8730293 |
| 16-Jun-2023 |
Job Noorman <jnoorman@igalia.com> |
[BOLT] Add minimal RISC-V 64-bit support
Just enough features are implemented to process a simple "hello world" executable and produce something that still runs (including libc calls). This was main
[BOLT] Add minimal RISC-V 64-bit support
Just enough features are implemented to process a simple "hello world" executable and produce something that still runs (including libc calls). This was mainly a matter of implementing support for various relocations. Currently, the following are handled:
- R_RISCV_JAL - R_RISCV_CALL - R_RISCV_CALL_PLT - R_RISCV_BRANCH - R_RISCV_RVC_BRANCH - R_RISCV_RVC_JUMP - R_RISCV_GOT_HI20 - R_RISCV_PCREL_HI20 - R_RISCV_PCREL_LO12_I - R_RISCV_RELAX - R_RISCV_NONE
Executables linked with linker relaxation will probably fail to be processed. BOLT relocates .text to a high address while leaving .plt at its original (low) address. This causes PC-relative PLT calls that were relaxed to a JAL to not fit their offset in an I-immediate anymore. This is something that will be addressed in a later patch.
Changes to the BOLT core are relatively minor. Two things were tricky to implement and needed slightly larger changes. I'll explain those below.
The R_RISCV_CALL(_PLT) relocation is put on the first instruction of a AUIPC/JALR pair, the second does not get any relocation (unlike other PCREL pairs). This causes issues with the combinations of the way BOLT processes binaries and the RISC-V MC-layer handles relocations: - BOLT reassembles instructions one by one and since the JALR doesn't have a relocation, it simply gets copied without modification; - Even though the MC-layer handles R_RISCV_CALL properly (adjusts both the AUIPC and the JALR), it assumes the immediates of both instructions are 0 (to be able to or-in a new value). This will most likely not be the case for the JALR that got copied over.
To handle this difficulty without resorting to RISC-V-specific hacks in the BOLT core, a new binary pass was added that searches for AUIPC/JALR pairs and zeroes-out the immediate of the JALR.
A second difficulty was supporting ABS symbols. As far as I can tell, ABS symbols were not handled at all, causing __global_pointer$ to break. RewriteInstance::analyzeRelocation was updated to handle these generically.
Tests are provided for all supported relocations. Note that in order to test the correct handling of PLT entries, an ELF file produced by GCC had to be used. While I tried to strip the YAML representation, it's still quite large. Any suggestions on how to improve this would be appreciated.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D145687
show more ...
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4 |
|
#
62a2feff |
| 16-May-2023 |
Rafael Auler <rafaelauler@fb.com> |
[BOLT] Fix state of MCSymbols in lowering pass
We have mostly harmless data races when running BinaryContext::calculateEmittedSize() in parallel, while performing split function pass. However, it i
[BOLT] Fix state of MCSymbols in lowering pass
We have mostly harmless data races when running BinaryContext::calculateEmittedSize() in parallel, while performing split function pass. However, it is possible to end up in a state where some MCSymbols are still registered and our clean up failed. This happens rarely but it does happen, and when it happens, it is a difficult to diagnose heisenbug. To avoid this, add a new clean pass to perform a last check on MCSymbols, before they undergo our final emission pass, to verify that they are in a sane state. If we fail to do this, we might resolve some symbols to zero and crash the output binary.
Reviewed By: #bolt, Amir
Differential Revision: https://reviews.llvm.org/D137984
show more ...
|
Revision tags: llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6 |
|
#
17ed8f29 |
| 16-Nov-2022 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT][AArch64] Handle adrp+ld64 linker relaxations
Linker might relax adrp + ldr got address loading to adrp + add for local non-preemptible symbols (e.g. hidden/protected symbols in executable). A
[BOLT][AArch64] Handle adrp+ld64 linker relaxations
Linker might relax adrp + ldr got address loading to adrp + add for local non-preemptible symbols (e.g. hidden/protected symbols in executable). As usually linker doesn't change relocations properly after relaxation, so we have to handle such cases by ourselves. To do that during relocations reading we change LD64 reloc to ADD if instruction mismatch found and introduce FixRelaxationPass that searches for ADRP+ADD pairs and after performing some checks we're replacing ADRP target symbol to already fixed ADDs one.
Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D138097
show more ...
|
#
be9d3ede |
| 20-Dec-2022 |
Maksim Panchenko <maks@fb.com> |
[BOLT][NFC] Remove unused PrintInstructions argument
PrintInstructions was unused in BinaryFunction::print() and dump().
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D140440
|
Revision tags: llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
#
1fb18619 |
| 27-Jun-2022 |
Alexey Moksyakov <alexey.moksyakov@huawei.com> |
adds huge pages support of PIE/no-PIE binaries
This patch adds the huge pages support (-hugify) for PIE/no-PIE binaries. Also returned functionality to support the kernels < 5.10 where there is a pr
adds huge pages support of PIE/no-PIE binaries
This patch adds the huge pages support (-hugify) for PIE/no-PIE binaries. Also returned functionality to support the kernels < 5.10 where there is a problem in a dynamic loader with the alignment of pages addresses.
Differential Revision: https://reviews.llvm.org/D129107
show more ...
|
#
4f158995 |
| 27-Aug-2022 |
Rafael Auler <rafaelauler@fb.com> |
[BOLT] Add pass to fix ambiguous memory references
This adds a round of checks to memory references, looking for incorrect references to jump table objects. Fix them by replacing the jump table refe
[BOLT] Add pass to fix ambiguous memory references
This adds a round of checks to memory references, looking for incorrect references to jump table objects. Fix them by replacing the jump table reference with another object reference + offset.
This solves bugs related to regular data references in code accidentally being bound to a jump table, and this reference being updated to a new (incorrect) location because we moved this jump table.
Fixes #55004
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D134098
show more ...
|
#
35efe1d8 |
| 06-Jul-2022 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT][AArch64] Handle gold linker veneers
The gold linker veneers are written between functions without symbols, so we to handle it specially in BOLT.
Vladislav Khmelevsky, Advanced Software Techn
[BOLT][AArch64] Handle gold linker veneers
The gold linker veneers are written between functions without symbols, so we to handle it specially in BOLT.
Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D129260
show more ...
|
#
66b01a89 |
| 30-Jun-2022 |
Amir Ayupov <aaupov@fb.com> |
[BOLT] Fix getDynoStats to handle BCs with no functions
Address fuzzer crash
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D120696
|