#
1b4bd4e1 |
| 25-Jan-2025 |
Maksim Panchenko <maks@fb.com> |
[BOLT][AArch64] Remove assertions from jump table heuristic (#124372)
The code for jump table detection on AArch64 asserts liberally whenever
the input instruction sequence does not match the expec
[BOLT][AArch64] Remove assertions from jump table heuristic (#124372)
The code for jump table detection on AArch64 asserts liberally whenever
the input instruction sequence does not match the expected pattern. As a
result, BOLT fails to process binaries with such sequences instead of
ignoring functions with unknown control flow.
Remove asserts in analyzeIndirectBranchFragment() and mark indirect
jumps as instructions with unknown control flow instead.
show more ...
|
#
34c6c5e7 |
| 24-Jan-2025 |
Maksim Panchenko <maks@fb.com> |
[BOLT][AArch64] Fix PLT optimization (#124192)
Preserve C++ exception metadata while running PLT optimization on
AArch64.
|
#
ad599c25 |
| 20-Jan-2025 |
Alexey Moksyakov <yavtuk@yandex.ru> |
[BOLT][AArch64] Add isPush & isPop (#120713)
This functionality is needed for inliner pass and also for correct dyno
stats.
Needed for [PR](https://github.com/llvm/llvm-project/pull/120187)
|
#
ee428225 |
| 17-Jan-2025 |
Nicholas <45984215+liusy58@users.noreply.github.com> |
[BOLT][AArch64]support `inline-small-functions` for AArch64 (#120187)
Add some functions in `AArch64MCPlusBuilder.cpp` to support inline for
AArch64.
|
#
1fa02b96 |
| 17-Jan-2025 |
Nicholas <45984215+liusy58@users.noreply.github.com> |
[BOLT][AArch64] Speedup `computeInstructionSize` (#121106)
AArch64 instructions have a fixed size 4 bytes, no need to compute.
|
#
e11d49cb |
| 20-Dec-2024 |
Alexey Moksyakov <yavtuk@yandex.ru> |
[BOLT][AArch64] Adds tls relocations support (#117465)
Co-authored-by: yavtuk <yavtuk@ya.ru>
|
#
be89e794 |
| 12-Nov-2024 |
Maksim Panchenko <maks@fb.com> |
[BOLT][AArch64] Add support for long absolute LLD thunks/veneers (#113408)
Absolute thunks generated by LLD reference function addresses recorded
as data in code. Since they are generated by the li
[BOLT][AArch64] Add support for long absolute LLD thunks/veneers (#113408)
Absolute thunks generated by LLD reference function addresses recorded
as data in code. Since they are generated by the linker, they don't have
relocations associated with them and thus the addresses are left
undetected. Use pattern matching to detect such thunks and handle them
in VeneerElimination pass.
show more ...
|
#
3023b15f |
| 09-May-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT] Support POSSIBLE_PIC_FIXED_BRANCH
Detect and support fixed PIC indirect jumps of the following form: ``` movslq En(%rip), %r1 leaq PIC_JUMP_TABLE(%rip), %r2 addq %r2, %r1 jmpq *%r1 ```
w
[BOLT] Support POSSIBLE_PIC_FIXED_BRANCH
Detect and support fixed PIC indirect jumps of the following form: ``` movslq En(%rip), %r1 leaq PIC_JUMP_TABLE(%rip), %r2 addq %r2, %r1 jmpq *%r1 ```
with PIC_JUMP_TABLE that looks like following:
``` JT: ---------- E1:| L1 - JT | |----------| E2:| L2 - JT | |----------| | | ...... En:| Ln - JT | ---------- ```
The code could be produced by compilers, see https://github.com/llvm/llvm-project/issues/91648.
Test Plan: updated jump-table-fixed-ref-pic.test
Reviewers: maksfb, ayermolo, dcci, rafaelauler
Reviewed By: rafaelauler
Pull Request: https://github.com/llvm/llvm-project/pull/91667
show more ...
|
#
e2cee2c1 |
| 05-Jul-2024 |
Ádám Kallai <kadam@inf.u-szeged.hu> |
[BOLT][AArch64] Fixes assertion errors occurred when perf2bolt was executed (#83394)
BOLT only checks for the most common indirect branch pattern during the
branch analyzation.
Extended the logic
[BOLT][AArch64] Fixes assertion errors occurred when perf2bolt was executed (#83394)
BOLT only checks for the most common indirect branch pattern during the
branch analyzation.
Extended the logic with two other indirect patterns which slightly
differ from the expected one.
Those patterns may be hit when statically linking libc (pattern 2
requires 'lld' linker).
As a workaround mark them as UNKNOWN branch for now.
Fixes: #83114
show more ...
|
#
344228eb |
| 02-Jul-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT] Drop macro-fusion alignment (#97358)
9d0754ada5dbbc0c009bcc2f7824488419cc5530 dropped MC support required for
optimal macro-fusion alignment in BOLT. Remove the support in BOLT as
performan
[BOLT] Drop macro-fusion alignment (#97358)
9d0754ada5dbbc0c009bcc2f7824488419cc5530 dropped MC support required for
optimal macro-fusion alignment in BOLT. Remove the support in BOLT as
performance measurements with large binaries didn't show a significant
improvement.
Test Plan:
macro-fusion alignment was never upstreamed, so no upstream tests are
affected.
show more ...
|
#
a13bc971 |
| 11-Jun-2024 |
Paschalis Mpeis <paschalis.mpeis@arm.com> |
[BOLT][AArch64] Implement PLTCall optimization (#93584)
`convertCallToIndirectCall` applies the PLTCall optimization and returns
an (updated if needed) iterator to the converted call instruction. S
[BOLT][AArch64] Implement PLTCall optimization (#93584)
`convertCallToIndirectCall` applies the PLTCall optimization and returns
an (updated if needed) iterator to the converted call instruction. Since
AArch64 requires to inject additional instructions to implement this
pass, the relevant BasicBlock and an iterator was passed to the
`convertCallToIndirectCall`.
`NumCallsOptimized` is updated only on successful application of the
pass.
Tests:
- Inputs/plt-tailcall.c: an example of a tail call optimized PLT call.
- AArch64/plt-call.test: it is the actual A64 test, that runs the
PLTCall optimization on the above input file and verifies the
application of the pass to the calls: 'printf' and 'puts'.
show more ...
|
#
3fefb3c5 |
| 07-Jun-2024 |
Nathan Sidwell <nathan@acm.org> |
[BOLT][NFC] Infailable fns return void (#92018)
Both `reverseBranchCondition` and `replaceBranchTarget` return a success boolean. But all-but-one caller ignores the return value, and the exception e
[BOLT][NFC] Infailable fns return void (#92018)
Both `reverseBranchCondition` and `replaceBranchTarget` return a success boolean. But all-but-one caller ignores the return value, and the exception emits a fatal error on failure.
Thus, just return nothing.
show more ...
|
#
bba790db |
| 14-Mar-2024 |
Maksim Panchenko <maks@fb.com> |
[BOLT] Refactor instruction creation interface. NFCI (#85292)
Refactor MCPlusBuilder's create{Instruction}() functions that used to
return bool. We almost never check the return value as we rely on
[BOLT] Refactor instruction creation interface. NFCI (#85292)
Refactor MCPlusBuilder's create{Instruction}() functions that used to
return bool. We almost never check the return value as we rely on
llvm_unreachable() to detect unimplemented functionality. There were a
couple of cases that checked the return value, but they would hit the
unreachable condition first (at least in debug builds) before the return
value gets checked.
show more ...
|
#
59ab86bb |
| 14-Mar-2024 |
Maksim Panchenko <maks@fb.com> |
[BOLT] Clear operands when creating new instructions. NFCI (#85191)
Reset operand list whenever we create a new instruction via a parameter
passed by reference. Most functions were already doing th
[BOLT] Clear operands when creating new instructions. NFCI (#85191)
Reset operand list whenever we create a new instruction via a parameter
passed by reference. Most functions were already doing this, but there
are several places missing the reset. Potentially, if we don not clear
the list it could lead to invalid instruction operands. But the existing
code is unaffected.
show more ...
|
#
71c2a132 |
| 04-Mar-2024 |
sinan <sinan.lin@linux.alibaba.com> |
[BOLT] support AArch64 JUMP26 createRelocation (#83531)
Add R_AARCH64_JUMP26 implementation for createRelocation, which
could significantly reduce the number of failed scan-refs cases if we
perfor
[BOLT] support AArch64 JUMP26 createRelocation (#83531)
Add R_AARCH64_JUMP26 implementation for createRelocation, which
could significantly reduce the number of failed scan-refs cases if we
perform bolt on a selective range of functions.
show more ...
|
#
b98e6a5c |
| 27-Feb-2024 |
Elvina Yakubova <elvinayakubova@gmail.com> |
[BOLT][AArch64] Skip BBs only instead of functions (#81989)
After [this
](https://github.com/llvm/llvm-project/commit/846eb76761c858cbfc75700bf68445e0e3ade48e)
commit we noticed that the size of f
[BOLT][AArch64] Skip BBs only instead of functions (#81989)
After [this
](https://github.com/llvm/llvm-project/commit/846eb76761c858cbfc75700bf68445e0e3ade48e)
commit we noticed that the size of fdata file decreased a lot. That's
why the better and more precise way will be to skip basic blocks with
exclusive instructions only instead of the whole function
show more ...
|
#
f20af737 |
| 05-Dec-2023 |
eleviant <56861949+eleviant@users.noreply.github.com> |
[bolt] Support arm64 FP register spills (#73021)
At the moment llvm-bolt fails when analyzing jump tables on aarch64 in
case FP register spill/reload is used.
|
#
888742a1 |
| 03-Nov-2023 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT][AArch64] Handle .plt.got section (#71216)
It seems that currently this section is only created by the mold linker
if 2 conditions are met: 1. The PLT function was called directly. 2. The
in
[BOLT][AArch64] Handle .plt.got section (#71216)
It seems that currently this section is only created by the mold linker
if 2 conditions are met: 1. The PLT function was called directly. 2. The
indirect access to PLT function was found (e.g. through ADRP
relocation). Although mold created symbol for every plt entry I've
removed them in yaml file to check that .plt.got was truly disassembled
by bolt.
show more ...
|
#
b6b49288 |
| 23-Oct-2023 |
Job Noorman <jnoorman@igalia.com> |
[BOLT][RISCV] Set minimum function alignment to 2 for RVC (#69837)
In #67707, the minimum function alignment on RISC-V was set to 4. When
RVC (compressed instructions) is enabled, the minimum align
[BOLT][RISCV] Set minimum function alignment to 2 for RVC (#69837)
In #67707, the minimum function alignment on RISC-V was set to 4. When
RVC (compressed instructions) is enabled, the minimum alignment can be
reduced to 2.
This patch implements this by delegating the choice of minimum alignment
to a new `MCPlusBuilder::getMinFunctionAlignment` function. This way,
the target-dependent code in `BinaryFunction` is minimized.
show more ...
|
#
8fb83bf5 |
| 06-Oct-2023 |
Job Noorman <jnoorman@igalia.com> |
[BOLT][NFC] Add MCSubtargetInfo to MCPlusBuilder (#68223)
On RISC-V, it's helpful to have access to `MCSubtargetInfo` while
generating instructions in `MCPlusBuilder`. For example, a return
instru
[BOLT][NFC] Add MCSubtargetInfo to MCPlusBuilder (#68223)
On RISC-V, it's helpful to have access to `MCSubtargetInfo` while
generating instructions in `MCPlusBuilder`. For example, a return
instruction might be generated differently based on if the target
supports compressed instructions (`c.jr ra`) or not (`jalr ra`).
show more ...
|
#
2d902d0f |
| 25-Sep-2023 |
Kepontry <zjpzhoujiapeng@163.com> |
[BOLT] Implement '--assume-abi' option for AArch64
This patch implements the `getCalleeSavedRegs` function for AArch64, addressing the issue where the "not implemented" error occurs when both the `-
[BOLT] Implement '--assume-abi' option for AArch64
This patch implements the `getCalleeSavedRegs` function for AArch64, addressing the issue where the "not implemented" error occurs when both the `--assume-abi` option and options related to the RegAnalysis Pass (e.g., `--indirect-call-promotion=all`) are enabled.
show more ...
|
#
846eb767 |
| 15-Sep-2023 |
Vladislav Khmelevsky <och95@yandex.ru> |
[BOLT][AArch64] Fix instrumentation deadloop
According to ARMv8-a architecture reference manual B2.10.5 software must avoid having any explicit memory accesses between exclusive load and associated
[BOLT][AArch64] Fix instrumentation deadloop
According to ARMv8-a architecture reference manual B2.10.5 software must avoid having any explicit memory accesses between exclusive load and associated store instruction. Otherwise exclusive monitor might clear the exclusivity without application-related cause which may result in the deadloop. Disable instrumentation for such functions, since between exclusive load and store there might be branches and we would insert instrumentation snippet which contains loads and stores.
The better solution would be to analyze with BFS finding the exact BBs between load and store and not instrumenting them. Or even better to recognize such sequences and replace them with more complex one, e.g. loading value non exclusively, and for the brach where exclusive store is made make exclusive load and store sequentially, but for now just disable instrumentation for such functions completely.
Differential Revision: https://reviews.llvm.org/D159520
show more ...
|
#
eafe4ee2 |
| 01-Sep-2023 |
Job Noorman <jnoorman@igalia.com> |
[BOLT] Rename isLoad/isStore to mayLoad/mayStore
As discussed in D159266, for some instructions it's impossible to know statically if they will load/store (e.g., predicated instructions). Therefore,
[BOLT] Rename isLoad/isStore to mayLoad/mayStore
As discussed in D159266, for some instructions it's impossible to know statically if they will load/store (e.g., predicated instructions). Therefore, mayLoad/mayStore are more appropriate names.
show more ...
|
#
70405a0b |
| 08-Aug-2023 |
Elvina Yakubova <elvina.yakubova@huawei.com> |
[BOLT][Instrumentation] Add support for MacOS counters
This commit adds support for generation of getter counters for AArch64 MacOS. Continuation of work D151899
Reviewed By: rafauleir, yota9
Diff
[BOLT][Instrumentation] Add support for MacOS counters
This commit adds support for generation of getter counters for AArch64 MacOS. Continuation of work D151899
Reviewed By: rafauleir, yota9
Differential Revision: https://reviews.llvm.org/D151901
show more ...
|
#
6e4c2305 |
| 08-Aug-2023 |
Elvina Yakubova <elvina.yakubova@huawei.com> |
[BOLT][Instrumentation] Initial instrumentation support for AArch64
This commit adds code generation for AArch64 instrumentation, including direct and indirect calls support.
Reviewed By: rafauler,
[BOLT][Instrumentation] Initial instrumentation support for AArch64
This commit adds code generation for AArch64 instrumentation, including direct and indirect calls support.
Reviewed By: rafauler, yota9
Differential Revision: https://reviews.llvm.org/D151899
show more ...
|