Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1 |
|
#
22c8b1d8 |
| 27-Sep-2024 |
Daniel Hoekwater <hoekwater@google.com> |
[BranchRelaxation] Remove quadratic behavior in relaxation pass (#96250)
Currently, we recompute block offsets after each relaxation. This causes
the complexity to be O(n^2) in the number of instru
[BranchRelaxation] Remove quadratic behavior in relaxation pass (#96250)
Currently, we recompute block offsets after each relaxation. This causes
the complexity to be O(n^2) in the number of instructions, inflating
compile
time.
If we instead recompute block offsets after each iteration of the outer
loop, the complexity is O(n). Recomputing offsets in the outer loop will
cause some out-of-range branches to be missed in the inner loop, but
they will be relaxed in the next iteration of the outer loop.
This change may introduce unnecessary relaxations for an architecture
where the relaxed branch is smaller than the unrelaxed branch, but AFAIK
there is no such architecture.
show more ...
|
Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1 |
|
#
866ae69c |
| 29-Jul-2023 |
Daniel Hoekwater <hoekwater@google.com> |
[AArch64] [BranchRelaxation] Optimize for hot code size in AArch64 branch relaxation
On AArch64, it is safe to let the linker handle relaxation of unconditional branches; in most cases, the destinat
[AArch64] [BranchRelaxation] Optimize for hot code size in AArch64 branch relaxation
On AArch64, it is safe to let the linker handle relaxation of unconditional branches; in most cases, the destination is within range, and the linker doesn't need to do anything. If the linker does insert fixup code, it clobbers the x16 inter-procedural register, so x16 must be available across the branch before linking. If x16 isn't available, but some other register is, we can relax the branch either by spilling x16 OR using the free register for a manually-inserted indirect branch.
This patch builds on D145211. While that patch is for correctness, this one is for performance of the common case. As noted in https://reviews.llvm.org/D145211#4537173, we can trust the linker to relax cross-section unconditional branches across which x16 is available.
Programs that use machine function splitting care most about the performance of hot code at the expense of the performance of cold code, so we prioritize minimizing hot code size.
Here's a breakdown of the cases:
Hot -> Cold [x16 is free across the branch] Do nothing; let the linker relax the branch.
Cold -> Hot [x16 is free across the branch] Do nothing; let the linker relax the branch.
Hot -> Cold [x16 used across the branch, but there is a free register] Spill x16; let the linker relax the branch.
Spilling requires fewer instructions than manually inserting an indirect branch.
Cold -> Hot [x16 used across the branch, but there is a free register] Manually insert an indirect branch.
Spilling would require adding a restore block in the hot section.
Hot -> Cold [No free regs] Spill x16; let the linker relax the branch.
Cold -> Hot [No free regs] Spill x16 and put the restore block at the end of the hot function; let the linker relax the branch. Ex: [Hot section] func.hot: ... hot code... func.restore: ... restore x16 ... B func.hot
[Cold section] func.cold: ... spill x16 ... B func.restore
Putting the restore block at the end of the function instead of just before the destination increases the cost of executing the store, but it avoids putting cold code in the middle of hot code. Since the restore is very rarely taken, this is a worthwhile tradeoff.
Differential Revision: https://reviews.llvm.org/D156767
show more ...
|
#
e223e456 |
| 21-Aug-2023 |
Daniel Hoekwater <hoekwater@google.com> |
Reland "[AArch64][CodeGen] Avoid inverting hot branches during relaxation""
This is a reland of 46d2d7599d9ed5e68fb53e910feb10d47ee2667b, which was reverted because of breaking build https://lab.llv
Reland "[AArch64][CodeGen] Avoid inverting hot branches during relaxation""
This is a reland of 46d2d7599d9ed5e68fb53e910feb10d47ee2667b, which was reverted because of breaking build https://lab.llvm.org/buildbot/#/builders/21/builds/78779. However, this buildbot is spuriously broken due to Flang::underscoring.f90 being nondeterministic.
show more ...
|
#
0303137b |
| 21-Aug-2023 |
Daniel Hoekwater <hoekwater@google.com> |
Revert "[AArch64][CodeGen] Avoid inverting hot branches during relaxation"
This reverts commit 46d2d7599d9ed5e68fb53e910feb10d47ee2667b. Breaks build https://lab.llvm.org/buildbot/#/builders/21/buil
Revert "[AArch64][CodeGen] Avoid inverting hot branches during relaxation"
This reverts commit 46d2d7599d9ed5e68fb53e910feb10d47ee2667b. Breaks build https://lab.llvm.org/buildbot/#/builders/21/builds/78779
show more ...
|
#
46d2d759 |
| 01-Aug-2023 |
Daniel Hoekwater <hoekwater@google.com> |
[AArch64][CodeGen] Avoid inverting hot branches during relaxation
Current behavior for relaxing out-of-range conditional branches is to invert the conditional and insert a fallthrough unconditional
[AArch64][CodeGen] Avoid inverting hot branches during relaxation
Current behavior for relaxing out-of-range conditional branches is to invert the conditional and insert a fallthrough unconditional branch to the original destination. This approach biases the branch predictor in the wrong direction, which can degrading performance.
Machine function splitting introduces many rarely-taken cross-section conditional branches, which are improperly relaxed. Avoid inverting these branches; instead, retarget them to trampolines at the end of the function. Doing so increases the runtime cost of jumping to cold code but eliminates the misprediction cost of jumping to hot code.
Differential Revision: https://reviews.llvm.org/D156837
show more ...
|
Revision tags: llvmorg-18-init |
|
#
d7bca8e4 |
| 30-Jun-2023 |
Daniel Hoekwater <hoekwater@google.com> |
[AArch64] Relax cross-section branches
Because the code layout is not known during compilation, the distance of cross-section jumps is not knowable at compile-time. Because of this, we should assume
[AArch64] Relax cross-section branches
Because the code layout is not known during compilation, the distance of cross-section jumps is not knowable at compile-time. Because of this, we should assume that any cross-sectional jumps are out of range. This assumption is necessary for machine function splitting on AArch64, which introduces cross-section branches in the middle of functions. The linker relaxes out-of-range unconditional branches, but it clobbers X16 to do so; it doesn't relax conditional branches, which must be manually relaxed by the compiler.
Differential Revision: https://reviews.llvm.org/D145211
show more ...
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2 |
|
#
8bf7f86d |
| 17-Apr-2023 |
Akshay Khadse <akshayskhadse@gmail.com> |
Fix uninitialized pointer members in CodeGen
This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr.
Reviewed By: LuoYuanke
Differentia
Fix uninitialized pointer members in CodeGen
This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr.
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D148303
show more ...
|
Revision tags: llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2 |
|
#
03bb277f |
| 31-Jan-2023 |
Philip Reames <preames@rivosinc.com> |
[BranchRelaxation] Strengthen post condition assertions
The whole point of this pass is to rewrite branches so that branches are in bounds. We should assert that we succeeded rather than just that w
[BranchRelaxation] Strengthen post condition assertions
The whole point of this pass is to rewrite branches so that branches are in bounds. We should assert that we succeeded rather than just that we kept our internal data structure in sync.
Differential Revision: https://reviews.llvm.org/D142778
show more ...
|
Revision tags: llvmorg-16.0.0-rc1 |
|
#
36244914 |
| 27-Jan-2023 |
Philip Reames <preames@rivosinc.com> |
Revert "[BranchRelaxation] Move faulting_op check into callee [nfc]"
This reverts commit c549da959b81902789a17918c5b95d4449e6fdfa. Per buildbots, this was not NFC.
|
#
c549da95 |
| 27-Jan-2023 |
Philip Reames <preames@rivosinc.com> |
[BranchRelaxation] Move faulting_op check into callee [nfc]
Mostly to remove a special case from an upcoming patch.
|
Revision tags: llvmorg-17-init |
|
#
5073a622 |
| 17-Jan-2023 |
Anshil Gandhi <gandhi21299@gmail.com> |
[MachineBasicBlock] Explicit FT branching param
Introduce a parameter in getFallThrough() to optionally allow returning the fall through basic block in spite of an explicit branch instruction to it.
[MachineBasicBlock] Explicit FT branching param
Introduce a parameter in getFallThrough() to optionally allow returning the fall through basic block in spite of an explicit branch instruction to it. This parameter is set to false by default.
Introduce getLogicalFallThrough() which calls getFallThrough(false) to obtain the block while avoiding insertion of a jump instruction to its immediate successor.
This patch also reverts the changes made by D134557 and solves the case where a jump is inserted after another jump (branch-relax-no-terminators.mir).
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D140790
show more ...
|
Revision tags: llvmorg-15.0.7 |
|
#
010a8f7a |
| 01-Dec-2022 |
ZHU Zijia <piggynl@outlook.com> |
[CodeGen] Fix restore blocks' BasicBlock information in branch relaxation
In branch relaxation pass, restore blocks are created and placed before the jump destination if indirect branches are requir
[CodeGen] Fix restore blocks' BasicBlock information in branch relaxation
In branch relaxation pass, restore blocks are created and placed before the jump destination if indirect branches are required. For example:
foo sd s11, 0(sp) jump .restore, s11 bar bar bar j .dest .restore: ld s11, 0(sp) .dest: baz
The BasicBlock information of the restore MachineBasicBlock should be identical to the dest MachineBasicBlock.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D131863
show more ...
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3 |
|
#
d383adec |
| 14-Oct-2022 |
Anshil Gandhi <Anshil.Gandhi@amd.com> |
[BranchRelaxation] Fall through only if block has no unconditional branches
Prior to inserting an unconditional branch from X to its fall through basic block, check if X has any terminators to avoid
[BranchRelaxation] Fall through only if block has no unconditional branches
Prior to inserting an unconditional branch from X to its fall through basic block, check if X has any terminators to avoid inserting additional branches.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D134557
show more ...
|
Revision tags: working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
#
01be9be2 |
| 28-Mar-2022 |
serge-sans-paille <sguelton@redhat.com> |
Cleanup includes: final pass
Cleanup a few extra files, this closes the work on libLLVM dependencies on my side.
Impact on libLLVM preprocessed output: -35876 lines
Discourse thread: https://disco
Cleanup includes: final pass
Cleanup a few extra files, this closes the work on libLLVM dependencies on my side.
Impact on libLLVM preprocessed output: -35876 lines
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D122576
show more ...
|
#
989f1c72 |
| 15-Mar-2022 |
serge-sans-paille <sguelton@redhat.com> |
Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169
after: 1061034926 before: 1063332844
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-in
Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169
after: 1061034926 before: 1063332844
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681
show more ...
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
#
a278250b |
| 10-Mar-2022 |
Nico Weber <thakis@chromium.org> |
Revert "Cleanup codegen includes"
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https:/
Revert "Cleanup codegen includes"
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169
show more ...
|
#
7f230fee |
| 07-Mar-2022 |
serge-sans-paille <sguelton@redhat.com> |
Cleanup codegen includes
after: 1061034926 before: 1063332844
Differential Revision: https://reviews.llvm.org/D121169
|
Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
#
d5b73a70 |
| 23-Nov-2021 |
Kazu Hirata <kazu@google.com> |
[llvm] Use range-based for loops (NFC)
|
#
af2ae2cf |
| 05-Nov-2021 |
Michael Liao <michael.hliao@gmail.com> |
[BranchRelaxation] Fix warning on unused variable. NFC.
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init |
|
#
e6a4ba3a |
| 16-Jul-2021 |
Michael Liao <michael.hliao@gmail.com> |
[amdgpu] Handle the case where there is no scavenged register.
- When an unconditional branch is expanded into an indirect branch, if there is no scavenged register, an SGPR pair needs spilling to
[amdgpu] Handle the case where there is no scavenged register.
- When an unconditional branch is expanded into an indirect branch, if there is no scavenged register, an SGPR pair needs spilling to enable the destination PC calculation. In addition, before jumping into the destination, that clobbered SGPR pair need restoring. - As SGPR cannot be spilled to or restored from memory directly, the spilling/restoring of that SGPR pair reuses the regular SGPR spilling support but without spilling it into memory. As that spilling and restoring points are fully controlled, we only need to spill that SGPR into the temporary VGPR, which needs spilling into its emergency slot. - The target-specific hook is revised to take additional restore block, where the restoring code is filled. After that, the relaxation will place that restore block directly before the destination block and insert an unconditional branch in any fall-through block into the destination block.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D106449
show more ...
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3 |
|
#
b04c181e |
| 17-Sep-2020 |
Philip Reames <listmail@philipreames.com> |
[AArch64] Enable implicit null check transformation
This change enables the generic implicit null transformation for the AArch64 target. As background for those unfamiliar with our implicit null che
[AArch64] Enable implicit null check transformation
This change enables the generic implicit null transformation for the AArch64 target. As background for those unfamiliar with our implicit null check support:
An implicit null check is the use of a signal handler to catch and redirect to a handler a null pointer. Specifically, it's replacing an explicit conditional branch with such a redirect. This is only done for very cold branches under frontend control w/appropriate metadata. FAULTING_OP is used to wrap the faulting instruction. It is modelled as being a conditional branch to reflect the fact it can transfer control in the CFG. FAULTING_OP does not need to be an analyzable branch to achieve it's purpose. (Or at least, that's the x86 model. I find this slightly questionable.) When lowering to MC, we convert the FAULTING_OP back into the actual instruction, record the labels, and lower the original instruction.
As can be seen in the test changes, currently the AArch64 backend does not eliminate the unconditional branch to the fallthrough block. I've tried two approaches, neither of which worked. I plan to return to this in a separate change set once I've wrapped my head around the interactions a bit better. (X86 handles this via AllowModify on analyzeBranch, but adding the obvious code causing BranchFolding to crash. I haven't yet figured out if it's a latent bug in BranchFolding, or something I'm doing wrong.)
Differential Revision: https://reviews.llvm.org/D87851
show more ...
|
Revision tags: llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3 |
|
#
1978309d |
| 19-Feb-2020 |
James Y Knight <jyknight@google.com> |
MachineBasicBlock::updateTerminator now requires an explicit layout successor.
Previously, it tried to infer the correct destination block from the successor list, but this is a rather tricky propsp
MachineBasicBlock::updateTerminator now requires an explicit layout successor.
Previously, it tried to infer the correct destination block from the successor list, but this is a rather tricky propspect, given the existence of successors that occur mid-block, such as invoke, and potentially in the future, callbr/INLINEASM_BR. (INLINEASM_BR, in particular would be problematic, because its successor blocks are not distinct from "normal" successors, as EHPads are.)
Instead, require the caller to pass in the expected fallthrough successor explicitly. In most callers, the correct block is immediately clear. But, in MachineBlockPlacement, we do need to record the original ordering, before starting to reorder blocks.
Unfortunately, the goal of decoupling the behavior of end-of-block jumps from the successor list has not been fully accomplished in this patch, as there is currently no other way to determine whether a block is intended to fall-through, or end as unreachable. Further work is needed there.
Differential Revision: https://reviews.llvm.org/D79605
show more ...
|
Revision tags: llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1 |
|
#
886d2c2c |
| 19-Jan-2020 |
Fangrui Song <i@maskray.me> |
[BranchRelaxation] Simplify offset computation and fix a bug in adjustBlockOffsets()
If Start!=0, adjustBlockOffsets() may unnecessarily adjust the offset of Start. There is no correctness issue, bu
[BranchRelaxation] Simplify offset computation and fix a bug in adjustBlockOffsets()
If Start!=0, adjustBlockOffsets() may unnecessarily adjust the offset of Start. There is no correctness issue, but it can create more block splits.
show more ...
|
Revision tags: llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1 |
|
#
05da2fe5 |
| 13-Nov-2019 |
Reid Kleckner <rnk@google.com> |
Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of reco
Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
show more ...
|
#
18f805a7 |
| 27-Sep-2019 |
Guillaume Chatelet <gchatelet@google.com> |
[Alignment][NFC] Remove unneeded llvm:: scoping on Align types
llvm-svn: 373081
|