BranchRelaxation.cpp - OpenGrok history log for /llvm-project/llvm/lib/CodeGen/BranchRelaxation.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1
# 22c8b1d8	27-Sep-2024	Daniel Hoekwater <hoekwater@google.com>	[BranchRelaxation] Remove quadratic behavior in relaxation pass (#96250) Currently, we recompute block offsets after each relaxation. This causes the complexity to be O(n^2) in the number of instru [BranchRelaxation] Remove quadratic behavior in relaxation pass (#96250) Currently, we recompute block offsets after each relaxation. This causes the complexity to be O(n^2) in the number of instructions, inflating compile time. If we instead recompute block offsets after each iteration of the outer loop, the complexity is O(n). Recomputing offsets in the outer loop will cause some out-of-range branches to be missed in the inner loop, but they will be relaxed in the next iteration of the outer loop. This change may introduce unnecessary relaxations for an architecture where the relaxed branch is smaller than the unrelaxed branch, but AFAIK there is no such architecture. show more ...
Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1
# 866ae69c	29-Jul-2023	Daniel Hoekwater <hoekwater@google.com>	[AArch64] [BranchRelaxation] Optimize for hot code size in AArch64 branch relaxation On AArch64, it is safe to let the linker handle relaxation of unconditional branches; in most cases, the destinat [AArch64] [BranchRelaxation] Optimize for hot code size in AArch64 branch relaxation On AArch64, it is safe to let the linker handle relaxation of unconditional branches; in most cases, the destination is within range, and the linker doesn't need to do anything. If the linker does insert fixup code, it clobbers the x16 inter-procedural register, so x16 must be available across the branch before linking. If x16 isn't available, but some other register is, we can relax the branch either by spilling x16 OR using the free register for a manually-inserted indirect branch. This patch builds on D145211. While that patch is for correctness, this one is for performance of the common case. As noted in https://reviews.llvm.org/D145211#4537173, we can trust the linker to relax cross-section unconditional branches across which x16 is available. Programs that use machine function splitting care most about the performance of hot code at the expense of the performance of cold code, so we prioritize minimizing hot code size. Here's a breakdown of the cases: Hot -> Cold [x16 is free across the branch] Do nothing; let the linker relax the branch. Cold -> Hot [x16 is free across the branch] Do nothing; let the linker relax the branch. Hot -> Cold [x16 used across the branch, but there is a free register] Spill x16; let the linker relax the branch. Spilling requires fewer instructions than manually inserting an indirect branch. Cold -> Hot [x16 used across the branch, but there is a free register] Manually insert an indirect branch. Spilling would require adding a restore block in the hot section. Hot -> Cold [No free regs] Spill x16; let the linker relax the branch. Cold -> Hot [No free regs] Spill x16 and put the restore block at the end of the hot function; let the linker relax the branch. Ex: [Hot section] func.hot: ... hot code... func.restore: ... restore x16 ... B func.hot [Cold section] func.cold: ... spill x16 ... B func.restore Putting the restore block at the end of the function instead of just before the destination increases the cost of executing the store, but it avoids putting cold code in the middle of hot code. Since the restore is very rarely taken, this is a worthwhile tradeoff. Differential Revision: https://reviews.llvm.org/D156767 show more ...
# e223e456	21-Aug-2023	Daniel Hoekwater <hoekwater@google.com>	Reland "[AArch64][CodeGen] Avoid inverting hot branches during relaxation"" This is a reland of 46d2d7599d9ed5e68fb53e910feb10d47ee2667b, which was reverted because of breaking build https://lab.llv Reland "[AArch64][CodeGen] Avoid inverting hot branches during relaxation"" This is a reland of 46d2d7599d9ed5e68fb53e910feb10d47ee2667b, which was reverted because of breaking build https://lab.llvm.org/buildbot/#/builders/21/builds/78779. However, this buildbot is spuriously broken due to Flang::underscoring.f90 being nondeterministic. show more ...
# 0303137b	21-Aug-2023	Daniel Hoekwater <hoekwater@google.com>	Revert "[AArch64][CodeGen] Avoid inverting hot branches during relaxation" This reverts commit 46d2d7599d9ed5e68fb53e910feb10d47ee2667b. Breaks build https://lab.llvm.org/buildbot/#/builders/21/buil Revert "[AArch64][CodeGen] Avoid inverting hot branches during relaxation" This reverts commit 46d2d7599d9ed5e68fb53e910feb10d47ee2667b. Breaks build https://lab.llvm.org/buildbot/#/builders/21/builds/78779 show more ...
# 46d2d759	01-Aug-2023	Daniel Hoekwater <hoekwater@google.com>	[AArch64][CodeGen] Avoid inverting hot branches during relaxation Current behavior for relaxing out-of-range conditional branches is to invert the conditional and insert a fallthrough unconditional [AArch64][CodeGen] Avoid inverting hot branches during relaxation Current behavior for relaxing out-of-range conditional branches is to invert the conditional and insert a fallthrough unconditional branch to the original destination. This approach biases the branch predictor in the wrong direction, which can degrading performance. Machine function splitting introduces many rarely-taken cross-section conditional branches, which are improperly relaxed. Avoid inverting these branches; instead, retarget them to trampolines at the end of the function. Doing so increases the runtime cost of jumping to cold code but eliminates the misprediction cost of jumping to hot code. Differential Revision: https://reviews.llvm.org/D156837 show more ...
Revision tags: llvmorg-18-init
# d7bca8e4	30-Jun-2023	Daniel Hoekwater <hoekwater@google.com>	[AArch64] Relax cross-section branches Because the code layout is not known during compilation, the distance of cross-section jumps is not knowable at compile-time. Because of this, we should assume [AArch64] Relax cross-section branches Because the code layout is not known during compilation, the distance of cross-section jumps is not knowable at compile-time. Because of this, we should assume that any cross-sectional jumps are out of range. This assumption is necessary for machine function splitting on AArch64, which introduces cross-section branches in the middle of functions. The linker relaxes out-of-range unconditional branches, but it clobbers X16 to do so; it doesn't relax conditional branches, which must be manually relaxed by the compiler. Differential Revision: https://reviews.llvm.org/D145211 show more ...
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2
# 8bf7f86d	17-Apr-2023	Akshay Khadse <akshayskhadse@gmail.com>	Fix uninitialized pointer members in CodeGen This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr. Reviewed By: LuoYuanke Differentia Fix uninitialized pointer members in CodeGen This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D148303 show more ...
Revision tags: llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2
# 03bb277f	31-Jan-2023	Philip Reames <preames@rivosinc.com>	[BranchRelaxation] Strengthen post condition assertions The whole point of this pass is to rewrite branches so that branches are in bounds. We should assert that we succeeded rather than just that w [BranchRelaxation] Strengthen post condition assertions The whole point of this pass is to rewrite branches so that branches are in bounds. We should assert that we succeeded rather than just that we kept our internal data structure in sync. Differential Revision: https://reviews.llvm.org/D142778 show more ...
Revision tags: llvmorg-16.0.0-rc1
# 36244914	27-Jan-2023	Philip Reames <preames@rivosinc.com>	Revert "[BranchRelaxation] Move faulting_op check into callee [nfc]" This reverts commit c549da959b81902789a17918c5b95d4449e6fdfa. Per buildbots, this was not NFC.
# c549da95	27-Jan-2023	Philip Reames <preames@rivosinc.com>	[BranchRelaxation] Move faulting_op check into callee [nfc] Mostly to remove a special case from an upcoming patch.
Revision tags: llvmorg-17-init
# 5073a622	17-Jan-2023	Anshil Gandhi <gandhi21299@gmail.com>	[MachineBasicBlock] Explicit FT branching param Introduce a parameter in getFallThrough() to optionally allow returning the fall through basic block in spite of an explicit branch instruction to it. [MachineBasicBlock] Explicit FT branching param Introduce a parameter in getFallThrough() to optionally allow returning the fall through basic block in spite of an explicit branch instruction to it. This parameter is set to false by default. Introduce getLogicalFallThrough() which calls getFallThrough(false) to obtain the block while avoiding insertion of a jump instruction to its immediate successor. This patch also reverts the changes made by D134557 and solves the case where a jump is inserted after another jump (branch-relax-no-terminators.mir). Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D140790 show more ...
Revision tags: llvmorg-15.0.7
# 010a8f7a	01-Dec-2022	ZHU Zijia <piggynl@outlook.com>	[CodeGen] Fix restore blocks' BasicBlock information in branch relaxation In branch relaxation pass, restore blocks are created and placed before the jump destination if indirect branches are requir [CodeGen] Fix restore blocks' BasicBlock information in branch relaxation In branch relaxation pass, restore blocks are created and placed before the jump destination if indirect branches are required. For example: foo sd s11, 0(sp) jump .restore, s11 bar bar bar j .dest .restore: ld s11, 0(sp) .dest: baz The BasicBlock information of the restore MachineBasicBlock should be identical to the dest MachineBasicBlock. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D131863 show more ...
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3
# d383adec	14-Oct-2022	Anshil Gandhi <Anshil.Gandhi@amd.com>	[BranchRelaxation] Fall through only if block has no unconditional branches Prior to inserting an unconditional branch from X to its fall through basic block, check if X has any terminators to avoid [BranchRelaxation] Fall through only if block has no unconditional branches Prior to inserting an unconditional branch from X to its fall through basic block, check if X has any terminators to avoid inserting additional branches. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D134557 show more ...
Revision tags: working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# 01be9be2	28-Mar-2022	serge-sans-paille <sguelton@redhat.com>	Cleanup includes: final pass Cleanup a few extra files, this closes the work on libLLVM dependencies on my side. Impact on libLLVM preprocessed output: -35876 lines Discourse thread: https://disco Cleanup includes: final pass Cleanup a few extra files, this closes the work on libLLVM dependencies on my side. Impact on libLLVM preprocessed output: -35876 lines Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D122576 show more ...
# 989f1c72	15-Mar-2022	serge-sans-paille <sguelton@redhat.com>	Cleanup codegen includes This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-in Cleanup codegen includes This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681 show more ...
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# a278250b	10-Mar-2022	Nico Weber <thakis@chromium.org>	Revert "Cleanup codegen includes" This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https:/ Revert "Cleanup codegen includes" This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169 show more ...
# 7f230fee	07-Mar-2022	serge-sans-paille <sguelton@redhat.com>	Cleanup codegen includes after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169
Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1
# d5b73a70	23-Nov-2021	Kazu Hirata <kazu@google.com>	[llvm] Use range-based for loops (NFC)
# af2ae2cf	05-Nov-2021	Michael Liao <michael.hliao@gmail.com>	[BranchRelaxation] Fix warning on unused variable. NFC.
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init
# e6a4ba3a	16-Jul-2021	Michael Liao <michael.hliao@gmail.com>	[amdgpu] Handle the case where there is no scavenged register. - When an unconditional branch is expanded into an indirect branch, if there is no scavenged register, an SGPR pair needs spilling to [amdgpu] Handle the case where there is no scavenged register. - When an unconditional branch is expanded into an indirect branch, if there is no scavenged register, an SGPR pair needs spilling to enable the destination PC calculation. In addition, before jumping into the destination, that clobbered SGPR pair need restoring. - As SGPR cannot be spilled to or restored from memory directly, the spilling/restoring of that SGPR pair reuses the regular SGPR spilling support but without spilling it into memory. As that spilling and restoring points are fully controlled, we only need to spill that SGPR into the temporary VGPR, which needs spilling into its emergency slot. - The target-specific hook is revised to take additional restore block, where the restoring code is filled. After that, the relaxation will place that restore block directly before the destination block and insert an unconditional branch in any fall-through block into the destination block. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D106449 show more ...
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3
# b04c181e	17-Sep-2020	Philip Reames <listmail@philipreames.com>	[AArch64] Enable implicit null check transformation This change enables the generic implicit null transformation for the AArch64 target. As background for those unfamiliar with our implicit null che [AArch64] Enable implicit null check transformation This change enables the generic implicit null transformation for the AArch64 target. As background for those unfamiliar with our implicit null check support: An implicit null check is the use of a signal handler to catch and redirect to a handler a null pointer. Specifically, it's replacing an explicit conditional branch with such a redirect. This is only done for very cold branches under frontend control w/appropriate metadata. FAULTING_OP is used to wrap the faulting instruction. It is modelled as being a conditional branch to reflect the fact it can transfer control in the CFG. FAULTING_OP does not need to be an analyzable branch to achieve it's purpose. (Or at least, that's the x86 model. I find this slightly questionable.) When lowering to MC, we convert the FAULTING_OP back into the actual instruction, record the labels, and lower the original instruction. As can be seen in the test changes, currently the AArch64 backend does not eliminate the unconditional branch to the fallthrough block. I've tried two approaches, neither of which worked. I plan to return to this in a separate change set once I've wrapped my head around the interactions a bit better. (X86 handles this via AllowModify on analyzeBranch, but adding the obvious code causing BranchFolding to crash. I haven't yet figured out if it's a latent bug in BranchFolding, or something I'm doing wrong.) Differential Revision: https://reviews.llvm.org/D87851 show more ...
Revision tags: llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3
# 1978309d	19-Feb-2020	James Y Knight <jyknight@google.com>	MachineBasicBlock::updateTerminator now requires an explicit layout successor. Previously, it tried to infer the correct destination block from the successor list, but this is a rather tricky propsp MachineBasicBlock::updateTerminator now requires an explicit layout successor. Previously, it tried to infer the correct destination block from the successor list, but this is a rather tricky propspect, given the existence of successors that occur mid-block, such as invoke, and potentially in the future, callbr/INLINEASM_BR. (INLINEASM_BR, in particular would be problematic, because its successor blocks are not distinct from "normal" successors, as EHPads are.) Instead, require the caller to pass in the expected fallthrough successor explicitly. In most callers, the correct block is immediately clear. But, in MachineBlockPlacement, we do need to record the original ordering, before starting to reorder blocks. Unfortunately, the goal of decoupling the behavior of end-of-block jumps from the successor list has not been fully accomplished in this patch, as there is currently no other way to determine whether a block is intended to fall-through, or end as unreachable. Further work is needed there. Differential Revision: https://reviews.llvm.org/D79605 show more ...
Revision tags: llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1
# 886d2c2c	19-Jan-2020	Fangrui Song <i@maskray.me>	[BranchRelaxation] Simplify offset computation and fix a bug in adjustBlockOffsets() If Start!=0, adjustBlockOffsets() may unnecessarily adjust the offset of Start. There is no correctness issue, bu [BranchRelaxation] Simplify offset computation and fix a bug in adjustBlockOffsets() If Start!=0, adjustBlockOffsets() may unnecessarily adjust the offset of Start. There is no correctness issue, but it can create more block splits. show more ...
Revision tags: llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1
# 05da2fe5	13-Nov-2019	Reid Kleckner <rnk@google.com>	Sink all InitializePasses.h includes This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of reco Sink all InitializePasses.h includes This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation. I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild. Reviewers: bkramer, asbirlea, bollu, jdoerfert Differential Revision: https://reviews.llvm.org/D70211 show more ...
# 18f805a7	27-Sep-2019	Guillaume Chatelet <gchatelet@google.com>	[Alignment][NFC] Remove unneeded llvm:: scoping on Align types llvm-svn: 373081
12 3