History log of /llvm-project/llvm/lib/CodeGen/BranchRelaxation.cpp (Results 1 – 25 of 57)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1
# 22c8b1d8 27-Sep-2024 Daniel Hoekwater <hoekwater@google.com>

[BranchRelaxation] Remove quadratic behavior in relaxation pass (#96250)

Currently, we recompute block offsets after each relaxation. This causes
the complexity to be O(n^2) in the number of instru

[BranchRelaxation] Remove quadratic behavior in relaxation pass (#96250)

Currently, we recompute block offsets after each relaxation. This causes
the complexity to be O(n^2) in the number of instructions, inflating
compile
time.

If we instead recompute block offsets after each iteration of the outer
loop, the complexity is O(n). Recomputing offsets in the outer loop will
cause some out-of-range branches to be missed in the inner loop, but
they will be relaxed in the next iteration of the outer loop.

This change may introduce unnecessary relaxations for an architecture
where the relaxed branch is smaller than the unrelaxed branch, but AFAIK
there is no such architecture.

show more ...


Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1
# 866ae69c 29-Jul-2023 Daniel Hoekwater <hoekwater@google.com>

[AArch64] [BranchRelaxation] Optimize for hot code size in AArch64 branch relaxation

On AArch64, it is safe to let the linker handle relaxation of
unconditional branches; in most cases, the destinat

[AArch64] [BranchRelaxation] Optimize for hot code size in AArch64 branch relaxation

On AArch64, it is safe to let the linker handle relaxation of
unconditional branches; in most cases, the destination is within range,
and the linker doesn't need to do anything. If the linker does insert
fixup code, it clobbers the x16 inter-procedural register, so x16 must
be available across the branch before linking. If x16 isn't available,
but some other register is, we can relax the branch either by spilling
x16 OR using the free register for a manually-inserted indirect branch.

This patch builds on D145211. While that patch is for correctness, this
one is for performance of the common case. As noted in
https://reviews.llvm.org/D145211#4537173, we can trust the linker to
relax cross-section unconditional branches across which x16 is
available.

Programs that use machine function splitting care most about the
performance of hot code at the expense of the performance of cold code,
so we prioritize minimizing hot code size.

Here's a breakdown of the cases:

Hot -> Cold [x16 is free across the branch]
Do nothing; let the linker relax the branch.

Cold -> Hot [x16 is free across the branch]
Do nothing; let the linker relax the branch.

Hot -> Cold [x16 used across the branch, but there is a free register]
Spill x16; let the linker relax the branch.

Spilling requires fewer instructions than manually inserting an
indirect branch.

Cold -> Hot [x16 used across the branch, but there is a free register]
Manually insert an indirect branch.

Spilling would require adding a restore block in the hot section.

Hot -> Cold [No free regs]
Spill x16; let the linker relax the branch.

Cold -> Hot [No free regs]
Spill x16 and put the restore block at the end of the hot function; let the linker relax the branch.
Ex:
[Hot section]
func.hot:
... hot code...
func.restore:
... restore x16 ...
B func.hot

[Cold section]
func.cold:
... spill x16 ...
B func.restore

Putting the restore block at the end of the function instead of
just before the destination increases the cost of executing the
store, but it avoids putting cold code in the middle of hot code.
Since the restore is very rarely taken, this is a worthwhile
tradeoff.

Differential Revision: https://reviews.llvm.org/D156767

show more ...


# e223e456 21-Aug-2023 Daniel Hoekwater <hoekwater@google.com>

Reland "[AArch64][CodeGen] Avoid inverting hot branches during relaxation""

This is a reland of 46d2d7599d9ed5e68fb53e910feb10d47ee2667b, which was
reverted because of breaking build
https://lab.llv

Reland "[AArch64][CodeGen] Avoid inverting hot branches during relaxation""

This is a reland of 46d2d7599d9ed5e68fb53e910feb10d47ee2667b, which was
reverted because of breaking build
https://lab.llvm.org/buildbot/#/builders/21/builds/78779. However, this
buildbot is spuriously broken due to Flang::underscoring.f90 being
nondeterministic.

show more ...


# 0303137b 21-Aug-2023 Daniel Hoekwater <hoekwater@google.com>

Revert "[AArch64][CodeGen] Avoid inverting hot branches during relaxation"

This reverts commit 46d2d7599d9ed5e68fb53e910feb10d47ee2667b.
Breaks build https://lab.llvm.org/buildbot/#/builders/21/buil

Revert "[AArch64][CodeGen] Avoid inverting hot branches during relaxation"

This reverts commit 46d2d7599d9ed5e68fb53e910feb10d47ee2667b.
Breaks build https://lab.llvm.org/buildbot/#/builders/21/builds/78779

show more ...


# 46d2d759 01-Aug-2023 Daniel Hoekwater <hoekwater@google.com>

[AArch64][CodeGen] Avoid inverting hot branches during relaxation

Current behavior for relaxing out-of-range conditional branches
is to invert the conditional and insert a fallthrough unconditional

[AArch64][CodeGen] Avoid inverting hot branches during relaxation

Current behavior for relaxing out-of-range conditional branches
is to invert the conditional and insert a fallthrough unconditional
branch to the original destination. This approach biases the branch
predictor in the wrong direction, which can degrading performance.

Machine function splitting introduces many rarely-taken cross-section
conditional branches, which are improperly relaxed. Avoid inverting
these branches; instead, retarget them to trampolines at the end of the
function. Doing so increases the runtime cost of jumping to cold code
but eliminates the misprediction cost of jumping to hot code.

Differential Revision: https://reviews.llvm.org/D156837

show more ...


Revision tags: llvmorg-18-init
# d7bca8e4 30-Jun-2023 Daniel Hoekwater <hoekwater@google.com>

[AArch64] Relax cross-section branches

Because the code layout is not known during compilation, the distance of
cross-section jumps is not knowable at compile-time. Because of this, we
should assume

[AArch64] Relax cross-section branches

Because the code layout is not known during compilation, the distance of
cross-section jumps is not knowable at compile-time. Because of this, we
should assume that any cross-sectional jumps are out of range. This
assumption is necessary for machine function splitting on AArch64, which
introduces cross-section branches in the middle of functions. The linker
relaxes out-of-range unconditional branches, but it clobbers X16 to do
so; it doesn't relax conditional branches, which must be manually
relaxed by the compiler.

Differential Revision: https://reviews.llvm.org/D145211

show more ...


Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2
# 8bf7f86d 17-Apr-2023 Akshay Khadse <akshayskhadse@gmail.com>

Fix uninitialized pointer members in CodeGen

This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr.

Reviewed By: LuoYuanke

Differentia

Fix uninitialized pointer members in CodeGen

This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D148303

show more ...


Revision tags: llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2
# 03bb277f 31-Jan-2023 Philip Reames <preames@rivosinc.com>

[BranchRelaxation] Strengthen post condition assertions

The whole point of this pass is to rewrite branches so that branches are in bounds. We should assert that we succeeded rather than just that w

[BranchRelaxation] Strengthen post condition assertions

The whole point of this pass is to rewrite branches so that branches are in bounds. We should assert that we succeeded rather than just that we kept our internal data structure in sync.

Differential Revision: https://reviews.llvm.org/D142778

show more ...


Revision tags: llvmorg-16.0.0-rc1
# 36244914 27-Jan-2023 Philip Reames <preames@rivosinc.com>

Revert "[BranchRelaxation] Move faulting_op check into callee [nfc]"

This reverts commit c549da959b81902789a17918c5b95d4449e6fdfa. Per buildbots, this was not NFC.


# c549da95 27-Jan-2023 Philip Reames <preames@rivosinc.com>

[BranchRelaxation] Move faulting_op check into callee [nfc]

Mostly to remove a special case from an upcoming patch.


Revision tags: llvmorg-17-init
# 5073a622 17-Jan-2023 Anshil Gandhi <gandhi21299@gmail.com>

[MachineBasicBlock] Explicit FT branching param

Introduce a parameter in getFallThrough() to optionally
allow returning the fall through basic block in spite of
an explicit branch instruction to it.

[MachineBasicBlock] Explicit FT branching param

Introduce a parameter in getFallThrough() to optionally
allow returning the fall through basic block in spite of
an explicit branch instruction to it. This parameter is
set to false by default.

Introduce getLogicalFallThrough() which calls
getFallThrough(false) to obtain the block while avoiding
insertion of a jump instruction to its immediate successor.

This patch also reverts the changes made by D134557 and
solves the case where a jump is inserted after another jump
(branch-relax-no-terminators.mir).

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D140790

show more ...


Revision tags: llvmorg-15.0.7
# 010a8f7a 01-Dec-2022 ZHU Zijia <piggynl@outlook.com>

[CodeGen] Fix restore blocks' BasicBlock information in branch relaxation

In branch relaxation pass, restore blocks are created and placed before
the jump destination if indirect branches are requir

[CodeGen] Fix restore blocks' BasicBlock information in branch relaxation

In branch relaxation pass, restore blocks are created and placed before
the jump destination if indirect branches are required. For example:

foo
sd s11, 0(sp)
jump .restore, s11
bar
bar
bar
j .dest
.restore:
ld s11, 0(sp)
.dest:
baz

The BasicBlock information of the restore MachineBasicBlock should be
identical to the dest MachineBasicBlock.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D131863

show more ...


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3
# d383adec 14-Oct-2022 Anshil Gandhi <Anshil.Gandhi@amd.com>

[BranchRelaxation] Fall through only if block has no unconditional branches

Prior to inserting an unconditional branch from X to its
fall through basic block, check if X has any terminators to
avoid

[BranchRelaxation] Fall through only if block has no unconditional branches

Prior to inserting an unconditional branch from X to its
fall through basic block, check if X has any terminators to
avoid inserting additional branches.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D134557

show more ...


Revision tags: working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# 01be9be2 28-Mar-2022 serge-sans-paille <sguelton@redhat.com>

Cleanup includes: final pass

Cleanup a few extra files, this closes the work on libLLVM dependencies on my
side.

Impact on libLLVM preprocessed output: -35876 lines

Discourse thread: https://disco

Cleanup includes: final pass

Cleanup a few extra files, this closes the work on libLLVM dependencies on my
side.

Impact on libLLVM preprocessed output: -35876 lines

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D122576

show more ...


# 989f1c72 15-Mar-2022 serge-sans-paille <sguelton@redhat.com>

Cleanup codegen includes

This is a (fixed) recommit of https://reviews.llvm.org/D121169

after: 1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-in

Cleanup codegen includes

This is a (fixed) recommit of https://reviews.llvm.org/D121169

after: 1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681

show more ...


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# a278250b 10-Mar-2022 Nico Weber <thakis@chromium.org>

Revert "Cleanup codegen includes"

This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https:/

Revert "Cleanup codegen includes"

This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169

show more ...


# 7f230fee 07-Mar-2022 serge-sans-paille <sguelton@redhat.com>

Cleanup codegen includes

after: 1061034926
before: 1063332844

Differential Revision: https://reviews.llvm.org/D121169


Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1
# d5b73a70 23-Nov-2021 Kazu Hirata <kazu@google.com>

[llvm] Use range-based for loops (NFC)


# af2ae2cf 05-Nov-2021 Michael Liao <michael.hliao@gmail.com>

[BranchRelaxation] Fix warning on unused variable. NFC.


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init
# e6a4ba3a 16-Jul-2021 Michael Liao <michael.hliao@gmail.com>

[amdgpu] Handle the case where there is no scavenged register.

- When an unconditional branch is expanded into an indirect branch, if
there is no scavenged register, an SGPR pair needs spilling to

[amdgpu] Handle the case where there is no scavenged register.

- When an unconditional branch is expanded into an indirect branch, if
there is no scavenged register, an SGPR pair needs spilling to enable
the destination PC calculation. In addition, before jumping into the
destination, that clobbered SGPR pair need restoring.
- As SGPR cannot be spilled to or restored from memory directly, the
spilling/restoring of that SGPR pair reuses the regular SGPR spilling
support but without spilling it into memory. As that spilling and
restoring points are fully controlled, we only need to spill that SGPR
into the temporary VGPR, which needs spilling into its emergency slot.
- The target-specific hook is revised to take additional restore block,
where the restoring code is filled. After that, the relaxation will
place that restore block directly before the destination block and
insert an unconditional branch in any fall-through block into the
destination block.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D106449

show more ...


Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3
# b04c181e 17-Sep-2020 Philip Reames <listmail@philipreames.com>

[AArch64] Enable implicit null check transformation

This change enables the generic implicit null transformation for the AArch64 target. As background for those unfamiliar with our implicit null che

[AArch64] Enable implicit null check transformation

This change enables the generic implicit null transformation for the AArch64 target. As background for those unfamiliar with our implicit null check support:

An implicit null check is the use of a signal handler to catch and redirect to a handler a null pointer. Specifically, it's replacing an explicit conditional branch with such a redirect. This is only done for very cold branches under frontend control w/appropriate metadata.
FAULTING_OP is used to wrap the faulting instruction. It is modelled as being a conditional branch to reflect the fact it can transfer control in the CFG.
FAULTING_OP does not need to be an analyzable branch to achieve it's purpose. (Or at least, that's the x86 model. I find this slightly questionable.)
When lowering to MC, we convert the FAULTING_OP back into the actual instruction, record the labels, and lower the original instruction.

As can be seen in the test changes, currently the AArch64 backend does not eliminate the unconditional branch to the fallthrough block. I've tried two approaches, neither of which worked. I plan to return to this in a separate change set once I've wrapped my head around the interactions a bit better. (X86 handles this via AllowModify on analyzeBranch, but adding the obvious code causing BranchFolding to crash. I haven't yet figured out if it's a latent bug in BranchFolding, or something I'm doing wrong.)

Differential Revision: https://reviews.llvm.org/D87851

show more ...


Revision tags: llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3
# 1978309d 19-Feb-2020 James Y Knight <jyknight@google.com>

MachineBasicBlock::updateTerminator now requires an explicit layout successor.

Previously, it tried to infer the correct destination block from the
successor list, but this is a rather tricky propsp

MachineBasicBlock::updateTerminator now requires an explicit layout successor.

Previously, it tried to infer the correct destination block from the
successor list, but this is a rather tricky propspect, given the
existence of successors that occur mid-block, such as invoke, and
potentially in the future, callbr/INLINEASM_BR. (INLINEASM_BR, in
particular would be problematic, because its successor blocks are not
distinct from "normal" successors, as EHPads are.)

Instead, require the caller to pass in the expected fallthrough
successor explicitly. In most callers, the correct block is
immediately clear. But, in MachineBlockPlacement, we do need to record
the original ordering, before starting to reorder blocks.

Unfortunately, the goal of decoupling the behavior of end-of-block
jumps from the successor list has not been fully accomplished in this
patch, as there is currently no other way to determine whether a block
is intended to fall-through, or end as unreachable. Further work is
needed there.

Differential Revision: https://reviews.llvm.org/D79605

show more ...


Revision tags: llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1
# 886d2c2c 19-Jan-2020 Fangrui Song <i@maskray.me>

[BranchRelaxation] Simplify offset computation and fix a bug in adjustBlockOffsets()

If Start!=0, adjustBlockOffsets() may unnecessarily adjust the offset of
Start. There is no correctness issue, bu

[BranchRelaxation] Simplify offset computation and fix a bug in adjustBlockOffsets()

If Start!=0, adjustBlockOffsets() may unnecessarily adjust the offset of
Start. There is no correctness issue, but it can create more block
splits.

show more ...


Revision tags: llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1
# 05da2fe5 13-Nov-2019 Reid Kleckner <rnk@google.com>

Sink all InitializePasses.h includes

This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of reco

Sink all InitializePasses.h includes

This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.

I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
recompiles touches affected_files header
342380 95 3604 llvm/include/llvm/ADT/STLExtras.h
314730 234 1345 llvm/include/llvm/InitializePasses.h
307036 118 2602 llvm/include/llvm/ADT/APInt.h
213049 59 3611 llvm/include/llvm/Support/MathExtras.h
170422 47 3626 llvm/include/llvm/Support/Compiler.h
162225 45 3605 llvm/include/llvm/ADT/Optional.h
158319 63 2513 llvm/include/llvm/ADT/Triple.h
140322 39 3598 llvm/include/llvm/ADT/StringRef.h
137647 59 2333 llvm/include/llvm/Support/Error.h
131619 73 1803 llvm/include/llvm/Support/FileSystem.h

Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.

Reviewers: bkramer, asbirlea, bollu, jdoerfert

Differential Revision: https://reviews.llvm.org/D70211

show more ...


# 18f805a7 27-Sep-2019 Guillaume Chatelet <gchatelet@google.com>

[Alignment][NFC] Remove unneeded llvm:: scoping on Align types

llvm-svn: 373081


123