History log of /llvm-project/llvm/lib/CodeGen/TailDuplicator.cpp (Results 1 – 25 of 105)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init
# 0d019081 23-Jan-2025 Florian Hahn <flo@fhahn.com>

[TailDup] Allow large number of predecessors/successors without phis. (#116072)

This adjusts the threshold logic added in #78582 to only trigger for
cases where there are actually phis to duplicate

[TailDup] Allow large number of predecessors/successors without phis. (#116072)

This adjusts the threshold logic added in #78582 to only trigger for
cases where there are actually phis to duplicate in either TailBB or in
one of the successors.

In cases there are no phis, we only have to pay the cost of extra edges,
but have no explosion in PHI related instructions.

This improves performance of Python on some inputs by 2-3% on Apple
Silicon CPUs.

PR: https://github.com/llvm/llvm-project/pull/116072

show more ...


Revision tags: llvmorg-19.1.7
# 19032bfe 13-Jan-2025 Daniel Paoliello <danpao@microsoft.com>

[aarch64][win] Update Called Globals info when updating Call Site info (#122762)

Fixes the "use after poison" issue introduced by #121516 (see
<https://github.com/llvm/llvm-project/pull/121516#issue

[aarch64][win] Update Called Globals info when updating Call Site info (#122762)

Fixes the "use after poison" issue introduced by #121516 (see
<https://github.com/llvm/llvm-project/pull/121516#issuecomment-2585912395>).

The root cause of this issue is that #121516 introduced "Called Global"
information for call instructions modeling how "Call Site" info is
stored in the machine function, HOWEVER it didn't copy the
copy/move/erase operations for call site information.

The fix is to rename and update the existing copy/move/erase functions
so they also take care of Called Global info.

show more ...


Revision tags: llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 735ab61a 13-Nov-2024 Kazu Hirata <kazu@google.com>

[CodeGen] Remove unused includes (NFC) (#115996)

Identified with misc-include-cleaner.


Revision tags: llvmorg-19.1.3
# 6ab26eab 28-Oct-2024 Ellis Hoag <ellis.sparky.hoag@gmail.com>

Check hasOptSize() in shouldOptimizeForSize() (#112626)


Revision tags: llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 0f0cfcff 19-Jul-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

CodeGen: Avoid some references to MachineFunction's getMMI (#99652)

MachineFunction's probably should not include a backreference to
the owning MachineModuleInfo. Most of these references were used

CodeGen: Avoid some references to MachineFunction's getMMI (#99652)

MachineFunction's probably should not include a backreference to
the owning MachineModuleInfo. Most of these references were used
just to query the MCContext, which MachineFunction already directly
stores. Other contexts are using it to query the LLVMContext, which
can already be accessed through the IR function reference.

show more ...


# 559ea40d 27-Jun-2024 Kazu Hirata <kazu@google.com>

[CodeGen] Use range-based for loops (NFC) (#96855)


# dae061f1 26-Jun-2024 Kazu Hirata <kazu@google.com>

[CodeGen] Use range-based for loops (NFC) (#96777)


Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5
# 86a78284 17-Apr-2024 Quentin Dian <dianqk@dianqk.net>

[TailDuplicator] Add maximum predecessors and successors to consider tail duplicating blocks (#78582)

Fixes #78578.

Duplicating a BB which has both multiple predecessors and successors
will resu

[TailDuplicator] Add maximum predecessors and successors to consider tail duplicating blocks (#78582)

Fixes #78578.

Duplicating a BB which has both multiple predecessors and successors
will result in a complex CFG and also may cause huge amount of PHI
nodes. See
https://github.com/llvm/llvm-project/issues/78578#issuecomment-1962363580
for a detailed description of the limit.

show more ...


Revision tags: llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3
# 4b2381a5 02-May-2023 Jay Foad <jay.foad@amd.com>

[CodeGen] Make use of MachineBasicBlock::phis. NFC.


# f7bee657 21-Apr-2023 Mikael Holmen <mikael.holmen@ericsson.com>

[TailDuplicator] Don't constrain register classes due to debug instructions

If cloning a DBG_VALUE instruction, register uses in that instruction could
lead to constraining of a virtual register tha

[TailDuplicator] Don't constrain register classes due to debug instructions

If cloning a DBG_VALUE instruction, register uses in that instruction could
lead to constraining of a virtual register that would not happen if the
DBG_VALUE was not present at all. This lead to different code with/without
debug info.

Now we only do that register class constraining if we dealing with a non
debug instruction.

Differential Revision: https://reviews.llvm.org/D149146

show more ...


Revision tags: llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# df947feb 21-Dec-2022 Bjorn Pettersson <bjorn.a.pettersson@ericsson.com>

[TailDuplicator] Fix old bugs in TailDuplicator::duplicateInstruction

This patch is updating TailDuplicator::duplicateInstruction to fix
some old bugs that has been found with an out-of-tree target.

[TailDuplicator] Fix old bugs in TailDuplicator::duplicateInstruction

This patch is updating TailDuplicator::duplicateInstruction to fix
some old bugs that has been found with an out-of-tree target. There
are three different things being addressed:

1) In one situation two subregister indices are combined using the
composeSubRegIndices helper. But the order in which those indices
are combined has been incorrect. For this problem I managed to
create some kind of reproducer using AArch64 (see the test case
touched in this patch).

2) Another fault was found in the else branch for the above situation.
Here we do not compose the two subregisters, instead we insert a
COPY to replace the PHI, and then the subreg index in the using
MO remains. Thus, the virtual register created for the COPY should
always match with the size of the original register. Therefore the
optimization that "constrain" (or rather relax) the register
class by looking at the instruction desc must be limited to the
situation when there is no subregister access. Otherwise we create
a vreg with the wrong class.

3) Last problem addressed in this patch is that when a new register
class is picked by looking at the instruction desc, then it
isn't guaranteed that the isAllocatable property is set for that
class. So one need to use the getAllocatableClass helper to find
a subclass that is allocatable before using createVirualRegister,
or alternatively (as in this patch) just use the OrigRC instead
of relaxing the register class for the COPY destination.

Haven't been able to find any in-tree reproducers for problem 2 and 3.
The tricky part is to find a target that has register hierarchies that
match with the problem to trigger those code paths (and with subreg
accesses involved).

Differential Revision: https://reviews.llvm.org/D140496

show more ...


# e72ca520 13-Jan-2023 Craig Topper <craig.topper@sifive.com>

[CodeGen] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFC

Use isPhysical/isVirtual methods.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D141715


# 5b24d421 09-Jan-2023 Tim Northover <tnorthover@apple.com>

TailDuplication: do not remove trivial PHIs from addr-taken blocks.

Unlike an anonymous block, it will not be removed even though we've resolved
all valid paths to get here. So removing a PHI can le

TailDuplication: do not remove trivial PHIs from addr-taken blocks.

Unlike an anonymous block, it will not be removed even though we've resolved
all valid paths to get here. So removing a PHI can leave vregs with no
definition, violating SSA. Instead, this converts it to an IMPLICIT_DEF.

show more ...


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0
# d7474bef 31-Aug-2022 Nick Desaulniers <ndesaulniers@google.com>

[llvm][TailDuplicator] don't taildup isInlineAsmBrIndirectTargets

This fixes a crash observed after
https://reviews.llvm.org/D129997.

Similar to D88823.

Reviewed By: efriedma

Differential Revisio

[llvm][TailDuplicator] don't taildup isInlineAsmBrIndirectTargets

This fixes a crash observed after
https://reviews.llvm.org/D129997.

Similar to D88823.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D130127

show more ...


Revision tags: llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2
# 4146c175 02-Aug-2022 Mircea Trofin <mtrofin@google.com>

[nfc] Remove unused parameter in TailDuplicator::duplicateSimpleBB

Differential Revision: https://reviews.llvm.org/D131008


Revision tags: llvmorg-15.0.0-rc1, llvmorg-16-init
# 9e6d1f4b 17-Jul-2022 Kazu Hirata <kazu@google.com>

[CodeGen] Qualify auto variables in for loops (NFC)


Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# b4ad28da 11-Apr-2022 Momchil Velikov <momchil.velikov@arm.com>

[CodeGen] Async unwind - add a pass to fix CFI information

This pass inserts the necessary CFI instructions to compensate for the
inconsistency of the call-frame information caused by linear (non-CG

[CodeGen] Async unwind - add a pass to fix CFI information

This pass inserts the necessary CFI instructions to compensate for the
inconsistency of the call-frame information caused by linear (non-CGA
aware) nature of the unwind tables.

Unlike the `CFIInstrInserer` pass, this one almost always emits only
`.cfi_remember_state`/`.cfi_restore_state`, which results in smaller
unwind tables and also transparently handles custom unwind info
extensions like CFA offset adjustement and save locations of SVE
registers.

This pass takes advantage of the constraints taht LLVM imposes on the
placement of save/restore points (cf. `ShrinkWrap.cpp`):

* there is a single basic block, containing the function prologue

* possibly multiple epilogue blocks, where each epilogue block is
complete and self-contained, i.e. CSR restore instructions (and the
corresponding CFI instructions are not split across two or more
blocks.

* prologue and epilogue blocks are outside of any loops

Thus, during execution, at the beginning and at the end of each basic
block the function can be in one of two states:

- "has a call frame", if the function has executed the prologue, or
has not executed any epilogue

- "does not have a call frame", if the function has not executed the
prologue, or has executed an epilogue

These properties can be computed for each basic block by a single RPO
traversal.

From the point of view of the unwind tables, the "has/does not have
call frame" state at beginning of each block is determined by the
state at the end of the previous block, in layout order.

Where these states differ, we insert compensating CFI instructions,
which come in two flavours:

- CFI instructions, which reset the unwind table state to the
initial one. This is done by a target specific hook and is
expected to be trivial to implement, for example it could be:
```
.cfi_def_cfa <sp>, 0
.cfi_same_value <rN>
.cfi_same_value <rN-1>
...
```
where `<rN>` are the callee-saved registers.

- CFI instructions, which reset the unwind table state to the one
created by the function prologue. These are the sequence:
```
.cfi_restore_state
.cfi_remember_state
```
In this case we also insert a `.cfi_remember_state` after the
last CFI instruction in the function prologue.

Reviewed By: MaskRay, danielkiss, chill

Differential Revision: https://reviews.llvm.org/D114545

show more ...


# 0320115c 05-Apr-2022 Muhammad Omair Javaid <omair.javaid@linaro.org>

Revert "[CodeGen] Async unwind - add a pass to fix CFI information"

This reverts commit 980c3e6dd223a8e628367144b8180117950bb364.

This commit had failing tests with clang crashing across various
AA

Revert "[CodeGen] Async unwind - add a pass to fix CFI information"

This reverts commit 980c3e6dd223a8e628367144b8180117950bb364.

This commit had failing tests with clang crashing across various
AArch64/Linux buildots.

https://lab.llvm.org/buildbot/#/builders/179/builds/3346

Differential Revision: https://reviews.llvm.org/D114545

show more ...


# 980c3e6d 04-Apr-2022 Momchil Velikov <momchil.velikov@arm.com>

[CodeGen] Async unwind - add a pass to fix CFI information

This pass inserts the necessary CFI instructions to compensate for the
inconsistency of the call-frame information caused by linear (non-CF

[CodeGen] Async unwind - add a pass to fix CFI information

This pass inserts the necessary CFI instructions to compensate for the
inconsistency of the call-frame information caused by linear (non-CFG
aware) nature of the unwind tables.

Unlike the `CFIInstrInserer` pass, this one almost always emits only
`.cfi_remember_state`/`.cfi_restore_state`, which results in smaller
unwind tables and also transparently handles custom unwind info
extensions like CFA offset adjustement and save locations of SVE
registers.

This pass takes advantage of the constraints that LLVM imposes on the
placement of save/restore points (cf. `ShrinkWrap.cpp`):

* there is a single basic block, containing the function prologue

* possibly multiple epilogue blocks, where each epilogue block is
complete and self-contained, i.e. CSR restore instructions (and the
corresponding CFI instructions are not split across two or more
blocks.

* prologue and epilogue blocks are outside of any loops

Thus, during execution, at the beginning and at the end of each basic
block the function can be in one of two states:

- "has a call frame", if the function has executed the prologue, or
has not executed any epilogue

- "does not have a call frame", if the function has not executed the
prologue, or has executed an epilogue

These properties can be computed for each basic block by a single RPO
traversal.

In order to accommodate backends which do not generate unwind info in
epilogues we compute an additional property "strong no call frame on
entry" which is set for the entry point of the function and for every
block reachable from the entry along a path that does not execute the
prologue. If this property holds, it takes precedence over the "has a
call frame" property.

From the point of view of the unwind tables, the "has/does not have
call frame" state at beginning of each block is determined by the
state at the end of the previous block, in layout order.

Where these states differ, we insert compensating CFI instructions,
which come in two flavours:

- CFI instructions, which reset the unwind table state to the
initial one. This is done by a target specific hook and is
expected to be trivial to implement, for example it could be:
```
.cfi_def_cfa <sp>, 0
.cfi_same_value <rN>
.cfi_same_value <rN-1>
...
```
where `<rN>` are the callee-saved registers.

- CFI instructions, which reset the unwind table state to the one
created by the function prologue. These are the sequence:
```
.cfi_restore_state
.cfi_remember_state
```
In this case we also insert a `.cfi_remember_state` after the
last CFI instruction in the function prologue.

Reviewed By: MaskRay, danielkiss, chill

Differential Revision: https://reviews.llvm.org/D114545

show more ...


# 37b37838 16-Mar-2022 Shengchen Kan <shengchen.kan@intel.com>

[NFC][CodeGen] Rename some functions in MachineInstr.h and remove duplicated comments


# 989f1c72 15-Mar-2022 serge-sans-paille <sguelton@redhat.com>

Cleanup codegen includes

This is a (fixed) recommit of https://reviews.llvm.org/D121169

after: 1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-in

Cleanup codegen includes

This is a (fixed) recommit of https://reviews.llvm.org/D121169

after: 1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681

show more ...


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# a278250b 10-Mar-2022 Nico Weber <thakis@chromium.org>

Revert "Cleanup codegen includes"

This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https:/

Revert "Cleanup codegen includes"

This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169

show more ...


# 7f230fee 07-Mar-2022 serge-sans-paille <sguelton@redhat.com>

Cleanup codegen includes

after: 1061034926
before: 1063332844

Differential Revision: https://reviews.llvm.org/D121169


Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 630c847b 07-Dec-2021 Kazu Hirata <kazu@google.com>

[llvm] Use range-based for loops (NFC)


# 98a021fc 03-Dec-2021 Stephen Tozer <stephen.tozer@sony.com>

[DebugInfo] Attempt to preserve more information during tail duplication

Prior to this patch, tail duplication handled debug info poorly -
specifically, debug instructions would be dropped instead o

[DebugInfo] Attempt to preserve more information during tail duplication

Prior to this patch, tail duplication handled debug info poorly -
specifically, debug instructions would be dropped instead of being set
undef, potentially extending the lifetimes of prior debug values that
should be killed. The pass was also very aggressive with dropping debug
info, dropping debug info even when the SSA value it referred to was
still present. This patch attempts to handle debug info more carefully,
checking to see whether each affected debug value can still be live,
setting it undef if not.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D106875

show more ...


12345