Revision tags: llvmorg-21-init |
|
#
0d019081 |
| 23-Jan-2025 |
Florian Hahn <flo@fhahn.com> |
[TailDup] Allow large number of predecessors/successors without phis. (#116072)
This adjusts the threshold logic added in #78582 to only trigger for
cases where there are actually phis to duplicate
[TailDup] Allow large number of predecessors/successors without phis. (#116072)
This adjusts the threshold logic added in #78582 to only trigger for
cases where there are actually phis to duplicate in either TailBB or in
one of the successors.
In cases there are no phis, we only have to pay the cost of extra edges,
but have no explosion in PHI related instructions.
This improves performance of Python on some inputs by 2-3% on Apple
Silicon CPUs.
PR: https://github.com/llvm/llvm-project/pull/116072
show more ...
|
Revision tags: llvmorg-19.1.7 |
|
#
19032bfe |
| 13-Jan-2025 |
Daniel Paoliello <danpao@microsoft.com> |
[aarch64][win] Update Called Globals info when updating Call Site info (#122762)
Fixes the "use after poison" issue introduced by #121516 (see <https://github.com/llvm/llvm-project/pull/121516#issue
[aarch64][win] Update Called Globals info when updating Call Site info (#122762)
Fixes the "use after poison" issue introduced by #121516 (see <https://github.com/llvm/llvm-project/pull/121516#issuecomment-2585912395>).
The root cause of this issue is that #121516 introduced "Called Global" information for call instructions modeling how "Call Site" info is stored in the machine function, HOWEVER it didn't copy the copy/move/erase operations for call site information.
The fix is to rename and update the existing copy/move/erase functions so they also take care of Called Global info.
show more ...
|
Revision tags: llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
735ab61a |
| 13-Nov-2024 |
Kazu Hirata <kazu@google.com> |
[CodeGen] Remove unused includes (NFC) (#115996)
Identified with misc-include-cleaner.
|
Revision tags: llvmorg-19.1.3 |
|
#
6ab26eab |
| 28-Oct-2024 |
Ellis Hoag <ellis.sparky.hoag@gmail.com> |
Check hasOptSize() in shouldOptimizeForSize() (#112626)
|
Revision tags: llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
0f0cfcff |
| 19-Jul-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
CodeGen: Avoid some references to MachineFunction's getMMI (#99652)
MachineFunction's probably should not include a backreference to
the owning MachineModuleInfo. Most of these references were used
CodeGen: Avoid some references to MachineFunction's getMMI (#99652)
MachineFunction's probably should not include a backreference to
the owning MachineModuleInfo. Most of these references were used
just to query the MCContext, which MachineFunction already directly
stores. Other contexts are using it to query the LLVMContext, which
can already be accessed through the IR function reference.
show more ...
|
#
559ea40d |
| 27-Jun-2024 |
Kazu Hirata <kazu@google.com> |
[CodeGen] Use range-based for loops (NFC) (#96855)
|
#
dae061f1 |
| 26-Jun-2024 |
Kazu Hirata <kazu@google.com> |
[CodeGen] Use range-based for loops (NFC) (#96777)
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5 |
|
#
86a78284 |
| 17-Apr-2024 |
Quentin Dian <dianqk@dianqk.net> |
[TailDuplicator] Add maximum predecessors and successors to consider tail duplicating blocks (#78582)
Fixes #78578.
Duplicating a BB which has both multiple predecessors and successors
will resu
[TailDuplicator] Add maximum predecessors and successors to consider tail duplicating blocks (#78582)
Fixes #78578.
Duplicating a BB which has both multiple predecessors and successors
will result in a complex CFG and also may cause huge amount of PHI
nodes. See
https://github.com/llvm/llvm-project/issues/78578#issuecomment-1962363580
for a detailed description of the limit.
show more ...
|
Revision tags: llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3 |
|
#
4b2381a5 |
| 02-May-2023 |
Jay Foad <jay.foad@amd.com> |
[CodeGen] Make use of MachineBasicBlock::phis. NFC.
|
#
f7bee657 |
| 21-Apr-2023 |
Mikael Holmen <mikael.holmen@ericsson.com> |
[TailDuplicator] Don't constrain register classes due to debug instructions
If cloning a DBG_VALUE instruction, register uses in that instruction could lead to constraining of a virtual register tha
[TailDuplicator] Don't constrain register classes due to debug instructions
If cloning a DBG_VALUE instruction, register uses in that instruction could lead to constraining of a virtual register that would not happen if the DBG_VALUE was not present at all. This lead to different code with/without debug info.
Now we only do that register class constraining if we dealing with a non debug instruction.
Differential Revision: https://reviews.llvm.org/D149146
show more ...
|
Revision tags: llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
df947feb |
| 21-Dec-2022 |
Bjorn Pettersson <bjorn.a.pettersson@ericsson.com> |
[TailDuplicator] Fix old bugs in TailDuplicator::duplicateInstruction
This patch is updating TailDuplicator::duplicateInstruction to fix some old bugs that has been found with an out-of-tree target.
[TailDuplicator] Fix old bugs in TailDuplicator::duplicateInstruction
This patch is updating TailDuplicator::duplicateInstruction to fix some old bugs that has been found with an out-of-tree target. There are three different things being addressed:
1) In one situation two subregister indices are combined using the composeSubRegIndices helper. But the order in which those indices are combined has been incorrect. For this problem I managed to create some kind of reproducer using AArch64 (see the test case touched in this patch).
2) Another fault was found in the else branch for the above situation. Here we do not compose the two subregisters, instead we insert a COPY to replace the PHI, and then the subreg index in the using MO remains. Thus, the virtual register created for the COPY should always match with the size of the original register. Therefore the optimization that "constrain" (or rather relax) the register class by looking at the instruction desc must be limited to the situation when there is no subregister access. Otherwise we create a vreg with the wrong class.
3) Last problem addressed in this patch is that when a new register class is picked by looking at the instruction desc, then it isn't guaranteed that the isAllocatable property is set for that class. So one need to use the getAllocatableClass helper to find a subclass that is allocatable before using createVirualRegister, or alternatively (as in this patch) just use the OrigRC instead of relaxing the register class for the COPY destination.
Haven't been able to find any in-tree reproducers for problem 2 and 3. The tricky part is to find a target that has register hierarchies that match with the problem to trigger those code paths (and with subreg accesses involved).
Differential Revision: https://reviews.llvm.org/D140496
show more ...
|
#
e72ca520 |
| 13-Jan-2023 |
Craig Topper <craig.topper@sifive.com> |
[CodeGen] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFC
Use isPhysical/isVirtual methods.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D141715
|
#
5b24d421 |
| 09-Jan-2023 |
Tim Northover <tnorthover@apple.com> |
TailDuplication: do not remove trivial PHIs from addr-taken blocks.
Unlike an anonymous block, it will not be removed even though we've resolved all valid paths to get here. So removing a PHI can le
TailDuplication: do not remove trivial PHIs from addr-taken blocks.
Unlike an anonymous block, it will not be removed even though we've resolved all valid paths to get here. So removing a PHI can leave vregs with no definition, violating SSA. Instead, this converts it to an IMPLICIT_DEF.
show more ...
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0 |
|
#
d7474bef |
| 31-Aug-2022 |
Nick Desaulniers <ndesaulniers@google.com> |
[llvm][TailDuplicator] don't taildup isInlineAsmBrIndirectTargets
This fixes a crash observed after https://reviews.llvm.org/D129997.
Similar to D88823.
Reviewed By: efriedma
Differential Revisio
[llvm][TailDuplicator] don't taildup isInlineAsmBrIndirectTargets
This fixes a crash observed after https://reviews.llvm.org/D129997.
Similar to D88823.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D130127
show more ...
|
Revision tags: llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2 |
|
#
4146c175 |
| 02-Aug-2022 |
Mircea Trofin <mtrofin@google.com> |
[nfc] Remove unused parameter in TailDuplicator::duplicateSimpleBB
Differential Revision: https://reviews.llvm.org/D131008
|
Revision tags: llvmorg-15.0.0-rc1, llvmorg-16-init |
|
#
9e6d1f4b |
| 17-Jul-2022 |
Kazu Hirata <kazu@google.com> |
[CodeGen] Qualify auto variables in for loops (NFC)
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
#
b4ad28da |
| 11-Apr-2022 |
Momchil Velikov <momchil.velikov@arm.com> |
[CodeGen] Async unwind - add a pass to fix CFI information
This pass inserts the necessary CFI instructions to compensate for the inconsistency of the call-frame information caused by linear (non-CG
[CodeGen] Async unwind - add a pass to fix CFI information
This pass inserts the necessary CFI instructions to compensate for the inconsistency of the call-frame information caused by linear (non-CGA aware) nature of the unwind tables.
Unlike the `CFIInstrInserer` pass, this one almost always emits only `.cfi_remember_state`/`.cfi_restore_state`, which results in smaller unwind tables and also transparently handles custom unwind info extensions like CFA offset adjustement and save locations of SVE registers.
This pass takes advantage of the constraints taht LLVM imposes on the placement of save/restore points (cf. `ShrinkWrap.cpp`):
* there is a single basic block, containing the function prologue
* possibly multiple epilogue blocks, where each epilogue block is complete and self-contained, i.e. CSR restore instructions (and the corresponding CFI instructions are not split across two or more blocks.
* prologue and epilogue blocks are outside of any loops
Thus, during execution, at the beginning and at the end of each basic block the function can be in one of two states:
- "has a call frame", if the function has executed the prologue, or has not executed any epilogue
- "does not have a call frame", if the function has not executed the prologue, or has executed an epilogue
These properties can be computed for each basic block by a single RPO traversal.
From the point of view of the unwind tables, the "has/does not have call frame" state at beginning of each block is determined by the state at the end of the previous block, in layout order.
Where these states differ, we insert compensating CFI instructions, which come in two flavours:
- CFI instructions, which reset the unwind table state to the initial one. This is done by a target specific hook and is expected to be trivial to implement, for example it could be: ``` .cfi_def_cfa <sp>, 0 .cfi_same_value <rN> .cfi_same_value <rN-1> ... ``` where `<rN>` are the callee-saved registers.
- CFI instructions, which reset the unwind table state to the one created by the function prologue. These are the sequence: ``` .cfi_restore_state .cfi_remember_state ``` In this case we also insert a `.cfi_remember_state` after the last CFI instruction in the function prologue.
Reviewed By: MaskRay, danielkiss, chill
Differential Revision: https://reviews.llvm.org/D114545
show more ...
|
#
0320115c |
| 05-Apr-2022 |
Muhammad Omair Javaid <omair.javaid@linaro.org> |
Revert "[CodeGen] Async unwind - add a pass to fix CFI information"
This reverts commit 980c3e6dd223a8e628367144b8180117950bb364.
This commit had failing tests with clang crashing across various AA
Revert "[CodeGen] Async unwind - add a pass to fix CFI information"
This reverts commit 980c3e6dd223a8e628367144b8180117950bb364.
This commit had failing tests with clang crashing across various AArch64/Linux buildots.
https://lab.llvm.org/buildbot/#/builders/179/builds/3346
Differential Revision: https://reviews.llvm.org/D114545
show more ...
|
#
980c3e6d |
| 04-Apr-2022 |
Momchil Velikov <momchil.velikov@arm.com> |
[CodeGen] Async unwind - add a pass to fix CFI information
This pass inserts the necessary CFI instructions to compensate for the inconsistency of the call-frame information caused by linear (non-CF
[CodeGen] Async unwind - add a pass to fix CFI information
This pass inserts the necessary CFI instructions to compensate for the inconsistency of the call-frame information caused by linear (non-CFG aware) nature of the unwind tables.
Unlike the `CFIInstrInserer` pass, this one almost always emits only `.cfi_remember_state`/`.cfi_restore_state`, which results in smaller unwind tables and also transparently handles custom unwind info extensions like CFA offset adjustement and save locations of SVE registers.
This pass takes advantage of the constraints that LLVM imposes on the placement of save/restore points (cf. `ShrinkWrap.cpp`):
* there is a single basic block, containing the function prologue
* possibly multiple epilogue blocks, where each epilogue block is complete and self-contained, i.e. CSR restore instructions (and the corresponding CFI instructions are not split across two or more blocks.
* prologue and epilogue blocks are outside of any loops
Thus, during execution, at the beginning and at the end of each basic block the function can be in one of two states:
- "has a call frame", if the function has executed the prologue, or has not executed any epilogue
- "does not have a call frame", if the function has not executed the prologue, or has executed an epilogue
These properties can be computed for each basic block by a single RPO traversal.
In order to accommodate backends which do not generate unwind info in epilogues we compute an additional property "strong no call frame on entry" which is set for the entry point of the function and for every block reachable from the entry along a path that does not execute the prologue. If this property holds, it takes precedence over the "has a call frame" property.
From the point of view of the unwind tables, the "has/does not have call frame" state at beginning of each block is determined by the state at the end of the previous block, in layout order.
Where these states differ, we insert compensating CFI instructions, which come in two flavours:
- CFI instructions, which reset the unwind table state to the initial one. This is done by a target specific hook and is expected to be trivial to implement, for example it could be: ``` .cfi_def_cfa <sp>, 0 .cfi_same_value <rN> .cfi_same_value <rN-1> ... ``` where `<rN>` are the callee-saved registers.
- CFI instructions, which reset the unwind table state to the one created by the function prologue. These are the sequence: ``` .cfi_restore_state .cfi_remember_state ``` In this case we also insert a `.cfi_remember_state` after the last CFI instruction in the function prologue.
Reviewed By: MaskRay, danielkiss, chill
Differential Revision: https://reviews.llvm.org/D114545
show more ...
|
#
37b37838 |
| 16-Mar-2022 |
Shengchen Kan <shengchen.kan@intel.com> |
[NFC][CodeGen] Rename some functions in MachineInstr.h and remove duplicated comments
|
#
989f1c72 |
| 15-Mar-2022 |
serge-sans-paille <sguelton@redhat.com> |
Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169
after: 1061034926 before: 1063332844
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-in
Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169
after: 1061034926 before: 1063332844
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681
show more ...
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
#
a278250b |
| 10-Mar-2022 |
Nico Weber <thakis@chromium.org> |
Revert "Cleanup codegen includes"
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https:/
Revert "Cleanup codegen includes"
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169
show more ...
|
#
7f230fee |
| 07-Mar-2022 |
serge-sans-paille <sguelton@redhat.com> |
Cleanup codegen includes
after: 1061034926 before: 1063332844
Differential Revision: https://reviews.llvm.org/D121169
|
Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
#
630c847b |
| 07-Dec-2021 |
Kazu Hirata <kazu@google.com> |
[llvm] Use range-based for loops (NFC)
|
#
98a021fc |
| 03-Dec-2021 |
Stephen Tozer <stephen.tozer@sony.com> |
[DebugInfo] Attempt to preserve more information during tail duplication
Prior to this patch, tail duplication handled debug info poorly - specifically, debug instructions would be dropped instead o
[DebugInfo] Attempt to preserve more information during tail duplication
Prior to this patch, tail duplication handled debug info poorly - specifically, debug instructions would be dropped instead of being set undef, potentially extending the lifetimes of prior debug values that should be killed. The pass was also very aggressive with dropping debug info, dropping debug info even when the SSA value it referred to was still present. This patch attempts to handle debug info more carefully, checking to see whether each affected debug value can still be live, setting it undef if not.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D106875
show more ...
|