#
79d0de2a |
| 09-Jul-2024 |
paperchalice <liujunchang97@outlook.com> |
[CodeGen][NewPM] Port `machine-loops` to new pass manager (#97793)
- Add `MachineLoopAnalysis`.
- Add `MachineLoopPrinterPass`.
- Convert to `MachineLoopInfoWrapperPass` in legacy pass manager.
|
#
ddaa93b0 |
| 29-Jun-2024 |
Kazu Hirata <kazu@google.com> |
[llvm] Use std::make_unique (NFC) (#97165)
This patch is based on clang-tidy's modernize-make-unique but limited
to those cases where type names are mentioned twice like
std::unique_ptr<Type>(new
[llvm] Use std::make_unique (NFC) (#97165)
This patch is based on clang-tidy's modernize-make-unique but limited
to those cases where type names are mentioned twice like
std::unique_ptr<Type>(new Type()), which is a bit mouthful.
show more ...
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5 |
|
#
ad73ce31 |
| 26-May-2022 |
Zongwei Lan <lanzongwei541@gmail.com> |
[Target] use getSubtarget<> instead of static_cast<>(getSubtarget())
Differential Revision: https://reviews.llvm.org/D125391
|
Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
#
989f1c72 |
| 15-Mar-2022 |
serge-sans-paille <sguelton@redhat.com> |
Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169
after: 1061034926 before: 1063332844
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-in
Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169
after: 1061034926 before: 1063332844
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681
show more ...
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
#
a278250b |
| 10-Mar-2022 |
Nico Weber <thakis@chromium.org> |
Revert "Cleanup codegen includes"
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https:/
Revert "Cleanup codegen includes"
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169
show more ...
|
#
7f230fee |
| 07-Mar-2022 |
serge-sans-paille <sguelton@redhat.com> |
Cleanup codegen includes
after: 1061034926 before: 1063332844
Differential Revision: https://reviews.llvm.org/D121169
|
Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
#
fdd8c109 |
| 28-Sep-2021 |
David Green <david.green@arm.com> |
[ARM] Delay reverting WLS in arm-block-placement
As we have to split blocks, we may be left in an invalid loop state after a WLS is reverted to a DLS. Instead remember the WLS that could not be fixe
[ARM] Delay reverting WLS in arm-block-placement
As we have to split blocks, we may be left in an invalid loop state after a WLS is reverted to a DLS. Instead remember the WLS that could not be fixed and revert them after finishing processing all other loops.
Differential Revision: https://reviews.llvm.org/D110567
show more ...
|
#
883758ed |
| 25-Sep-2021 |
David Green <david.green@arm.com> |
[ARM] Fix Arm block placement creating branches after jump tables.
Given: - A jump table - Which jumps to the next block - The next block ends in a WLS - Where the WLS conditionally jumps to blo
[ARM] Fix Arm block placement creating branches after jump tables.
Given: - A jump table - Which jumps to the next block - The next block ends in a WLS - Where the WLS conditionally jumps to block earlier in the program.
The Arm block placement pass would attempt to move the block containing the WLS earlier, as the WLS instruction can only branch forward. In doing so it would add a branch from the jumptable block to the WLS block, thinking it previously fell-through.
This in itself would be fine, if a little inefficient, but the constant island pass expects all instructions after a jump-table branch to have been removed by analyzeBranch. So it gets confused and can assign the same labels to multiple jump table blocks.
I've changed the condition to the same as used in analyzeBranch.
show more ...
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1 |
|
#
c423a586 |
| 02-Aug-2021 |
David Green <david.green@arm.com> |
[ARM] Remove setPreservesCFG from ARMBlockPlacement
As of 28293918409dd3a5a it no longer preserves the CFG, needing to split blocks in order to add DLS instructions.
|
#
28293918 |
| 02-Aug-2021 |
David Green <david.green@arm.com> |
[ARM] Revert WLSTP to DLSTP if the target block is out of range
If the block target for a WLSTP instruction is known to be out of range, and cannot be fixed by the ARMBlockPlacementPass, we can rela
[ARM] Revert WLSTP to DLSTP if the target block is out of range
If the block target for a WLSTP instruction is known to be out of range, and cannot be fixed by the ARMBlockPlacementPass, we can relax it to a DLSTP (and cmp/branch) to still allow the creation of tail predicated loops. That is what this patch does, adding extra revert code to the fallback path of ARMBlockPlacementPass.
Due to the code produced when reverting, this creates a DLSTP between a Bcc and a Br. As a DLS isn't necessarily a terminator we need to split the block to move the DLS/Br into.
Differential Revision: https://reviews.llvm.org/D104709
show more ...
|
Revision tags: llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
#
bee2f618 |
| 13-Jun-2021 |
David Green <david.green@arm.com> |
[ARM] Introduce t2WhileLoopStartTP
This adds t2WhileLoopStartTP, similar to the t2DoLoopStartTP added in D90591. It keeps a reference to both the tripcount register and the element count register, s
[ARM] Introduce t2WhileLoopStartTP
This adds t2WhileLoopStartTP, similar to the t2DoLoopStartTP added in D90591. It keeps a reference to both the tripcount register and the element count register, so that the ARMLowOverheadLoops pass in the backend can pick the correct one without having to search for it from the operand of a VCTP.
Differential Revision: https://reviews.llvm.org/D103236
show more ...
|
Revision tags: llvmorg-12.0.1-rc1 |
|
#
9ba5238c |
| 05-May-2021 |
Malhar Jajoo <malhar.jajoo@arm.com> |
[ARM] Simplification to ARMBlockPlacement Pass.
It simplifies the logic by moving the predecessor (preHeader or it's predecessor) above the target (or loopExit), instead of moving the target to aft
[ARM] Simplification to ARMBlockPlacement Pass.
It simplifies the logic by moving the predecessor (preHeader or it's predecessor) above the target (or loopExit), instead of moving the target to after the predecessor.
Since the loopExit is no longer being moved, directions of any branches within/to it are unaffected.
While the predecessor is being moved, the backwards movement simplifies some considerations, and the only consideration now required is that a forward WLS to the predecessor should not become backwards.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D100094
show more ...
|
#
465df353 |
| 29-Apr-2021 |
David Green <david.green@arm.com> |
[ARM] Use just ARM::t2B in ARMBlockPlacementPass
The ARMConstantIsland pass will convert any t2B to tB if they are within range after it has added or moved any constant pools. They don't need to be
[ARM] Use just ARM::t2B in ARMBlockPlacementPass
The ARMConstantIsland pass will convert any t2B to tB if they are within range after it has added or moved any constant pools. They don't need to be deliberately converted beforehand, and it doesn't deal with needing to convert tB to t2B very well.
show more ...
|
#
58f3201a |
| 12-Apr-2021 |
Malhar Jajoo <malhar.jajoo@arm.com> |
[ARM] Updates to arm-block-placement pass
The patch makes two updates to the arm-block-placement pass: - Handle arbitrarily nested loops - Extends the search (for t2WhileLoopStartLR) to the predeces
[ARM] Updates to arm-block-placement pass
The patch makes two updates to the arm-block-placement pass: - Handle arbitrarily nested loops - Extends the search (for t2WhileLoopStartLR) to the predecessor of the preHeader.
Differential Revision: https://reviews.llvm.org/D99649
show more ...
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
#
fad70c30 |
| 11-Mar-2021 |
David Green <david.green@arm.com> |
[ARM] Improve WLS lowering
Recently we improved the lowering of low overhead loops and tail predicated loops, but concentrated first on the DLS do style loops. This extends those improvements over t
[ARM] Improve WLS lowering
Recently we improved the lowering of low overhead loops and tail predicated loops, but concentrated first on the DLS do style loops. This extends those improvements over to the WLS while loops, improving the chance of lowering them successfully. To do this the lowering has to change a little as the instructions are terminators that produce a value - something that needs to be treated carefully.
Lowering starts at the Hardware Loop pass, inserting a new llvm.test.start.loop.iterations that produces both an i1 to control the loop entry and an i32 similar to the llvm.start.loop.iterations intrinsic added for do loops. This feeds into the loop phi, properly gluing the values together:
%wls = call { i32, i1 } @llvm.test.start.loop.iterations.i32(i32 %div) %wls0 = extractvalue { i32, i1 } %wls, 0 %wls1 = extractvalue { i32, i1 } %wls, 1 br i1 %wls1, label %loop.ph, label %loop.exit ... loop: %lsr.iv = phi i32 [ %wls0, %loop.ph ], [ %iv.next, %loop ] .. %iv.next = call i32 @llvm.loop.decrement.reg.i32(i32 %lsr.iv, i32 1) %cmp = icmp ne i32 %iv.next, 0 br i1 %cmp, label %loop, label %loop.exit
The llvm.test.start.loop.iterations need to be lowered through ISel lowering as a pair of WLS and WLSSETUP nodes, which each get converted to t2WhileLoopSetup and t2WhileLoopStart Pseudos. This helps prevent t2WhileLoopStart from being a terminator that produces a value, something difficult to control at that stage in the pipeline. Instead the t2WhileLoopSetup produces the value of LR (essentially acting as a lr = subs rn, 0), t2WhileLoopStart consumes that lr value (the Bcc).
These are then converted into a single t2WhileLoopStartLR at the same point as t2DoLoopStartTP and t2LoopEndDec. Otherwise we revert the loop to prevent them from progressing further in the pipeline. The t2WhileLoopStartLR is a single instruction that takes a GPR and produces LR, similar to the WLS instruction.
%1:gprlr = t2WhileLoopStartLR %0:rgpr, %bb.3 t2B %bb.1 ... bb.2.loop: %2:gprlr = PHI %1:gprlr, %bb.1, %3:gprlr, %bb.2 ... %3:gprlr = t2LoopEndDec %2:gprlr, %bb.2 t2B %bb.3
The t2WhileLoopStartLR can then be treated similar to the other low overhead loop pseudos, eventually being lowered to a WLS providing the branches are within range.
Differential Revision: https://reviews.llvm.org/D97729
show more ...
|
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2 |
|
#
1a497ae9 |
| 15-Jan-2021 |
Sam Tebbs <samuel.tebbs@arm.com> |
[ARM][Block placement] Check the predecessor exists before processing it
Not all machine loops will have a predecessor. so the pass needs to check it before continuing.
Reviewed By: dmgreen
Differ
[ARM][Block placement] Check the predecessor exists before processing it
Not all machine loops will have a predecessor. so the pass needs to check it before continuing.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D94780
show more ...
|
#
5e4480b6 |
| 15-Jan-2021 |
Sam Tebbs <samuel.tebbs@arm.com> |
[ARM] Don't run the block placement pass at O0
The block placement pass shouldn't run unless optimisations are enabled.
Differential Revision: https://reviews.llvm.org/D94691
|
Revision tags: llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2 |
|
#
60fda8eb |
| 26-Nov-2020 |
Sam Tebbs <samuel.tebbs@arm.com> |
[ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch
Blocks can be laid out such that a t2WhileLoopStart branches backwards. This is forbidden by the architecture and so it
[ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch
Blocks can be laid out such that a t2WhileLoopStart branches backwards. This is forbidden by the architecture and so it fails to be converted into a low-overhead loop. This new pass checks for these cases and moves the target block, fixing any fall-through that would then be broken.
Differential Revision: https://reviews.llvm.org/D92385
show more ...
|