Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
9571cc2b |
| 13-Nov-2024 |
Kazu Hirata <kazu@google.com> |
[ARM] Remove unused includes (NFC) (#115995)
Identified with misc-include-cleaner.
|
Revision tags: llvmorg-19.1.3 |
|
#
85c17e40 |
| 17-Oct-2024 |
Jay Foad <jay.foad@amd.com> |
[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706)
Convert many instances of:
Fn = Intrinsic::getOrInsertDeclaration(...);
CreateCall(Fn, ...)
to the equivalent CreateIntrinsi
[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706)
Convert many instances of:
Fn = Intrinsic::getOrInsertDeclaration(...);
CreateCall(Fn, ...)
to the equivalent CreateIntrinsic call.
show more ...
|
Revision tags: llvmorg-19.1.2 |
|
#
fa789dff |
| 11-Oct-2024 |
Rahul Joshi <rjoshi@nvidia.com> |
[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is a
[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
show more ...
|
Revision tags: llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2 |
|
#
edf46f36 |
| 03-Aug-2024 |
Florian Hahn <flo@fhahn.com> |
[SCEV] Use const SCEV * explicitly in more places.
Use const SCEV * explicitly in more places to prepare for https://github.com/llvm/llvm-project/pull/91961. Split off as suggested.
|
Revision tags: llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
2d209d96 |
| 27-Jun-2024 |
Nikita Popov <npopov@redhat.com> |
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it does
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...
`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
show more ...
|
#
d75f9dd1 |
| 24-Jun-2024 |
Stephen Tozer <stephen.tozer@sony.com> |
Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497)"
Reverts the above commit, as it updates a common header function and did not update all callsites:
https://lab.llvm.org/buildbot
Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497)"
Reverts the above commit, as it updates a common header function and did not update all callsites:
https://lab.llvm.org/buildbot/#/builders/29/builds/382
This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.
show more ...
|
#
6481dc57 |
| 24-Jun-2024 |
Stephen Tozer <stephen.tozer@sony.com> |
[IR][NFC] Update IRBuilder to use InsertPosition (#96497)
Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
[IR][NFC] Update IRBuilder to use InsertPosition (#96497)
Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
alongside a BasicBlock::iterator, using the fact that we can now get the
parent basic block from the iterator even if it points to the sentinel.
This patch removes the BasicBlock argument from each constructor or call
to setInsertPoint.
This has no functional effect, but later on as we look to remove the
`Instruction *InsertBefore` argument from instruction-creation
(discussed
[here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)),
this will simplify the process by allowing us to deprecate the
InsertPosition constructor directly and catch all the cases where we use
instructions rather than iterators.
show more ...
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0 |
|
#
e54277fa |
| 11-Sep-2023 |
Jeremy Morse <jeremy.morse@sony.com> |
[NFC][RemoveDIs] Use iterators over inst-pointers when using IRBuilder
This patch adds a two-argument SetInsertPoint method to IRBuilder that takes a block/iterator instead of an instruction, and up
[NFC][RemoveDIs] Use iterators over inst-pointers when using IRBuilder
This patch adds a two-argument SetInsertPoint method to IRBuilder that takes a block/iterator instead of an instruction, and updates many call sites to use it. The motivating reason for doing this is given here [0], we'd like to pass around more information about the position of debug-info in the iterator object. That necessitates passing iterators around most of the time.
[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
Differential Revision: https://reviews.llvm.org/D152468
show more ...
|
Revision tags: llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4 |
|
#
9eef6d9a |
| 07-May-2023 |
Kazu Hirata <kazu@google.com> |
[ARM] Remove unused declaration RematerializeIterCount
The corresponding function definition was removed by:
commit af45907653fd312264632b616eff0fad1ae1eb2e Author: Sjoerd Meijer <sjoerd.meijer
[ARM] Remove unused declaration RematerializeIterCount
The corresponding function definition was removed by:
commit af45907653fd312264632b616eff0fad1ae1eb2e Author: Sjoerd Meijer <sjoerd.meijer@arm.com> Date: Mon Jun 29 15:40:03 2020 +0100
show more ...
|
Revision tags: llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1 |
|
#
eb64450a |
| 29-Mar-2023 |
David Green <david.green@arm.com> |
[ARM] Convert active.lane.masks to vctp with non-zero starts
This attempts to expand the logic in the MVETailPredication pass to convert active lane masks that the vectorizer produces to vctp instru
[ARM] Convert active.lane.masks to vctp with non-zero starts
This attempts to expand the logic in the MVETailPredication pass to convert active lane masks that the vectorizer produces to vctp instructions that the backend can later turn into tail predicated loops. Especially for addrecs with non-zero starts that can be created from epilog vectorization. There is some adjustment to the logic to handle this, moving some of the code to check the addrec earlier so that we can get the start value. This start value is then incorporated into the logic of checkin the new vctp is valid, and there is a newly added check that it is known to be a multiple of the VF as we expect.
Differential Revision: https://reviews.llvm.org/D146517
show more ...
|
Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0 |
|
#
2833760c |
| 29-Aug-2022 |
Kazu Hirata <kazu@google.com> |
[Target] Qualify auto in range-based for loops (NFC)
|
Revision tags: llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
#
ab0c5cea |
| 03-Dec-2021 |
David Green <david.green@arm.com> |
[ARM] Use v2i1 for MVE and CDE intrinsics
This adjusts all the MVE and CDE intrinsics now that v2i1 is a legal type, to use a <2 x i1> as opposed to emulating the predicate with a <4 x i1>. The v4i1
[ARM] Use v2i1 for MVE and CDE intrinsics
This adjusts all the MVE and CDE intrinsics now that v2i1 is a legal type, to use a <2 x i1> as opposed to emulating the predicate with a <4 x i1>. The v4i1 workarounds have been removed leaving the natural v2i1 types, notably in vctp64 which now generates a v2i1 type.
AutoUpgrade code has been added to upgrade old IR, which needs to convert the old v4i1 to a v2i1 be converting it back and forth to an integer with arm.mve.v2i and arm.mve.i2v intrinsics. These should be optimized away in the final assembly.
Differential Revision: https://reviews.llvm.org/D114455
show more ...
|
Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3 |
|
#
29fa37ec |
| 01-Sep-2021 |
Philip Reames <listmail@philipreames.com> |
[SCEV] If max BTC is zero, then so is the exact BTC [2 of 2]
This extends D108921 into a generic rule applied to constructing ExitLimits along all paths. The remaining paths (primarily howFarToZero)
[SCEV] If max BTC is zero, then so is the exact BTC [2 of 2]
This extends D108921 into a generic rule applied to constructing ExitLimits along all paths. The remaining paths (primarily howFarToZero) don't have the same reasoning about UB sensitivity as the howManyLessThan ones did. Instead, the remain cause for max counts being more precise than exact counts is that we apply context sensitive loop guards on the max path, and not on the exact path. That choice is mildly suspect, but out of scope of this patch.
The MVETailPredication.cpp change deserves a bit of explanation. We were previously figuring out that two SCEVs happened to be equal because the happened to be identical. When we optimized one with context sensitive information, but not the other, we lost the ability to prove them equal. So, cover this case by subtracting and then applying loop guards again. Without this, we see changes in test/CodeGen/Thumb2/mve-blockplacement.ll
Differential Revision: https://reviews.llvm.org/D109015
show more ...
|
Revision tags: llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
#
258e2e9a |
| 26-Apr-2021 |
David Green <david.green@arm.com> |
[ARM] Ensure loop invariant active.lane.mask operands
CGP can move instructions like a ptrtoint into a loop, but the MVETailPredication when converting them will currently assume invariant trip coun
[ARM] Ensure loop invariant active.lane.mask operands
CGP can move instructions like a ptrtoint into a loop, but the MVETailPredication when converting them will currently assume invariant trip counts. This tries to ensure the operands are loop invariant, and bails if not.
Differential Revision: https://reviews.llvm.org/D100550
show more ...
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
#
fad70c30 |
| 11-Mar-2021 |
David Green <david.green@arm.com> |
[ARM] Improve WLS lowering
Recently we improved the lowering of low overhead loops and tail predicated loops, but concentrated first on the DLS do style loops. This extends those improvements over t
[ARM] Improve WLS lowering
Recently we improved the lowering of low overhead loops and tail predicated loops, but concentrated first on the DLS do style loops. This extends those improvements over to the WLS while loops, improving the chance of lowering them successfully. To do this the lowering has to change a little as the instructions are terminators that produce a value - something that needs to be treated carefully.
Lowering starts at the Hardware Loop pass, inserting a new llvm.test.start.loop.iterations that produces both an i1 to control the loop entry and an i32 similar to the llvm.start.loop.iterations intrinsic added for do loops. This feeds into the loop phi, properly gluing the values together:
%wls = call { i32, i1 } @llvm.test.start.loop.iterations.i32(i32 %div) %wls0 = extractvalue { i32, i1 } %wls, 0 %wls1 = extractvalue { i32, i1 } %wls, 1 br i1 %wls1, label %loop.ph, label %loop.exit ... loop: %lsr.iv = phi i32 [ %wls0, %loop.ph ], [ %iv.next, %loop ] .. %iv.next = call i32 @llvm.loop.decrement.reg.i32(i32 %lsr.iv, i32 1) %cmp = icmp ne i32 %iv.next, 0 br i1 %cmp, label %loop, label %loop.exit
The llvm.test.start.loop.iterations need to be lowered through ISel lowering as a pair of WLS and WLSSETUP nodes, which each get converted to t2WhileLoopSetup and t2WhileLoopStart Pseudos. This helps prevent t2WhileLoopStart from being a terminator that produces a value, something difficult to control at that stage in the pipeline. Instead the t2WhileLoopSetup produces the value of LR (essentially acting as a lr = subs rn, 0), t2WhileLoopStart consumes that lr value (the Bcc).
These are then converted into a single t2WhileLoopStartLR at the same point as t2DoLoopStartTP and t2LoopEndDec. Otherwise we revert the loop to prevent them from progressing further in the pipeline. The t2WhileLoopStartLR is a single instruction that takes a GPR and produces LR, similar to the WLS instruction.
%1:gprlr = t2WhileLoopStartLR %0:rgpr, %bb.3 t2B %bb.1 ... bb.2.loop: %2:gprlr = PHI %1:gprlr, %bb.1, %3:gprlr, %bb.2 ... %3:gprlr = t2LoopEndDec %2:gprlr, %bb.2 t2B %bb.3
The t2WhileLoopStartLR can then be treated similar to the other low overhead loop pseudos, eventually being lowered to a WLS providing the branches are within range.
Differential Revision: https://reviews.llvm.org/D97729
show more ...
|
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2 |
|
#
f5abf0bd |
| 15-Jan-2021 |
David Green <david.green@arm.com> |
[ARM] Tail predication with constant loop bounds
The TripCount for a predicated vector loop body will be ceil(ElementCount/Width). This alters the conversion of an active.lane.mask to a VCPT intrins
[ARM] Tail predication with constant loop bounds
The TripCount for a predicated vector loop body will be ceil(ElementCount/Width). This alters the conversion of an active.lane.mask to a VCPT intrinsics to match.
Differential Revision: https://reviews.llvm.org/D94608
show more ...
|
Revision tags: llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2 |
|
#
0e49a40d |
| 26-Nov-2020 |
David Green <david.green@arm.com> |
[ARM] Cleanup for the MVETailPrediction pass
This strips out a lot of the code that should no longer be needed from the MVETailPredictionPass, leaving the important part - find active lane mask inst
[ARM] Cleanup for the MVETailPrediction pass
This strips out a lot of the code that should no longer be needed from the MVETailPredictionPass, leaving the important part - find active lane mask instructions and convert them to VCTP operations.
Differential Revision: https://reviews.llvm.org/D91866
show more ...
|
Revision tags: llvmorg-11.0.1-rc1 |
|
#
b2ac9681 |
| 10-Nov-2020 |
David Green <david.green@arm.com> |
[ARM] Alter t2DoLoopStart to define lr
This changes the definition of t2DoLoopStart from t2DoLoopStart rGPR to GPRlr = t2DoLoopStart rGPR
This will hopefully mean that low overhead loops are more t
[ARM] Alter t2DoLoopStart to define lr
This changes the definition of t2DoLoopStart from t2DoLoopStart rGPR to GPRlr = t2DoLoopStart rGPR
This will hopefully mean that low overhead loops are more tied together, and we can more reliably generate loops without reverting or being at the whims of the register allocator.
This is a fairly simple change in itself, but leads to a number of other required alterations.
- The hardware loop pass, if UsePhi is set, now generates loops of the form: %start = llvm.start.loop.iterations(%N) loop: %p = phi [%start], [%dec] %dec = llvm.loop.decrement.reg(%p, 1) %c = icmp ne %dec, 0 br %c, loop, exit - For this a new llvm.start.loop.iterations intrinsic was added, identical to llvm.set.loop.iterations but produces a value as seen above, gluing the loop together more through def-use chains. - This new instrinsic conceptually produces the same output as input, which is taught to SCEV so that the checks in MVETailPredication are not affected. - Some minor changes are needed to the ARMLowOverheadLoop pass, but it has been left mostly as before. We should now more reliably be able to tell that the t2DoLoopStart is correct without having to prove it, but t2WhileLoopStart and tail-predicated loops will remain the same. - And all the tests have been updated. There are a lot of them!
This patch on it's own might cause more trouble that it helps, with more tail-predicated loops being reverted, but some additional patches can hopefully improve upon that to get to something that is better overall.
Differential Revision: https://reviews.llvm.org/D89881
show more ...
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6 |
|
#
322d0afd |
| 03-Oct-2020 |
Amara Emerson <amara@apple.com> |
[llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics.
This change renames the intrinsics to not have "experimental" in the name.
The autoupgrader will handle lega
[llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics.
This change renames the intrinsics to not have "experimental" in the name.
The autoupgrader will handle legacy intrinsics.
Relevant ML thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html
Differential Revision: https://reviews.llvm.org/D88787
show more ...
|
Revision tags: llvmorg-11.0.0-rc5 |
|
#
509fba75 |
| 28-Sep-2020 |
Tres Popp <tpopp@google.com> |
[llvm] Fix unused variable in non-debug configurations
|
Revision tags: llvmorg-11.0.0-rc4 |
|
#
1696dd27 |
| 28-Sep-2020 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[ARM][MVE] Enable tail-predication by default
We have been running tests/benchmarks downstream with tail-predication enabled for some time now and this behaves as expected: we are not aware of any c
[ARM][MVE] Enable tail-predication by default
We have been running tests/benchmarks downstream with tail-predication enabled for some time now and this behaves as expected: we are not aware of any correctness issues, and this performs better across the board than with tail-predication disabled. Time to flip the switch!
Differential Revision: https://reviews.llvm.org/D88093
show more ...
|
#
f39f92c1 |
| 24-Sep-2020 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[ARM][MVE] tail-predication: overflow checks for elementcount, cont'd
This is a reimplementation of the overflow checks for the elementcount, i.e. the 2nd argument of intrinsic get.active.lane.mask.
[ARM][MVE] tail-predication: overflow checks for elementcount, cont'd
This is a reimplementation of the overflow checks for the elementcount, i.e. the 2nd argument of intrinsic get.active.lane.mask. The element count is lowered in each iteration of the tail-predicated loop, and we must prove that this expression doesn't overflow.
Many thanks to Eli Friedman and Sam Parker for all their help with this work.
Differential Revision: https://reviews.llvm.org/D88086
show more ...
|
#
2fc690ac |
| 24-Sep-2020 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[ARM] LowoverheadLoops: add an option to disable tail-predication
This might be useful for testing. We already have an option -tail-predication but that controls the MVETailPredication pass. This -
[ARM] LowoverheadLoops: add an option to disable tail-predication
This might be useful for testing. We already have an option -tail-predication but that controls the MVETailPredication pass. This -arm-loloops-disable-tail-pred is just for disabling it in the LowoverheadLoops pass.
Differential Revision: https://reviews.llvm.org/D88212
show more ...
|
Revision tags: llvmorg-11.0.0-rc3 |
|
#
b5c3efeb |
| 16-Sep-2020 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[ARM][MVE] Tail-predication: predicate new elementcount checks on force-enabled
Additional sanity checks were added to get.active.lane.mask's second argument, the loop tripcount/elementcount, in rG6
[ARM][MVE] Tail-predication: predicate new elementcount checks on force-enabled
Additional sanity checks were added to get.active.lane.mask's second argument, the loop tripcount/elementcount, in rG635b87511ec3. Like the other (overflow) checks, skip this if tail-predication is forced.
Differential Revision: https://reviews.llvm.org/D87769
show more ...
|
#
635b8751 |
| 15-Sep-2020 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[ARM][MVE] Tail-predication: use unsigned SCEV ranges for tripcount
Loop tripcount expressions have a positive range, so use unsigned SCEV ranges for them.
Differential Revision: https://reviews.ll
[ARM][MVE] Tail-predication: use unsigned SCEV ranges for tripcount
Loop tripcount expressions have a positive range, so use unsigned SCEV ranges for them.
Differential Revision: https://reviews.llvm.org/D87608
show more ...
|