#
2b55ef18 |
| 29-Jan-2025 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Add helper to run VPlan passes, verify after run (NFC). (#123640)
Add new runPass helpers to run a VPlan transformation. This makes it
easier to add additional checks/functionality for each
[VPlan] Add helper to run VPlan passes, verify after run (NFC). (#123640)
Add new runPass helpers to run a VPlan transformation. This makes it
easier to add additional checks/functionality for each transform run. In
this patch, an option is added to run the verifier after each VPlan
transform.
Follow-ups will use the same helper to also support printing VPlans
after each transform.
Note that the verifier at the moment requires there to be a canonical IV
and vector loop region, so the final lowering transforms aren't run via
runPass yet.
PR: https://github.com/llvm/llvm-project/pull/123640
show more ...
|
Revision tags: llvmorg-21-init |
|
#
304a9909 |
| 28-Jan-2025 |
Jeremy Morse <jeremy.morse@sony.com> |
[NFC][DebugInfo] Use iterators for insertion at some final callsites
These are the callsites that have materialised in the last three weeks since I last built with deprecation warnings.
|
#
cdea38f9 |
| 28-Jan-2025 |
Nicholas Guy <nicholas.guy@arm.com> |
Reland "[LoopVectorizer] Add support for chaining partial reductions #120272" (#124282)
Change `getScaledReduction` to take an existing vector, rather than creating and returning a new one each call
Reland "[LoopVectorizer] Add support for chaining partial reductions #120272" (#124282)
Change `getScaledReduction` to take an existing vector, rather than creating and returning a new one each call. Rename `getScaledReduction` to `getScaledReductions` to more accurately reflect what it's now doing.
---------
Co-authored-by: Karlo Basioli <68535415+basioli-k@users.noreply.github.com>
show more ...
|
#
0f61558b |
| 28-Jan-2025 |
David Sherwood <david.sherwood@arm.com> |
[LoopVectorize][NFC] Remove unused variable in addUsersInExitBlocks (#124553)
We were allocating a VPTypeAnalysis object on the stack,
but never using it for anything.
|
#
09a29fcc |
| 27-Jan-2025 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Don't collect live-ins in collectUsersInExitBlocks. (NFC) (#123819)
Live-ins don't need to be handled, other than adding to the exit phi
recipe. Do that early and assert that otherwise the
[VPlan] Don't collect live-ins in collectUsersInExitBlocks. (NFC) (#123819)
Live-ins don't need to be handled, other than adding to the exit phi
recipe. Do that early and assert that otherwise the exit value is
defined in the vector loop region.
This should enable simply skipping other exit values that do not need
further fixing, e.g. if handling the exit value from the early exit
directly in handleUncountableEarlyExit.
PR: https://github.com/llvm/llvm-project/pull/123819
show more ...
|
#
6383a12e |
| 25-Jan-2025 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Refactor HCFG builder to preserve original vector latch (NFC).
Update HCFG builder to preserve the original latch block of the initial VPlan, ensuring there is always a latch.
It also skips
[VPlan] Refactor HCFG builder to preserve original vector latch (NFC).
Update HCFG builder to preserve the original latch block of the initial VPlan, ensuring there is always a latch.
It also skips creating the BranchOnCond for the latch of the top-level loop, instead of removing it later. Exiting via the latch is controlled by later recipes.
This further unifies HCFG construction and prepares for use to also build an initial VPlan (VPlan0) for inner loops.
show more ...
|
#
6292a808 |
| 24-Jan-2025 |
Jeremy Morse <jeremy.morse@sony.com> |
[NFC][DebugInfo] Use iterator-flavour getFirstNonPHI at many call-sites (#123737)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNo
[NFC][DebugInfo] Use iterator-flavour getFirstNonPHI at many call-sites (#123737)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to getFirstNonPHI use the iterator-returning version.
This patch changes a bunch of call-sites calling getFirstNonPHI to use
getFirstNonPHIIt, which returns an iterator. All these call sites are
where it's obviously safe to fetch the iterator then dereference it. A
follow-up patch will contain less-obviously-safe changes.
We'll eventually deprecate and remove the instruction-pointer
getFirstNonPHI, but not before adding concise documentation of what
considerations are needed (very few).
---------
Co-authored-by: Stephen Tozer <Melamoto@gmail.com>
show more ...
|
#
8e702735 |
| 24-Jan-2025 |
Jeremy Morse <jeremy.morse@sony.com> |
[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and sim
[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to moveBefore use iterators.
This patch adds a (guaranteed dereferenceable) iterator-taking
moveBefore, and changes a bunch of call-sites where it's obviously safe
to change to use it by just calling getIterator() on an instruction
pointer. A follow-up patch will contain less-obviously-safe changes.
We'll eventually deprecate and remove the instruction-pointer
insertBefore, but not before adding concise documentation of what
considerations are needed (very few).
show more ...
|
#
0e213834 |
| 23-Jan-2025 |
Vitaly Buka <vitalybuka@google.com> |
Revert "[LoopVectorizer] Add support for chaining partial reductions (#120272)" (#124198)
Introduced stack buffer overflow, see #120272.
`getScaledReduction` can return empty vector, and there is n
Revert "[LoopVectorizer] Add support for chaining partial reductions (#120272)" (#124198)
Introduced stack buffer overflow, see #120272.
`getScaledReduction` can return empty vector, and there is not check for that.
This reverts commit c9b7303b9b18129c4ee6b56aaa2a0a9f59be2d09. This reverts commit caf0540b91b0fee31353dc7049ae836e0f814cff.
show more ...
|
#
caf0540b |
| 23-Jan-2025 |
Nicholas Guy <nicholas.guy@arm.com> |
[LoopVectorizer] Add support for chaining partial reductions (#120272)
Chaining partial reductions, where multiple partial reductions share an accumulator, allow for more values to be combined toget
[LoopVectorizer] Add support for chaining partial reductions (#120272)
Chaining partial reductions, where multiple partial reductions share an accumulator, allow for more values to be combined together as part of the reduction without discarding the semantics of the partial reduction itself.
show more ...
|
#
6c787ff6 |
| 21-Jan-2025 |
Florian Hahn <flo@fhahn.com> |
Revert "[LV]: Teach LV to recursively (de)interleave. (#122989)"
This reverts commit 9491f75e1d912b277247450d1c7b6d56f7faf885.
This triggers an assert when building with SVE enabled. https://lab.ll
Revert "[LV]: Teach LV to recursively (de)interleave. (#122989)"
This reverts commit 9491f75e1d912b277247450d1c7b6d56f7faf885.
This triggers an assert when building with SVE enabled. https://lab.llvm.org/buildbot/#/builders/143/builds/4795
show more ...
|
#
87701263 |
| 21-Jan-2025 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Remove stale comment for collectUsersInExitBlocks.
Remove stale section about wide inductions. Since 2c87133c6212d4bd0 all live-outs are modeled in VPlan.
|
#
fcec8756 |
| 20-Jan-2025 |
David Sherwood <david.sherwood@arm.com> |
[LoopVectorize][NFC] Simplify ScaledReductionExitInstrs map (#123368)
For the following variable
DenseMap<const Instruction *, std::pair<PartialReductionChain,
unsigned>>
ScaledReductionExitI
[LoopVectorize][NFC] Simplify ScaledReductionExitInstrs map (#123368)
For the following variable
DenseMap<const Instruction *, std::pair<PartialReductionChain,
unsigned>>
ScaledReductionExitInstrs;
we never actually need the PartialReductionChain when using the map.
I've cleaned this up so that this now becomes
DenseMap<const Instruction *, unsigned> ScaledReductionMap;
show more ...
|
#
84c89d0a |
| 20-Jan-2025 |
Mel Chen <mel.chen@sifive.com> |
[LV][EVL] Address post-commit comments for 9720be9. (NFC) (#123311)
|
#
2c87133c |
| 19-Jan-2025 |
Florian Hahn <flo@fhahn.com> |
Reapply "[VPlan] Update final IV exit value via VPlan. (#112147)"
This reverts the revert commit 58326f1d5b5b379590af92dd129b2f3b3e96af46.
The build failure in sanitizer stage2 builds has been fixe
Reapply "[VPlan] Update final IV exit value via VPlan. (#112147)"
This reverts the revert commit 58326f1d5b5b379590af92dd129b2f3b3e96af46.
The build failure in sanitizer stage2 builds has been fixed with 0d39fe6f5bb3edf0bddec09a8c6417377390aeac.
Original commit message: Model updating IV users directly in VPlan, replace fixupIVUsers.
Now simple extracts are created for all phis in the exit block during initial VPlan construction. A later VPlan transform (optimizeInductionExitUsers) replaces extracts of inductions with their pre-computed values if possible.
This completes the transition towards modeling all live-outs directly in VPlan.
There are a few follow-ups: * emit extracts initially also for resume phis, and optimize them tougher with IV exit users * support for VPlans with multiple exits in optimizeInductionExitUsers.
Depends on https://github.com/llvm/llvm-project/pull/110004, https://github.com/llvm/llvm-project/pull/109975 and https://github.com/llvm/llvm-project/pull/112145.
show more ...
|
#
58326f1d |
| 18-Jan-2025 |
Florian Hahn <flo@fhahn.com> |
Revert "[VPlan] Update final IV exit value via VPlan. (#112147)"
This reverts commit c2d15ac4d4432788557e77c15ce572ac655a8fec.
Causes build failures on PPC stage2 & fuchsia bots https://lab.llv
Revert "[VPlan] Update final IV exit value via VPlan. (#112147)"
This reverts commit c2d15ac4d4432788557e77c15ce572ac655a8fec.
Causes build failures on PPC stage2 & fuchsia bots https://lab.llvm.org/buildbot/#/builders/168/builds/7650 https://lab.llvm.org/buildbot/#/builders/11/builds/11248
show more ...
|
#
c2d15ac4 |
| 18-Jan-2025 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Update final IV exit value via VPlan. (#112147)
Model updating IV users directly in VPlan, replace fixupIVUsers.
Now simple extracts are created for all phis in the exit block during
ini
[VPlan] Update final IV exit value via VPlan. (#112147)
Model updating IV users directly in VPlan, replace fixupIVUsers.
Now simple extracts are created for all phis in the exit block during
initial VPlan construction. A later VPlan transform
(optimizeInductionExitUsers) replaces extracts of inductions with
their pre-computed values if possible.
This completes the transition towards modeling all live-outs directly in
VPlan.
There are a few follow-ups:
* emit extracts initially also for resume phis, and optimize them
tougher with IV exit users
* support for VPlans with multiple exits in optimizeInductionExitUsers.
Depends on https://github.com/llvm/llvm-project/pull/110004,
https://github.com/llvm/llvm-project/pull/109975 and
https://github.com/llvm/llvm-project/pull/112145.
show more ...
|
#
edf3a55b |
| 17-Jan-2025 |
John Brawn <john.brawn@arm.com> |
[LoopVectorize][NFC] Centralize the setting of CostKind (#121937)
In each class which calculates instruction costs (VPCostContext,
LoopVectorizationCostModel, GeneratedRTChecks) set the CostKind on
[LoopVectorize][NFC] Centralize the setting of CostKind (#121937)
In each class which calculates instruction costs (VPCostContext,
LoopVectorizationCostModel, GeneratedRTChecks) set the CostKind once in
the constructor instead of in each function that calculates a cost. This
is in preparation for potentially changing the CostKind when compiling
for optsize.
show more ...
|
#
9491f75e |
| 17-Jan-2025 |
Hassnaa Hamdi <hassnaa.hamdi@arm.com> |
Reland: [LV]: Teach LV to recursively (de)interleave. (#122989)
This commit relands the changes from "[LV]: Teach LV to recursively
(de)interleave. #89018"
Reason for revert:
- The patch expose
Reland: [LV]: Teach LV to recursively (de)interleave. (#122989)
This commit relands the changes from "[LV]: Teach LV to recursively
(de)interleave. #89018"
Reason for revert:
- The patch exposed a bug in the IA pass, the bug is now fixed and landed by commit: #122643
show more ...
|
#
9720be95 |
| 17-Jan-2025 |
Mel Chen <mel.chen@sifive.com> |
[LV][EVL] Disable fixed-order recurrence idiom with EVL tail folding. (#122458)
The currently llvm.splice may occurs unexpected behavior if the evl of
the second-to-last iteration is not VF*UF.
[LV][EVL] Disable fixed-order recurrence idiom with EVL tail folding. (#122458)
The currently llvm.splice may occurs unexpected behavior if the evl of
the second-to-last iteration is not VF*UF.
Issue #122461
show more ...
|
#
edc02351 |
| 15-Jan-2025 |
David Sherwood <david.sherwood@arm.com> |
[NFC][LoopVectorize] Add more loop early exit asserts (#122732)
This patch is split off #120567, adding asserts in
addScalarResumePhis and addExitUsersForFirstOrderRecurrences
that the loop does n
[NFC][LoopVectorize] Add more loop early exit asserts (#122732)
This patch is split off #120567, adding asserts in
addScalarResumePhis and addExitUsersForFirstOrderRecurrences
that the loop does not contain an uncountable early exit,
since the code cannot yet handle them correctly.
show more ...
|
#
1de3dc7d |
| 14-Jan-2025 |
Florian Hahn <flo@fhahn.com> |
[LV] Bail out early if BTC+1 wraps.
Currently we fail to detect the case where BTC + 1 wraps, i.e. the vector trip count is 0, In those cases, the minimum iteration count check will fail, and the ve
[LV] Bail out early if BTC+1 wraps.
Currently we fail to detect the case where BTC + 1 wraps, i.e. the vector trip count is 0, In those cases, the minimum iteration count check will fail, and the vector code will never be executed.
Explicitly check for this condition in computeMaxVF and avoid trying to vectorize alltogether.
Note that a number of tests needed to be updated, because the vector loop would never be executed given the input IR.
Fixes https://github.com/llvm/llvm-project/issues/122558.
show more ...
|
Revision tags: llvmorg-19.1.7 |
|
#
cb2560d3 |
| 14-Jan-2025 |
Luke Lau <luke@igalia.com> |
[VPlan] Verify plan before optimizations. NFC (#122678)
I've been exploring verifying the VPlan before and after the EVL transformation steps, and noticed that the VPlan comes out in an invalid stat
[VPlan] Verify plan before optimizations. NFC (#122678)
I've been exploring verifying the VPlan before and after the EVL transformation steps, and noticed that the VPlan comes out in an invalid state between construction and optimisation.
In adjustRecipesForReductions, we leave behind some dead recipes which are invalid:
1) When we replace a link with a reduction recipe, the old link ends up becoming a use-before-def:
WIDEN ir<%l7> = add ir<%sum.02>, ir<%indvars.iv>.1 WIDEN ir<%l8> = add ir<%l7>.1, ir<%l3> WIDEN ir<%l9> = add ir<%l8>.1, ir<%l5> ... REDUCE ir<%l7>.1 = ir<%sum.02> + reduce.add (ir<%indvars.iv>.1) REDUCE ir<%l8>.1 = ir<%l7>.1 + reduce.add (ir<%l3>) REDUCE ir<%l9>.1 = ir<%l8>.1 + reduce.add (ir<%l5>)
2) When transforming an AnyOf reduction phi to a boolean, we leave behind a select with mismatching operand types, which will trigger the assertions in VTypeAnalysis after #122679
This adds an extra verification step and deletes the dead recipes eagerly to keep the plan valid.
show more ...
|
#
3397950f |
| 13-Jan-2025 |
Mel Chen <mel.chen@sifive.com> |
[LV] Fix FindLastIV reduction for epilogue vectorization. (#120395)
Following 0e528ac404e13ed2d952a2d83aaf8383293c851e, this patch adjusts
the resume value of VPReductionPHIRecipe for FindLastIV re
[LV] Fix FindLastIV reduction for epilogue vectorization. (#120395)
Following 0e528ac404e13ed2d952a2d83aaf8383293c851e, this patch adjusts
the resume value of VPReductionPHIRecipe for FindLastIV reductions.
Replacing the resume value with:
ResumeValue = ResumeValue == StartValue ? SentinelValue : ResumeValue;
This addressed the correctness issue when the start value might not be
less than the minimum value of a monotonically increasing induction
variable.
Thanks Florian Hahn for the help.
---------
Co-authored-by: Florian Hahn <flo@fhahn.com>
show more ...
|
#
795e35a6 |
| 13-Jan-2025 |
Sam Tebbs <samuel.tebbs@arm.com> |
Reland "[LoopVectorizer] Add support for partial reductions" with non-phi operand fix. (#121744)
This relands the reverted #120721 with a fix for cases where neither reduction operand are the reduct
Reland "[LoopVectorizer] Add support for partial reductions" with non-phi operand fix. (#121744)
This relands the reverted #120721 with a fix for cases where neither reduction operand are the reduction phi. Only 63114239cc8d26225a0ef9920baacfc7cc00fc58 and 63114239cc8d26225a0ef9920baacfc7cc00fc58 are new on top of the reverted PR.
---------
Co-authored-by: Nicholas Guy <nicholas.guy@arm.com>
show more ...
|