Revision tags: llvmorg-21-init |
|
#
713482fc |
| 27-Jan-2025 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Use State.get to extract lane mask for BranchOnMask.
Simplifies the code slightly and avoids redundant extracts/broadcasts if the operand is live-in or already scalar.
|
#
1de3dc7d |
| 14-Jan-2025 |
Florian Hahn <flo@fhahn.com> |
[LV] Bail out early if BTC+1 wraps.
Currently we fail to detect the case where BTC + 1 wraps, i.e. the vector trip count is 0, In those cases, the minimum iteration count check will fail, and the ve
[LV] Bail out early if BTC+1 wraps.
Currently we fail to detect the case where BTC + 1 wraps, i.e. the vector trip count is 0, In those cases, the minimum iteration count check will fail, and the vector code will never be executed.
Explicitly check for this condition in computeMaxVF and avoid trying to vectorize alltogether.
Note that a number of tests needed to be updated, because the vector loop would never be executed given the input IR.
Fixes https://github.com/llvm/llvm-project/issues/122558.
show more ...
|
Revision tags: llvmorg-19.1.7 |
|
#
f0d5104c |
| 08-Jan-2025 |
Luke Lau <luke@igalia.com> |
[VPlan] Handle some VPInstructions in may{Read,Write}FromMemory (#120058)
This just copies the same conservative definition from mayWriteToMemory, and enables more VPInstructions to be hoisted out i
[VPlan] Handle some VPInstructions in may{Read,Write}FromMemory (#120058)
This just copies the same conservative definition from mayWriteToMemory, and enables more VPInstructions to be hoisted out in LICM.
I think this should give more accurate costs, and I was able to build llvm-test-suite without the legacy-vplan cost model assertion going off.
show more ...
|
#
7f3428d3 |
| 29-Dec-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Compute induction end values in VPlan. (#112145)
Use createDerivedIV to compute IV end values directly in VPlan, instead
of creating them up-front.
This allows updating IV users outside
[VPlan] Compute induction end values in VPlan. (#112145)
Use createDerivedIV to compute IV end values directly in VPlan, instead
of creating them up-front.
This allows updating IV users outside the loop as follow-up.
Depends on https://github.com/llvm/llvm-project/pull/110004 and
https://github.com/llvm/llvm-project/pull/109975.
PR: https://github.com/llvm/llvm-project/pull/112145
show more ...
|
#
4ad0fdd1 |
| 17-Dec-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Remove reverse() of predecessors from VPInstruction::generate.
This was originally done to reduce the diff for the change. Remove it and update the remaining tests. NFC modulo reordering of
[VPlan] Remove reverse() of predecessors from VPInstruction::generate.
This was originally done to reduce the diff for the change. Remove it and update the remaining tests. NFC modulo reordering of incoming values.
Clean up after https://github.com/llvm/llvm-project/pull/114292.
show more ...
|
Revision tags: llvmorg-19.1.6 |
|
#
7f7f540a |
| 06-Dec-2024 |
Florian Hahn <flo@fhahn.com> |
Reapply "[VPlan] Update scalar induction resume values in VPlan. (#110577)"
This reverts commit f09b16e2671cbcdf7cb7dc7ed705db092a9deda1.
The crash when building llvm-test-suite with stage2 should
Reapply "[VPlan] Update scalar induction resume values in VPlan. (#110577)"
This reverts commit f09b16e2671cbcdf7cb7dc7ed705db092a9deda1.
The crash when building llvm-test-suite with stage2 should have been fixed by 1091fad31a83d5ab87eb6fa11fe3bdb3f0d152ea.
show more ...
|
#
f09b16e2 |
| 06-Dec-2024 |
Nikita Popov <npopov@redhat.com> |
Revert "[VPlan] Update scalar induction resume values in VPlan. (#110577)"
This reverts commit 0678e2058364ec10b94560d27ec7138dfa003287. This reverts commit 1091fad31a83d5ab87eb6fa11fe3bdb3f0d152ea.
Revert "[VPlan] Update scalar induction resume values in VPlan. (#110577)"
This reverts commit 0678e2058364ec10b94560d27ec7138dfa003287. This reverts commit 1091fad31a83d5ab87eb6fa11fe3bdb3f0d152ea.
Causes crashes in llvm-test-suite when using stage 2 clang.
show more ...
|
#
0678e205 |
| 06-Dec-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Update scalar induction resume values in VPlan. (#110577)
Updated ILV.createInductionResumeValues (now createInductionResumeVPValue)
to directly update the VPIRInstructions wrapping the ori
[VPlan] Update scalar induction resume values in VPlan. (#110577)
Updated ILV.createInductionResumeValues (now createInductionResumeVPValue)
to directly update the VPIRInstructions wrapping the original phis with the
created resume values.
This is the first step towards modeling them completely in VPlan.
Subsequent patches will move creation of the resume values completely
into VPlan.
Depends on https://github.com/llvm/llvm-project/pull/109975.
PR: https://github.com/llvm/llvm-project/pull/110577
show more ...
|
#
462cb3cd |
| 05-Dec-2024 |
Nikita Popov <npopov@redhat.com> |
[InstCombine] Infer nusw + nneg -> nuw for getelementptr (#111144)
If the gep is nusw (usually via inbounds) and the offset is
non-negative, we can infer nuw.
Proof: https://alive2.llvm.org/ce/z
[InstCombine] Infer nusw + nneg -> nuw for getelementptr (#111144)
If the gep is nusw (usually via inbounds) and the offset is
non-negative, we can infer nuw.
Proof: https://alive2.llvm.org/ce/z/ihztLy
show more ...
|
Revision tags: llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
38fffa63 |
| 06-Nov-2024 |
Paul Walker <paul.walker@arm.com> |
[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548)
|
#
b021464d |
| 31-Oct-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Introduce scalar loop header in plan, remove VPLiveOut. (#109975)
Update VPlan to include the scalar loop header. This allows retiring
VPLiveOut, as the remaining live-outs can now be handl
[VPlan] Introduce scalar loop header in plan, remove VPLiveOut. (#109975)
Update VPlan to include the scalar loop header. This allows retiring
VPLiveOut, as the remaining live-outs can now be handled by adding
operands to the wrapped phis in the scalar loop header.
Note that the current version only includes the scalar loop header, no
other loop blocks and also does not wrap it in a region block.
PR: https://github.com/llvm/llvm-project/pull/109975
show more ...
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2 |
|
#
9f3d1695 |
| 02-Oct-2024 |
Nikita Popov <npopov@redhat.com> |
[SCEVExpander] Preserve gep nuw during expansion (#102133)
When expanding SCEV adds to geps, transfer the nuw flag to the resulting
gep. (Note that this doesn't apply to IV increment GEPs, which go
[SCEVExpander] Preserve gep nuw during expansion (#102133)
When expanding SCEV adds to geps, transfer the nuw flag to the resulting
gep. (Note that this doesn't apply to IV increment GEPs, which go
through a different code path.)
show more ...
|
Revision tags: llvmorg-19.1.1 |
|
#
53266f73 |
| 22-Sep-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Run DCE after unrolling.
This cleans up a number of dead recipes after unrolling if only their first or last parts are used. This simplifies a number of tests.
Fixes https://github.com/llvm
[VPlan] Run DCE after unrolling.
This cleans up a number of dead recipes after unrolling if only their first or last parts are used. This simplifies a number of tests.
Fixes https://github.com/llvm/llvm-project/issues/109581.
show more ...
|
#
8ec40675 |
| 21-Sep-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Implement unrolling as VPlan-to-VPlan transform. (#95842)
This patch implements explicit unrolling by UF as VPlan transform. In
follow up patches this will allow simplifying VPTransform st
[VPlan] Implement unrolling as VPlan-to-VPlan transform. (#95842)
This patch implements explicit unrolling by UF as VPlan transform. In
follow up patches this will allow simplifying VPTransform state (no need
to store unrolled parts) as well as recipe execution (no need to
generate code for multiple parts in an each recipe). It also allows for
more general optimziations (e.g. avoid generating code for recipes that
are uniform-across parts).
It also unifies the logic dealing with unrolled parts in a single place,
rather than spreading it out across multiple places (e.g. VPlan post
processing for header-phi recipes previously.)
In the initial implementation, a number of recipes still take the
unrolled part as additional, optional argument, if their execution
depends on the unrolled part.
The computation for start/step values for scalable inductions changed
slightly. Previously the step would be computed as scalar and then
splatted, now vscale gets splatted and multiplied by the step in a
vector mul.
This has been split off https://github.com/llvm/llvm-project/pull/94339
which also includes changes to simplify VPTransfomState and recipes'
::execute.
The current version mostly leaves existing ::execute untouched and
instead sets VPTransfomState::UF to 1.
A follow-up patch will clean up all references to VPTransformState::UF.
Another follow-up patch will simplify VPTransformState to only store a
single vector value per VPValue.
PR: https://github.com/llvm/llvm-project/pull/95842
show more ...
|
Revision tags: llvmorg-19.1.0 |
|
#
2c7786e9 |
| 03-Sep-2024 |
Philip Reames <preames@rivosinc.com> |
Prefer use of 0.0 over -0.0 for fadd reductions w/nsz (in IR) (#106770)
This is a follow up to 924907bc6, and is mostly motivated by consistency
but does include one additional optimization. In gen
Prefer use of 0.0 over -0.0 for fadd reductions w/nsz (in IR) (#106770)
This is a follow up to 924907bc6, and is mostly motivated by consistency
but does include one additional optimization. In general, we prefer 0.0
over -0.0 as the identity value for an fadd. We use that value in
several places, but don't in others. So, let's be consistent and use the
same identity (when nsz allows) everywhere.
This creates a bunch of test churn, but due to 924907bc6, most of that
churn doesn't actually indicate a change in codegen. The exception is
that this change enables the use of 0.0 for nsz, but *not* reasoc, fadd
reductions. Or said differently, it allows the neutral value of an
ordered fadd reduction to be 0.0.
show more ...
|
Revision tags: llvmorg-19.1.0-rc4 |
|
#
f044564d |
| 02-Sep-2024 |
Nikita Popov <npopov@redhat.com> |
[InstCombine] Make backedge check in op of phi transform more precise (#106075)
The op of phi transform wants to prevent moving an operation across a
backedge, as this may lead to an infinite combi
[InstCombine] Make backedge check in op of phi transform more precise (#106075)
The op of phi transform wants to prevent moving an operation across a
backedge, as this may lead to an infinite combine loop.
Currently, this is done using isPotentiallyReachable(). The problem with
that is that all blocks inside a loop are reachable from each other.
This means that the op of phi transform is effectively completely
disabled for code inside loops, even when it's not actually operating on
a loop phi (just a phi that happens to be in a loop).
Fix this by explicitly computing the backedges inside the function
instead. Do this via RPOT, which is a bit more efficient than using
FindFunctionBackedges() (which does it without any pre-computed
analyses).
For irreducible cycles, there may be multiple possible choices of
backedge, and this just picks one of them. This is still sufficient to
prevent combine loops.
This also removes the last use of LoopInfo in InstCombine -- I'll drop
the analysis in a followup.
show more ...
|
#
a1058776 |
| 21-Aug-2024 |
Nikita Popov <npopov@redhat.com> |
[InstCombine] Remove some of the complexity-based canonicalization (#91185)
The idea behind this canonicalization is that it allows us to handle less
patterns, because we know that some will be can
[InstCombine] Remove some of the complexity-based canonicalization (#91185)
The idea behind this canonicalization is that it allows us to handle less
patterns, because we know that some will be canonicalized away. This is
indeed very useful to e.g. know that constants are always on the right.
However, this is only useful if the canonicalization is actually
reliable. This is the case for constants, but not for arguments: Moving
these to the right makes it look like the "more complex" expression is
guaranteed to be on the left, but this is not actually the case in
practice. It fails as soon as you replace the argument with another
instruction.
The end result is that it looks like things correctly work in tests,
while they actually don't. We use the "thwart complexity-based
canonicalization" trick to handle this in tests, but it's often a
challenge for new contributors to get this right, and based on the
regressions this PR originally exposed, we clearly don't get this right
in many cases.
For this reason, I think that it's better to remove this complexity
canonicalization. It will make it much easier to write tests for
commuted cases and make sure that they are handled.
show more ...
|
Revision tags: llvmorg-19.1.0-rc3 |
|
#
c3c2370c |
| 06-Aug-2024 |
Nikita Popov <npopov@redhat.com> |
[Tests] Regenerate test checks (NFC)
|
Revision tags: llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
9a5a8731 |
| 11-Jul-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Introduce ResumePhi VPInstruction, use to create phi for FOR. (#94760)
This patch introduces a new ResumePhi VPInstruction which creates a phi
in a leaf block of a VPlan. The first use is t
[VPlan] Introduce ResumePhi VPInstruction, use to create phi for FOR. (#94760)
This patch introduces a new ResumePhi VPInstruction which creates a phi
in a leaf block of a VPlan. The first use is to create the phi node for
fixed-order recurrence resume values in the scalar preheader.
The VPInstruction takes 2 operands: 1) the incoming value from the
middle-block and a default value to be used for all other incoming
blocks.
In follow-up changes, it will also be used to create phis for reduction
and induction resume values.
Depends on https://github.com/llvm/llvm-project/pull/92651
PR: https://github.com/llvm/llvm-project/pull/94760
show more ...
|
#
99d6c6d9 |
| 05-Jul-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Model branch cond to enter scalar epilogue in VPlan. (#92651)
This patch moves branch condition creation to enter the scalar epilogue
loop to VPlan. Modeling the branch in the middle block
[VPlan] Model branch cond to enter scalar epilogue in VPlan. (#92651)
This patch moves branch condition creation to enter the scalar epilogue
loop to VPlan. Modeling the branch in the middle block also requires
modeling the successor blocks. This is done using the recently
introduced VPIRBasicBlock.
Note that the middle.block is still created as part of the skeleton and
then patched in during VPlan execution. Unfortunately the skeleton needs
to create the middle.block early on, as it is also used for induction
resume value creation and is also needed to properly update the
dominator tree during skeleton creation.
After this patch lands, I plan to move induction resume value and phi
node creation in the scalar preheader to VPlan. Once that is done, we
should be able to create the middle.block in VPlan directly.
This is a re-worked version based on the earlier
https://reviews.llvm.org/D150398 and the main change is the use of
VPIRBasicBlock.
Depends on https://github.com/llvm/llvm-project/pull/92525
PR: https://github.com/llvm/llvm-project/pull/92651
show more ...
|
#
26326800 |
| 27-Jun-2024 |
Noah Goldstein <goldstein.w.n@gmail.com> |
[InstCombine] Canonicalize `(gep <not i8> p, (div exact X, C))`
If C % sizeof(gep_element_type) is zero, we can canonicalize to `i8` via: `(gep i8 p, (div exact X, C / (sizeof(gep_element_type))
[InstCombine] Canonicalize `(gep <not i8> p, (div exact X, C))`
If C % sizeof(gep_element_type) is zero, we can canonicalize to `i8` via: `(gep i8 p, (div exact X, C / (sizeof(gep_element_type))))`
Closes #96898
show more ...
|
#
352a8361 |
| 26-Jun-2024 |
David Green <david.green@arm.com> |
[InstCombine] Canonicalize non-i8 gep of mul to i8 (#96606)
This is a small canonicalization for `gep i32, p, (mul x, C)` -> `gep
i8, p, (mul x, C*4)`, so that the mul can combine both of the const
[InstCombine] Canonicalize non-i8 gep of mul to i8 (#96606)
This is a small canonicalization for `gep i32, p, (mul x, C)` -> `gep
i8, p, (mul x, C*4)`, so that the mul can combine both of the constant
multiplications, and we take a small step towards canonicalizing more
geps to i8.
It currently doesn't attempt to check for multiple uses on the mul, but
that should be possible if it sounds better. Let me know what you think
of the idea in general.
show more ...
|
#
3808ba78 |
| 20-Jun-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Model middle block via VPIRBasicBlock. (#95816)
Use VPIRBasicBlock to wrap the middle block and implement patching up
branches in predecessors in VPIRBasicBlock::execute. The IR middle bloc
[VPlan] Model middle block via VPIRBasicBlock. (#95816)
Use VPIRBasicBlock to wrap the middle block and implement patching up
branches in predecessors in VPIRBasicBlock::execute. The IR middle block
is only created after skeleton creation. Initially a regular
VPBasicBlock is created, which will later be replaced by a
VPIRBasicBlock once the middle IR basic block has been created.
Note that this slightly changes the order of instructions created in the
middle block; code generated by recipe execution in the middle block
will now be inserted before the terminator (and in between the compare
to used by the terminator). The original order will be restored in
https://github.com/llvm/llvm-project/pull/92651.
PR: https://github.com/llvm/llvm-project/pull/95816
show more ...
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7 |
|
#
05e1b534 |
| 05-Jun-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Model FOR resume value extraction in VPlan. (#93396)
This patch uses the ExtractFromEnd VPInstruction opcode
to extract the value of a FOR to be used as resume value for the ph in
the scal
[VPlan] Model FOR resume value extraction in VPlan. (#93396)
This patch uses the ExtractFromEnd VPInstruction opcode
to extract the value of a FOR to be used as resume value for the ph in
the scalar loop.
It adds a new live-out that temporarily wraps the FOR phi in the scalar
loop. fixFixedOrderRecurrence will process live outs for fixed order
recurrence phis by creating a new phi node in the scalar preheader,
using the generated value for the live-out as incoming value from the
middle block and the original start value as incoming value for the
other edge. Creation of the phi in the preheader, as well as updating
the phi in the scalar loop will also be moved to VPlan in the future,
eventually retiring fixFixedOrderRecurrence
Depends on https://github.com/llvm/llvm-project/pull/93395
PR: https://github.com/llvm/llvm-project/pull/93396
show more ...
|
Revision tags: llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3 |
|
#
7c0d52ca |
| 08-Feb-2024 |
Nikita Popov <npopov@redhat.com> |
[ValueTracking] Support dominating known bits condition in and/or (#74728)
This extends computeKnownBits() support for dominating conditions to
also handle and/or conditions. We'll look through eit
[ValueTracking] Support dominating known bits condition in and/or (#74728)
This extends computeKnownBits() support for dominating conditions to
also handle and/or conditions. We'll look through either and or or
depending on which edge we're considering.
This change is mainly for the sake of completeness, so we don't start
missing optimizations if SimplifyCFG decides to merge some branches.
show more ...
|