History log of /llvm-project/llvm/test/Transforms/LoopVectorize/reduction-inloop-pred.ll (Results 1 – 25 of 29)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5
# 82821254 28-Nov-2024 Florian Hahn <flo@fhahn.com>

[LV] Use IVUpdateMayOverflow to set HasNUW. (#111758)

If IVUpdateMayOverflow is false, we proved that the induction increment
cannot overflow in the vector loop. This allows setting NUW in some
ca

[LV] Use IVUpdateMayOverflow to set HasNUW. (#111758)

If IVUpdateMayOverflow is false, we proved that the induction increment
cannot overflow in the vector loop. This allows setting NUW in some
cases when folding the tail.

PR: https://github.com/llvm/llvm-project/pull/111758

show more ...


Revision tags: llvmorg-19.1.4
# 38fffa63 06-Nov-2024 Paul Walker <paul.walker@arm.com>

[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548)


Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0
# 2c7786e9 03-Sep-2024 Philip Reames <preames@rivosinc.com>

Prefer use of 0.0 over -0.0 for fadd reductions w/nsz (in IR) (#106770)

This is a follow up to 924907bc6, and is mostly motivated by consistency
but does include one additional optimization. In gen

Prefer use of 0.0 over -0.0 for fadd reductions w/nsz (in IR) (#106770)

This is a follow up to 924907bc6, and is mostly motivated by consistency
but does include one additional optimization. In general, we prefer 0.0
over -0.0 as the identity value for an fadd. We use that value in
several places, but don't in others. So, let's be consistent and use the
same identity (when nsz allows) everywhere.

This creates a bunch of test churn, but due to 924907bc6, most of that
churn doesn't actually indicate a change in codegen. The exception is
that this change enables the use of 0.0 for nsz, but *not* reasoc, fadd
reductions. Or said differently, it allows the neutral value of an
ordered fadd reduction to be 0.0.

show more ...


Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2
# 09eb9f11 19-Mar-2024 Michele Scandale <michele.scandale@gmail.com>

[InstCombine] Fix for folding `select` into floating point binary operators. (#83200)

Folding a `select` into a floating point binary operators can only be
done if the result is preserved for both

[InstCombine] Fix for folding `select` into floating point binary operators. (#83200)

Folding a `select` into a floating point binary operators can only be
done if the result is preserved for both case. In particular, if the
other operand of the `select` can be a NaN, then the transformation
won't preserve the result value.

show more ...


Revision tags: llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# 51afb101 09-Jan-2024 Florian Hahn <flo@fhahn.com>

[LV] Create block in mask up-front if needed. (#76635)

At the moment, block and edge masks are created on demand, which means
that they are inserted at the point where they are demanded and then
c

[LV] Create block in mask up-front if needed. (#76635)

At the moment, block and edge masks are created on demand, which means
that they are inserted at the point where they are demanded and then
cached. It is possible that the mask for a block is looked up later at a
point that's not dominated by the point where the mask has been
inserted.

To avoid this, create masks up front on entry to the corresponding basic
block and leave it to VPlan simplification to remove unneeded masks.

Note that we need to create masks for all blocks, if any of the blocks
in the loop needs predication, as computing the mask of a block depends
on the masks of its predecessor.

Needed for #76090.

https://github.com/llvm/llvm-project/pull/76635

show more ...


Revision tags: llvmorg-17.0.6
# 03d4a9d9 27-Nov-2023 Craig Topper <craig.topper@sifive.com>

[InstCombine] Set disjoint flag when turning Add into Or. (#72702)

The disjoint flag was recently added to IR in #72583


Revision tags: llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# 7d757725 14-Dec-2022 Nikita Popov <npopov@redhat.com>

[LoopVectorize] Convert some tests to opaque pointers (NFC)


# 1e08a08a 07-Dec-2022 Roman Lebedev <lebedev.ri@gmail.com>

[NFC] Port all LoopVectorize tests to `-passes=` syntax


# be51fa45 05-Dec-2022 Roman Lebedev <lebedev.ri@gmail.com>

[NFC] Port all runlines for LoopVectorize pass tests to -passes syntax


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 9df0b254 23-Jul-2022 Nuno Lopes <nuno.lopes@tecnico.ulisboa.pt>

[NFC] Switch a few uses of undef to poison as placeholders for unreachable code


Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2
# a266af72 14-Feb-2022 Nikita Popov <npopov@redhat.com>

[InstCombine] Canonicalize SPF to min/max intrinsics

Now that integer min/max intrinsics have good support in both
InstCombine and other passes, start canonicalizing SPF min/max
to intrinsic min/max

[InstCombine] Canonicalize SPF to min/max intrinsics

Now that integer min/max intrinsics have good support in both
InstCombine and other passes, start canonicalizing SPF min/max
to intrinsic min/max.

Once this sticks, we can stop matching SPF min/max in various
places, and can remove hacks we have for preventing infinite loops
and breaking of SPF canonicalization.

Differential Revision: https://reviews.llvm.org/D98152

show more ...


Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# e6ad9ef4 14-Dec-2021 Philip Reames <listmail@philipreames.com>

[instcombine] Canonicalize constant index type to i64 for extractelement/insertelement

The basic idea to this is that a) having a single canonical type makes CSE easier, and b) many of our transform

[instcombine] Canonicalize constant index type to i64 for extractelement/insertelement

The basic idea to this is that a) having a single canonical type makes CSE easier, and b) many of our transforms are inconsistent about which types we end up with based on visit order.

I'm restricting this to constants as for non-constants, we'd have to decide whether the simplicity was worth extra instructions. For constants, there are no extra instructions.

We chose the canonical type as i64 arbitrarily. We might consider changing this to something else in the future if we have cause.

Differential Revision: https://reviews.llvm.org/D115387

show more ...


Revision tags: llvmorg-13.0.1-rc1
# 9cd7c534 22-Nov-2021 Huihui Zhang <huihuiz@quicinc.com>

[InstCombine] Enable fold select into operand for FAdd, FMul, FSub and FDiv.

For FAdd, FMul, FSub and FDiv, fold select into one of the operands to enable
further optimizations, i.e., floating-poin

[InstCombine] Enable fold select into operand for FAdd, FMul, FSub and FDiv.

For FAdd, FMul, FSub and FDiv, fold select into one of the operands to enable
further optimizations, i.e., floating-point reduction detection.

Turn code:
%C = fadd %A, %B
%D = select %cond, %C, %A

into:
%C = select %cond, %B, -0.000000e+00
%D = fadd %A, %C

Alive2 verification (with --disable-undef-input), timed out otherwise.
FAdd - https://alive2.llvm.org/ce/z/eUxN4Y
FMul - https://alive2.llvm.org/ce/z/5SWZz4
FSub - https://alive2.llvm.org/ce/z/Dhj8dU
FDiv - https://alive2.llvm.org/ce/z/Yj_NA2

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D113442

show more ...


# dcb8222d 26-Oct-2021 Rosie Sumpter <rosie.sumpter@arm.com>

[LoopVectorize] Propagate fast-math flags for inloop reductions

This patch updates VPReductionRecipe::execute so that the fast-math
flags associated with the underlying instruction of the VPRecipe a

[LoopVectorize] Propagate fast-math flags for inloop reductions

This patch updates VPReductionRecipe::execute so that the fast-math
flags associated with the underlying instruction of the VPRecipe are
propagated through to the reductions which are created.

Differential Revision: https://reviews.llvm.org/D112548

show more ...


# c45045bf 28-Oct-2021 Florian Hahn <flo@fhahn.com>

[VPlan] Keep induction recipes in header.

This patch updates recipe creation to ensure all
VPWidenIntOrFpInductionRecipes are in the header block. At the moment,
new induction recipes can be created

[VPlan] Keep induction recipes in header.

This patch updates recipe creation to ensure all
VPWidenIntOrFpInductionRecipes are in the header block. At the moment,
new induction recipes can be created in different blocks when trying to
optimize casts and induction variables.

Having all induction recipes in the header makes it easier to
analyze/transform them in VPlan.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D111300

show more ...


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1
# 9d355949 30-Jul-2021 Kerry McLaughlin <kerry.mclaughlin@arm.com>

Reland "[LV] Use lookThroughAnd with logical reductions"

If a reduction Phi has a single user which `AND`s the Phi with a type mask,
`lookThroughAnd` will return the user of the Phi and the narrower

Reland "[LV] Use lookThroughAnd with logical reductions"

If a reduction Phi has a single user which `AND`s the Phi with a type mask,
`lookThroughAnd` will return the user of the Phi and the narrower type represented
by the mask. Currently this is only used for arithmetic reductions, whereas loops
containing logical reductions will create a reduction intrinsic using the widened
type, for example:

for.body:
%phi = phi i32 [ %and, %for.body ], [ 255, %entry ]
%mask = and i32 %phi, 255
%gep = getelementptr inbounds i8, i8* %ptr, i32 %iv
%load = load i8, i8* %gep
%ext = zext i8 %load to i32
%and = and i32 %mask, %ext
...

^ this will generate an and reduction intrinsic such as the following:
call i32 @llvm.vector.reduce.and.v8i32(<8 x i32>...)

The same example for an add instruction would create an intrinsic of type i8:
call i8 @llvm.vector.reduce.add.v8i8(<8 x i8>...)

This patch changes AddReductionVar to call lookThroughAnd for other integer
reductions, allowing loops similar to the example above with reductions such
as and, or & xor to vectorize.

Reviewed By: david-arm, dmgreen

Differential Revision: https://reviews.llvm.org/D105632

show more ...


Revision tags: llvmorg-14-init
# be753b20 21-Jul-2021 Kerry McLaughlin <kerry.mclaughlin@arm.com>

Revert "[LV] Use lookThroughAnd with logical reductions"

Reverting patch due to buildbot failures.

This reverts commit e22a59967251294ccdac6b43a06f48c1b7075240.


# e22a5996 19-Jul-2021 Kerry McLaughlin <kerry.mclaughlin@arm.com>

[LV] Use lookThroughAnd with logical reductions

If a reduction Phi has a single user which `AND`s the Phi with a type mask,
`lookThroughAnd` will return the user of the Phi and the narrower type rep

[LV] Use lookThroughAnd with logical reductions

If a reduction Phi has a single user which `AND`s the Phi with a type mask,
`lookThroughAnd` will return the user of the Phi and the narrower type represented
by the mask. Currently this is only used for arithmetic reductions, whereas loops
containing logical reductions will create a reduction intrinsic using the widened
type, for example:

for.body:
%phi = phi i32 [ %and, %for.body ], [ 255, %entry ]
%mask = and i32 %phi, 255
%gep = getelementptr inbounds i8, i8* %ptr, i32 %iv
%load = load i8, i8* %gep
%ext = zext i8 %load to i32
%and = and i32 %mask, %ext
...

^ this will generate an and reduction intrinsic such as the following:
call i32 @llvm.vector.reduce.and.v8i32(<8 x i32>...)

The same example for an add instruction would create an intrinsic of type i8:
call i8 @llvm.vector.reduce.add.v8i8(<8 x i8>...)

This patch changes AddReductionVar to call lookThroughAnd for other integer
reductions, allowing loops similar to the example above with reductions such
as and, or & xor to vectorize.

Reviewed By: david-arm, dmgreen

Differential Revision: https://reviews.llvm.org/D105632

show more ...


Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4
# 80aa7e14 28-Jun-2021 Florian Hahn <flo@fhahn.com>

[VPlan] Merge predicated-triangle regions, after sinking.

Sinking scalar operands into predicated-triangle regions may allow
merging regions. This patch adds a VPlan-to-VPlan transform that tries
to

[VPlan] Merge predicated-triangle regions, after sinking.

Sinking scalar operands into predicated-triangle regions may allow
merging regions. This patch adds a VPlan-to-VPlan transform that tries
to merge predicate-triangle regions after sinking.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D100260

show more ...


Revision tags: llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2
# 23c2f2e6 07-Jun-2021 Florian Hahn <flo@fhahn.com>

[LV] Mark increment of main vector loop induction variable as NUW.

This patch marks the induction increment of the main induction variable
of the vector loop as NUW when not folding the tail.

If th

[LV] Mark increment of main vector loop induction variable as NUW.

This patch marks the induction increment of the main induction variable
of the vector loop as NUW when not folding the tail.

If the tail is not folded, we know that End - Start >= Step (either
statically or through the minimum iteration checks). We also know that both
Start % Step == 0 and End % Step == 0. We exit the vector loop if %IV +
%Step == %End. Hence we must exit the loop before %IV + %Step unsigned
overflows and we can mark the induction increment as NUW.

This should make SCEV return more precise bounds for the created vector
loops, used by later optimizations, like late unrolling.

At the moment quite a few tests still need to be updated, but before
doing so I'd like to get initial feedback to make sure I am not missing
anything.

Note that this could probably be further improved by using information
from the original IV.

Attempt of modeling of the assumption in Alive2:
https://alive2.llvm.org/ce/z/H_DL_g

Part of a set of fixes required for PR50412.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D103255

show more ...


Revision tags: llvmorg-12.0.1-rc1
# 8a156d1c 02-May-2021 Juneyoung Lee <aqjune@gmail.com>

[InstCombine] Fully disable select to and/or i1 folding

This is a patch that disables the poison-unsafe select -> and/or i1 folding.

It has been blocking D72396 and also has been the source of a fe

[InstCombine] Fully disable select to and/or i1 folding

This is a patch that disables the poison-unsafe select -> and/or i1 folding.

It has been blocking D72396 and also has been the source of a few miscompilations
described in llvm.org/pr49688 .
D99674 conditionally blocked this folding and successfully fixed the latter one.
The former one was still blocked, and this patch addresses it.

Note that a few test functions that has `_logical` suffix are now deoptimized.
These are created by @nikic to check the impact of disabling this optimization
by copying existing original functions and replacing and/or with select.

I can see that most of these are poison-unsafe; they can be revived by introducing
freeze instruction. I left comments at fcmp + select optimizations (or-fcmp.ll, and-fcmp.ll)
because I think they are good targets for freeze fix.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D101191

show more ...


# e639bcce 02-May-2021 Juneyoung Lee <aqjune@gmail.com>

run update_test_checks.py for the tests in D101191 (NFC)

This is an NFC that reruns update_test_checks.py on the tests that are
going to be updated in D101191.


Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2
# ed253ef7 09-Feb-2021 Juneyoung Lee <aqjune@gmail.com>

[LoopVectorize] Fix VPRecipeBuilder::createEdgeMask to correctly generate the mask

This patch fixes pr48832 by correctly generating the mask when a poison value is involved.

Consider this CFG (whic

[LoopVectorize] Fix VPRecipeBuilder::createEdgeMask to correctly generate the mask

This patch fixes pr48832 by correctly generating the mask when a poison value is involved.

Consider this CFG (which is a part of the input):

```
for.body: ; preds = %for.cond
br i1 true, label %cond.false, label %land.rhs

land.rhs: ; preds = %for.body
br i1 poison, label %cond.end, label %cond.false

cond.false: ; preds = %for.body, %land.rhs
br label %cond.end

cond.end: ; preds = %land.rhs, %cond.false
%cond = phi i32 [ 0, %cond.false ], [ 1, %land.rhs ]

```

The path for.body -> land.rhs -> cond.end should be taken when 'select i1 false, i1 poison, i1 false' holds (which means it's never taken); but VPRecipeBuilder::createEdgeMask was emitting 'and i1 false, poison' instead.
The former one successfully blocks poison propagation whereas the latter one doesn't, making the condition poison and thus causing the miscompilation.

SimplifyCFG has a similar bug (which didn't expose a real-world bug yet), and a patch for this is also ongoing (see https://reviews.llvm.org/D95026).

Reviewed By: bjope

Differential Revision: https://reviews.llvm.org/D95217

show more ...


# 79b1b4a5 12-Feb-2021 Sanjay Patel <spatel@rotateright.com>

[Vectorizers][TTI] remove option to bypass creation of vector reduction intrinsics

The vector reduction intrinsics started life as experimental ops, so backend support
was lacking. As part of promot

[Vectorizers][TTI] remove option to bypass creation of vector reduction intrinsics

The vector reduction intrinsics started life as experimental ops, so backend support
was lacking. As part of promoting them to 1st-class intrinsics, however, codegen
support was added/improved:
D58015
D90247

So I think it is safe to now remove this complication from IR.

Note that we still have an IR-level codegen expansion pass for these as discussed
in D95690. Removing that is another step in simplifying the logic. Also note that
x86 was already unconditionally forming reductions in IR, so there should be no
difference for x86.

I spot checked a couple of the tests here by running them through opt+llc and did
not see any asm diffs.

If we do find functional differences for other targets, it should be possible
to (at least temporarily) restore the shuffle IR with the ExpandReductions IR
pass.

Differential Revision: https://reviews.llvm.org/D96552

show more ...


Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1
# 4a8e6ed2 05-Jan-2021 Juneyoung Lee <aqjune@gmail.com>

[SLP,LV] Use poison constant vector for shufflevector/initial insertelement

This patch makes SLP and LV emit operations with initial vectors set to poison constant instead of undef.
This is a part o

[SLP,LV] Use poison constant vector for shufflevector/initial insertelement

This patch makes SLP and LV emit operations with initial vectors set to poison constant instead of undef.
This is a part of efforts for using poison vector instead of undef to represent "doesn't care" vector.
The goal is to make nice shufflevector optimizations valid that is currently incorrect due to the tricky interaction between undef and poison (see https://bugs.llvm.org/show_bug.cgi?id=44185 ).

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D94061

show more ...


12