Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7
# 4ad0fdd1 17-Dec-2024 Florian Hahn <flo@fhahn.com>

[VPlan] Remove reverse() of predecessors from VPInstruction::generate.

This was originally done to reduce the diff for the change. Remove it
and update the remaining tests. NFC modulo reordering of

[VPlan] Remove reverse() of predecessors from VPInstruction::generate.

This was originally done to reduce the diff for the change. Remove it
and update the remaining tests. NFC modulo reordering of incoming
values.

Clean up after https://github.com/llvm/llvm-project/pull/114292.

show more ...


Revision tags: llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2
# 9536a628 31-Jan-2024 Florian Hahn <flo@fhahn.com>

[VPlan] Preserve original induction order when creating scalar steps.

Update createScalarIVSteps to take an insert point as parameter. This
ensures that the inserted scalar steps are in the same ord

[VPlan] Preserve original induction order when creating scalar steps.

Update createScalarIVSteps to take an insert point as parameter. This
ensures that the inserted scalar steps are in the same order as the
recipes they replace (vs in reverse order as currently). This helps to
reduce the diff for follow-up changes.

show more ...


Revision tags: llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1
# f108c6cd 18-Sep-2023 Florian Hahn <flo@fhahn.com>

[VPlan] Fold (MUL A, 1) -> A as VPlan2VPlan transform.

Add first VPlan-based recipe simplification to fold (MUL A, 1) -> A.
Among other things, this enables additional simplifications after
applying

[VPlan] Fold (MUL A, 1) -> A as VPlan2VPlan transform.

Add first VPlan-based recipe simplification to fold (MUL A, 1) -> A.
Among other things, this enables additional simplifications after
applying versioned strides, as follow up to D147783.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D159200

show more ...


Revision tags: llvmorg-17.0.0, llvmorg-17.0.0-rc4
# 96e83d37 29-Aug-2023 Florian Hahn <flo@fhahn.com>

[LV] Use IRBuilder to create and optimize middle-block compare.

Split off from D150398 to avoid builder-related diff changes there.
Using IRBuilder to create ICmps simplifies the result if both oper

[LV] Use IRBuilder to create and optimize middle-block compare.

Split off from D150398 to avoid builder-related diff changes there.
Using IRBuilder to create ICmps simplifies the result if both operands
are constants.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D158332

show more ...


Revision tags: llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# d2090847 13-Jun-2023 Florian Hahn <flo@fhahn.com>

[VPlan] Replace versioned stride with constant during VPlan opts.

After constructing the initial VPlan, replace VPValues for versioned
strides with their constant counterparts.

Differential Revisio

[VPlan] Replace versioned stride with constant during VPlan opts.

After constructing the initial VPlan, replace VPValues for versioned
strides with their constant counterparts.

Differential Revision: https://reviews.llvm.org/D147783

show more ...


Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1
# 78ae870f 31-Mar-2023 Philip Reames <preames@rivosinc.com>

{tests] Rerun autogen to reduce a diff [nfc]


Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# 5b400150 14-Dec-2022 Nikita Popov <npopov@redhat.com>

[LoopVectorize] Convert some tests to opaque pointers (NFC)

For these tests update_test_checks.py had to be rerun.


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3
# 2e14900d 28-Apr-2022 Bjorn Pettersson <bjorn.a.pettersson@ericsson.com>

[test][NewPM] Use -passes=loop-vectorize instead of -loop-vectorize

Update a bunch of loop-vectorize regression tests to use the new PM
syntax (opt -passes=loop-vectorize) instead of the deprecated

[test][NewPM] Use -passes=loop-vectorize instead of -loop-vectorize

Update a bunch of loop-vectorize regression tests to use the new PM
syntax (opt -passes=loop-vectorize) instead of the deprecated legacy
PM syntax (opt -loop-vectorize).

show more ...


Revision tags: llvmorg-14.0.2, llvmorg-14.0.1
# 872f7000 03-Apr-2022 Dávid Bolvanský <david.bolvansky@gmail.com>

Revert "[NFCI] Regenerate SROA/LoopVectorize test checks"

This reverts commit 14e3450fb57305aa9ff3e9e60687b458e43835c9.


# a113a582 03-Apr-2022 Dávid Bolvanský <david.bolvansky@gmail.com>

[NFCI] Regenerate LoopVectorize test checks


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2
# b3e8ace1 28-Feb-2022 Florian Hahn <flo@fhahn.com>

Recommit "[VPlan] Introduce recipe to build scalar steps."

This reverts the revert commit ff93260bf6bddfbad1fa65c4d5184988885b900f.

The underlying issue causing the PPC bot failures has been fixed

Recommit "[VPlan] Introduce recipe to build scalar steps."

This reverts the revert commit ff93260bf6bddfbad1fa65c4d5184988885b900f.

The underlying issue causing the PPC bot failures has been fixed in
cbaac1473403 and a corresponding test case has been added in
ad2cad1c521c.

Original message:

This patch adds a new VPScalarIVStepsRecipe to handle building scalar
steps.

In the first patch, it only handles the case where there is no vector
induction variable needed.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D115953

show more ...


# ff93260b 27-Feb-2022 Florian Hahn <flo@fhahn.com>

Revert "[VPlan] Introduce recipe to build scalar steps."

This reverts commit 49b23f451cf713036c99573a35daed308d2ac894.

This appears to break some PPC build bots. Revert while I investigate.


# 49b23f45 27-Feb-2022 Florian Hahn <flo@fhahn.com>

[VPlan] Introduce recipe to build scalar steps.

This patch adds a new VPScalarIVStepsRecipe to handle building scalar
steps.

In the first patch, it only handles the case where there is no vector
in

[VPlan] Introduce recipe to build scalar steps.

This patch adds a new VPScalarIVStepsRecipe to handle building scalar
steps.

In the first patch, it only handles the case where there is no vector
induction variable needed.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D115953

show more ...


Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 86d113a8 06-Jan-2022 Florian Hahn <flo@fhahn.com>

[SCEVExpand] Do not create redundant 'or false' for pred expansion.

This patch updates SCEVExpander::expandUnionPredicate to not create
redundant 'or false, x' instructions. While those are triviall

[SCEVExpand] Do not create redundant 'or false' for pred expansion.

This patch updates SCEVExpander::expandUnionPredicate to not create
redundant 'or false, x' instructions. While those are trivially
foldable, they can be easily avoided and hinder code that checks the
size/cost of the generated checks before further folds.

I am planning on look into a few other similar improvements to code
generated by SCEVExpander.

I remember a while ago @lebedev.ri working on doing some trivial folds
like that in IRBuilder itself, but there where concerns that such
changes may subtly break existing code.

Reviewed By: reames, lebedev.ri

Differential Revision: https://reviews.llvm.org/D116696

show more ...


Revision tags: llvmorg-13.0.1-rc1
# c45045bf 28-Oct-2021 Florian Hahn <flo@fhahn.com>

[VPlan] Keep induction recipes in header.

This patch updates recipe creation to ensure all
VPWidenIntOrFpInductionRecipes are in the header block. At the moment,
new induction recipes can be created

[VPlan] Keep induction recipes in header.

This patch updates recipe creation to ensure all
VPWidenIntOrFpInductionRecipes are in the header block. At the moment,
new induction recipes can be created in different blocks when trying to
optimize casts and induction variables.

Having all induction recipes in the header makes it easier to
analyze/transform them in VPlan.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D111300

show more ...


# b2915971 27-Oct-2021 Roman Lebedev <lebedev.ri@gmail.com>

Revert rest of `IRBuilderBase`'s short-circuiting folds

Upon further investigation and discussion,
this is actually the opposite direction from what we should be taking,
and this direction wouldn't

Revert rest of `IRBuilderBase`'s short-circuiting folds

Upon further investigation and discussion,
this is actually the opposite direction from what we should be taking,
and this direction wouldn't solve the motivational problem anyway.

Additionally, some more (polly) tests have escaped being updated.
So, let's just take a step back here.

This reverts commit f3190dedeef9da2109ea57e4cb372f295ff53b88.
This reverts commit 749581d21f2b3f53e4fca4eb8728c942d646893b.
This reverts commit f3df87d57e096143670e0fd396e81d43393a2dd2.
This reverts commit ab1dbcecd6f0969976fafd62af34730436ad5944.

show more ...


# 156f10c8 27-Oct-2021 Roman Lebedev <lebedev.ri@gmail.com>

[IR] `SCEVExpander::generateOverflowCheck()`: short-circuit `umul_with_overflow`-by-one

It's a no-op, no overflow happens ever: https://alive2.llvm.org/ce/z/Zw89rZ

While generally i don't like such

[IR] `SCEVExpander::generateOverflowCheck()`: short-circuit `umul_with_overflow`-by-one

It's a no-op, no overflow happens ever: https://alive2.llvm.org/ce/z/Zw89rZ

While generally i don't like such hacks,
we have a very good reason to do this: here we are expanding
a run-time correctness check for the vectorization,
and said `umul_with_overflow` will not be optimized out
before we query the cost of the checks we've generated.

Which means, the cost of run-time checks would be artificially inflated,
and after https://reviews.llvm.org/D109368 that will affect
the minimal trip count for which these checks are even evaluated.
And if they aren't even evaluated, then the vectorized code
certainly won't be run.

We could consider doing this in IRBuilder, but then we'd need to
also teach `CreateExtractValue()` to look into chain of `insertvalue`'s,
and i'm not sure there's precedent for that.

Refs. https://reviews.llvm.org/D109368#3089809

show more ...


# f3df87d5 27-Oct-2021 Roman Lebedev <lebedev.ri@gmail.com>

[IR] `IRBuilderBase::CreateOr()`: fix short-circuiting for constant on LHS

There is no guarantee that the constant is on RHS here,
we have to handle both cases.

Refs. https://reviews.llvm.org/D1093

[IR] `IRBuilderBase::CreateOr()`: fix short-circuiting for constant on LHS

There is no guarantee that the constant is on RHS here,
we have to handle both cases.

Refs. https://reviews.llvm.org/D109368#3089809

show more ...


# ab1dbcec 27-Oct-2021 Roman Lebedev <lebedev.ri@gmail.com>

[IR] `IRBuilderBase::CreateSelect()`: if cond is a constant i1, short-circuit

While we could emit such a tautological `select`,
it will stick around until the next instsimplify invocation,
which may

[IR] `IRBuilderBase::CreateSelect()`: if cond is a constant i1, short-circuit

While we could emit such a tautological `select`,
it will stick around until the next instsimplify invocation,
which may happen after we count the cost of this redundant `select`.
Which is precisely what happens with loop vectorization legality checks,
and that artificially increases the cost of said checks,
which is bad.

There is prior art for this in `IRBuilderBase::CreateAnd()`/`IRBuilderBase::CreateOr()`.

Refs. https://reviews.llvm.org/D109368#3089809

show more ...


# 5a8a7b3b 27-Oct-2021 Roman Lebedev <lebedev.ri@gmail.com>

[NFC] Re-autogenerate check lines in some tests to ease of future update


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2
# 23c2f2e6 07-Jun-2021 Florian Hahn <flo@fhahn.com>

[LV] Mark increment of main vector loop induction variable as NUW.

This patch marks the induction increment of the main induction variable
of the vector loop as NUW when not folding the tail.

If th

[LV] Mark increment of main vector loop induction variable as NUW.

This patch marks the induction increment of the main induction variable
of the vector loop as NUW when not folding the tail.

If the tail is not folded, we know that End - Start >= Step (either
statically or through the minimum iteration checks). We also know that both
Start % Step == 0 and End % Step == 0. We exit the vector loop if %IV +
%Step == %End. Hence we must exit the loop before %IV + %Step unsigned
overflows and we can mark the induction increment as NUW.

This should make SCEV return more precise bounds for the created vector
loops, used by later optimizations, like late unrolling.

At the moment quite a few tests still need to be updated, but before
doing so I'd like to get initial feedback to make sure I am not missing
anything.

Note that this could probably be further improved by using information
from the original IV.

Attempt of modeling of the assumption in Alive2:
https://alive2.llvm.org/ce/z/H_DL_g

Part of a set of fixes required for PR50412.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D103255

show more ...


Revision tags: llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1
# 9529597c 12-May-2020 Sjoerd Meijer <sjoerd.meijer@arm.com>

Recommit #2: "[LV] Induction Variable does not remain scalar under tail-folding."

This was reverted because of a miscompilation. At closer inspection, the
problem was actually visible in a changed l

Recommit #2: "[LV] Induction Variable does not remain scalar under tail-folding."

This was reverted because of a miscompilation. At closer inspection, the
problem was actually visible in a changed llvm regression test too. This
one-line follow up fix/recommit will splat the IV, which is what we are trying
to avoid if unnecessary in general, if tail-folding is requested even if all
users are scalar instructions after vectorisation. Because with tail-folding,
the splat IV will be used by the predicate of the masked loads/stores
instructions. The previous version omitted this, which caused the
miscompilation. The original commit message was:

If tail-folding of the scalar remainder loop is applied, the primary induction
variable is splat to a vector and used by the masked load/store vector
instructions, thus the IV does not remain scalar. Because we now mark
that the IV does not remain scalar for these cases, we don't emit the vector IV
if it is not used. Thus, the vectoriser produces less dead code.

Thanks to Ayal Zaks for the direction how to fix this.

show more ...


# 96c63f54 10-May-2020 Florian Hahn <flo@fhahn.com>

Recommit "[LAA] Remove one addRuntimeChecks function (NFC)."

The failing assertion has been fixed and the problematic test case has
been added.

This reverts the revert commit fc44617f28847417e55836

Recommit "[LAA] Remove one addRuntimeChecks function (NFC)."

The failing assertion has been fixed and the problematic test case has
been added.

This reverts the revert commit fc44617f28847417e55836193bbe8e9c3f09eca9.

show more ...