Revision tags: llvmorg-21-init |
|
#
4a2ebd66 |
| 22-Jan-2025 |
David Sherwood <david.sherwood@arm.com> |
[LV][NFC] Refactor structures used to maintain uncountable exit info (#123219)
I've removed the HasUncountableEarlyExit variable, since we can
already determine whether or not a loop has an early e
[LV][NFC] Refactor structures used to maintain uncountable exit info (#123219)
I've removed the HasUncountableEarlyExit variable, since we can
already determine whether or not a loop has an early exit by seeing
if we found an uncountable exit.
I have also deleted the old UncountableExitingBlocks and
UncountableExitBlocks lists and replaced them with a single
uncountable edge. This means we don't need to worry about keeping the
list entries in sync and makes it clear which exiting block
corresponds to which exit block.
show more ...
|
Revision tags: llvmorg-19.1.7 |
|
#
b0697dc1 |
| 09-Jan-2025 |
Florian Hahn <flo@fhahn.com> |
[LV] Only check isVectorizableEarlyExitLoop with multiple exits. (#121994)
Currently we emit early-exit related debug messages/remarks even when
there is a single exit. Update to only check isVecto
[LV] Only check isVectorizableEarlyExitLoop with multiple exits. (#121994)
Currently we emit early-exit related debug messages/remarks even when
there is a single exit. Update to only check isVectorizableEarlyExitLoop
if there isn't a single exit block.
PR: https://github.com/llvm/llvm-project/pull/121994
show more ...
|
#
f88ef1bd |
| 09-Jan-2025 |
Benjamin Maxwell <benjamin.maxwell@arm.com> |
[LV] Teach LoopVectorizationLegality about struct vector calls (#119221)
This is a split-off from #109833 and only adds code relating to checking
if a struct-returning call can be vectorized.
Th
[LV] Teach LoopVectorizationLegality about struct vector calls (#119221)
This is a split-off from #109833 and only adds code relating to checking
if a struct-returning call can be vectorized.
This initial patch only allows the case where all users of the struct
return are `extractvalue` operations that can be widened.
```
%call = tail call { float, float } @foo(float %in_val)
%extract_a = extractvalue { float, float } %call, 0
%extract_b = extractvalue { float, float } %call, 1
```
Note: The tests require the VFABI changes from #119000 to pass.
show more ...
|
#
45c01e8a |
| 19-Dec-2024 |
Finn Plummer <50529406+inbelic@users.noreply.github.com> |
[NFC][TargetTransformInfo][VectorUtils] Consolidate `isVectorIntrinsic...` api (#117635)
- update `VectorUtils:isVectorIntrinsicWithScalarOpAtArg` to use TTI for
all uses, to allow specifiction of
[NFC][TargetTransformInfo][VectorUtils] Consolidate `isVectorIntrinsic...` api (#117635)
- update `VectorUtils:isVectorIntrinsicWithScalarOpAtArg` to use TTI for
all uses, to allow specifiction of target specific intrinsics
- add TTI to the `isVectorIntrinsicWithStructReturnOverloadAtField` api
- update TTI api to provide `isTargetIntrinsicWith...` functions and
consistently name them
- move `isTriviallyScalarizable` to VectorUtils
- update all uses of the api and provide the TTI parameter
Resolves #117030
show more ...
|
#
c18fda02 |
| 19-Dec-2024 |
David Sherwood <david.sherwood@arm.com> |
[LoopVectorize] Use new single string variant of reportVectorizationFailure (#120414)
|
Revision tags: llvmorg-19.1.6 |
|
#
5fae408d |
| 11-Dec-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Dispatch to multiple exit blocks via middle blocks. (#112138)
A more lightweight variant of
https://github.com/llvm/llvm-project/pull/109193,
which dispatches to multiple exit blocks via t
[VPlan] Dispatch to multiple exit blocks via middle blocks. (#112138)
A more lightweight variant of
https://github.com/llvm/llvm-project/pull/109193,
which dispatches to multiple exit blocks via the middle blocks.
The patch also introduces a bit of required scaffolding to enable
early-exit vectorization, including an option. At the moment, early-exit
vectorization doesn't come with legality checks, and is only used if the
option is provided and the loop has metadata forcing vectorization. This
is only intended to be used for testing during bring-up, with @david-arm
enabling auto early-exit vectorization plugging in the changes from
https://github.com/llvm/llvm-project/pull/88385.
PR: https://github.com/llvm/llvm-project/pull/112138
show more ...
|
Revision tags: llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3 |
|
#
6ab26eab |
| 28-Oct-2024 |
Ellis Hoag <ellis.sparky.hoag@gmail.com> |
Check hasOptSize() in shouldOptimizeForSize() (#112626)
|
Revision tags: llvmorg-19.1.2, llvmorg-19.1.1 |
|
#
6f1a8c2d |
| 27-Sep-2024 |
Graham Hunter <graham.hunter@arm.com> |
[LV] Vectorize histogram operations (#99851)
This patch implements autovectorization support for the 'all-in-one'
histogram intrinsic, which seems to have more support than the
'standalone' intrin
[LV] Vectorize histogram operations (#99851)
This patch implements autovectorization support for the 'all-in-one'
histogram intrinsic, which seems to have more support than the
'standalone' intrinsic. See
https://discourse.llvm.org/t/rfc-vectorization-support-for-histogram-count-operations/74788/
for an overview of the work and my notes on the tradeoffs between the
two approaches.
show more ...
|
#
f4eeae12 |
| 23-Sep-2024 |
David Sherwood <david.sherwood@arm.com> |
[LoopVectorize] Address comments on PR #107004 left post-commit (#109300)
* Rename Speculative -> Uncountable and update tests.
* Add comments explaining why it's safe to ignore the predicates when
[LoopVectorize] Address comments on PR #107004 left post-commit (#109300)
* Rename Speculative -> Uncountable and update tests.
* Add comments explaining why it's safe to ignore the predicates when
building up a list of exiting blocks.
* Reshuffle some code to do (hopefully) cheaper checks first.
show more ...
|
#
02ee96ec |
| 23-Sep-2024 |
David Sherwood <david.sherwood@arm.com> |
[Analysis] Teach isDereferenceableAndAlignedInLoop about SCEV predicates (#106562)
Currently if a loop contains loads that we can prove at compile time
are dereferenceable when certain conditions a
[Analysis] Teach isDereferenceableAndAlignedInLoop about SCEV predicates (#106562)
Currently if a loop contains loads that we can prove at compile time
are dereferenceable when certain conditions are satisfied the function
isDereferenceableAndAlignedInLoop will still return false because
getSmallConstantMaxTripCount will return 0 when SCEV predicates
are required. This patch changes getSmallConstantMaxTripCount to take
an optional Predicates pointer argument so that we can permit
functions such as isDereferenceableAndAlignedInLoop to consider more
cases.
show more ...
|
#
57777a50 |
| 19-Sep-2024 |
Benjamin Kramer <benny.kra@googlemail.com> |
[LoopVectorize] Silence unused variable warning
|
#
e762d4da |
| 19-Sep-2024 |
David Sherwood <david.sherwood@arm.com> |
[LoopVectorize] Teach LoopVectorizationLegality about more early exits (#107004)
This patch is split off from PR #88385 and concerns only the code
related to the legality of vectorising early exit
[LoopVectorize] Teach LoopVectorizationLegality about more early exits (#107004)
This patch is split off from PR #88385 and concerns only the code
related to the legality of vectorising early exit loops. It is the first
step in adding support for vectorisation of a simple class of loops that
typically involves searching for something, i.e.
for (int i = 0; i < n; i++) {
if (p[i] == val)
return i;
}
return n;
or
for (int i = 0; i < n; i++) {
if (p1[i] != p2[i])
return i;
}
return n;
In this initial commit LoopVectorizationLegality will only consider
early exit loops legal for vectorising if they follow these criteria:
1. There are no stores in the loop.
2. The loop must have only one early exit like those shown in the above
example. I have referred to such exits as speculative early exits, to
distinguish from existing support for early exits where the
exit-not-taken count is known exactly at compile time.
3. The early exit block dominates the latch block.
4. The latch block must have an exact exit count.
5. There are no loads after the early exit block.
6. The loop must not contain reductions or recurrences. I don't see
anything fundamental blocking vectorisation of such loops, but I just
haven't done the work to support them yet.
7. We must be able to prove at compile-time that loops will not contain
faulting loads.
Tests have been added here:
Transforms/LoopVectorize/AArch64/simple_early_exit.ll
show more ...
|
Revision tags: llvmorg-19.1.0 |
|
#
78e1e6ac |
| 06-Sep-2024 |
ErikHogeman <erik.hogeman@arm.com> |
[LV] Check for vector-to-scalar casts in legalizer (#106244)
The code makes assumptions later on the operations and their inputs
being scalar in the loops that are processed, so we should make sure
[LV] Check for vector-to-scalar casts in legalizer (#106244)
The code makes assumptions later on the operations and their inputs
being scalar in the loops that are processed, so we should make sure
this is the case in the legalizer.
show more ...
|
#
cd46829e |
| 04-Sep-2024 |
Madhur Amilkanthwar <madhura@nvidia.com> |
[LV] Fix emission of debug message in legality check (#101924)
Successful vectorization message is emitted even
after "Result" is false. "Result" = false indicates failure of one of
the legality c
[LV] Fix emission of debug message in legality check (#101924)
Successful vectorization message is emitted even
after "Result" is false. "Result" = false indicates failure of one of
the legality check and thus
successful message should not be printed.
show more ...
|
Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3 |
|
#
0da2ba81 |
| 17-Aug-2024 |
Daniil Fukalov <dfukalov@gmail.com> |
[NFC] Cleanup in ADT and Analysis headers. (#104484)
Remove unused directly includes and forward declarations in ADT and
Analysis headers.
|
#
f0df4fbd |
| 11-Aug-2024 |
Florian Hahn <flo@fhahn.com> |
[LV] Support generating masks for switch terminators. (#99808)
Update createEdgeMask to created masks where the terminator in Src is a
switch. We need to handle 2 separate cases:
1. Dst is not t
[LV] Support generating masks for switch terminators. (#99808)
Update createEdgeMask to created masks where the terminator in Src is a
switch. We need to handle 2 separate cases:
1. Dst is not the default desintation. Dst is reached if any of the
cases with destination == Dst are taken. Join the conditions for each
case where destination == Dst using a logical OR.
2. Dst is the default destination. Dst is reached if none of the cases
with destination != Dst are taken. Join the conditions for each case
where the destination is != Dst using a logical OR and negate it.
Edge masks are created for every destination of cases and/or
default when requesting a mask where the source is a switch.
Fixes https://github.com/llvm/llvm-project/issues/48188.
PR: https://github.com/llvm/llvm-project/pull/99808
show more ...
|
Revision tags: llvmorg-19.1.0-rc2 |
|
#
edf46f36 |
| 03-Aug-2024 |
Florian Hahn <flo@fhahn.com> |
[SCEV] Use const SCEV * explicitly in more places.
Use const SCEV * explicitly in more places to prepare for https://github.com/llvm/llvm-project/pull/91961. Split off as suggested.
|
Revision tags: llvmorg-19.1.0-rc1 |
|
#
e1a3aa8c |
| 24-Jul-2024 |
Ramkumar Ramachandra <ramkumar.ramachandra@codasip.com> |
LV/Legality: fix style after cursory reading (NFC) (#100363)
|
Revision tags: llvmorg-20-init |
|
#
5c834989 |
| 21-Jul-2024 |
Kazu Hirata <kazu@google.com> |
[Transforms] Use range-based for loops (NFC) (#99607)
|
#
22a7f6dc |
| 11-Jul-2024 |
Graham Hunter <graham.hunter@arm.com> |
Revert "[LV] Autovectorization for the all-in-one histogram intrinsic" (#98493)
Reverts llvm/llvm-project#91458 to deal with post-commit reviewer
requests.
|
#
1860fd04 |
| 11-Jul-2024 |
Graham Hunter <graham.hunter@arm.com> |
[LV] Autovectorization for the all-in-one histogram intrinsic (#91458)
This patch implements limited loop vectorization support for the 'all-in-one' histogram intrinsic. The feature is disabled by d
[LV] Autovectorization for the all-in-one histogram intrinsic (#91458)
This patch implements limited loop vectorization support for the 'all-in-one' histogram intrinsic. The feature is disabled by default, and when enabled will only vectorize if there are no other users of values in the gather-modify-scatter sequence.
show more ...
|
#
0577cdaa |
| 08-Jul-2024 |
Florian Hahn <flo@fhahn.com> |
[LV] Split checking if tail-folding is possible, collecting masked ops. (#77612)
Introduce new canFoldTail helper which only checks if tail-folding is
possible, but without modifying MaskedOps.
[LV] Split checking if tail-folding is possible, collecting masked ops. (#77612)
Introduce new canFoldTail helper which only checks if tail-folding is
possible, but without modifying MaskedOps.
Just because tail-folding is possible doesn't mean the tail will be
folded; that's up to the cost-model to decide. Separating the check if
tail-folding is possible and preparing for tail-folding makes sure that
MaskedOps is only populated when tail-folding is actually selected.
PR: https://github.com/llvm/llvm-project/pull/77612
show more ...
|
#
2d209d96 |
| 27-Jun-2024 |
Nikita Popov <npopov@redhat.com> |
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it does
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...
`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
show more ...
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7 |
|
#
e949b54a |
| 04-Jun-2024 |
Florian Hahn <flo@fhahn.com> |
[LAA] Use PSE::getSymbolicMaxBackedgeTakenCount. (#93499)
Update LAA to use PSE::getSymbolicMaxBackedgeTakenCount which returns
the minimum of the countable exits.
When analyzing dependences and
[LAA] Use PSE::getSymbolicMaxBackedgeTakenCount. (#93499)
Update LAA to use PSE::getSymbolicMaxBackedgeTakenCount which returns
the minimum of the countable exits.
When analyzing dependences and computing runtime checks, we need the
smallest upper bound on the number of iterations. In terms of memory
safety, it shouldn't matter if any uncomputable exits leave the loop,
as long as we prove that there are no dependences given the minimum of
the countable exits. The same should apply also for generating runtime
checks.
Note that this shifts the responsiblity of checking whether all exit
counts are computable or handling early-exits to the users of LAA.
Depends on https://github.com/llvm/llvm-project/pull/93498
PR: https://github.com/llvm/llvm-project/pull/93499
show more ...
|
Revision tags: llvmorg-18.1.6 |
|
#
b54a78d6 |
| 04-May-2024 |
Florian Hahn <flo@fhahn.com> |
[LV,LAA] Don't vectorize loops with load and store to invar address.
Code checking stores to invariant addresses and reductions made an incorrect assumption that the case of both a load & store to t
[LV,LAA] Don't vectorize loops with load and store to invar address.
Code checking stores to invariant addresses and reductions made an incorrect assumption that the case of both a load & store to the same invariant address does not need to be handled.
In some cases when vectorizing with runtime checks, there may be dependences with a load and store to the same address, storing a reduction value.
Update LAA to separately track if there was a store-store and a load-store dependence with an invariant addresses.
Bail out early if there as a load-store dependence with invariant address. If there was a store-store one, still apply the logic checking if they all store a reduction.
show more ...
|