#
c836b895 |
| 29-Jan-2025 |
David Sherwood <david.sherwood@arm.com> |
[LoopVectorize][NFC] Disable output for tests that don't need it (#124747)
There are a lot of tests that do not depend upon the IR output
for validation, relying instead on the debug output. For th
[LoopVectorize][NFC] Disable output for tests that don't need it (#124747)
There are a lot of tests that do not depend upon the IR output
for validation, relying instead on the debug output. For these
tests we can add the -disable-output command line argument.
show more ...
|
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
f69ac9a2 |
| 21-Dec-2022 |
Florian Hahn <flo@fhahn.com> |
[LV] Support widened induction variables in epilogue vectorization.
Code generation now uses the start VPValue of induction recipes.
This makes it possible to adjust the start value of the epilogue
[LV] Support widened induction variables in epilogue vectorization.
Code generation now uses the start VPValue of induction recipes.
This makes it possible to adjust the start value of the epilogue vector loop to use the 'resume' value of the main vector loop.
Fixes #59459.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D92132
show more ...
|
#
7d757725 |
| 14-Dec-2022 |
Nikita Popov <npopov@redhat.com> |
[LoopVectorize] Convert some tests to opaque pointers (NFC)
|
#
1e08a08a |
| 07-Dec-2022 |
Roman Lebedev <lebedev.ri@gmail.com> |
[NFC] Port all LoopVectorize tests to `-passes=` syntax
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
#
2f80ea7f |
| 07-Mar-2022 |
Roman Lebedev <lebedev.ri@gmail.com> |
[NFC][LV] Use different braces in debug output
The analysis passes output function name encapsulated in `'` braces, but LV uses `"`. Harmonizing this may help in creating an update script for the LV
[NFC][LV] Use different braces in debug output
The analysis passes output function name encapsulated in `'` braces, but LV uses `"`. Harmonizing this may help in creating an update script for the LV costmodel test checks.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D121105
show more ...
|
Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init |
|
#
8082ab2f |
| 24-Jan-2022 |
Kerry McLaughlin <kerry.mclaughlin@arm.com> |
[LoopVectorize] Support epilogue vectorisation of loops with reductions
isCandidateForEpilogueVectorization will currently return false for loops which contain reductions. This patch removes this re
[LoopVectorize] Support epilogue vectorisation of loops with reductions
isCandidateForEpilogueVectorization will currently return false for loops which contain reductions. This patch removes this restriction and makes the following changes to support epilogue vectorisation with reductions:
- `fixReduction`: If fixReduction is being called during vectorisation of the epilogue, the phi node it creates will need to additionally carry incoming values from the middle block of the main loop.
- `createEpilogueVectorizedLoopSkeleton`: The incoming values of the phi created by fixReduction are updated after the vec.epilog.iter.check block is added. The phi is also moved to the preheader of the epilogue.
- `processLoop`: The start value of any VPReductionPHIRecipes are updated before vectorising the epilogue loop. The getResumeInstr function added to the ILV will return the resume instruction associated with the recurrence descriptor.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D116928
show more ...
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1 |
|
#
1e7efd39 |
| 16-Nov-2020 |
Cullen Rhodes <cullen.rhodes@arm.com> |
[LV] Legalize scalable VF hints
In the following loop:
void foo(int *a, int *b, int N) { for (int i=0; i<N; ++i) a[i + 4] = a[i] + b[i]; }
The loop dependence constrains the VF to a
[LV] Legalize scalable VF hints
In the following loop:
void foo(int *a, int *b, int N) { for (int i=0; i<N; ++i) a[i + 4] = a[i] + b[i]; }
The loop dependence constrains the VF to a maximum of (4, fixed), which would mean using <4 x i32> as the vector type in vectorization. Extending this to scalable vectorization, a VF of (4, scalable) implies a vector type of <vscale x 4 x i32>. To determine if this is legal vscale must be taken into account. For this example, unless max(vscale)=1, it's unsafe to vectorize.
For SVE, the number of bits in an SVE register is architecturally defined to be a multiple of 128 bits with a maximum of 2048 bits, thus the maximum vscale is 16. In the loop above it is therefore unfeasible to vectorize with SVE. However, in this loop:
void foo(int *a, int *b, int N) { #pragma clang loop vectorize_width(X, scalable) for (int i=0; i<N; ++i) a[i + 32] = a[i] + b[i]; }
As long as max(vscale) multiplied by the number of lanes 'X' doesn't exceed the dependence distance, it is safe to vectorize. For SVE a VF of (2, scalable) is within this constraint, since a vector of <16 x 2 x 32> will have no dependencies between lanes. For any number of lanes larger than this it would be unsafe to vectorize.
This patch extends 'computeFeasibleMaxVF' to legalize scalable VFs specified as loop hints, implementing the following behaviour: * If the backend does not support scalable vectors, ignore the hint. * If scalable vectorization is unfeasible given the loop dependence, like in the first example above for SVE, then use a fixed VF. * Accept scalable VFs if it's safe to do so. * Otherwise, clamp scalable VFs that exceed the maximum safe VF.
Reviewed By: sdesmalen, fhahn, david-arm
Differential Revision: https://reviews.llvm.org/D91718
show more ...
|
#
1fd3a047 |
| 09-Dec-2020 |
Cullen Rhodes <cullen.rhodes@arm.com> |
[LV] Disable epilogue vectorization for scalable VFs
Epilogue vectorization doesn't support scalable vectorization factors yet, disable it for now.
Reviewed By: sdesmalen, bmahjour
Differential Re
[LV] Disable epilogue vectorization for scalable VFs
Epilogue vectorization doesn't support scalable vectorization factors yet, disable it for now.
Reviewed By: sdesmalen, bmahjour
Differential Revision: https://reviews.llvm.org/D93063
show more ...
|
#
a8034fc1 |
| 02-Dec-2020 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[LoopVectorize] Fix optimal-epilog-vectorization-limitations.ll test on non-debug build bots
Add "REQUIRES: asserts" as the test uses the "--debug-only" switch
Should fix the clang-with-thin-lto-ub
[LoopVectorize] Fix optimal-epilog-vectorization-limitations.ll test on non-debug build bots
Add "REQUIRES: asserts" as the test uses the "--debug-only" switch
Should fix the clang-with-thin-lto-ubuntu buildbot failure
show more ...
|
#
a7e2c269 |
| 02-Dec-2020 |
Bardia Mahjour <bmahjour@ca.ibm.com> |
[LV] Epilogue Vectorization with Optimal Control Flow (Recommit)
This is yet another attempt at providing support for epilogue vectorization following discussions raised in RFC http://llvm.1065342.n
[LV] Epilogue Vectorization with Optimal Control Flow (Recommit)
This is yet another attempt at providing support for epilogue vectorization following discussions raised in RFC http://llvm.1065342.n5.nabble.com/llvm-dev-Proposal-RFC-Epilog-loop-vectorization-tt106322.html#none and reviews D30247 and D88819.
Similar to D88819, this patch achieve epilogue vectorization by executing a single vplan twice: once on the main loop and a second time on the epilogue loop (using a different VF). However it's able to handle more loops, and generates more optimal control flow for cases where the trip count is too small to execute any code in vector form.
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D89566
show more ...
|
#
9c5504ad |
| 01-Dec-2020 |
Bardia Mahjour <bmahjour@ca.ibm.com> |
[LV] Epilogue Vectorization with Optimal Control Flow
This is yet another attempt at providing support for epilogue vectorization following discussions raised in RFC http://llvm.1065342.n5.nabble.co
[LV] Epilogue Vectorization with Optimal Control Flow
This is yet another attempt at providing support for epilogue vectorization following discussions raised in RFC http://llvm.1065342.n5.nabble.com/llvm-dev-Proposal-RFC-Epilog-loop-vectorization-tt106322.html#none and reviews D30247 and D88819.
Similar to D88819, this patch achieve epilogue vectorization by executing a single vplan twice: once on the main loop and a second time on the epilogue loop (using a different VF). However it's able to handle more loops, and generates more optimal control flow for cases where the trip count is too small to execute any code in vector form.
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D89566
show more ...
|