Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6 |
|
#
38d18d93 |
| 02-Oct-2020 |
David Sherwood <david.sherwood@arm.com> |
[SVE] Add support to vectorize_width loop pragma for scalable vectors
This patch adds support for two new variants of the vectorize_width pragma:
1. vectorize_width(X[, fixed|scalable]) where an op
[SVE] Add support to vectorize_width loop pragma for scalable vectors
This patch adds support for two new variants of the vectorize_width pragma:
1. vectorize_width(X[, fixed|scalable]) where an optional second parameter is passed to the vectorize_width pragma, which indicates if the user wishes to use fixed width or scalable vectorization. For example the user can now write something like:
#pragma clang loop vectorize_width(4, fixed) or #pragma clang loop vectorize_width(4, scalable)
In the absence of a second parameter it is assumed the user wants fixed width vectorization, in order to maintain compatibility with existing code. 2. vectorize_width(fixed|scalable) where the width is left unspecified, but the user hints what type of vectorization they prefer, either fixed width or scalable.
I have implemented this by making use of the LLVM loop hint attribute:
llvm.loop.vectorize.scalable.enable
Tests were added to
clang/test/CodeGenCXX/pragma-loop.cpp
for both the 'fixed' and 'scalable' optional parameter.
See this thread for context: http://lists.llvm.org/pipermail/cfe-dev/2020-November/067262.html
Differential Revision: https://reviews.llvm.org/D89031
show more ...
|
#
ac73b73c |
| 02-Nov-2020 |
Atmn Patel <a335pate@uwaterloo.ca> |
[clang] Add mustprogress and llvm.loop.mustprogress attribute deduction
Since C++11, the C++ standard has a forward progress guarantee [intro.progress], so all such functions must have the `mustprog
[clang] Add mustprogress and llvm.loop.mustprogress attribute deduction
Since C++11, the C++ standard has a forward progress guarantee [intro.progress], so all such functions must have the `mustprogress` requirement. In addition, from C11 and onwards, loops without a non-zero constant conditional or no conditional are also required to make progress (C11 6.8.5p6). This patch implements these attribute deductions so they can be used by the optimization passes.
Differential Revision: https://reviews.llvm.org/D86841
show more ...
|
Revision tags: llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3 |
|
#
02168549 |
| 11-Dec-2019 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[Clang] Pragma vectorize_width() implies vectorize(enable)
Let's try this again; this has been reverted/recommited a few times. Last time this got reverted because for this loop:
void a() { #
[Clang] Pragma vectorize_width() implies vectorize(enable)
Let's try this again; this has been reverted/recommited a few times. Last time this got reverted because for this loop:
void a() { #pragma clang loop vectorize(disable) for (;;) ; }
vectorisation was incorrectly enabled and the vectorize.enable metadata was set due to a logic error. But with this fixed, we now imply vectorisation when:
1) vectorisation is enabled, which means: VectorizeWidth > 1, 2) and don't want to add it when it is disabled or enabled, otherwise we would be incorrectly setting it or duplicating the metadata, respectively.
This should fix PR27643.
Differential Revision: https://reviews.llvm.org/D69628
show more ...
|
Revision tags: llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1 |
|
#
6d424a16 |
| 24-Oct-2019 |
Jordan Rupprecht <rupprecht@google.com> |
Revert "Recommit "[Clang] Pragma vectorize_width() implies vectorize(enable)""
This reverts commit 80371c74ae63d2f260bcc75408be9c6f81e38465.
Given the following source: ``` void a() { for (;;)
Revert "Recommit "[Clang] Pragma vectorize_width() implies vectorize(enable)""
This reverts commit 80371c74ae63d2f260bcc75408be9c6f81e38465.
Given the following source: ``` void a() { for (;;) ; } ```
It incorrectly enables vectorization (with vector width 1), as well as generating a warning that vectorization could not be performed.
show more ...
|
#
80371c74 |
| 10-Oct-2019 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
Recommit "[Clang] Pragma vectorize_width() implies vectorize(enable)"
This was further discussed at the llvm dev list:
http://lists.llvm.org/pipermail/llvm-dev/2019-October/135602.html
I think the
Recommit "[Clang] Pragma vectorize_width() implies vectorize(enable)"
This was further discussed at the llvm dev list:
http://lists.llvm.org/pipermail/llvm-dev/2019-October/135602.html
I think the brief summary of that is that this change is an improvement, this is the behaviour that we expect and promise in ours docs, and also as a result there are cases where we now emit diagnostics whereas before pragmas were silently ignored. Two areas where we can improve: 1) the diagnostic message itself, and 2) and in some cases (e.g. -Os and -Oz) the vectoriser is (quite understandably) not triggering.
Original commit message:
Specifying the vectorization width was supposed to implicitly enable vectorization, except that it wasn't really doing this. It was only setting the vectorize.width metadata, but not vectorize.enable.
This should fix PR27643.
llvm-svn: 374288
show more ...
|
#
858a1ae3 |
| 18-Sep-2019 |
Hans Wennborg <hans@hanshq.net> |
Revert r372082 "[Clang] Pragma vectorize_width() implies vectorize(enable)"
This broke the Chromium build. Consider the following code:
float ScaleSumSamples_C(const float* src, float* dst, float
Revert r372082 "[Clang] Pragma vectorize_width() implies vectorize(enable)"
This broke the Chromium build. Consider the following code:
float ScaleSumSamples_C(const float* src, float* dst, float scale, int width) { float fsum = 0.f; int i; #if defined(__clang__) #pragma clang loop vectorize_width(4) #endif for (i = 0; i < width; ++i) { float v = *src++; fsum += v * v; *dst++ = v * scale; } return fsum; }
Compiling at -Oz, Clang now warns:
$ clang++ -target x86_64 -Oz -c /tmp/a.cc /tmp/a.cc:1:7: warning: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
this suggests it's not actually enabling vectorization hard enough.
At -Os it asserts instead:
$ build.release/bin/clang++ -target x86_64 -Os -c /tmp/a.cc clang-10: /work/llvm.monorepo/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:2734: void llvm::InnerLoopVectorizer::emitMemRuntimeChecks(llvm::Loop*, llvm::BasicBlock*): Assertion ` !BB->getParent()->hasOptSize() && "Cannot emit memory checks when optimizing for size"' failed.
Of course neither of these are what the developer expected from the pragma.
> Specifying the vectorization width was supposed to implicitly enable > vectorization, except that it wasn't really doing this. It was only > setting the vectorize.width metadata, but not vectorize.enable. > > This should fix PR27643. > > Differential Revision: https://reviews.llvm.org/D66290
llvm-svn: 372225
show more ...
|
Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6 |
|
#
e573a9c0 |
| 17-Sep-2019 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[Clang] Pragma vectorize_width() implies vectorize(enable)
Specifying the vectorization width was supposed to implicitly enable vectorization, except that it wasn't really doing this. It was only se
[Clang] Pragma vectorize_width() implies vectorize(enable)
Specifying the vectorization width was supposed to implicitly enable vectorization, except that it wasn't really doing this. It was only setting the vectorize.width metadata, but not vectorize.enable.
This should fix PR27643.
Differential Revision: https://reviews.llvm.org/D66290
llvm-svn: 372082
show more ...
|
Revision tags: llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1 |
|
#
58e76426 |
| 01-Apr-2019 |
Michael Kruse <llvm@meinersbur.de> |
[CodeGen] Generate follow-up metadata for loops with more than one transformation.
Before this patch, CGLoop would dump all transformations for a loop into a single LoopID without encoding any order
[CodeGen] Generate follow-up metadata for loops with more than one transformation.
Before this patch, CGLoop would dump all transformations for a loop into a single LoopID without encoding any order in which to apply them. rL348944 added the possibility to encode a transformation order using followup-attributes.
When a loop has more than one transformation, use the follow-up attribute define the order in which they are applied. The emitted order is the defacto order as defined by the current LLVM pass pipeline, which is:
LoopFullUnrollPass LoopDistributePass LoopVectorizePass LoopUnrollAndJamPass LoopUnrollPass MachinePipeliner
This patch should therefore not change the assembly output, assuming that all explicit transformations can be applied, and no implicit transformations in-between. In the former case, WarnMissedTransformationsPass should emit a warning (except for MachinePipeliner which is not implemented yet). The latter could be avoided by adding 'llvm.loop.disable_nonforced' attributes.
Because LoopUnrollAndJamPass processes a loop nest, generation of the MDNode is delayed to after the inner loop metadata have been processed. A temporary LoopID is therefore used to annotate instructions and RAUW'ed by the actual LoopID later.
Differential Revision: https://reviews.llvm.org/D57978
llvm-svn: 357415
show more ...
|
Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1, llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1, llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1, llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3, llvmorg-4.0.0-rc2, llvmorg-4.0.0-rc1, llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1, llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2, llvmorg-3.9.0-rc1 |
|
#
2de463ec |
| 14-Jun-2016 |
Adam Nemet <anemet@apple.com> |
Add loop pragma for Loop Distribution
Summary: This is similar to other loop pragmas like 'vectorize'. Currently it only has state values: distribute(enable) and distribute(disable). When one of t
Add loop pragma for Loop Distribution
Summary: This is similar to other loop pragmas like 'vectorize'. Currently it only has state values: distribute(enable) and distribute(disable). When one of these is specified the corresponding loop metadata is generated:
!{!"llvm.loop.distribute.enable", i1 true/false}
As a result, loop distribution will be attempted on the loop even if Loop Distribution in not enabled globally. Analogously, with 'disable' distribution can be turned off for an individual loop even when the pass is otherwise enabled.
There are some slight differences compared to the existing loop pragmas.
1. There is no 'assume_safety' variant which makes its handling slightly different from 'vectorize'/'interleave'.
2. Unlike the existing loop pragmas, it does not have a corresponding numeric pragma like 'vectorize' -> 'vectorize_width'. So for the consistency checks in CheckForIncompatibleAttributes we don't need to check it against other pragmas. We just need to check for duplicates of the same pragma.
Reviewers: rsmith, dexonsmith, aaron.ballman
Subscribers: bob.wilson, cfe-commits, hfinkel
Differential Revision: http://reviews.llvm.org/D19403
llvm-svn: 272656
show more ...
|
Revision tags: llvmorg-3.8.1, llvmorg-3.8.1-rc1, llvmorg-3.8.0, llvmorg-3.8.0-rc3, llvmorg-3.8.0-rc2, llvmorg-3.8.0-rc1, llvmorg-3.7.1, llvmorg-3.7.1-rc2, llvmorg-3.7.1-rc1, llvmorg-3.7.0, llvmorg-3.7.0-rc4, llvmorg-3.7.0-rc3, studio-1.4, llvmorg-3.7.0-rc2 |
|
#
54c020d3 |
| 27-Jul-2015 |
Tyler Nowicki <tyler.nowicki@gmail.com> |
Use CGLoopInfo to emit metadata for loop hint pragmas.
When ‘#pragma clang loop vectorize(assume_safety)’ was specified on a loop other loop hints were lost. The problem is that CGLoopInfo attaches
Use CGLoopInfo to emit metadata for loop hint pragmas.
When ‘#pragma clang loop vectorize(assume_safety)’ was specified on a loop other loop hints were lost. The problem is that CGLoopInfo attaches metadata differently than EmitCondBrHints in CGStmt. For do-loops CGLoopInfo attaches metadata to the br in the body block and for while and for loops, the inc block. EmitCondBrHints on the other hand always attaches data to the br in the cond block. When specifying assume_safety CGLoopInfo emits an empty llvm.loop metadata shadowing the metadata in the cond block. Loop transformations like rotate and unswitch would then eliminate the cond block and its non-empty metadata.
This patch unifies both approaches for adding metadata and modifies the existing safety tests to include non-assume_safety loop hints.
llvm-svn: 243315
show more ...
|
Revision tags: llvmorg-3.7.0-rc1, llvmorg-3.6.2, llvmorg-3.6.2-rc1 |
|
#
90203dc8 |
| 08-Jun-2015 |
Tyler Nowicki <tyler.nowicki@gmail.com> |
Moved CPP CodeGen tests into CodeGenCXX.
llvm-svn: 239362
|