Revision tags: llvmorg-3.8.1, llvmorg-3.8.1-rc1 |
|
#
10a1e8b1 |
| 27-May-2016 |
Tim Northover <tnorthover@apple.com> |
Vectorizer: track non-fast FP instructions through phis when finding reductions.
When we traced through a phi node looking for floating-point reductions, we forgot whether we'd ever seen an instruct
Vectorizer: track non-fast FP instructions through phis when finding reductions.
When we traced through a phi node looking for floating-point reductions, we forgot whether we'd ever seen an instruction without fast-math flags (that would block vectorization). This propagates it through to the end.
llvm-svn: 271015
show more ...
|
#
c434d091 |
| 10-May-2016 |
Elena Demikhovsky <elena.demikhovsky@intel.com> |
[LoopVectorize] Handling induction variable with non-constant step.
Allow vectorization when the step is a loop-invariant variable. This is the loop example that is getting vectorized after the patc
[LoopVectorize] Handling induction variable with non-constant step.
Allow vectorization when the step is a loop-invariant variable. This is the loop example that is getting vectorized after the patch:
int int_inc; int bar(int init, int *restrict A, int N) {
int x = init; for (int i=0;i<N;i++){ A[i] = x; x += int_inc; } return x; }
"x" is an induction variable with *loop-invariant* step. But it is not a primary induction. Primary induction variable with non-constant step is not handled yet.
Differential Revision: http://reviews.llvm.org/D19258
llvm-svn: 269023
show more ...
|
#
c05bab8a |
| 05-May-2016 |
Silviu Baranga <silviu.baranga@arm.com> |
[LV] Identify more induction PHIs by coercing expressions to AddRecExprs
Summary: Some PHIs can have expressions that are not AddRecExprs due to the presence of sext/zext instructions. In order to p
[LV] Identify more induction PHIs by coercing expressions to AddRecExprs
Summary: Some PHIs can have expressions that are not AddRecExprs due to the presence of sext/zext instructions. In order to prevent the Loop Vectorizer from bailing out when encountering these PHIs, we now coerce the SCEV expressions to AddRecExprs using SCEV predicates (when possible).
We only do this when the alternative would be to not vectorize.
Reviewers: mzolotukhin, anemet
Subscribers: mssimpso, sanjoy, mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D17153
llvm-svn: 268633
show more ...
|
#
fe3def7c |
| 22-Apr-2016 |
Adam Nemet <anemet@apple.com> |
[LoopUtils] Extend findStringMetadataForLoop to return the value for metadata
E.g. for:
!1 = {"llvm.distribute", i32 1}
it now returns the MDOperand for 1.
I will use this in LoopDistribution t
[LoopUtils] Extend findStringMetadataForLoop to return the value for metadata
E.g. for:
!1 = {"llvm.distribute", i32 1}
it now returns the MDOperand for 1.
I will use this in LoopDistribution to check the value of the metadata.
Note that the change is backward-compatible with its current use in LoopVersioningLICM. An Optional implicitly converts to a bool depending whether it contains a value or not.
llvm-svn: 267190
show more ...
|
#
6dcf0788 |
| 21-Apr-2016 |
Adam Nemet <anemet@apple.com> |
[LoopUtils] Fix typo in comment
llvm-svn: 267016
|
#
293be666 |
| 21-Apr-2016 |
Adam Nemet <anemet@apple.com> |
[LoopUtils] Add asserts to findStringMetadataForLoop. NFC
These ensure that operand array has at least one element and it is the self-reference.
llvm-svn: 267015
|
#
963341c8 |
| 21-Apr-2016 |
Adam Nemet <anemet@apple.com> |
[LoopUtils] Move def of findStringMetadataForLoop to LoopUtils.cpp. NFC
The decl is in LoopUtils.h. I think that this was added to LoopVersioningLICM.cpp by mistake.
llvm-svn: 267014
|
#
53207a99 |
| 11-Apr-2016 |
Matthew Simpson <mssimpso@codeaurora.org> |
[LoopUtils, LV] Fix PR27246 (first-order recurrences)
This patch ensures that when we detect first-order recurrences, we reject a phi node if its previous value is also a phi node. During vectorizat
[LoopUtils, LV] Fix PR27246 (first-order recurrences)
This patch ensures that when we detect first-order recurrences, we reject a phi node if its previous value is also a phi node. During vectorization the initial and previous values of the recurrence are shuffled together to create the value for the current iteration. However, phi nodes are not widened like other instructions. This fixes PR27246.
Differential Revision: http://reviews.llvm.org/D18971
llvm-svn: 265983
show more ...
|
#
8dd66e57 |
| 30-Mar-2016 |
Nirav Dave <niravd@google.com> |
Remove HasFnAttribute guards to getFnAttribute calls
These checks are redundant and can be removed
Reviewers: hans
Subscribers: llvm-commits, mzolotukhin
Differential Revision: http://reviews.llv
Remove HasFnAttribute guards to getFnAttribute calls
These checks are redundant and can be removed
Reviewers: hans
Subscribers: llvm-commits, mzolotukhin
Differential Revision: http://reviews.llvm.org/D18564
llvm-svn: 264872
show more ...
|
#
b840a6d6 |
| 03-Mar-2016 |
Matthew Simpson <mssimpso@codeaurora.org> |
[LoopUtils, LV] Fix PR26734
The vectorization of first-order recurrences (r261346) caused PR26734. When detecting these recurrences, we need to ensure that the previous value is actually defined ins
[LoopUtils, LV] Fix PR26734
The vectorization of first-order recurrences (r261346) caused PR26734. When detecting these recurrences, we need to ensure that the previous value is actually defined inside the loop. This patch includes the fix and test case.
llvm-svn: 262624
show more ...
|
Revision tags: llvmorg-3.8.0, llvmorg-3.8.0-rc3 |
|
#
29c997c1 |
| 19-Feb-2016 |
Matthew Simpson <mssimpso@codeaurora.org> |
[LV] Vectorize first-order recurrences
This patch enables the vectorization of first-order recurrences. A first-order recurrence is a non-reduction recurrence relation in which the value of the recu
[LV] Vectorize first-order recurrences
This patch enables the vectorization of first-order recurrences. A first-order recurrence is a non-reduction recurrence relation in which the value of the recurrence in the current loop iteration equals a value defined in the previous iteration. The load PRE of the GVN pass often creates these recurrences by hoisting loads from within loops.
In this patch, we add a new recurrence kind for first-order phi nodes and attempt to vectorize them if possible. Vectorization is performed by shuffling the values for the current and previous iterations. The vectorization cost estimate is updated to account for the added shuffle instruction.
Contributed-by: Matthew Simpson and Chad Rosier <mcrosier@codeaurora.org> Differential Revision: http://reviews.llvm.org/D16197
llvm-svn: 261346
show more ...
|
#
31088a9d |
| 19-Feb-2016 |
Chandler Carruth <chandlerc@gmail.com> |
[LPM] Factor all of the loop analysis usage updates into a common helper routine.
We were getting this wrong in small ways and generally being very inconsistent about it across loop passes. Instead,
[LPM] Factor all of the loop analysis usage updates into a common helper routine.
We were getting this wrong in small ways and generally being very inconsistent about it across loop passes. Instead, let's have a common place where we do this. One minor downside is that this will require some analyses like SCEV in more places than they are strictly needed. However, this seems benign as these analyses are complete no-ops, and without this consistency we can in many cases end up with the legacy pass manager scheduling deciding to split up a loop pass pipeline in order to run the function analysis half-way through. It is very, very annoying to fix these without just being very pedantic across the board.
The only loop passes I've not updated here are ones that use AU.setPreservesAll() such as IVUsers (an analysis) and the pass printer. They seemed less relevant.
With this patch, almost all of the problems in PR24804 around loop pass pipelines are fixed. The one remaining issue is that we run simplify-cfg and instcombine in the middle of the loop pass pipeline. We've recently added some loop variants of these passes that would seem substantially cleaner to use, but this at least gets us much closer to the previous state. Notably, the seven loop pass managers is down to three.
I've not updated the loop passes using LoopAccessAnalysis because that analysis hasn't been fully wired into LoopSimplify/LCSSA, and it isn't clear that those transforms want to support those forms anyways. They all run late anyways, so this is harmless. Similarly, LSR is left alone because it already carefully manages its forms and doesn't need to get fused into a single loop pass manager with a bunch of other loop passes.
LoopReroll didn't use loop simplified form previously, and I've updated the test case to match the trivially different output.
Finally, I've also factored all the pass initialization for the passes that use this technique as well, so that should be done regularly and reliably.
Thanks to James for the help reviewing and thinking about this stuff, and Ben for help thinking about it as well!
Differential Revision: http://reviews.llvm.org/D17435
llvm-svn: 261316
show more ...
|
Revision tags: llvmorg-3.8.0-rc2, llvmorg-3.8.0-rc1 |
|
#
a252815b |
| 12-Jan-2016 |
Sanjay Patel <spatel@rotateright.com> |
function names start with a lower case letter ; NFC
llvm-svn: 257496
|
#
ad1ccb35 |
| 09-Dec-2015 |
Silviu Baranga <silviu.baranga@arm.com> |
Revert r255115 until we figure out how to fix the bot failures.
llvm-svn: 255117
|
#
41eb6825 |
| 09-Dec-2015 |
Silviu Baranga <silviu.baranga@arm.com> |
[LV][LAA] Add a layer over SCEV to apply run-time checked knowledge on SCEV expressions
Summary: This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the usage of SCEV pr
[LV][LAA] Add a layer over SCEV to apply run-time checked knowledge on SCEV expressions
Summary: This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the usage of SCEV predicates. The SCEVPredicatedLayer takes the statically deduced knowledge by ScalarEvolution and applies the knowledge from the SCEV predicates. The end goal is that both LAA and LV should use this interface everywhere.
This also solves a problem involving the result of SCEV expression rewritting when the predicate changes. Suppose we have the expression (sext {a,+,b}) and two predicates P1: {a,+,b} has nsw P2: b = 1.
Applying P1 and then P2 gives us {a,+,1}, while applying P2 and the P1 gives us sext({a,+,1}) (the AddRec expression was changed by P2 so P1 no longer applies). The SCEVPredicatedLayer maintains the order of transformations by feeding back the results of previous transformations into new transformations, and therefore avoiding this issue.
The SCEVPredicatedLayer maintains a cache to remember the results of previous SCEV rewritting results. This also has the benefit of reducing the overall number of expression rewrites.
Reviewers: mzolotukhin, anemet
Subscribers: jmolloy, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D14296
llvm-svn: 255115
show more ...
|
Revision tags: llvmorg-3.7.1 |
|
#
45d4cb9a |
| 24-Nov-2015 |
Weiming Zhao <weimingz@codeaurora.org> |
[Utils] Put includes in correct order. NFC.
Summary: Followed the guidelines in: http://llvm.org/docs/CodingStandards.html#include-style However, I noticed that uppercase named head
[Utils] Put includes in correct order. NFC.
Summary: Followed the guidelines in: http://llvm.org/docs/CodingStandards.html#include-style However, I noticed that uppercase named headers come before lowercase ones throughout the codebase. So kept them as is. Patch by Mandeep Singh Grang <mgrang@codeaurora.org>
Reviewers: majnemer, davide, jmolloy, atrick
Subscribers: sanjoy
Differential Revision: http://reviews.llvm.org/D14939
llvm-svn: 254005
show more ...
|
Revision tags: llvmorg-3.7.1-rc2, llvmorg-3.7.1-rc1 |
|
#
50a4c27f |
| 21-Sep-2015 |
James Molloy <james.molloy@arm.com> |
[LoopUtils,LV] Propagate fast-math flags on generated FCmp instructions
We're currently losing any fast-math flags when synthesizing fcmps for min/max reductions. In LV, make sure we copy over the s
[LoopUtils,LV] Propagate fast-math flags on generated FCmp instructions
We're currently losing any fast-math flags when synthesizing fcmps for min/max reductions. In LV, make sure we copy over the scalar inst's flags. In LoopUtils, we know we only ever match patterns with hasUnsafeAlgebra, so apply that to any synthesized ops.
llvm-svn: 248201
show more ...
|
#
29dc0f70 |
| 10-Sep-2015 |
Matthew Simpson <mssimpso@codeaurora.org> |
[LV] Relax Small Size Reduction Type Requirement
This patch enables small size reductions in which the source types are smaller than the reduction type (e.g., computing an i16 sum from the values in
[LV] Relax Small Size Reduction Type Requirement
This patch enables small size reductions in which the source types are smaller than the reduction type (e.g., computing an i16 sum from the values in an i8 array). The previous behavior was to only allow small size reductions if the source types and reduction type were the same. The change accounts for the fact that the existing sign- and zero-extend instructions in these cases should still be included in the cost model.
Differential Revision: http://reviews.llvm.org/D12770
llvm-svn: 247337
show more ...
|
Revision tags: llvmorg-3.7.0 |
|
#
c94f8e29 |
| 27-Aug-2015 |
Chad Rosier <mcrosier@codeaurora.org> |
[LoopVectorize] Add Support for Small Size Reductions.
Unlike scalar operations, we can perform vector operations on element types that are smaller than the native integer types. We type-promote sca
[LoopVectorize] Add Support for Small Size Reductions.
Unlike scalar operations, we can perform vector operations on element types that are smaller than the native integer types. We type-promote scalar operations if they are smaller than a native type (e.g., i8 arithmetic is promoted to i32 arithmetic on Arm targets). This patch detects and removes type-promotions within the reduction detection framework, enabling the vectorization of small size reductions.
In the legality phase, we look through the ANDs and extensions that InstCombine creates during promotion, keeping track of the smaller type. In the profitability phase, we use the smaller type and ignore the ANDs and extensions in the cost model. Finally, in the code generation phase, we truncate the result of the reduction to allow InstCombine to rewrite the entire expression in the smaller type.
This fixes PR21369. http://reviews.llvm.org/D12202
Patch by Matt Simpson <mssimpso@codeaurora.org>!
llvm-svn: 246149
show more ...
|
#
1bbf15c5 |
| 27-Aug-2015 |
James Molloy <james.molloy@arm.com> |
[LoopVectorize] Extract InductionInfo into a helper class...
... and move it into LoopUtils where it can be used by other passes, just like ReductionDescriptor. The API is very similar to ReductionD
[LoopVectorize] Extract InductionInfo into a helper class...
... and move it into LoopUtils where it can be used by other passes, just like ReductionDescriptor. The API is very similar to ReductionDescriptor - that is, not very nice at all. Sorting these both out will come in a followup.
NFC
llvm-svn: 246145
show more ...
|
Revision tags: llvmorg-3.7.0-rc4, llvmorg-3.7.0-rc3 |
|
#
c5b7b555 |
| 19-Aug-2015 |
Ashutosh Nema <ashu1212@gmail.com> |
Exposed findDefsUsedOutsideOfLoop as a loop utility function
Exposed findDefsUsedOutsideOfLoop as a loop utility function by moving it from LoopDistribute to LoopUtils.
Reviewed By: anemet
llvm-s
Exposed findDefsUsedOutsideOfLoop as a loop utility function
Exposed findDefsUsedOutsideOfLoop as a loop utility function by moving it from LoopDistribute to LoopUtils.
Reviewed By: anemet
llvm-svn: 245416
show more ...
|
Revision tags: studio-1.4 |
|
#
c1a86f58 |
| 10-Aug-2015 |
Tyler Nowicki <tyler.nowicki@gmail.com> |
Late evaluation of the fast-math vectorization requirement.
This patch moves the verification of fast-math to just before vectorization is done. This way we can tell clang to append the command line
Late evaluation of the fast-math vectorization requirement.
This patch moves the verification of fast-math to just before vectorization is done. This way we can tell clang to append the command line options would that allow floating-point commutativity. Specifically those are enableing fast-math or specifying a loop hint.
llvm-svn: 244489
show more ...
|
Revision tags: llvmorg-3.7.0-rc2, llvmorg-3.7.0-rc1, llvmorg-3.6.2, llvmorg-3.6.2-rc1 |
|
#
27b2c39e |
| 16-Jun-2015 |
Tyler Nowicki <tyler.nowicki@gmail.com> |
Refactor RecurrenceInstDesc
Moved RecurrenceInstDesc into RecurrenceDescriptor to simplify the namespaces.
llvm-svn: 239862
|
#
0a91310c |
| 16-Jun-2015 |
Tyler Nowicki <tyler.nowicki@gmail.com> |
Rename Reduction variables/structures to Recurrence.
A reduction is a special kind of recurrence. In the loop vectorizer we currently identify basic reductions. Future patches will extend this to id
Rename Reduction variables/structures to Recurrence.
A reduction is a special kind of recurrence. In the loop vectorizer we currently identify basic reductions. Future patches will extend this to identifying basic recurrences.
llvm-svn: 239835
show more ...
|
#
b58f32f7 |
| 05-Jun-2015 |
David Majnemer <david.majnemer@gmail.com> |
[LoopVectorize] Don't crash on zero-sized types in isInductionPHI
isInductionPHI wants to calculate the stride based on the pointee size. However, this is not possible when the pointee is zero sized
[LoopVectorize] Don't crash on zero-sized types in isInductionPHI
isInductionPHI wants to calculate the stride based on the pointee size. However, this is not possible when the pointee is zero sized.
This fixes PR23763.
llvm-svn: 239143
show more ...
|