#
de3ea51b |
| 16-Jul-2021 |
Eli Friedman <efriedma@quicinc.com> |
[ScalarEvolution] Refine computeMaxBECountForLT to be accurate in more cases.
Allow arbitrary strides, and make sure we return the correct result when the backedge-taken count is zero.
Differential
[ScalarEvolution] Refine computeMaxBECountForLT to be accurate in more cases.
Allow arbitrary strides, and make sure we return the correct result when the backedge-taken count is zero.
Differential Revision: https://reviews.llvm.org/D106197
show more ...
|
#
4402d0d4 |
| 19-Jul-2021 |
Philip Reames <listmail@philipreames.com> |
[SCEV] Add a clarifying comment in howManyLessThans
Wrap semantics are subtle when combined with multiple exits. This has caused several rounds of confusion during recent reviews, so try to documen
[SCEV] Add a clarifying comment in howManyLessThans
Wrap semantics are subtle when combined with multiple exits. This has caused several rounds of confusion during recent reviews, so try to document the subtly distinction between when wrap flags provide <u and <=u facts.
show more ...
|
#
2b17c24a |
| 18-Jul-2021 |
Nikita Popov <nikita.ppv@gmail.com> |
[SCEV] Fix unused variable warning (NFC)
|
#
28a3ad3f |
| 18-Jul-2021 |
Eli Friedman <efriedma@quicinc.com> |
[ScalarEvolution] Remove uses of PointerType::getElementType.
|
#
cbba71bf |
| 09-Jul-2021 |
Eli Friedman <efriedma@quicinc.com> |
[ScalarEvolution] Fix overflow in computeBECount.
The current implementation of computeBECount doesn't account for the possibility that adding "Stride - 1" to Delta might overflow. For almost all lo
[ScalarEvolution] Fix overflow in computeBECount.
The current implementation of computeBECount doesn't account for the possibility that adding "Stride - 1" to Delta might overflow. For almost all loops, it doesn't, but it's not actually proven anywhere.
To deal with this, use a variety of tricks to try to prove that the addition doesn't overflow. If the proof is impossible, use an alternate sequence which never overflows.
Differential Revision: https://reviews.llvm.org/D105216
show more ...
|
#
a99d420a |
| 15-Jul-2021 |
Philip Reames <listmail@philipreames.com> |
[SCEV] Fix unsound reasoning in howManyLessThans
This is split from D105216, it handles only a subset of the cases in that patch.
Specifically, the issue being fixed is that the code incorrectly as
[SCEV] Fix unsound reasoning in howManyLessThans
This is split from D105216, it handles only a subset of the cases in that patch.
Specifically, the issue being fixed is that the code incorrectly assumed that (Start-Stide) < End implied that the backedge was taken at least once. This is not true when e.g. Start = 4, Stride = 2, and End = 3. Note that we often do produce the right backedge taken count despite the flawed reasoning.
The fix chosen here is to use an alternate form of uceil (ceiling of unsigned divide) lowering which is safe when max(RHS,Start) > Start - Stride. (Note that signedness of both max expression and comparison depend on the signedness of the comparison being analyzed, and that overflow in the Start - Stride expression is allowed.) Note that this is weaker than proving the backedge is taken because it allows start - stride < end < start. Some cases which can't be proven safe are sent down the generic path, and we do end up generating less optimal expressions in a few cases.
Credit for coming up with the approach goes entirely to Eli. I just split it off, tweaked the comments a bit, and did some additional testing.
Differential Revision: https://reviews.llvm.org/D105942
show more ...
|
#
205ed009 |
| 13-Jul-2021 |
Philip Reames <listmail@philipreames.com> |
[SCEV] Handle zero stride correctly in howManyLessThans
This is split from D105216, but the code is hoisted much earlier into the path where we can actually get a zero stride flowing through. Some f
[SCEV] Handle zero stride correctly in howManyLessThans
This is split from D105216, but the code is hoisted much earlier into the path where we can actually get a zero stride flowing through. Some fairly simple proofs handle the cases which show up in practice. The only test changes are the cases where we really do need a non-zero divider to produce the right result.
Recommitting with isLoopInvariant() check.
Differential Revision: https://reviews.llvm.org/D105921
show more ...
|
#
57388196 |
| 14-Jul-2021 |
Arthur Eubanks <aeubanks@google.com> |
Revert "[SCEV] Handle zero stride correctly in howManyLessThans"
This reverts commit 4df591b5c960affd1612e330d0c9cd3076c18053.
Causes crashes, see comments on D105921.
|
#
bb8c7a98 |
| 13-Jul-2021 |
Eli Friedman <efriedma@quicinc.com> |
[ScalarEvolution] Make isKnownNonZero handle more cases.
Using an unsigned range instead of signed ranges is a bit more precise.
Differential Revision: https://reviews.llvm.org/D105941
|
#
4df591b5 |
| 13-Jul-2021 |
Philip Reames <listmail@philipreames.com> |
[SCEV] Handle zero stride correctly in howManyLessThans
This is split from D105216, but the code is hoisted much earlier into the path where we can actually get a zero stride flowing through. Some f
[SCEV] Handle zero stride correctly in howManyLessThans
This is split from D105216, but the code is hoisted much earlier into the path where we can actually get a zero stride flowing through. Some fairly simple proofs handle the cases which show up in practice. The only test changes are the cases where we really do need a non-zero divider to produce the right result.
Differential Revision: https://reviews.llvm.org/D105921
show more ...
|
#
087310c7 |
| 13-Jul-2021 |
Philip Reames <listmail@philipreames.com> |
[SCEV] Strengthen inference of RHS > Start in howManyLessThans
Split off from D105216 to simplify review. Rewritten with a lambda to be easier to follow. Comments clarified.
Sorry for no test cas
[SCEV] Strengthen inference of RHS > Start in howManyLessThans
Split off from D105216 to simplify review. Rewritten with a lambda to be easier to follow. Comments clarified.
Sorry for no test case, this is tricky to exercise with the current structure of the code. It's about to be hit more frequently in a follow up patch, and the change itself is simple.
show more ...
|
#
e4b43973 |
| 13-Jul-2021 |
Philip Reames <listmail@philipreames.com> |
[ScalarEvolution] Fix overflow when computing max trip counts
This is split from D105216 to reduce patch complexity. Original code by Eli with very minor modification by me.
The primary point of t
[ScalarEvolution] Fix overflow when computing max trip counts
This is split from D105216 to reduce patch complexity. Original code by Eli with very minor modification by me.
The primary point of this patch is to add the getUDivCeilSCEV routine. I included the two callers with constant arguments as we know those must constant fold even without any of the fancy inference logic.
show more ...
|
#
882ee7fb |
| 10-Jul-2021 |
Eli Friedman <efriedma@quicinc.com> |
Fix buildbot regression from 9c4baf5.
Apparently ScalarEvolution::isImpliedCond tries to truncate a pointer in some obscure cases. Guard the code with a check for pointers.
|
#
9c4baf51 |
| 23-Jun-2021 |
Eli Friedman <efriedma@quicinc.com> |
[ScalarEvolution] Strictly enforce pointer/int type rules.
Rules:
1. SCEVUnknown is a pointer if and only if the LLVM IR value is a pointer. 2. SCEVPtrToInt is never a pointer. 3. If any other S
[ScalarEvolution] Strictly enforce pointer/int type rules.
Rules:
1. SCEVUnknown is a pointer if and only if the LLVM IR value is a pointer. 2. SCEVPtrToInt is never a pointer. 3. If any other SCEV expression has no pointer operands, the result is an integer. 4. If a SCEVAddExpr has exactly one pointer operand, the result is a pointer. 5. If a SCEVAddRecExpr's first operand is a pointer, and it has no other pointer operands, the result is a pointer. 6. If every operand of a SCEVMinMaxExpr is a pointer, the result is a pointer. 7. Otherwise, the SCEV expression is invalid.
I'm not sure how useful rule 6 is in practice. If we exclude it, we can guarantee that ScalarEvolution::getPointerBase always returns a SCEVUnknown, which might be a helpful property. Anyway, I'll leave that for a followup.
This is basically mop-up at this point; all the changes with significant functional effects have landed. Some of the remaining changes could be split off, but I don't see much point.
Differential Revision: https://reviews.llvm.org/D105510
show more ...
|
#
2e3f4694 |
| 09-Jul-2021 |
Nikita Popov <nikita.ppv@gmail.com> |
[IR] Add GEPOperator::indices() (NFC)
In order to mirror the GetElementPtrInst::indices() API.
Wanted to use this in the IRForTarget code, and was surprised to find that it didn't exist yet.
|
#
e479777d |
| 09-Jul-2021 |
Martin Storsjö <martin@martin.st> |
Revert "[ScalarEvolution] Fix overflow in computeBECount."
This reverts commit 5b350183cdabd83573bc760ddf513f3e1d991bcb (and also "[NFC][ScalarEvolution] Cleanup howManyLessThans.", 009436e9c1fee129
Revert "[ScalarEvolution] Fix overflow in computeBECount."
This reverts commit 5b350183cdabd83573bc760ddf513f3e1d991bcb (and also "[NFC][ScalarEvolution] Cleanup howManyLessThans.", 009436e9c1fee1290d62bc0faafe0c0295542f56, to make it apply).
See https://reviews.llvm.org/D105216 for discussion on various miscompilations caused by that commit.
show more ...
|
#
009436e9 |
| 09-Jul-2021 |
Eli Friedman <efriedma@quicinc.com> |
[NFC][ScalarEvolution] Cleanup howManyLessThans.
In preparation for D104075. Some NFC cleanup, and some test coverage for planned changes.
|
#
5b350183 |
| 29-Jun-2021 |
Eli Friedman <efriedma@quicinc.com> |
[ScalarEvolution] Fix overflow in computeBECount.
There are two issues with the current implementation of computeBECount:
1. It doesn't account for the possibility that adding "Stride - 1" to Delta
[ScalarEvolution] Fix overflow in computeBECount.
There are two issues with the current implementation of computeBECount:
1. It doesn't account for the possibility that adding "Stride - 1" to Delta might overflow. For almost all loops, it doesn't, but it's not actually proven anywhere. 2. It doesn't account for the possibility that Stride is zero. If Delta is zero, the backedge is never taken; the value of Stride isn't relevant. To handle this, we have to make sure that the expression returned by computeBECount evaluates to zero.
To deal with this, add two new checks:
1. Use a variety of tricks to try to prove that the addition doesn't overflow. If the proof is impossible, use an alternate sequence which never overflows. 2. Use umax(Stride, 1) to handle the possibility that Stride is zero.
Differential Revision: https://reviews.llvm.org/D105216
show more ...
|
#
f5603aa0 |
| 07-Jul-2021 |
Eli Friedman <efriedma@quicinc.com> |
[ScalarEvolution] Make sure getMinusSCEV doesn't negate pointers.
Add a function removePointerBase that returns, essentially, S - getPointerBase(S). Use it in getMinusSCEV instead of actually subtr
[ScalarEvolution] Make sure getMinusSCEV doesn't negate pointers.
Add a function removePointerBase that returns, essentially, S - getPointerBase(S). Use it in getMinusSCEV instead of actually subtracting pointers.
Differential Revision: https://reviews.llvm.org/D105503
show more ...
|
#
7ac1c7be |
| 06-Jul-2021 |
Eli Friedman <efriedma@quicinc.com> |
Recommit [ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers.
As part of making ScalarEvolution's handling of pointers consistent, we want to forbid multiplying a pointer by -1 (or any
Recommit [ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers.
As part of making ScalarEvolution's handling of pointers consistent, we want to forbid multiplying a pointer by -1 (or any other value). This means we can't blindly subtract pointers.
There are a few ways we could deal with this: 1. We could completely forbid subtracting pointers in getMinusSCEV() 2. We could forbid subracting pointers with different pointer bases (this patch). 3. We could try to ptrtoint pointer operands.
The option in this patch is more friendly to non-integral pointers: code that works with normal pointers will also work with non-integral pointers. And it seems like there are very few places that actually benefit from the third option.
As a minimal patch, the ScalarEvolution implementation of getMinusSCEV still ends up subtracting pointers if they have the same base. This should eliminate the shared pointer base, but eventually we'll need to rewrite it to avoid negating the pointer base. I plan to do this as a separate step to allow measuring the compile-time impact.
This doesn't cause obvious functional changes in most cases; the one case that is significantly affected is ICmpZero handling in LSR (which is the source of almost all the test changes). The resulting changes seem okay to me, but suggestions welcome. As an alternative, I tried explicitly ptrtoint'ing the operands, but the result doesn't seem obviously better.
I deleted the test lsr-undef-in-binop.ll becuase I couldn't figure out how to repair it to test what it was actually trying to test.
Recommitting with fix to MemoryDepChecker::isDependent.
Differential Revision: https://reviews.llvm.org/D104806
show more ...
|
#
a6d081b2 |
| 06-Jul-2021 |
Eli Friedman <efriedma@quicinc.com> |
Revert "[ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers."
This reverts commit 74d6ce5d5f169e9cf3fac0eb1042602e286dd2b9.
Seeing crashes on buildbots in MemoryDepChecker::isDependen
Revert "[ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers."
This reverts commit 74d6ce5d5f169e9cf3fac0eb1042602e286dd2b9.
Seeing crashes on buildbots in MemoryDepChecker::isDependent.
show more ...
|
#
74d6ce5d |
| 06-Jul-2021 |
Eli Friedman <efriedma@quicinc.com> |
[ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers.
As part of making ScalarEvolution's handling of pointers consistent, we want to forbid multiplying a pointer by -1 (or any other va
[ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers.
As part of making ScalarEvolution's handling of pointers consistent, we want to forbid multiplying a pointer by -1 (or any other value). This means we can't blindly subtract pointers.
There are a few ways we could deal with this: 1. We could completely forbid subtracting pointers in getMinusSCEV() 2. We could forbid subracting pointers with different pointer bases (this patch). 3. We could try to ptrtoint pointer operands.
The option in this patch is more friendly to non-integral pointers: code that works with normal pointers will also work with non-integral pointers. And it seems like there are very few places that actually benefit from the third option.
As a minimal patch, the ScalarEvolution implementation of getMinusSCEV still ends up subtracting pointers if they have the same base. This should eliminate the shared pointer base, but eventually we'll need to rewrite it to avoid negating the pointer base. I plan to do this as a separate step to allow measuring the compile-time impact.
This doesn't cause obvious functional changes in most cases; the one case that is significantly affected is ICmpZero handling in LSR (which is the source of almost all the test changes). The resulting changes seem okay to me, but suggestions welcome. As an alternative, I tried explicitly ptrtoint'ing the operands, but the result doesn't seem obviously better.
I deleted the test lsr-undef-in-binop.ll becuase I couldn't figure out how to repair it to test what it was actually trying to test.
Differential Revision: https://reviews.llvm.org/D104806
show more ...
|
#
14d8f154 |
| 30-Jun-2021 |
Philip Reames <listmail@philipreames.com> |
[SCEV] Fold (0 udiv %x) to 0
We have analogous rules in instsimplify, etc.., but were missing the same in SCEV. The fold is near trivial, but came up in the context of a larger change.
|
#
8d5bf070 |
| 25-Jun-2021 |
Eli Friedman <efriedma@quicinc.com> |
[NFC] Prefer ConstantRange::makeExactICmpRegion over makeAllowedICmpRegion
The implementation is identical, but it makes the semantics a bit more obvious.
|
#
6478f3fb |
| 25-Jun-2021 |
Florian Hahn <flo@fhahn.com> |
[SCEV] Support single-cond range check idiom in applyLoopGuards.
This patch extends applyLoopGuards to detect a single-cond range check idiom that InstCombine generates.
It extends applyLoopGuards
[SCEV] Support single-cond range check idiom in applyLoopGuards.
This patch extends applyLoopGuards to detect a single-cond range check idiom that InstCombine generates.
It extends applyLoopGuards to detect conditions of the form (-C1 + X < C2). InstCombine will create this form when combining two checks of the form (X u< C2 + C1) and (X >=u C1).
In practice, this enables us to correctly compute a tight trip count bounds for code as in the function below. InstCombine will fold the minimum iteration check created by LoopRotate with the user check (< 8).
void unsigned_check(short *pred, unsigned width) { if (width < 8) { for (int x = 0; x < width; x++) pred[x] = pred[x] * pred[x]; } }
As a consequence, LLVM creates dead vector loops for the code above, e.g. see https://godbolt.org/z/cb8eTcqET
https://alive2.llvm.org/ce/z/SHHW4d
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D104741
show more ...
|