#
4cad61ad |
| 01-Aug-2017 |
Sanjoy Das <sanjoy@playingwithpointers.com> |
[SCEV/IndVars] Always compute loop exiting values if the backedge count is 0
If SCEV can prove that the backedge taken count for a loop is zero, it does not need to "understand" a recursive PHI to c
[SCEV/IndVars] Always compute loop exiting values if the backedge count is 0
If SCEV can prove that the backedge taken count for a loop is zero, it does not need to "understand" a recursive PHI to compute its exiting value.
This should fix PR33885.
llvm-svn: 309758
show more ...
|
#
b5a968f6 |
| 29-Jul-2017 |
Sanjoy Das <sanjoy@playingwithpointers.com> |
[SCEV] Change an early exit to an assert; NFC
llvm-svn: 309480
|
#
fa496953 |
| 28-Jul-2017 |
Max Kazantsev <max.kazantsev@azul.com> |
[SCEV] Do not visit nodes twice in containsConstantSomewhere
This patch reworks the function that searches constants in Add and Mul SCEV expression chains so that now it does not visit a node more t
[SCEV] Do not visit nodes twice in containsConstantSomewhere
This patch reworks the function that searches constants in Add and Mul SCEV expression chains so that now it does not visit a node more than once, and also renames this function for better correspondence between its implementation and semantics.
Differential Revision: https://reviews.llvm.org/D35931
llvm-svn: 309367
show more ...
|
#
843ab574 |
| 28-Jul-2017 |
Sanjoy Das <sanjoy@playingwithpointers.com> |
Revert "[SCEV] Cache results of computeExitLimit"
This reverts commit r309080. The patch needs to clear out the ScalarEvolution::ExitLimits cache in forgetMemoizedResults.
I've replied on the comm
Revert "[SCEV] Cache results of computeExitLimit"
This reverts commit r309080. The patch needs to clear out the ScalarEvolution::ExitLimits cache in forgetMemoizedResults.
I've replied on the commit thread for the patch with more details.
llvm-svn: 309357
show more ...
|
Revision tags: llvmorg-5.0.0-rc1 |
|
#
f282aed4 |
| 26-Jul-2017 |
Max Kazantsev <max.kazantsev@azul.com> |
[SCEV] Cache results of computeExitLimit
This patch adds a cache for computeExitLimit to save compilation time. A lot of examples of tests that take extensive time to compile are attached to the bug
[SCEV] Cache results of computeExitLimit
This patch adds a cache for computeExitLimit to save compilation time. A lot of examples of tests that take extensive time to compile are attached to the bug 33494.
Differential Revision: https://reviews.llvm.org/D35827
llvm-svn: 309080
show more ...
|
#
469e740f |
| 26-Jul-2017 |
Sanjoy Das <sanjoy@playingwithpointers.com> |
[SCEV] Remove unnecessary call to forgetMemoizedResults
`SCEVUnknown::allUsesReplacedWith` does not need to call `forgetMemoizedResults` since RAUW does a value-equivalent replacement by assumption.
[SCEV] Remove unnecessary call to forgetMemoizedResults
`SCEVUnknown::allUsesReplacedWith` does not need to call `forgetMemoizedResults` since RAUW does a value-equivalent replacement by assumption. If this assumption was false then the later setValPtr(New) call would be incorrect too.
This is a non-trivial performance optimization for functions with a large number of loops since `forgetMemoizedResults` walks all loop backedge taken counts to see if any of them use the SCEVUnknown being RAUWed. However, this improvement is difficult to demonstrate without checking in an excessively large IR file.
llvm-svn: 309072
show more ...
|
#
0e9e0796 |
| 23-Jul-2017 |
Max Kazantsev <max.kazantsev@azul.com> |
[SCEV] Limit max size of AddRecExpr during evolving
When SCEV calculates product of two SCEVAddRecs from the same loop, it tries to combine them into one big AddRecExpr. If the sizes of the initial
[SCEV] Limit max size of AddRecExpr during evolving
When SCEV calculates product of two SCEVAddRecs from the same loop, it tries to combine them into one big AddRecExpr. If the sizes of the initial SCEVs were `S1` and `S2`, the size of their product is `S1 + S2 - 1`, and every operand of the resulting SCEV is combined from operands of initial SCEV and has much higher complexity than they have.
As result, if we try to calculate something like: %x1 = {a,+,b} %x2 = mul i32 %x1, %x1 %x3 = mul i32 %x2, %x1 %x4 = mul i32 %x3, %x2 ... The size of such SCEVs grows as `2^N`, and the arguments become more and more complex as we go forth. This leads to long compilation and huge memory consumption.
This patch sets a limit after which we don't try to combine two `SCEVAddRecExpr`s into one. By default, max allowed size of the resulting AddRecExpr is set to 16.
Differential Revision: https://reviews.llvm.org/D35664
llvm-svn: 308847
show more ...
|
#
ca4fd18d |
| 18-Jul-2017 |
Dorit Nuzman <dorit.nuzman@intel.com> |
PSCEV] Create AddRec for Phis in cases of possible integer overflow, using runtime checks
Extend the SCEVPredicateRewriter to work a bit harder when it encounters an UnknownSCEV for a Phi node; Try
PSCEV] Create AddRec for Phis in cases of possible integer overflow, using runtime checks
Extend the SCEVPredicateRewriter to work a bit harder when it encounters an UnknownSCEV for a Phi node; Try to build an AddRecurrence also for Phi nodes whose update chain involves casts that can be ignored under the proper runtime overflow test. This is one step towards addressing PR30654.
Differential revision: http://reviews.llvm.org/D30041
llvm-svn: 308299
show more ...
|
#
ca2c8765 |
| 06-Jul-2017 |
Craig Topper <craig.topper@intel.com> |
[Constants] Replace calls to ConstantInt::equalsInt(0)/equalsInt(1) with isZero and isOne. NFCI
llvm-svn: 307293
|
#
79ab643d |
| 06-Jul-2017 |
Craig Topper <craig.topper@intel.com> |
[Constants] If we already have a ConstantInt*, prefer to use isZero/isOne/isMinusOne instead of isNullValue/isOneValue/isAllOnesValue inherited from Constant. NFCI
Going through the Constant methods
[Constants] If we already have a ConstantInt*, prefer to use isZero/isOne/isMinusOne instead of isNullValue/isOneValue/isAllOnesValue inherited from Constant. NFCI
Going through the Constant methods requires redetermining that the Constant is a ConstantInt and then calling isZero/isOne/isMinusOne.
llvm-svn: 307292
show more ...
|
#
8d0322e6 |
| 30-Jun-2017 |
Max Kazantsev <max.kazantsev@azul.com> |
[SCEV] Use depth limit instead of local cache for SExt and ZExt
In rL300494 there was an attempt to deal with excessive compile time on invocations of getSign/ZeroExtExpr using local caching. This a
[SCEV] Use depth limit instead of local cache for SExt and ZExt
In rL300494 there was an attempt to deal with excessive compile time on invocations of getSign/ZeroExtExpr using local caching. This approach only helps if we request the same SCEV multiple times throughout recursion. But in the bug PR33431 we see a case where we request different values all the time, so caching does not help and the size of the cache grows enormously.
In this patch we remove the local cache for this methods and add the recursion depth limit instead, as we do for arithmetics. This gives us a guarantee that the invocation sequence is limited and reasonably short.
Differential Revision: https://reviews.llvm.org/D34273
llvm-svn: 306785
show more ...
|
#
41044876 |
| 29-Jun-2017 |
Alexandre Isoard <alexandre.isoard@gmail.com> |
Reverting r306695 while investigating failing test case.
Failing test case: Transforms/LoopVectorize.iv_outside_user.ll
llvm-svn: 306723
|
#
aa29afc7 |
| 29-Jun-2017 |
Alexandre Isoard <alexandre.isoard@gmail.com> |
ScalarEvolution: Add URem support
In LLVM IR the following code:
%r = urem <ty> %t, %b
is equivalent to:
%q = udiv <ty> %t, %b %s = mul <ty> nuw %q, %b %r = sub <ty> nuw %t, %q ;
ScalarEvolution: Add URem support
In LLVM IR the following code:
%r = urem <ty> %t, %b
is equivalent to:
%q = udiv <ty> %t, %b %s = mul <ty> nuw %q, %b %r = sub <ty> nuw %t, %q ; (t / b) * b + (t % b) = t
As UDiv, Mul and Sub are already supported by SCEV, URem can be implemented with minimal effort this way.
Note: While SRem and SDiv are also related this way, SCEV does not provides SDiv yet.
llvm-svn: 306695
show more ...
|
#
01020396 |
| 24-Jun-2017 |
Craig Topper <craig.topper@gmail.com> |
[SCEV] Avoid copying ConstantRange just to get the min/max value
Summary: This patch changes getRange to getRangeRef and returns a reference to the ConstantRange object stored inside the DenseMap ca
[SCEV] Avoid copying ConstantRange just to get the min/max value
Summary: This patch changes getRange to getRangeRef and returns a reference to the ConstantRange object stored inside the DenseMap caches. We then take advantage of that to add new helper methods that can return min/max value of a signed or unsigned ConstantRange using that reference without first copying the ConstantRange.
getRangeRef calls itself recursively and I believe the reference return is fine for those calls.
I've left getSignedRange and getUnsignedRange returning a ConstantRange object so they will make a copy now. This is to ensure safety since the reference will be invalidated if the DenseMap changes.
I'm sure there are still more places that can take advantage of the reference and I'll submit future patches as I find them.
Reviewers: sanjoy, davide
Reviewed By: sanjoy
Subscribers: zzheng, llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D32978
llvm-svn: 306229
show more ...
|
#
eac01d4c |
| 21-Jun-2017 |
Max Kazantsev <max.kazantsev@azul.com> |
[SCEV] Make MulOpsInlineThreshold lower to avoid excessive compilation time
MulOpsInlineThreshold option of SCEV is defaulted to 1000, which is inadequately high. When constructing SCEVs of expressi
[SCEV] Make MulOpsInlineThreshold lower to avoid excessive compilation time
MulOpsInlineThreshold option of SCEV is defaulted to 1000, which is inadequately high. When constructing SCEVs of expressions like:
x1 = a * a x2 = x1 * x1 x3 = x2 * x2 ...
We actually have huge SCEVs with max allowed amount of operands inlined. Such expressions are easy to get from unrolling of loops looking like
x = a for (i = 0; i < n; i++) x = x * x
Or more tricky cases where big powers are involved. If some non-linear analysis tries to work with a SCEV that has 1000 operands, it may lead to excessively long compilation. The attached test does not pass within 1 minute with default threshold.
This patch decreases its default value to 32, which looks much more reasonable if we use analyzes with complexity O(N^2) or O(N^3) working with SCEV.
Differential Revision: https://reviews.llvm.org/D34397
llvm-svn: 305882
show more ...
|
#
0bcf6ec8 |
| 20-Jun-2017 |
Max Kazantsev <max.kazantsev@azul.com> |
[SCEV][NFC] Fix a misleading description of AddOpsInlineThreshold
The description of this option was copy-pasted from another one and does not correspond to reality.
Differential Revision: https://
[SCEV][NFC] Fix a misleading description of AddOpsInlineThreshold
The description of this option was copy-pasted from another one and does not correspond to reality.
Differential Revision: https://reviews.llvm.org/D34390
llvm-svn: 305782
show more ...
|
#
dc80366d |
| 15-Jun-2017 |
Max Kazantsev <max.kazantsev@azul.com> |
[ScalarEvolution] Apply Depth limit to getMulExpr
This is a fix for PR33292 that shows a case of extremely long compilation of a single .c file with clang, with most time spent within SCEV.
We have
[ScalarEvolution] Apply Depth limit to getMulExpr
This is a fix for PR33292 that shows a case of extremely long compilation of a single .c file with clang, with most time spent within SCEV.
We have a mechanism of limiting recursion depth for getAddExpr to avoid long analysis in SCEV. However, there are calls from getAddExpr to getMulExpr and back that do not propagate the info about depth. As result of this, a chain
getAddExpr -> ... .> getAddExpr -> getMulExpr -> getAddExpr -> ... -> getAddExpr
can be extremely long, with every segment of getAddExpr's being up to max depth long. This leads either to long compilation or crash by stack overflow. We face this situation while analyzing big SCEVs in the test of PR33292.
This patch applies the same limit on max expression depth for getAddExpr and getMulExpr.
Differential Revision: https://reviews.llvm.org/D33984
llvm-svn: 305463
show more ...
|
Revision tags: llvmorg-4.0.1, llvmorg-4.0.1-rc3 |
|
#
647025f9 |
| 09-Jun-2017 |
Andrew Kaylor <andrew.kaylor@intel.com> |
[InstSimplify] Don't constant fold or DCE calls that are marked nobuiltin
Differential Revision: https://reviews.llvm.org/D33737
llvm-svn: 305132
|
#
6bda14b3 |
| 06-Jun-2017 |
Chandler Carruth <chandlerc@gmail.com> |
Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line
Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days.
I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch.
This patch is *entirely* mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files.
Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again).
llvm-svn: 304787
show more ...
|
#
8514dd54 |
| 31-May-2017 |
Galina Kistanova <gkistanova@gmail.com> |
Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304358
|
Revision tags: llvmorg-4.0.1-rc2 |
|
#
d8fe3eb9 |
| 30-May-2017 |
Max Kazantsev <max.kazantsev@azul.com> |
[SCEV][NFC] Remove redundant params from isAvailableAtLoopEntry
Params DT and LI are redundant, because these values are contained in fields anyways.
Differential Revision: https://reviews.llvm.org
[SCEV][NFC] Remove redundant params from isAvailableAtLoopEntry
Params DT and LI are redundant, because these values are contained in fields anyways.
Differential Revision: https://reviews.llvm.org/D33668
llvm-svn: 304204
show more ...
|
#
e3684d0b |
| 27-May-2017 |
Tobias Grosser <tobias@grosser.es> |
[SCEV] Assume parameters coming from function calls contain IVs
The optimistic delinearization implemented in LLVM detects array sizes by looking for non-linear products between parameters and induc
[SCEV] Assume parameters coming from function calls contain IVs
The optimistic delinearization implemented in LLVM detects array sizes by looking for non-linear products between parameters and induction variables. In OpenCL code, such products often look like:
A[get_global_id(0) * N + get_global_id(1)]
Hence, the IV is hidden in the get_global_id() call and consequently delinearization would fail as no induction variable is available that helps us to identify N as array size parameter.
We now use a very simple heuristic to change this. We assume that each parameter that comes directly from a function call is a hidden induction variable. As a result, we can delinearize the access above to:
A[get_global_id(0)][get_global_id(1]
llvm-svn: 304073
show more ...
|
#
41450329 |
| 26-May-2017 |
Max Kazantsev <max.kazantsev@azul.com> |
Re-enable "[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start"
The patch rL303730 was reverted because test lsr-expand-quadratic.ll failed on many non-X86 configs with this patch. The re
Re-enable "[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start"
The patch rL303730 was reverted because test lsr-expand-quadratic.ll failed on many non-X86 configs with this patch. The reason of this is that the patch makes a correctless fix that changes optimizer's behavior for this test. Without the change, LSR was making an overconfident simplification basing on a wrong SCEV. Apparently it did not need the IV analysis to do this. With the change, it chose a different way to simplify (that wasn't so confident), and this way required the IV analysis. Now, following the right execution path, LSR tries to make a transformation relying on IV Users analysis. This analysis is target-dependent due to this code:
// LSR is not APInt clean, do not touch integers bigger than 64-bits. // Also avoid creating IVs of non-native types. For example, we don't want a // 64-bit IV in 32-bit code just because the loop has one 64-bit cast. uint64_t Width = SE->getTypeSizeInBits(I->getType()); if (Width > 64 || !DL.isLegalInteger(Width)) return false;
To make a proper transformation in this test case, the type i32 needs to be legal for the specified data layout. When the test runs on some non-X86 configuration (e.g. pure ARM 64), opt gets confused by the specified target and does not use it, rejecting the specified data layout as well. Instead, it uses some default layout that does not treat i32 as a legal type (currently the layout that is used when it is not specified does not have legal types at all). As result, the transformation we expect to happen does not happen for this test.
This re-enabling patch does not have any source code changes compared to the original patch rL303730. The only difference is that the failing test is moved to X86 directory and now has requirement of running on x86 only to comply with the specified target triple and data layout.
Differential Revision: https://reviews.llvm.org/D33543
llvm-svn: 303971
show more ...
|
#
8205a1a9 |
| 24-May-2017 |
Craig Topper <craig.topper@gmail.com> |
[ValueTracking] Convert most of the calls to computeKnownBits to use the version that returns the KnownBits object.
This continues the changes started when computeSignBit was replaced with this new
[ValueTracking] Convert most of the calls to computeKnownBits to use the version that returns the KnownBits object.
This continues the changes started when computeSignBit was replaced with this new version of computeKnowBits.
Differential Revision: https://reviews.llvm.org/D33431
llvm-svn: 303773
show more ...
|
#
183863fc |
| 24-May-2017 |
Diana Picus <diana.picus@linaro.org> |
Revert "[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start"
This reverts commit r303730 because it broke all the buildbots.
llvm-svn: 303747
|