#
a71c4e4f |
| 17-Oct-2022 |
Sjoerd Meijer <sjoerd.meijer@gmail.com> |
Revert "[LoopFlatten] Enable it by default"
This reverts commit 233659c7ae9b83b64a9f739d340736bca39c3d2e.
I see some sanitizer build bot failures. Not sure if it is change causing it, but let's see
Revert "[LoopFlatten] Enable it by default"
This reverts commit 233659c7ae9b83b64a9f739d340736bca39c3d2e.
I see some sanitizer build bot failures. Not sure if it is change causing it, but let's see if a revert returns the bots to green...
show more ...
|
#
233659c7 |
| 17-Oct-2022 |
Sjoerd Meijer <sjoerd.meijer@gmail.com> |
[LoopFlatten] Enable it by default
LoopFlatten has been in the code base off by default for years, but this enables it to run by default. Downstream this has been running for years, so it has been e
[LoopFlatten] Enable it by default
LoopFlatten has been in the code base off by default for years, but this enables it to run by default. Downstream this has been running for years, so it has been exposed to quite some code. Then around the time we switched to the NPM, several fixes went in related to updating the MemorySSA state and we moved it to a loop pass manager, which both helped preventing rerunning certain analysis passes, and thus helped a bit with compile-times.
About compile-times, adding a pass isn't free, but this should see only very minor increases. The pass is relatively simple and there shouldn't be anything algorithmically expensive because all it does is looking at inner/outer loops and it checks assumptions on loop increments and indices. If we see increases, I expect this to mainly come from invalidation of analysis info, and perhaps subsequent passes to trigger and do more. Despite its simplicity/restrictions, it triggers in most code-bases, which makes it worth to enable this by default.
Differential Revision: https://reviews.llvm.org/D109958
show more ...
|
Revision tags: working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3 |
|
#
fdec5018 |
| 18-Aug-2022 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[CostModel] Replace getUserCost with getInstructionCost * Replace getUserCost with getInstructionCost, covering all cost kinds. * Remove getInstructionLatency, it's not implemented by any backends, a
[CostModel] Replace getUserCost with getInstructionCost * Replace getUserCost with getInstructionCost, covering all cost kinds. * Remove getInstructionLatency, it's not implemented by any backends, and we should fold the functionality into getUserCost (now getInstructionCost) to make it easier for targets to handle the cost kinds with their existing cost callbacks.
Original Patch by @samparker (Sam Parker)
Differential Revision: https://reviews.llvm.org/D79483
show more ...
|
Revision tags: llvmorg-15.0.0-rc2 |
|
#
0e37ef01 |
| 08-Aug-2022 |
Kazu Hirata <kazu@google.com> |
[Transforms] Fix comment typos (NFC)
|
#
a2d45017 |
| 07-Aug-2022 |
Kazu Hirata <kazu@google.com> |
[llvm] Fix comment typos (NFC)
|
Revision tags: llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6 |
|
#
0916d96d |
| 21-Jun-2022 |
Kazu Hirata <kazu@google.com> |
Don't use Optional::hasValue (NFC)
|
Revision tags: llvmorg-14.0.5 |
|
#
d73684e2 |
| 07-Jun-2022 |
Craig Topper <craig.topper@sifive.com> |
[LoopFlatten] Fix crash if the inner loop trip count comes from a sext instruction.
If we look through a truncate in matchLinearIVUser, it's possible we find a sext/zext instruction that didn't come
[LoopFlatten] Fix crash if the inner loop trip count comes from a sext instruction.
If we look through a truncate in matchLinearIVUser, it's possible we find a sext/zext instruction that didn't come from widening. This will fail the MatchedItCount->getType() == InnerInductionPHI->getType() assertion.
Fix this by checking that we did not look through a truncate already.
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D127149
show more ...
|
#
fdd58435 |
| 07-Jun-2022 |
Craig Topper <craig.topper@sifive.com> |
[LoopFlatten] Replace unchecked dyn_cast with cast.
Spotted while reading through the code.
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D127146
|
Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
#
59630917 |
| 02-Mar-2022 |
serge-sans-paille <sguelton@redhat.com> |
Cleanup includes: Transform/Scalar
Estimated impact on preprocessor output line: before: 1062981579 after: 1062494547
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cl
Cleanup includes: Transform/Scalar
Estimated impact on preprocessor output line: before: 1062981579 after: 1062494547
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120817
show more ...
|
Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init |
|
#
ada6d78a |
| 24-Jan-2022 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[LoopFlatten] Address FIXME about getTripCountFromExitCount. NFC.
Together with the previous commit which mainly documents better LoopFlatten's overall strategy, this addresses a concern added as a
[LoopFlatten] Address FIXME about getTripCountFromExitCount. NFC.
Together with the previous commit which mainly documents better LoopFlatten's overall strategy, this addresses a concern added as a FIXME comment in D110587; the code refactoring (NFC) introduces functions (also for the SCEV usage) to make this clearer.
show more ...
|
#
f6ac8088 |
| 24-Jan-2022 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[LoopFlatten] Added comments about usage of various Loop APIs. NFC.
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
#
d544a89a |
| 05-Jan-2022 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[LoopFlatten] Update MemorySSA state
I would like to move LoopFlatten from LoopPass Manager LPM2 to LPM1 (D116612), but that is a LPM that is using MemorySSA and so LoopFlatten needs to preserve Mem
[LoopFlatten] Update MemorySSA state
I would like to move LoopFlatten from LoopPass Manager LPM2 to LPM1 (D116612), but that is a LPM that is using MemorySSA and so LoopFlatten needs to preserve MemorySSA and this adds that. More specifically, LoopFlatten restructures the CFG and with this change the MSSA state is updated accordingly, where we also update the DomTree. LoopFlatten doesn't rewrite/optimise/delete load or store instructions, so I have not added any MSSA updates for that.
Differential Revision: https://reviews.llvm.org/D116660
show more ...
|
#
66383038 |
| 04-Jan-2022 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[LoopFlatten] checkOverflow - use cast<> instead of dyn_cast<> to avoid dereference of nullptr.
Fix static analysis warning by using cast<> instead of dyn_cast<> as both isa<> and isGuaranteedToExec
[LoopFlatten] checkOverflow - use cast<> instead of dyn_cast<> to avoid dereference of nullptr.
Fix static analysis warning by using cast<> instead of dyn_cast<> as both isa<> and isGuaranteedToExecuteForEveryIteration expect a non-null Instruction pointer.
show more ...
|
Revision tags: llvmorg-13.0.1-rc1 |
|
#
7f55209c |
| 11-Oct-2021 |
Philip Reames <listmail@philipreames.com> |
[SCEV] Extend trip count to avoid overflow by default
As a brief reminder, an "exit count" is the number of times the backedge executes before some event. It can be zero if we exit before the backed
[SCEV] Extend trip count to avoid overflow by default
As a brief reminder, an "exit count" is the number of times the backedge executes before some event. It can be zero if we exit before the backedge is reached. A "trip count" is the number of times the loop header is entered if we branch into the loop. In general, TC = BTC + 1 and thus a zero trip count is ill defined
There is a cornercases which we don't handle well. Let's assume i8 for our examples to keep things simple. If BTC = 255, then the correct trip count is 256. However, 256 is not representable in i8.
In theory, code which needs to reason about trip counts is responsible for checking for this cornercase, and either bailing out, or handling it correctly. Historically, we don't have a great track record about actually doing so.
When reviewing D109676, I found myself asking a basic question. Was there any good reason to preserve the current wrap-to-zero behavior when converting from backedge taken counts to trip counts? After reviewing existing code, I could not find a single case which appears to correctly and precisely handle the overflow case.
This patch changes the default behavior to extend instead of wrap. That is, if the result might be 256, we return a value of i9 type to ensure we interpret the count correctly. I did leave the legacy behavior as an option since a) loop-flatten stops triggering if I extend due to weirdly specific pattern matching I didn't understand and b) we could reasonably use the mode if we'd externally established a lack of overflow.
I want to emphasize that this change is *not* NFC. There are two call sites (one in ScalarEvolution.cpp, one in LoopCacheAnalysis.cpp) which are switched to the extend semantics. The former appears imprecise (but correct) for a constant 255 BTC. The later appears incorrect, though I don't have a test case.
Differential Revision: https://reviews.llvm.org/D110587
show more ...
|
#
e3129fb7 |
| 07-Oct-2021 |
Nikita Popov <nikita.ppv@gmail.com> |
[LoopFlatten] Mark inner loop as deleted
If a loop is flattened, the inner loop is removed and the LPM should be informed of this fact, so it can invalidate associated analyses. To support this, we
[LoopFlatten] Mark inner loop as deleted
If a loop is flattened, the inner loop is removed and the LPM should be informed of this fact, so it can invalidate associated analyses. To support this, we relax an assertion in LPMUpdater to allow invalidating non-top-level loops when running in LoopNestMode, as the pass does not know how exactly it will get scheduled.
Differential Revision: https://reviews.llvm.org/D111350
show more ...
|
#
c5245dd3 |
| 07-Oct-2021 |
Nikita Popov <nikita.ppv@gmail.com> |
[LoopFlatten] Mark loop analyses as preserved
LoopFlatten does preserve loop analyses (DT, LI and SCEV), but currently doesn't mark them as preserved in the NewPM (they are marked as preserved in th
[LoopFlatten] Mark loop analyses as preserved
LoopFlatten does preserve loop analyses (DT, LI and SCEV), but currently doesn't mark them as preserved in the NewPM (they are marked as preserved in the LegacyPM). I think this doesn't really have an effect in the end because the loop pass adaptor will just assume they're preserved anyway, but let's be explicit about this for the sake of clarity.
Differential Revision: https://reviews.llvm.org/D111328
show more ...
|
#
367df180 |
| 29-Sep-2021 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[LoopFlatten] Bail if we can't perform flattening after IV widening
It can happen that after widening of the IV, flattening may not be possible, e.g. when it is deemed unprofitable. We were not prop
[LoopFlatten] Bail if we can't perform flattening after IV widening
It can happen that after widening of the IV, flattening may not be possible, e.g. when it is deemed unprofitable. We were not properly checking this, which resulted in flattening being applied when it shouldn't, also leading to incorrect results (miscompilation).
This should fix PR51980 (https://bugs.llvm.org/show_bug.cgi?id=51980)
Differential Revision: https://reviews.llvm.org/D110712
show more ...
|
#
0ea77502 |
| 28-Sep-2021 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[LoopFlatten] Updating Phi nodes after IV widening
In rG6a076fa9539e, a problem with updating the old/narrow phi nodes after IV widening was introduced. If after widening of the IV the transformatio
[LoopFlatten] Updating Phi nodes after IV widening
In rG6a076fa9539e, a problem with updating the old/narrow phi nodes after IV widening was introduced. If after widening of the IV the transformation is *not* applied, the narrow phi node was incorrectly modified, which should only happen if flattening happens. This can be seen in the added test widen-iv2.ll, which incorrectly had 1 incoming value, but should have its original 2 incoming values, which is now restored.
Differential Revision: https://reviews.llvm.org/D110234
show more ...
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3 |
|
#
2d26a72f |
| 11-Sep-2021 |
Eric Christopher <echristo@gmail.com> |
nullptr initialize variables, spotted on msan bots.
|
#
6a076fa9 |
| 03-Sep-2021 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[LoopFlatten] Make the analysis more robust after IV widening
LoopFlatten wasn't triggering on this motivating case after IV widening:
void foo(int *A, int N, int M) { for (int i = 0; i < N;
[LoopFlatten] Make the analysis more robust after IV widening
LoopFlatten wasn't triggering on this motivating case after IV widening:
void foo(int *A, int N, int M) { for (int i = 0; i < N; ++i) for (int j = 0; j < M; ++j) f(A[i*M+j]); }
The reason was that the old induction phi nodes were getting in the way. These narrow and dead induction phis are not always trivially dead, and having both the narrow and wide IVs confused the analysis and caused it to bail. This adds some extra bookkeeping for these old phis, so we can filter them out when checks on phi nodes are performed. Other clean up passes will get rid of these old phis and increment instructions.
As this was one of the motivating examples from the beginning, it was surprising this wasn't triggering from C/C++ code. It looks like the IR and CFG is just slightly different.
Differential Revision: https://reviews.llvm.org/D109309
show more ...
|
Revision tags: llvmorg-13.0.0-rc2 |
|
#
e2217247 |
| 24-Aug-2021 |
Rosie Sumpter <rosie.sumpter@arm.com> |
[LoopFlatten] Add statistic for number of loops flattened. NFC
Differential Revision: https://reviews.llvm.org/D108644
|
#
d1aa0751 |
| 17-Aug-2021 |
Rosie Sumpter <rosie.sumpter@arm.com> |
[LoopFlatten] Fix assertion failure
There is an assertion failure in computeOverflowForUnsignedMul (used in checkOverflow) due to the inner and outer trip counts having different types. This occurs
[LoopFlatten] Fix assertion failure
There is an assertion failure in computeOverflowForUnsignedMul (used in checkOverflow) due to the inner and outer trip counts having different types. This occurs when the IV has been widened, but the loop components are not successfully rediscovered. This is fixed by some refactoring of the code in findLoopComponents which identifies the trip count of the loop.
Differential Revision: https://reviews.llvm.org/D108107
show more ...
|
#
46abd1fb |
| 09-Aug-2021 |
Rosie Sumpter <rosie.sumpter@arm.com> |
[LoopFlatten] Fix assertion failure in checkOverflow
There is an assertion failure in computeOverflowForUnsignedMul (used in checkOverflow) due to the inner and outer trip counts having different ty
[LoopFlatten] Fix assertion failure in checkOverflow
There is an assertion failure in computeOverflowForUnsignedMul (used in checkOverflow) due to the inner and outer trip counts having different types. This occurs when the IV has been widened, but the loop components are not successfully rediscovered. This is fixed by some refactoring of the code in findLoopComponents which identifies the trip count of the loop.
show more ...
|
Revision tags: llvmorg-13.0.0-rc1 |
|
#
f117ed54 |
| 30-Jul-2021 |
Rosie Sumpter <rosie.sumpter@arm.com> |
[LoopFlatten] Fix missed LoopFlatten opportunity
When the limit of the inner loop is a known integer, the InstCombine pass now causes the transformation e.g. imcp ult i32 %inc, tripcount -> icmp ult
[LoopFlatten] Fix missed LoopFlatten opportunity
When the limit of the inner loop is a known integer, the InstCombine pass now causes the transformation e.g. imcp ult i32 %inc, tripcount -> icmp ult %j, tripcount-step (where %j is the inner loop induction variable and %inc is add %j, step), which is now accounted for when identifying the trip count of the loop. This is also an acceptable use of %j (provided the step is 1) so is ignored as long as the compare that it's used in is also the condition of the inner branch.
Differential Revision: https://reviews.llvm.org/D105802
show more ...
|
#
fab5659c |
| 29-Jul-2021 |
Rosie Sumpter <rosie.sumpter@arm.com> |
Revert "[LoopFlatten] Fix missed LoopFlatten opportunity"
This reverts commit 2df8bf9339e43de63d8d28e07182e1d6d7ffb843.
Reverting because it causes an assertion failure.
|