#
dbf6ab5e |
| 05-Jul-2022 |
Zaara Syeda <syzaara@ca.ibm.com> |
[LSR] Fix bug for optimizing unused IVs to final values
This is a fix for a crash reported for https://reviews.llvm.org/D118808 The fix is to only consider PHINodes which are induction phis. Fixes #
[LSR] Fix bug for optimizing unused IVs to final values
This is a fix for a crash reported for https://reviews.llvm.org/D118808 The fix is to only consider PHINodes which are induction phis. Fixes #55529
Differential Revision: https://reviews.llvm.org/D125990
show more ...
|
#
65d59b42 |
| 01-Jul-2022 |
Nikita Popov <npopov@redhat.com> |
[LoopDeletion] Fix deletion with unusual predecessor terminator (PR56266)
LoopSimplify only requires that the loop predecessor has a single successor and is safe to hoist into -- it doesn't necessar
[LoopDeletion] Fix deletion with unusual predecessor terminator (PR56266)
LoopSimplify only requires that the loop predecessor has a single successor and is safe to hoist into -- it doesn't necessarily have to be an unconditional BranchInst.
Adjust LoopDeletion to assert conditions closer to what it actually needs for correctness, namely a single successor and a side-effect-free terminator (as the terminator is getting dropped).
Fixes https://github.com/llvm/llvm-project/issues/56266.
show more ...
|
#
a7938c74 |
| 26-Jun-2022 |
Kazu Hirata <kazu@google.com> |
[llvm] Don't use Optional::hasValue (NFC)
This patch replaces Optional::hasValue with the implicit cast to bool in conditionals only.
|
#
3b7c3a65 |
| 25-Jun-2022 |
Kazu Hirata <kazu@google.com> |
Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.
|
#
aa8feeef |
| 25-Jun-2022 |
Kazu Hirata <kazu@google.com> |
Don't use Optional::hasValue (NFC)
|
Revision tags: llvmorg-14.0.6 |
|
#
e0e687a6 |
| 20-Jun-2022 |
Kazu Hirata <kazu@google.com> |
[llvm] Don't use Optional::hasValue (NFC)
|
#
129b531c |
| 19-Jun-2022 |
Kazu Hirata <kazu@google.com> |
[llvm] Use value_or instead of getValueOr (NFC)
|
#
e9cced27 |
| 17-Jun-2022 |
Florian Hahn <flo@fhahn.com> |
Recommit "[LAA] Initial support for runtime checks with pointer selects."
This reverts commit 7aa8a678826dea86ff3e6c7df9d2a8a6ef868f5d.
This version includes fixes to address issues uncovered after
Recommit "[LAA] Initial support for runtime checks with pointer selects."
This reverts commit 7aa8a678826dea86ff3e6c7df9d2a8a6ef868f5d.
This version includes fixes to address issues uncovered after the commit landed and discussed at D11448.
Those include:
* Limit select-traversal to selects inside the loop. * Freeze pointers resulting from looking through selects to avoid branch-on-poison.
show more ...
|
Revision tags: llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
#
d77f9448 |
| 10-Apr-2021 |
Nikita Popov <nikita.ppv@gmail.com> |
[LoopInfo] Add getOutermostLoop() (NFC)
This is a recurring pattern, add an API function for it.
|
#
10f41a21 |
| 25-May-2022 |
Alexey Bataev <a.bataev@outlook.com> |
[SLP]Fix PR55688: Miscompile due to incorrect nuw/nsw handling.
Need to use all ReductionOps when propagating flags for the reduction ops, otherwise transformation is not correct. Plus, need to drop
[SLP]Fix PR55688: Miscompile due to incorrect nuw/nsw handling.
Need to use all ReductionOps when propagating flags for the reduction ops, otherwise transformation is not correct. Plus, need to drop nuw/nsw flags.
Differential Revision: https://reviews.llvm.org/D126371
show more ...
|
#
b7315ffc |
| 16-May-2022 |
Florian Hahn <flo@fhahn.com> |
[LAA,LV] Add initial support for pointer-diff memory checks.
This patch adds initial support for a pointer diff based runtime check scheme for vectorization. This scheme requires fewer computations
[LAA,LV] Add initial support for pointer-diff memory checks.
This patch adds initial support for a pointer diff based runtime check scheme for vectorization. This scheme requires fewer computations and checks than the existing full overlap checking, if it is applicable.
The main idea is to only check if source and sink of a dependency are far enough apart so the accesses won't overlap in the vector loop. To do so, it is sufficient to compute the difference and compare it to the `VF * UF * AccessSize`. It is sufficient to check `(Sink - Src) <u VF * UF * AccessSize` to rule out a backwards dependence in the vector loop with the given VF and UF. If Src >=u Sink, there is not dependence preventing vectorization, hence the overflow should not matter and using the ULT should be sufficient.
Note that the initial version is restricted in multiple ways:
1. Pointers must only either be read or written, by a single instruction (this allows re-constructing source/sink for dependences with the available information) 2. Source and sink pointers must be add-recs, with matching steps 3. The step must be a constant. 3. abs(step) == AccessSize.
Most of those restrictions can be relaxed in the future.
See https://github.com/llvm/llvm-project/issues/53590.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D119078
show more ...
|
#
a494ae43 |
| 01-Mar-2022 |
serge-sans-paille <sguelton@redhat.com> |
Cleanup includes: TransformsUtils
Estimation on the impact on preprocessor output: before: 1065307662 after: 1064800684
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-
Cleanup includes: TransformsUtils
Estimation on the impact on preprocessor output: before: 1065307662 after: 1064800684
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120741
show more ...
|
#
e00158ed |
| 15-Jan-2022 |
Florian Hahn <flo@fhahn.com> |
[LoopUtils] Use InstSimplifyFolder in addRuntimeChecks.
Use the InstSimplifyFolder introduced earlier to perform initial simplification during runtime check construction.
|
#
f632c494 |
| 17-Dec-2021 |
Philip Reames <listmail@philipreames.com> |
Extract a helper function for computing estimate trip count of an exiting branch
Plan to use this in following change to support estimated trip counts derived from multiple loop exits.
|
#
0d13f94c |
| 09-Dec-2021 |
Philip Reames <listmail@philipreames.com> |
[reductions] Delete another piece of dead flag handling [NFC]
The code claimed to handle nsw/nuw, but those aren't passed via builder state and the explicit IR construction just above never sets the
[reductions] Delete another piece of dead flag handling [NFC]
The code claimed to handle nsw/nuw, but those aren't passed via builder state and the explicit IR construction just above never sets them.
The only case this bit of code is actually relevant for is FMF flags. However, dropPoisonGeneratingFlags currently doesn't know about FMF at all, so this was a noop. It's also unneeded, as the caller explicitly configures the flags on the builder before this call, and the flags on the individual ops should be controled by the intrinsic flags anyways. If any of the flags aren't safe to propagate, the caller needs to make that change.
show more ...
|
#
b24db85c |
| 09-Dec-2021 |
Philip Reames <listmail@philipreames.com> |
[recurrence] Delete dead flag/fmf handling [NFC]
The recurrence lowering code has handling which claims to be about flag intersection, but all the callers pass empty arrays to the arguments. The so
[recurrence] Delete dead flag/fmf handling [NFC]
The recurrence lowering code has handling which claims to be about flag intersection, but all the callers pass empty arrays to the arguments. The sole exception is a caller of a method which has the argument, but no implementation.
I don't know what the intent was here, but it certaintly doesn't actually do anything today.
show more ...
|
#
2d31b025 |
| 09-Dec-2021 |
Philip Reames <listmail@philipreames.com> |
Compute estimated trip counts for multiple exit loops
This change allows us to estimate trip count from profile metadata for all multiple exit loops. We still do the estimate only from the latch, bu
Compute estimated trip counts for multiple exit loops
This change allows us to estimate trip count from profile metadata for all multiple exit loops. We still do the estimate only from the latch, but that's fine as it causes us to over estimate the trip count at worst.
Reviewing the uses of the API, all but one are cases where we restrict a loop transformation (unroll, and vectorize respectively) when we know the trip count is short enough. So, as a result, the change makes these passes strictly less aggressive. The test change illustrates a case where we'd previously have runtime unrolled a loop which ran fewer iterations than the unroll factor. This is definitely unprofitable.
The one case where an upper bound on estimate trip count could drive a more aggressive transform is peeling, and I duplicated the logic being removed from the generic estimation there to keep it the same. The resulting heuristic makes no sense and should probably be immediately removed, but we can do that in a separate change.
This was noticed when analyzing regressions on D113939.
I plan to come back and incorporate estimated trip counts from other exits, but that's a minor improvement which can follow separately.
Differential Revision: https://reviews.llvm.org/D115362
show more ...
|
#
c2441b6b |
| 11-Oct-2021 |
Rosie Sumpter <rosie.sumpter@arm.com> |
[LoopVectorize] Add vector reduction support for fmuladd intrinsic
Enables LoopVectorize to handle reduction patterns involving the llvm.fmuladd intrinsic.
Differential Revision: https://reviews.ll
[LoopVectorize] Add vector reduction support for fmuladd intrinsic
Enables LoopVectorize to handle reduction patterns involving the llvm.fmuladd intrinsic.
Differential Revision: https://reviews.llvm.org/D111555
show more ...
|
#
d1abf481 |
| 20-Nov-2021 |
Kazu Hirata <kazu@google.com> |
[llvm] Use range-based for loops (NFC)
|
#
0d182d9d |
| 08-Nov-2021 |
Kazu Hirata <kazu@google.com> |
[Transforms] Use make_early_inc_range (NFC)
|
#
8977bd58 |
| 20-Oct-2021 |
Florian Hahn <flo@fhahn.com> |
[IndVars] Invalidate SCEV when IR is changed in rewriteLoopExitValue.
At the moment, rewriteLoopExitValue forgets the current phi node in the loop that collects phis to rewrite. A few lines after th
[IndVars] Invalidate SCEV when IR is changed in rewriteLoopExitValue.
At the moment, rewriteLoopExitValue forgets the current phi node in the loop that collects phis to rewrite. A few lines after the value is forgotten, SCEV is used again to analyze incoming values and potentially expand SCEV expression. This means that another SCEV is created for PN, before the IR is actually updated in the next loop.
This leads to accessing invalid cached expression in combination with D71539.
PN should only be changed once the actual incoming exit value is set in the next loop. Moving invalidation there should ensure that PN is invalidated in all relevant cases.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D111495
show more ...
|
#
e844f053 |
| 18-Oct-2021 |
Florian Hahn <flo@fhahn.com> |
[LoopUtils] Simplify addRuntimeCheck to return a single value.
This simplifies the return value of addRuntimeCheck from a pair of instructions to a single `Value *`.
The existing users of addRuntim
[LoopUtils] Simplify addRuntimeCheck to return a single value.
This simplifies the return value of addRuntimeCheck from a pair of instructions to a single `Value *`.
The existing users of addRuntimeChecks were ignoring the first element of the pair, hence there is not reason to track FirstInst and return it.
Additionally all users of addRuntimeChecks use the second returned `Instruction *` just as `Value *`, so there is no need to return an `Instruction *`. Therefore there is no need to create a redundant dummy `and X, true` instruction any longer.
Effectively this change should not impact the generated code because the redundant AND will be folded by later optimizations. But it is easy to avoid creating it in the first place and it allows more accurately estimating the cost of the runtime checks.
show more ...
|
#
26b7d9d6 |
| 04-Aug-2021 |
David Sherwood <david.sherwood@arm.com> |
[LoopVectorize] Permit vectorisation of more select(cmp(), X, Y) reduction patterns
This patch adds further support for vectorisation of loops that involve selecting an integer value based on a prev
[LoopVectorize] Permit vectorisation of more select(cmp(), X, Y) reduction patterns
This patch adds further support for vectorisation of loops that involve selecting an integer value based on a previous comparison. Consider the following C++ loop:
int r = a; for (int i = 0; i < n; i++) { if (src[i] > 3) { r = b; } src[i] += 2; }
We should be able to vectorise this loop because all we are doing is selecting between two states - 'a' and 'b' - both of which are loop invariant. This just involves building a vector of values that contain either 'a' or 'b', where the final reduced value will be 'b' if any lane contains 'b'.
The IR generated by clang typically looks like this:
%phi = phi i32 [ %a, %entry ], [ %phi.update, %for.body ] ... %pred = icmp ugt i32 %val, i32 3 %phi.update = select i1 %pred, i32 %b, i32 %phi
We already detect min/max patterns, which also involve a select + cmp. However, with the min/max patterns we are selecting loaded values (and hence loop variant) in the loop. In addition we only support certain cmp predicates. This patch adds a new pattern matching function (isSelectCmpPattern) and new RecurKind enums - SelectICmp & SelectFCmp. We only support selecting values that are integer and loop invariant, however we can support any kind of compare - integer or float.
Tests have been added here:
Transforms/LoopVectorize/AArch64/sve-select-cmp.ll Transforms/LoopVectorize/select-cmp-predicated.ll Transforms/LoopVectorize/select-cmp.ll
Differential Revision: https://reviews.llvm.org/D108136
show more ...
|
#
0dcd2b40 |
| 06-Oct-2021 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[TTI] Remove default condition type and predicate arguments from getCmpSelInstrCost
We need to be better at exposing the comparison predicate to getCmpSelInstrCost calls as some targets (e.g. X86 SS
[TTI] Remove default condition type and predicate arguments from getCmpSelInstrCost
We need to be better at exposing the comparison predicate to getCmpSelInstrCost calls as some targets (e.g. X86 SSE) have very different costs for different comparisons (PR48337), and we can't always rely on the optional Instruction argument.
This initial commit requires explicit condition type and predicate arguments. The next step will be to review a lot of the existing getCmpSelInstrCost calls which have used BAD_ICMP_PREDICATE even when the predicate is known.
Differential Revision: https://reviews.llvm.org/D111024
show more ...
|
#
685f1bfd |
| 01-Oct-2021 |
Krasimir Georgiev <krasimir@google.com> |
Revert "[LoopVectorize] Permit vectorisation of more select(cmp(), X, Y) reduction patterns"
It appears to cause stage2 clang build failures, e.g., https://lab.llvm.org/buildbot/#/builders/74/builds
Revert "[LoopVectorize] Permit vectorisation of more select(cmp(), X, Y) reduction patterns"
It appears to cause stage2 clang build failures, e.g., https://lab.llvm.org/buildbot/#/builders/74/builds/7145.
This reverts commit 1fb37334bdb3cdb028977382fbd84cebde64ebb2.
show more ...
|