#
72ee6945 |
| 24-Jun-2017 |
Craig Topper <craig.topper@gmail.com> |
[Analysis][Transforms] Use commutable matchers instead of m_CombineOr in a few places. NFC
llvm-svn: 306204
|
#
4ab0f491 |
| 23-Jun-2017 |
Chandler Carruth <chandlerc@gmail.com> |
[LoopSimplify] Factor the logic to form dedicated exits into a utility.
I want to use the same logic as LoopSimplify to form dedicated exits in another pass (SimpleLoopUnswitch) so I wanted to facto
[LoopSimplify] Factor the logic to form dedicated exits into a utility.
I want to use the same logic as LoopSimplify to form dedicated exits in another pass (SimpleLoopUnswitch) so I wanted to factor it out here.
I also noticed that there is a pretty significantly more efficient way to implement this than the way the code in LoopSimplify worked. We don't need to actually retain the set of unique exit blocks, we can just rewrite them as we find them and use only a set to deduplicate.
This did require changing one part of LoopSimplify to not re-use the unique set of exits, but it only used it to check that there was a single unique exit. That part of the code is about to walk the exiting blocks anyways, so it seemed better to rewrite it to use those exiting blocks to compute this property on-demand.
I also had to ditch a statistic, but it doesn't seem terribly valuable.
Differential Revision: https://reviews.llvm.org/D34049
llvm-svn: 306081
show more ...
|
Revision tags: llvmorg-4.0.1, llvmorg-4.0.1-rc3 |
|
#
6bda14b3 |
| 06-Jun-2017 |
Chandler Carruth <chandlerc@gmail.com> |
Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line
Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days.
I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch.
This patch is *entirely* mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files.
Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again).
llvm-svn: 304787
show more ...
|
Revision tags: llvmorg-4.0.1-rc2 |
|
#
836b0f48 |
| 10-May-2017 |
Amara Emerson <amara.emerson@arm.com> |
Add a late IR expansion pass for the experimental reduction intrinsics.
This pass uses a new target hook to decide whether or not to expand a particular intrinsic to the shuffevector sequence.
Diff
Add a late IR expansion pass for the experimental reduction intrinsics.
This pass uses a new target hook to decide whether or not to expand a particular intrinsic to the shuffevector sequence.
Differential Revision: https://reviews.llvm.org/D32245
llvm-svn: 302631
show more ...
|
#
cf9daa33 |
| 09-May-2017 |
Amara Emerson <amara.emerson@arm.com> |
Introduce experimental generic intrinsics for horizontal vector reductions.
- This change allows targets to opt-in to using them instead of the log2 shufflevector algorithm. - The SLP and Loop vec
Introduce experimental generic intrinsics for horizontal vector reductions.
- This change allows targets to opt-in to using them instead of the log2 shufflevector algorithm. - The SLP and Loop vectorizers have the common code to do shuffle reductions factored out into LoopUtils, and now have a unified interface for generating reductions regardless of the preference of the target. LoopUtils now uses TTI to determine what kind of reductions the target wants to handle. - For CodeGen, basic legalization support is added.
Differential Revision: https://reviews.llvm.org/D30086
llvm-svn: 302514
show more ...
|
Revision tags: llvmorg-4.0.1-rc1 |
|
#
58ccc094 |
| 24-Apr-2017 |
Evgeniy Stepanov <eugeni.stepanov@gmail.com> |
Revert "Compute safety information in a much finer granularity."
Use-after-free in llvm::isGuaranteedToExecute.
llvm-svn: 301214
|
#
a266923d |
| 24-Apr-2017 |
Xin Tong <trent.xin.tong@gmail.com> |
Compute safety information in a much finer granularity.
Summary: Instead of keeping a variable indicating whether there are early exits in the loop. We keep all the early exits. This improves LICM'
Compute safety information in a much finer granularity.
Summary: Instead of keeping a variable indicating whether there are early exits in the loop. We keep all the early exits. This improves LICM's ability to move instructions out of the loop based on is-guaranteed-to-execute.
I am going to update compilation time as well soon.
Reviewers: hfinkel, sanjoy, efriedma, mkuper
Reviewed By: hfinkel
Subscribers: llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D32433
llvm-svn: 301196
show more ...
|
#
dcdb325f |
| 13-Apr-2017 |
Anna Thomas <anna@azul.com> |
[LV] Fix the vector code generation for first order recurrence
Summary: In first order recurrences where phi's are used outside the loop, we should generate an additional vector.extract of the secon
[LV] Fix the vector code generation for first order recurrence
Summary: In first order recurrences where phi's are used outside the loop, we should generate an additional vector.extract of the second last element from the vectorized phi update. This is because we require the phi itself (which is the value at the second last iteration of the vector loop) and not the phi's update within the loop. Also fix the code gen when we just unroll, but don't vectorize. Fixes PR32396.
Reviewers: mssimpso, mkuper, anemet
Subscribers: llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D31979
llvm-svn: 300238
show more ...
|
#
00dc1b74 |
| 11-Apr-2017 |
Anna Thomas <anna@azul.com> |
[LV] Avoid vectorizing first order recurrence when phi uses are outside loop
In the vectorization of first order recurrence, we vectorize such that the last element in the vector will be the one ext
[LV] Avoid vectorizing first order recurrence when phi uses are outside loop
In the vectorization of first order recurrence, we vectorize such that the last element in the vector will be the one extracted to pass into the scalar remainder loop. However, this is not true when there is a phi (other than the primary induction variable) is used outside the loop. In such a case, we need the value from the second last iteration (i.e. the phi value), not the last iteration (which would be the phi update). I've added a test case for this. Also see PR32396.
A follow up patch would generate the correct code gen for such cases, and turn this vectorization on.
Differential Revision: https://reviews.llvm.org/D31910
Reviewers: mssimpso llvm-svn: 299985
show more ...
|
Revision tags: llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3, llvmorg-4.0.0-rc2 |
|
#
0de990da |
| 18-Jan-2017 |
Michael Kuperstein <mkuper@google.com> |
Fix up a comment. NFC.
llvm-svn: 292425
|
#
7cefb409 |
| 18-Jan-2017 |
Michael Kuperstein <mkuper@google.com> |
[LV] Allow reductions that have several uses outside the loop
We currently check whether a reduction has a single outside user. We don't really need to require that - we just need to make sure a sin
[LV] Allow reductions that have several uses outside the loop
We currently check whether a reduction has a single outside user. We don't really need to require that - we just need to make sure a single value is used externally. The number of external users of that value shouldn't actually matter.
Differential Revision: https://reviews.llvm.org/D28830
llvm-svn: 292424
show more ...
|
Revision tags: llvmorg-4.0.0-rc1 |
|
#
ee31cbe3 |
| 10-Jan-2017 |
Michael Kuperstein <mkuper@google.com> |
[LV] Don't panic when encountering the IV of an outer loop.
Bail out instead of asserting when we encounter this situation, which can actually happen.
The reason the test uses the new PM is that th
[LV] Don't panic when encountering the IV of an outer loop.
Bail out instead of asserting when we encounter this situation, which can actually happen.
The reason the test uses the new PM is that the "bad" phi, incidentally, gets cleaned up by LoopSimplify. But LICM can create this kind of phi and preserve loop simplify form, so the cleanup has no chance to run.
This fixes PR31190. We may want to solve this in a less conservative manner, since this phi is actually uniform within the inner loop (or we may want LICM to output a cleaner promotion to begin with).
Differential Revision: https://reviews.llvm.org/D28490
llvm-svn: 291589
show more ...
|
Revision tags: llvmorg-3.9.1, llvmorg-3.9.1-rc3 |
|
#
997dac87 |
| 03-Dec-2016 |
Michael Kuperstein <mkuper@google.com> |
Remove stale comment. NFC.
llvm-svn: 288572
|
Revision tags: llvmorg-3.9.1-rc2 |
|
#
b151a641 |
| 30-Nov-2016 |
Michael Kuperstein <mkuper@google.com> |
[LoopUnroll] Implement profile-based loop peeling
This implements PGO-driven loop peeling.
The basic idea is that when the average dynamic trip-count of a loop is known, based on PGO, to be low, we
[LoopUnroll] Implement profile-based loop peeling
This implements PGO-driven loop peeling.
The basic idea is that when the average dynamic trip-count of a loop is known, based on PGO, to be low, we can expect a performance win by peeling off the first several iterations of that loop. Unlike unrolling based on a known trip count, or a trip count multiple, this doesn't save us the conditional check and branch on each iteration. However, it does allow us to simplify the straight-line code we get (constant-folding, etc.). This is important given that we know that we will usually only hit this code, and not the actual loop.
This is currently disabled by default.
Differential Revision: https://reviews.llvm.org/D25963
llvm-svn: 288274
show more ...
|
Revision tags: llvmorg-3.9.1-rc1 |
|
#
41d72a86 |
| 17-Nov-2016 |
Dehao Chen <dehao@google.com> |
Use profile info to adjust loop unroll threshold.
Summary: For flat loop, even if it is hot, it is not a good idea to unroll in runtime, thus we set a lower partial unroll threshold. For hot loop, w
Use profile info to adjust loop unroll threshold.
Summary: For flat loop, even if it is hot, it is not a good idea to unroll in runtime, thus we set a lower partial unroll threshold. For hot loop, we set a higher unroll threshold and allows expensive tripcount computation to allow more aggressive unrolling.
Reviewers: davidxl, mzolotukhin
Subscribers: sanjoy, mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D26527
llvm-svn: 287186
show more ...
|
#
c3ccf5d7 |
| 28-Oct-2016 |
Igor Laevsky <igmyrj@gmail.com> |
[LCSSA] Perform LCSSA verification only for the current loop nest.
Now LPPassManager will run LCSSA verification only for the top-level loop which was processed on the current iteration.
Differenti
[LCSSA] Perform LCSSA verification only for the current loop nest.
Now LPPassManager will run LCSSA verification only for the top-level loop which was processed on the current iteration.
Differential Revision: https://reviews.llvm.org/D25873
llvm-svn: 285394
show more ...
|
#
4f155b6e |
| 26-Aug-2016 |
Adam Nemet <anemet@apple.com> |
[LoopUnroll] Use OptimizationRemarkEmitter directly not via the analysis pass
We can't mark ORE (a function pass) preserved as required by the loop passes because that is how we ensure that the requ
[LoopUnroll] Use OptimizationRemarkEmitter directly not via the analysis pass
We can't mark ORE (a function pass) preserved as required by the loop passes because that is how we ensure that the required passes like LazyBFI are all available any time ORE is used. See the new comments in the patch.
Instead we use it directly just like the inliner does in D22694.
As expected there is some additional overhead after removing the caching provided by analysis passes. The worst case, I measured was LNT/CINT2006_ref/401.bzip2 which regresses by 12%. As before, this only affects -Rpass-with-hotness and not default compilation.
llvm-svn: 279829
show more ...
|
Revision tags: llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2 |
|
#
42531260 |
| 12-Aug-2016 |
David Majnemer <david.majnemer@gmail.com> |
Use the range variant of find/find_if instead of unpacking begin/end
If the result of the find is only used to compare against end(), just use is_contained instead.
No functionality change is inten
Use the range variant of find/find_if instead of unpacking begin/end
If the result of the find is only used to compare against end(), just use is_contained instead.
No functionality change is intended.
llvm-svn: 278469
show more ...
|
#
0a16c228 |
| 11-Aug-2016 |
David Majnemer <david.majnemer@gmail.com> |
Use range algorithms instead of unpacking begin/end
No functionality change is intended.
llvm-svn: 278417
|
Revision tags: llvmorg-3.9.0-rc1 |
|
#
12937c36 |
| 29-Jul-2016 |
Adam Nemet <anemet@apple.com> |
[LoopUnroll] Include hotness of region in opt remark
LoopUnroll is a loop pass, so the analysis of OptimizationRemarkEmitter is added to the common function analysis passes that loop passes depend o
[LoopUnroll] Include hotness of region in opt remark
LoopUnroll is a loop pass, so the analysis of OptimizationRemarkEmitter is added to the common function analysis passes that loop passes depend on.
The BFI and indirectly BPI used in this pass is computed lazily so no overhead should be observed unless -pass-remarks-with-hotness is used.
This is how the patch affects the O3 pipeline:
Dominator Tree Construction Natural Loop Information Canonicalize natural loops Loop-Closed SSA Form Pass Basic Alias Analysis (stateless AA impl) Function Alias Analysis Results Scalar Evolution Analysis + Lazy Branch Probability Analysis + Lazy Block Frequency Analysis + Optimization Remark Emitter Loop Pass Manager Rotate Loops Loop Invariant Code Motion Unswitch loops Simplify the CFG Dominator Tree Construction Basic Alias Analysis (stateless AA impl) Function Alias Analysis Results Combine redundant instructions Natural Loop Information Canonicalize natural loops Loop-Closed SSA Form Pass Scalar Evolution Analysis + Lazy Branch Probability Analysis + Lazy Block Frequency Analysis + Optimization Remark Emitter Loop Pass Manager Induction Variable Simplification Recognize loop idioms Delete dead loops Unroll loops ...
llvm-svn: 277203
show more ...
|
#
2f2bd8ca |
| 26-Jul-2016 |
Adam Nemet <anemet@apple.com> |
[LoopUtils] Sort headers
llvm-svn: 276776
|
#
376a18bd |
| 24-Jul-2016 |
Elena Demikhovsky <elena.demikhovsky@intel.com> |
[Loop Vectorizer] Handling loops FP induction variables.
Allowed loop vectorization with secondary FP IVs. Like this: float *A; float x = init; for (int i=0; i < N; ++i) { A[i] = x; x -= fp_inc;
[Loop Vectorizer] Handling loops FP induction variables.
Allowed loop vectorization with secondary FP IVs. Like this: float *A; float x = init; for (int i=0; i < N; ++i) { A[i] = x; x -= fp_inc; }
The auto-vectorization is possible when the induction binary operator is "fast" or the function has "unsafe" attribute.
Differential Revision: https://reviews.llvm.org/D21330
llvm-svn: 276554
show more ...
|
#
f1da33e4 |
| 11-Jun-2016 |
Eli Friedman <eli.friedman@gmail.com> |
[LICM] Make isGuaranteedToExecute more accurate.
Summary: Make isGuaranteedToExecute use the isGuaranteedToTransferExecutionToSuccessor helper, and make that helper a bit more accurate.
There's a p
[LICM] Make isGuaranteedToExecute more accurate.
Summary: Make isGuaranteedToExecute use the isGuaranteedToTransferExecutionToSuccessor helper, and make that helper a bit more accurate.
There's a potential performance impact here from assuming that arbitrary calls might not return. This probably has little impact on loads and stores to a pointer because most things alias analysis can reason about are dereferenceable anyway. The other impacts, like less aggressive hoisting of sdiv by a variable and less aggressive hoisting around volatile memory operations, are unlikely to matter for real code.
This also impacts SCEV, which uses the same helper. It's a minor improvement there because we can tell that, for example, memcpy always returns normally. Strictly speaking, it's also introducing a bug, but it's not any worse than everywhere else we assume readonly functions terminate.
Fixes http://llvm.org/PR27857.
Reviewers: hfinkel, reames, chandlerc, sanjoy
Subscribers: broune, llvm-commits
Differential Revision: http://reviews.llvm.org/D21167
llvm-svn: 272489
show more ...
|
#
122f984a |
| 10-Jun-2016 |
Evgeniy Stepanov <eugeni.stepanov@gmail.com> |
Move isGuaranteedToExecute out of LICM.
Also rename LICMSafetyInfo to LoopSafetyInfo. Both will be used in LoopUnswitch in a separate change.
llvm-svn: 272420
|
#
e12c487b |
| 09-Jun-2016 |
Easwaran Raman <eraman@google.com> |
[PM] Port LCSSA to the new PM.
Differential Revision: http://reviews.llvm.org/D21090
llvm-svn: 272294
|