#
bbdcc821 |
| 02-Aug-2019 |
Serguei Katkov <serguei.katkov@azul.com> |
[Loop Peeling] Do not close further unroll/peel if profile based peeling was not used.
Current peeling cost model can decide to peel off not all iterations but only some of them to eliminate conditi
[Loop Peeling] Do not close further unroll/peel if profile based peeling was not used.
Current peeling cost model can decide to peel off not all iterations but only some of them to eliminate conditions on phi. At the same time if any peeling happens the door for further unroll/peel optimizations on that loop closes because the part of the code thinks that if peeling happened it is profile based peeling and all iterations are peeled off.
To resolve this inconsistency the patch provides the flag which states whether the full peeling basing on profile is enabled or not and peeling cost model is able to modify this field like it does not PeelCount.
In a separate patch I will introduce an option to allow/disallow peeling basing on profile.
To avoid infinite loop peeling the patch tracks the total number of peeled iteration through llvm.loop.peeled.count loop metadata.
Reviewers: reames, fhahn Reviewed By: reames Subscribers: hiraditya, zzheng, dmgreen, llvm-commits Differential Revision: https://reviews.llvm.org/D64972
llvm-svn: 367647
show more ...
|
Revision tags: llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2 |
|
#
d82ddfa7 |
| 23-May-2019 |
Alina Sbirlea <asbirlea@google.com> |
[NewPassManager] Add tuning option: ForgetAllSCEVInLoopUnroll [NFC].
Summary: Mirror tuning option from old pass manager in new pass manager.
Reviewers: chandlerc
Subscribers: mehdi_amini, jlebar,
[NewPassManager] Add tuning option: ForgetAllSCEVInLoopUnroll [NFC].
Summary: Mirror tuning option from old pass manager in new pass manager.
Reviewers: chandlerc
Subscribers: mehdi_amini, jlebar, zzheng, dmgreen, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61612
llvm-svn: 361560
show more ...
|
Revision tags: llvmorg-8.0.1-rc1 |
|
#
f31eba64 |
| 08-May-2019 |
Alina Sbirlea <asbirlea@google.com> |
[MemorySSA] Teach LoopSimplify to preserve MemorySSA.
Summary: Preserve MemorySSA in LoopSimplify, in the old pass manager, if the analysis is available. Do not preserve it in the new pass manager.
[MemorySSA] Teach LoopSimplify to preserve MemorySSA.
Summary: Preserve MemorySSA in LoopSimplify, in the old pass manager, if the analysis is available. Do not preserve it in the new pass manager. Update tests.
Subscribers: nemanjai, jlebar, javed.absar, Prazek, kbarton, zzheng, jsji, llvm-commits, george.burgess.iv, chandlerc
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60833
llvm-svn: 360270
show more ...
|
#
da0f71af |
| 18-Apr-2019 |
Alina Sbirlea <asbirlea@google.com> |
[LoopUnroll] Move list of params into a struct [NFCI].
Summary: Cleanup suggested in review of r358304.
Reviewers: sanjoy, efriedma
Subscribers: jlebar, zzheng, dmgreen, llvm-commits
Tags: #llvm
[LoopUnroll] Move list of params into a struct [NFCI].
Summary: Cleanup suggested in review of r358304.
Reviewers: sanjoy, efriedma
Subscribers: jlebar, zzheng, dmgreen, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60638
llvm-svn: 358723
show more ...
|
#
893aea58 |
| 17-Apr-2019 |
Florian Hahn <flo@fhahn.com> |
[LoopUnroll] Allow unrolling if the unrolled size does not exceed loop size.
Summary: In the following cases, unrolling can be beneficial, even when optimizing for code size: 1) very low trip count
[LoopUnroll] Allow unrolling if the unrolled size does not exceed loop size.
Summary: In the following cases, unrolling can be beneficial, even when optimizing for code size: 1) very low trip counts 2) potential to constant fold most instructions after fully unrolling.
We can unroll in those cases, by setting the unrolling threshold to the loop size. This might highlight some cost modeling issues and fixing them will have a positive impact in general.
Reviewers: vsk, efriedma, dmgreen, paquette
Reviewed By: paquette
Differential Revision: https://reviews.llvm.org/D60265
llvm-svn: 358586
show more ...
|
#
09e539fc |
| 15-Apr-2019 |
Hiroshi Yamauchi <yamauchi@google.com> |
[PGO] Profile guided code size optimization.
Summary: Enable some of the existing size optimizations for cold code under PGO.
A ~5% code size saving in big internal app under PGO.
The way it gets
[PGO] Profile guided code size optimization.
Summary: Enable some of the existing size optimizations for cold code under PGO.
A ~5% code size saving in big internal app under PGO.
The way it gets BFI/PSI is discussed in the RFC thread
http://lists.llvm.org/pipermail/llvm-dev/2019-March/130894.html
Note it doesn't currently touch loop passes.
Reviewers: davidxl, eraman
Reviewed By: eraman
Subscribers: mgorny, javed.absar, smeenai, mehdi_amini, eraman, zzheng, steven_wu, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59514
llvm-svn: 358422
show more ...
|
#
2312a06c |
| 12-Apr-2019 |
Alina Sbirlea <asbirlea@google.com> |
[SCEV] Add option to forget everything in SCEV.
Summary: Create a method to forget everything in SCEV. Add a cl::opt and PassManagerBuilder option to use this in LoopUnroll.
Motivation: Certain Hal
[SCEV] Add option to forget everything in SCEV.
Summary: Create a method to forget everything in SCEV. Add a cl::opt and PassManagerBuilder option to use this in LoopUnroll.
Motivation: Certain Halide applications spend a very long time compiling in forgetLoop, and prefer to forget everything and rebuild SCEV from scratch. Sample difference in compile time reduction: 21.04 to 14.78 using current ToT release build. Testcase showcasing this cannot be opensourced and is fairly large.
The option disabled by default, but it may be desirable to enable by default. Evidence in favor (two difference runs on different days/ToT state):
File Before (s) After (s) clang-9.bc 7267.91 6639.14 llvm-as.bc 194.12 194.12 llvm-dis.bc 62.50 62.50 opt.bc 1855.85 1857.53
File Before (s) After (s) clang-9.bc 8588.70 7812.83 llvm-as.bc 196.20 194.78 llvm-dis.bc 61.55 61.97 opt.bc 1739.78 1886.26
Reviewers: sanjoy
Subscribers: mehdi_amini, jlebar, zzheng, javed.absar, dmgreen, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60144
llvm-svn: 358304
show more ...
|
#
85bd3978 |
| 04-Apr-2019 |
Evandro Menezes <e.menezes@samsung.com> |
[IR] Refactor attribute methods in Function class (NFC)
Rename the functions that query the optimization kind attributes.
Differential revision: https://reviews.llvm.org/D60287
llvm-svn: 357731
|
Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1 |
|
#
2946cd70 |
| 19-Jan-2019 |
Chandler Carruth <chandlerc@gmail.com> |
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the ne
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository.
llvm-svn: 351636
show more ...
|
#
3284775b |
| 18-Dec-2018 |
Michael Kruse <llvm@meinersbur.de> |
[LoopUnroll] Honor '#pragma unroll' even with -fno-unroll-loops.
When using clang with `-fno-unroll-loops` (implicitly added with `-O1`), the LoopUnrollPass is not not added to the (legacy) pass pip
[LoopUnroll] Honor '#pragma unroll' even with -fno-unroll-loops.
When using clang with `-fno-unroll-loops` (implicitly added with `-O1`), the LoopUnrollPass is not not added to the (legacy) pass pipeline. This also means that it will not process any loop metadata such as llvm.loop.unroll.enable (which is generated by #pragma unroll or WarnMissedTransformationsPass emits a warning that a forced transformation has not been applied (see https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181210/610833.html). Such explicit transformations should take precedence over disabling heuristics.
This patch unconditionally adds LoopUnrollPass to the optimizing pipeline (that is, it is still not added with `-O0`), but passes a flag indicating whether automatic unrolling is dis-/enabled. This is the same approach as LoopVectorize uses.
The new pass manager's pipeline builder has no option to disable unrolling, hence the problem does not apply.
Differential Revision: https://reviews.llvm.org/D55716
llvm-svn: 349509
show more ...
|
#
72448525 |
| 12-Dec-2018 |
Michael Kruse <llvm@meinersbur.de> |
[Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes.
When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of thei
[Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes.
When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g.
#pragma clang loop unroll_and_jam(enable) #pragma clang loop distribute(enable)
is the same as
#pragma clang loop distribute(enable) #pragma clang loop unroll_and_jam(enable)
and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used.
This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance,
!0 = !{!0, !1, !2} !1 = !{!"llvm.loop.unroll_and_jam.enable"} !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3} !3 = !{!"llvm.loop.distribute.enable"}
defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop.
Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account.
For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations.
Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated.
To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied.
With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling).
Reviewed By: hfinkel, dmgreen
Differential Revision: https://reviews.llvm.org/D49281 Differential Revision: https://reviews.llvm.org/D55288
llvm-svn: 348944
show more ...
|
Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1 |
|
#
412ed347 |
| 31-Oct-2018 |
Fedor Sergeev <fedor.sergeev@azul.com> |
[LoopUnroll] allow customization for new-pass-manager version of LoopUnroll
Unlike its legacy counterpart new pass manager's LoopUnrollPass does not provide any means to select which flavors of unro
[LoopUnroll] allow customization for new-pass-manager version of LoopUnroll
Unlike its legacy counterpart new pass manager's LoopUnrollPass does not provide any means to select which flavors of unroll to run (runtime, peeling, partial), relying on global defaults.
In some cases having ability to run a restricted LoopUnroll that does more than LoopFullUnroll is needed.
Introduced LoopUnrollOptions to select optional unroll behaviors. Added 'unroll<peeling>' to PassRegistry mainly for the sake of testing.
Reviewers: chandlerc, tejohnson Differential Revision: https://reviews.llvm.org/D53440
llvm-svn: 345723
show more ...
|
#
edb12a83 |
| 15-Oct-2018 |
Chandler Carruth <chandlerc@gmail.com> |
[TI removal] Make variables declared as `TerminatorInst` and initialized by `getTerminator()` calls instead be declared as `Instruction`.
This is the biggest remaining chunk of the usage of `getTerm
[TI removal] Make variables declared as `TerminatorInst` and initialized by `getTerminator()` calls instead be declared as `Instruction`.
This is the biggest remaining chunk of the usage of `getTerminator()` that insists on the narrow type and so is an easy batch of updates. Several files saw more extensive updates where this would cascade to requiring API updates within the file to use `Instruction` instead of `TerminatorInst`. All of these were trivial in nature (pervasively using `Instruction` instead just worked).
llvm-svn: 344502
show more ...
|
#
d2261f61 |
| 05-Oct-2018 |
Neil Henning <neil.henning@amd.com> |
Add missing period to comment to match style of file.
This is a test commit to show that my commit access is working.
llvm-svn: 343842
|
Revision tags: llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1 |
|
#
f78650a8 |
| 30-Jul-2018 |
Fangrui Song <maskray@google.com> |
Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}
llvm-svn: 338293
|
#
963401d2 |
| 01-Jul-2018 |
David Green <david.green@arm.com> |
[UnrollAndJam] New Unroll and Jam pass
This is a simple implementation of the unroll-and-jam classical loop optimisation.
The basic idea is that we take an outer loop of the form:
for i.. Fo
[UnrollAndJam] New Unroll and Jam pass
This is a simple implementation of the unroll-and-jam classical loop optimisation.
The basic idea is that we take an outer loop of the form:
for i.. ForeBlocks(i) for j.. SubLoopBlocks(i, j) AftBlocks(i)
Instead of doing normal inner or outer unrolling, we unroll as follows:
for i... i+=2 ForeBlocks(i) ForeBlocks(i+1) for j.. SubLoopBlocks(i, j) SubLoopBlocks(i+1, j) AftBlocks(i) AftBlocks(i+1) Remainder Loop
So we have unrolled the outer loop, then jammed the two inner loops into one. This can lead to a simpler inner loop if memory accesses can be shared between the now jammed loops.
To do this we have to prove that this is all safe, both for the memory accesses (using dependence analysis) and that ForeBlocks(i+1) can move before AftBlocks(i) and SubLoopBlocks(i, j).
Differential Revision: https://reviews.llvm.org/D41953
llvm-svn: 336062
show more ...
|
#
2c1a570a |
| 26-Jun-2018 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
LoopUnroll: Allow analyzing intrinsic call costs
I'm not sure why the code here is skipping calls since TTI does try to do something for general calls, but it at least should allow intrinsics.
Skip
LoopUnroll: Allow analyzing intrinsic call costs
I'm not sure why the code here is skipping calls since TTI does try to do something for general calls, but it at least should allow intrinsics.
Skip intrinsics that should not be omitted as calls, which is by far the most common case on AMDGPU.
llvm-svn: 335645
show more ...
|
Revision tags: llvmorg-6.0.1, llvmorg-6.0.1-rc3 |
|
#
f209649d |
| 14-Jun-2018 |
Hiroshi Inoue <inouehrs@jp.ibm.com> |
[NFC] fix trivial typos in comments
llvm-svn: 334687
|
Revision tags: llvmorg-6.0.1-rc2 |
|
#
aee7ad0c |
| 27-May-2018 |
David Green <david.green@arm.com> |
Revert 333358 as it's failing on some builders.
I'm guessing the tests reply on the ARM backend being built.
llvm-svn: 333359
|
#
3034281b |
| 27-May-2018 |
David Green <david.green@arm.com> |
[UnrollAndJam] Add a new Unroll and Jam pass
This is a simple implementation of the unroll-and-jam classical loop optimisation.
The basic idea is that we take an outer loop of the form:
for i..
[UnrollAndJam] Add a new Unroll and Jam pass
This is a simple implementation of the unroll-and-jam classical loop optimisation.
The basic idea is that we take an outer loop of the form:
for i.. ForeBlocks(i) for j.. SubLoopBlocks(i, j) AftBlocks(i)
Instead of doing normal inner or outer unrolling, we unroll as follows:
for i... i+=2 ForeBlocks(i) ForeBlocks(i+1) for j.. SubLoopBlocks(i, j) SubLoopBlocks(i+1, j) AftBlocks(i) AftBlocks(i+1) Remainder
So we have unrolled the outer loop, then jammed the two inner loops into one. This can lead to a simpler inner loop if memory accesses can be shared between the now-jammed loops.
To do this we have to prove that this is all safe, both for the memory accesses (using dependence analysis) and that ForeBlocks(i+1) can move before AftBlocks(i) and SubLoopBlocks(i, j).
Differential Revision: https://reviews.llvm.org/D41953
llvm-svn: 333358
show more ...
|
#
d34e60ca |
| 14-May-2018 |
Nicola Zaghen <nicola.zaghen@imgtec.com> |
Rename DEBUG macro to LLVM_DEBUG. The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/
Rename DEBUG macro to LLVM_DEBUG. The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it.
In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one.
Differential Revision: https://reviews.llvm.org/D43624
llvm-svn: 332240
show more ...
|
#
5f8f34e4 |
| 01-May-2018 |
Adrian Prantl <aprantl@apple.com> |
Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they ar
Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all.
Patch produced by
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46290
llvm-svn: 331272
show more ...
|
Revision tags: llvmorg-6.0.1-rc1 |
|
#
1376d934 |
| 03-Apr-2018 |
Ikhlas Ajbar <iajbar@codeaurora.org> |
[Hexagon] peel loops with runtime small trip counts
Move the check canPeel() to Hexagon Target before setting PeelCount.
Differential Revision: https://reviews.llvm.org/D44880
llvm-svn: 329129
|
#
b7322e8a |
| 03-Apr-2018 |
Ikhlas Ajbar <iajbar@codeaurora.org> |
peel loops with runtime small trip counts
For Hexagon, peeling loops with small runtime trip count is beneficial for our benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount se
peel loops with runtime small trip counts
For Hexagon, peeling loops with small runtime trip count is beneficial for our benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set by the target for computing the desired peel count.
Differential Revision: https://reviews.llvm.org/D44880
llvm-svn: 329042
show more ...
|
Revision tags: llvmorg-5.0.2, llvmorg-5.0.2-rc2 |
|
#
a373d18e |
| 28-Mar-2018 |
David Blaikie <dblaikie@gmail.com> |
Transforms: Introduce Transforms/Utils.h rather than spreading the declarations amongst Scalar.h and IPO.h
Fixes layering - Transforms/Utils shouldn't depend on including a Scalar or IPO header, bec
Transforms: Introduce Transforms/Utils.h rather than spreading the declarations amongst Scalar.h and IPO.h
Fixes layering - Transforms/Utils shouldn't depend on including a Scalar or IPO header, because Scalar and IPO depend on Utils.
llvm-svn: 328717
show more ...
|