History log of /llvm-project/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp (Results 101 – 125 of 369)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# bbdcc821 02-Aug-2019 Serguei Katkov <serguei.katkov@azul.com>

[Loop Peeling] Do not close further unroll/peel if profile based peeling was not used.

Current peeling cost model can decide to peel off not all iterations
but only some of them to eliminate conditi

[Loop Peeling] Do not close further unroll/peel if profile based peeling was not used.

Current peeling cost model can decide to peel off not all iterations
but only some of them to eliminate conditions on phi. At the same time
if any peeling happens the door for further unroll/peel optimizations on that
loop closes because the part of the code thinks that if peeling happened
it is profile based peeling and all iterations are peeled off.

To resolve this inconsistency the patch provides the flag which states whether
the full peeling basing on profile is enabled or not and peeling cost model
is able to modify this field like it does not PeelCount.

In a separate patch I will introduce an option to allow/disallow peeling basing
on profile.

To avoid infinite loop peeling the patch tracks the total number of peeled iteration
through llvm.loop.peeled.count loop metadata.

Reviewers: reames, fhahn
Reviewed By: reames
Subscribers: hiraditya, zzheng, dmgreen, llvm-commits
Differential Revision: https://reviews.llvm.org/D64972

llvm-svn: 367647

show more ...


Revision tags: llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2
# d82ddfa7 23-May-2019 Alina Sbirlea <asbirlea@google.com>

[NewPassManager] Add tuning option: ForgetAllSCEVInLoopUnroll [NFC].

Summary: Mirror tuning option from old pass manager in new pass manager.

Reviewers: chandlerc

Subscribers: mehdi_amini, jlebar,

[NewPassManager] Add tuning option: ForgetAllSCEVInLoopUnroll [NFC].

Summary: Mirror tuning option from old pass manager in new pass manager.

Reviewers: chandlerc

Subscribers: mehdi_amini, jlebar, zzheng, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61612

llvm-svn: 361560

show more ...


Revision tags: llvmorg-8.0.1-rc1
# f31eba64 08-May-2019 Alina Sbirlea <asbirlea@google.com>

[MemorySSA] Teach LoopSimplify to preserve MemorySSA.

Summary:
Preserve MemorySSA in LoopSimplify, in the old pass manager, if the analysis is available.
Do not preserve it in the new pass manager.

[MemorySSA] Teach LoopSimplify to preserve MemorySSA.

Summary:
Preserve MemorySSA in LoopSimplify, in the old pass manager, if the analysis is available.
Do not preserve it in the new pass manager.
Update tests.

Subscribers: nemanjai, jlebar, javed.absar, Prazek, kbarton, zzheng, jsji, llvm-commits, george.burgess.iv, chandlerc

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60833

llvm-svn: 360270

show more ...


# da0f71af 18-Apr-2019 Alina Sbirlea <asbirlea@google.com>

[LoopUnroll] Move list of params into a struct [NFCI].

Summary: Cleanup suggested in review of r358304.

Reviewers: sanjoy, efriedma

Subscribers: jlebar, zzheng, dmgreen, llvm-commits

Tags: #llvm

[LoopUnroll] Move list of params into a struct [NFCI].

Summary: Cleanup suggested in review of r358304.

Reviewers: sanjoy, efriedma

Subscribers: jlebar, zzheng, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60638

llvm-svn: 358723

show more ...


# 893aea58 17-Apr-2019 Florian Hahn <flo@fhahn.com>

[LoopUnroll] Allow unrolling if the unrolled size does not exceed loop size.

Summary:
In the following cases, unrolling can be beneficial, even when
optimizing for code size:
1) very low trip count

[LoopUnroll] Allow unrolling if the unrolled size does not exceed loop size.

Summary:
In the following cases, unrolling can be beneficial, even when
optimizing for code size:
1) very low trip counts
2) potential to constant fold most instructions after fully unrolling.

We can unroll in those cases, by setting the unrolling threshold to the
loop size. This might highlight some cost modeling issues and fixing
them will have a positive impact in general.

Reviewers: vsk, efriedma, dmgreen, paquette

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D60265

llvm-svn: 358586

show more ...


# 09e539fc 15-Apr-2019 Hiroshi Yamauchi <yamauchi@google.com>

[PGO] Profile guided code size optimization.

Summary:
Enable some of the existing size optimizations for cold code under PGO.

A ~5% code size saving in big internal app under PGO.

The way it gets

[PGO] Profile guided code size optimization.

Summary:
Enable some of the existing size optimizations for cold code under PGO.

A ~5% code size saving in big internal app under PGO.

The way it gets BFI/PSI is discussed in the RFC thread

http://lists.llvm.org/pipermail/llvm-dev/2019-March/130894.html

Note it doesn't currently touch loop passes.

Reviewers: davidxl, eraman

Reviewed By: eraman

Subscribers: mgorny, javed.absar, smeenai, mehdi_amini, eraman, zzheng, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59514

llvm-svn: 358422

show more ...


# 2312a06c 12-Apr-2019 Alina Sbirlea <asbirlea@google.com>

[SCEV] Add option to forget everything in SCEV.

Summary:
Create a method to forget everything in SCEV.
Add a cl::opt and PassManagerBuilder option to use this in LoopUnroll.

Motivation: Certain Hal

[SCEV] Add option to forget everything in SCEV.

Summary:
Create a method to forget everything in SCEV.
Add a cl::opt and PassManagerBuilder option to use this in LoopUnroll.

Motivation: Certain Halide applications spend a very long time compiling in forgetLoop, and prefer to forget everything and rebuild SCEV from scratch.
Sample difference in compile time reduction: 21.04 to 14.78 using current ToT release build.
Testcase showcasing this cannot be opensourced and is fairly large.

The option disabled by default, but it may be desirable to enable by
default. Evidence in favor (two difference runs on different days/ToT state):

File Before (s) After (s)
clang-9.bc 7267.91 6639.14
llvm-as.bc 194.12 194.12
llvm-dis.bc 62.50 62.50
opt.bc 1855.85 1857.53

File Before (s) After (s)
clang-9.bc 8588.70 7812.83
llvm-as.bc 196.20 194.78
llvm-dis.bc 61.55 61.97
opt.bc 1739.78 1886.26

Reviewers: sanjoy

Subscribers: mehdi_amini, jlebar, zzheng, javed.absar, dmgreen, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60144

llvm-svn: 358304

show more ...


# 85bd3978 04-Apr-2019 Evandro Menezes <e.menezes@samsung.com>

[IR] Refactor attribute methods in Function class (NFC)

Rename the functions that query the optimization kind attributes.

Differential revision: https://reviews.llvm.org/D60287

llvm-svn: 357731


Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1
# 2946cd70 19-Jan-2019 Chandler Carruth <chandlerc@gmail.com>

Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the ne

Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636

show more ...


# 3284775b 18-Dec-2018 Michael Kruse <llvm@meinersbur.de>

[LoopUnroll] Honor '#pragma unroll' even with -fno-unroll-loops.

When using clang with `-fno-unroll-loops` (implicitly added with `-O1`),
the LoopUnrollPass is not not added to the (legacy) pass pip

[LoopUnroll] Honor '#pragma unroll' even with -fno-unroll-loops.

When using clang with `-fno-unroll-loops` (implicitly added with `-O1`),
the LoopUnrollPass is not not added to the (legacy) pass pipeline. This
also means that it will not process any loop metadata such as
llvm.loop.unroll.enable (which is generated by #pragma unroll or
WarnMissedTransformationsPass emits a warning that a forced
transformation has not been applied (see
https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181210/610833.html).
Such explicit transformations should take precedence over disabling
heuristics.

This patch unconditionally adds LoopUnrollPass to the optimizing
pipeline (that is, it is still not added with `-O0`), but passes a flag
indicating whether automatic unrolling is dis-/enabled. This is the same
approach as LoopVectorize uses.

The new pass manager's pipeline builder has no option to disable
unrolling, hence the problem does not apply.

Differential Revision: https://reviews.llvm.org/D55716

llvm-svn: 349509

show more ...


# 72448525 12-Dec-2018 Michael Kruse <llvm@meinersbur.de>

[Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes.

When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of thei

[Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes.

When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g.

#pragma clang loop unroll_and_jam(enable)
#pragma clang loop distribute(enable)

is the same as

#pragma clang loop distribute(enable)
#pragma clang loop unroll_and_jam(enable)

and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used.

This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance,

!0 = !{!0, !1, !2}
!1 = !{!"llvm.loop.unroll_and_jam.enable"}
!2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3}
!3 = !{!"llvm.loop.distribute.enable"}

defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop.

Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account.

For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations.

Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated.

To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied.

With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling).

Reviewed By: hfinkel, dmgreen

Differential Revision: https://reviews.llvm.org/D49281
Differential Revision: https://reviews.llvm.org/D55288

llvm-svn: 348944

show more ...


Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1
# 412ed347 31-Oct-2018 Fedor Sergeev <fedor.sergeev@azul.com>

[LoopUnroll] allow customization for new-pass-manager version of LoopUnroll

Unlike its legacy counterpart new pass manager's LoopUnrollPass does
not provide any means to select which flavors of unro

[LoopUnroll] allow customization for new-pass-manager version of LoopUnroll

Unlike its legacy counterpart new pass manager's LoopUnrollPass does
not provide any means to select which flavors of unroll to run
(runtime, peeling, partial), relying on global defaults.

In some cases having ability to run a restricted LoopUnroll that
does more than LoopFullUnroll is needed.

Introduced LoopUnrollOptions to select optional unroll behaviors.
Added 'unroll<peeling>' to PassRegistry mainly for the sake of testing.

Reviewers: chandlerc, tejohnson
Differential Revision: https://reviews.llvm.org/D53440

llvm-svn: 345723

show more ...


# edb12a83 15-Oct-2018 Chandler Carruth <chandlerc@gmail.com>

[TI removal] Make variables declared as `TerminatorInst` and initialized
by `getTerminator()` calls instead be declared as `Instruction`.

This is the biggest remaining chunk of the usage of `getTerm

[TI removal] Make variables declared as `TerminatorInst` and initialized
by `getTerminator()` calls instead be declared as `Instruction`.

This is the biggest remaining chunk of the usage of `getTerminator()`
that insists on the narrow type and so is an easy batch of updates.
Several files saw more extensive updates where this would cascade to
requiring API updates within the file to use `Instruction` instead of
`TerminatorInst`. All of these were trivial in nature (pervasively using
`Instruction` instead just worked).

llvm-svn: 344502

show more ...


# d2261f61 05-Oct-2018 Neil Henning <neil.henning@amd.com>

Add missing period to comment to match style of file.

This is a test commit to show that my commit access is working.

llvm-svn: 343842


Revision tags: llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1
# f78650a8 30-Jul-2018 Fangrui Song <maskray@google.com>

Remove trailing space

sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}

llvm-svn: 338293


# 963401d2 01-Jul-2018 David Green <david.green@arm.com>

[UnrollAndJam] New Unroll and Jam pass

This is a simple implementation of the unroll-and-jam classical loop
optimisation.

The basic idea is that we take an outer loop of the form:

for i..
Fo

[UnrollAndJam] New Unroll and Jam pass

This is a simple implementation of the unroll-and-jam classical loop
optimisation.

The basic idea is that we take an outer loop of the form:

for i..
ForeBlocks(i)
for j..
SubLoopBlocks(i, j)
AftBlocks(i)

Instead of doing normal inner or outer unrolling, we unroll as follows:

for i... i+=2
ForeBlocks(i)
ForeBlocks(i+1)
for j..
SubLoopBlocks(i, j)
SubLoopBlocks(i+1, j)
AftBlocks(i)
AftBlocks(i+1)
Remainder Loop

So we have unrolled the outer loop, then jammed the two inner loops into
one. This can lead to a simpler inner loop if memory accesses can be shared
between the now jammed loops.

To do this we have to prove that this is all safe, both for the memory
accesses (using dependence analysis) and that ForeBlocks(i+1) can move before
AftBlocks(i) and SubLoopBlocks(i, j).

Differential Revision: https://reviews.llvm.org/D41953

llvm-svn: 336062

show more ...


# 2c1a570a 26-Jun-2018 Matt Arsenault <Matthew.Arsenault@amd.com>

LoopUnroll: Allow analyzing intrinsic call costs

I'm not sure why the code here is skipping calls since
TTI does try to do something for general calls, but it
at least should allow intrinsics.

Skip

LoopUnroll: Allow analyzing intrinsic call costs

I'm not sure why the code here is skipping calls since
TTI does try to do something for general calls, but it
at least should allow intrinsics.

Skip intrinsics that should not be omitted as calls, which
is by far the most common case on AMDGPU.

llvm-svn: 335645

show more ...


Revision tags: llvmorg-6.0.1, llvmorg-6.0.1-rc3
# f209649d 14-Jun-2018 Hiroshi Inoue <inouehrs@jp.ibm.com>

[NFC] fix trivial typos in comments

llvm-svn: 334687


Revision tags: llvmorg-6.0.1-rc2
# aee7ad0c 27-May-2018 David Green <david.green@arm.com>

Revert 333358 as it's failing on some builders.

I'm guessing the tests reply on the ARM backend being built.

llvm-svn: 333359


# 3034281b 27-May-2018 David Green <david.green@arm.com>

[UnrollAndJam] Add a new Unroll and Jam pass

This is a simple implementation of the unroll-and-jam classical loop
optimisation.

The basic idea is that we take an outer loop of the form:

for i..

[UnrollAndJam] Add a new Unroll and Jam pass

This is a simple implementation of the unroll-and-jam classical loop
optimisation.

The basic idea is that we take an outer loop of the form:

for i..
ForeBlocks(i)
for j..
SubLoopBlocks(i, j)
AftBlocks(i)

Instead of doing normal inner or outer unrolling, we unroll as follows:

for i... i+=2
ForeBlocks(i)
ForeBlocks(i+1)
for j..
SubLoopBlocks(i, j)
SubLoopBlocks(i+1, j)
AftBlocks(i)
AftBlocks(i+1)
Remainder

So we have unrolled the outer loop, then jammed the two inner loops into
one. This can lead to a simpler inner loop if memory accesses can be shared
between the now-jammed loops.

To do this we have to prove that this is all safe, both for the memory
accesses (using dependence analysis) and that ForeBlocks(i+1) can move before
AftBlocks(i) and SubLoopBlocks(i, j).

Differential Revision: https://reviews.llvm.org/D41953

llvm-svn: 333358

show more ...


# d34e60ca 14-May-2018 Nicola Zaghen <nicola.zaghen@imgtec.com>

Rename DEBUG macro to LLVM_DEBUG.

The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/

Rename DEBUG macro to LLVM_DEBUG.

The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624

llvm-svn: 332240

show more ...


# 5f8f34e4 01-May-2018 Adrian Prantl <aprantl@apple.com>

Remove \brief commands from doxygen comments.

We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they ar

Remove \brief commands from doxygen comments.

We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

llvm-svn: 331272

show more ...


Revision tags: llvmorg-6.0.1-rc1
# 1376d934 03-Apr-2018 Ikhlas Ajbar <iajbar@codeaurora.org>

[Hexagon] peel loops with runtime small trip counts

Move the check canPeel() to Hexagon Target before setting PeelCount.

Differential Revision: https://reviews.llvm.org/D44880

llvm-svn: 329129


# b7322e8a 03-Apr-2018 Ikhlas Ajbar <iajbar@codeaurora.org>

peel loops with runtime small trip counts

For Hexagon, peeling loops with small runtime trip count is beneficial for our
benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount se

peel loops with runtime small trip counts

For Hexagon, peeling loops with small runtime trip count is beneficial for our
benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set
by the target for computing the desired peel count.

Differential Revision: https://reviews.llvm.org/D44880

llvm-svn: 329042

show more ...


Revision tags: llvmorg-5.0.2, llvmorg-5.0.2-rc2
# a373d18e 28-Mar-2018 David Blaikie <dblaikie@gmail.com>

Transforms: Introduce Transforms/Utils.h rather than spreading the declarations amongst Scalar.h and IPO.h

Fixes layering - Transforms/Utils shouldn't depend on including a Scalar
or IPO header, bec

Transforms: Introduce Transforms/Utils.h rather than spreading the declarations amongst Scalar.h and IPO.h

Fixes layering - Transforms/Utils shouldn't depend on including a Scalar
or IPO header, because Scalar and IPO depend on Utils.

llvm-svn: 328717

show more ...


12345678910>>...15