History log of /llvm-project/llvm/lib/CodeGen/HardwareLoops.cpp (Results 1 – 25 of 47)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 735ab61a 13-Nov-2024 Kazu Hirata <kazu@google.com>

[CodeGen] Remove unused includes (NFC) (#115996)

Identified with misc-include-cleaner.


Revision tags: llvmorg-19.1.3
# 85c17e40 17-Oct-2024 Jay Foad <jay.foad@amd.com>

[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706)

Convert many instances of:
Fn = Intrinsic::getOrInsertDeclaration(...);
CreateCall(Fn, ...)
to the equivalent CreateIntrinsi

[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706)

Convert many instances of:
Fn = Intrinsic::getOrInsertDeclaration(...);
CreateCall(Fn, ...)
to the equivalent CreateIntrinsic call.

show more ...


Revision tags: llvmorg-19.1.2
# fa789dff 11-Oct-2024 Rahul Joshi <rjoshi@nvidia.com>

[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752)

Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is a

[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752)

Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).

show more ...


Revision tags: llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 74deadf1 29-Jun-2024 Nikita Popov <llvm@npopov.com>

[IRBuilder] Don't include Module.h (NFC) (#97159)

This used to be necessary to fetch the DataLayout, but isn't anymore.


# 9df71d76 28-Jun-2024 Nikita Popov <npopov@redhat.com>

[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)

Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, re

[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)

Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.

show more ...


# d75f9dd1 24-Jun-2024 Stephen Tozer <stephen.tozer@sony.com>

Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497)"

Reverts the above commit, as it updates a common header function and
did not update all callsites:

https://lab.llvm.org/buildbot

Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497)"

Reverts the above commit, as it updates a common header function and
did not update all callsites:

https://lab.llvm.org/buildbot/#/builders/29/builds/382

This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.

show more ...


# 6481dc57 24-Jun-2024 Stephen Tozer <stephen.tozer@sony.com>

[IR][NFC] Update IRBuilder to use InsertPosition (#96497)

Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock

[IR][NFC] Update IRBuilder to use InsertPosition (#96497)

Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
alongside a BasicBlock::iterator, using the fact that we can now get the
parent basic block from the iterator even if it points to the sentinel.
This patch removes the BasicBlock argument from each constructor or call
to setInsertPoint.

This has no functional effect, but later on as we look to remove the
`Instruction *InsertBefore` argument from instruction-creation
(discussed
[here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)),
this will simplify the process by allowing us to deprecate the
InsertPosition constructor directly and catch all the cases where we use
instructions rather than iterators.

show more ...


Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2
# 15104739 14-Mar-2024 Stephen Tozer <stephen.tozer@sony.com>

[RemoveDIs] Insert PHIs before debug records in hardware loops (#85288)

Fixes: https://github.com/llvm/llvm-project/issues/85254

Hardware loops inserts PHIs at the position `getFirstNonPhi()`, wh

[RemoveDIs] Insert PHIs before debug records in hardware loops (#85288)

Fixes: https://github.com/llvm/llvm-project/issues/85254

Hardware loops inserts PHIs at the position `getFirstNonPhi()`, which is
incorrect - instead, `getFirstNonPhiIt()` is required to not insert the
PHI after any debug records that immediately follow the last existing
PHI.

show more ...


# 702e2da1 11-Mar-2024 Kevin P. Neal <52762977+kpneal@users.noreply.github.com>

[HardwareLoops] Add support for strictfp functions. (#84531)

This pass was adding new function calls without adding the strictfp
attribute as required by the rules laid out in the langref. With thi

[HardwareLoops] Add support for strictfp functions. (#84531)

This pass was adding new function calls without adding the strictfp
attribute as required by the rules laid out in the langref. With this
change a make check has 4-5 fewer failing tests with the Verifier
changes in D146845.

LangRef:
https://llvm.org/docs/LangRef.html#constrained-floating-point-intrinsics

Test failures found with "https://reviews.llvm.org/D146845".

show more ...


Revision tags: llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3
# 2a58be42 13-Feb-2023 Samuel Parker <sam.parker@arm.com>

[HardwareLoops] NewPM support.

With the NPM, we're now defaulting to preserving LCSSA, so a couple
of tests have changed slightly.

Differential Revision: https://reviews.llvm.org/D140982


Revision tags: llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 9e6d1f4b 17-Jul-2022 Kazu Hirata <kazu@google.com>

[CodeGen] Qualify auto variables in for loops (NFC)


# dcf4b733 13-Jul-2022 Nikita Popov <npopov@redhat.com>

[SCEVExpander] Make CanonicalMode handing in isSafeToExpand() more robust (PR50506)

isSafeToExpand() for addrecs depends on whether the SCEVExpander
will be used in CanonicalMode. At least one calle

[SCEVExpander] Make CanonicalMode handing in isSafeToExpand() more robust (PR50506)

isSafeToExpand() for addrecs depends on whether the SCEVExpander
will be used in CanonicalMode. At least one caller currently gets
this wrong, resulting in PR50506.

Fix this by a) making the CanonicalMode argument on the freestanding
functions required and b) adding member functions on SCEVExpander
that automatically take the SCEVExpander mode into account. We can
use the latter variant nearly everywhere, and thus make sure that
there is no chance of CanonicalMode mismatch.

Fixes https://github.com/llvm/llvm-project/issues/50506.

Differential Revision: https://reviews.llvm.org/D129630

show more ...


Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# 989f1c72 15-Mar-2022 serge-sans-paille <sguelton@redhat.com>

Cleanup codegen includes

This is a (fixed) recommit of https://reviews.llvm.org/D121169

after: 1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-in

Cleanup codegen includes

This is a (fixed) recommit of https://reviews.llvm.org/D121169

after: 1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681

show more ...


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# a278250b 10-Mar-2022 Nico Weber <thakis@chromium.org>

Revert "Cleanup codegen includes"

This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https:/

Revert "Cleanup codegen includes"

This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169

show more ...


# 7f230fee 07-Mar-2022 serge-sans-paille <sguelton@redhat.com>

Cleanup codegen includes

after: 1061034926
before: 1063332844

Differential Revision: https://reviews.llvm.org/D121169


Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4
# c98a8a09 16-Sep-2021 Sam Parker <sam.parker@arm.com>

[HardwareLoops] Loop guard intrinsic to recognise zext

If a loop count was initially represented by a 32b unsigned int in C
then the hardware-loop pass can recognise the loop guard and insert
the ll

[HardwareLoops] Loop guard intrinsic to recognise zext

If a loop count was initially represented by a 32b unsigned int in C
then the hardware-loop pass can recognise the loop guard and insert
the llvm.test.set.loop.iterations intrinsic. If this was instead a
unsigned short/char then clang inserts a zext instruction to expand
the loop count to an i32. This patch adds the necessary pattern
matching to enable the use of lvm.test.set.loop.iterations in those
cases.

Patch by: sherwin-dc

Differential Revision: https://reviews.llvm.org/D109631

show more ...


Revision tags: llvmorg-13.0.0-rc3
# 34badc40 03-Sep-2021 Chen Zheng <czhengsz@cn.ibm.com>

Revert "[HardwareLoops] Change order of SCEV expression construction for InitLoopCount."

This causes https://bugs.llvm.org/show_bug.cgi?id=51714 and
is not a right patch according to comments in D91

Revert "[HardwareLoops] Change order of SCEV expression construction for InitLoopCount."

This causes https://bugs.llvm.org/show_bug.cgi?id=51714 and
is not a right patch according to comments in D91724

This reverts commit 42eaf4fe0adef3344adfd9fbccd49f325cb549ef.

show more ...


Revision tags: llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init
# e00d67dc 27-Jul-2021 David Green <david.green@arm.com>

[NFC] Reflow some debug messages.


Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4
# fad70c30 11-Mar-2021 David Green <david.green@arm.com>

[ARM] Improve WLS lowering

Recently we improved the lowering of low overhead loops and tail
predicated loops, but concentrated first on the DLS do style loops. This
extends those improvements over t

[ARM] Improve WLS lowering

Recently we improved the lowering of low overhead loops and tail
predicated loops, but concentrated first on the DLS do style loops. This
extends those improvements over to the WLS while loops, improving the
chance of lowering them successfully. To do this the lowering has to
change a little as the instructions are terminators that produce a value
- something that needs to be treated carefully.

Lowering starts at the Hardware Loop pass, inserting a new
llvm.test.start.loop.iterations that produces both an i1 to control the
loop entry and an i32 similar to the llvm.start.loop.iterations
intrinsic added for do loops. This feeds into the loop phi, properly
gluing the values together:

%wls = call { i32, i1 } @llvm.test.start.loop.iterations.i32(i32 %div)
%wls0 = extractvalue { i32, i1 } %wls, 0
%wls1 = extractvalue { i32, i1 } %wls, 1
br i1 %wls1, label %loop.ph, label %loop.exit
...
loop:
%lsr.iv = phi i32 [ %wls0, %loop.ph ], [ %iv.next, %loop ]
..
%iv.next = call i32 @llvm.loop.decrement.reg.i32(i32 %lsr.iv, i32 1)
%cmp = icmp ne i32 %iv.next, 0
br i1 %cmp, label %loop, label %loop.exit

The llvm.test.start.loop.iterations need to be lowered through ISel
lowering as a pair of WLS and WLSSETUP nodes, which each get converted
to t2WhileLoopSetup and t2WhileLoopStart Pseudos. This helps prevent
t2WhileLoopStart from being a terminator that produces a value,
something difficult to control at that stage in the pipeline. Instead
the t2WhileLoopSetup produces the value of LR (essentially acting as a
lr = subs rn, 0), t2WhileLoopStart consumes that lr value (the Bcc).

These are then converted into a single t2WhileLoopStartLR at the same
point as t2DoLoopStartTP and t2LoopEndDec. Otherwise we revert the loop
to prevent them from progressing further in the pipeline. The
t2WhileLoopStartLR is a single instruction that takes a GPR and produces
LR, similar to the WLS instruction.

%1:gprlr = t2WhileLoopStartLR %0:rgpr, %bb.3
t2B %bb.1
...
bb.2.loop:
%2:gprlr = PHI %1:gprlr, %bb.1, %3:gprlr, %bb.2
...
%3:gprlr = t2LoopEndDec %2:gprlr, %bb.2
t2B %bb.3

The t2WhileLoopStartLR can then be treated similar to the other low
overhead loop pseudos, eventually being lowered to a WLS providing the
branches are within range.

Differential Revision: https://reviews.llvm.org/D97729

show more ...


Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2
# 905cf88d 13-Feb-2021 Kazu Hirata <kazu@google.com>

[CodeGen] Use range-based for loops (NFC)


Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1
# 42eaf4fe 24-Nov-2020 Janek van Oirschot <janekvo@graphcore.ai>

[HardwareLoops] Change order of SCEV expression construction for InitLoopCount.

Putting the +1 before the zero-extend will allow scalar evolution to fold the expression in some cases such as the one

[HardwareLoops] Change order of SCEV expression construction for InitLoopCount.

Putting the +1 before the zero-extend will allow scalar evolution to fold the expression in some cases such as the one shown in PowerPC's `shrink-wrap.ll` test.

Reviewed By: samparker

Differential Revision: https://reviews.llvm.org/D91724

show more ...


# b2ac9681 10-Nov-2020 David Green <david.green@arm.com>

[ARM] Alter t2DoLoopStart to define lr

This changes the definition of t2DoLoopStart from
t2DoLoopStart rGPR
to
GPRlr = t2DoLoopStart rGPR

This will hopefully mean that low overhead loops are more t

[ARM] Alter t2DoLoopStart to define lr

This changes the definition of t2DoLoopStart from
t2DoLoopStart rGPR
to
GPRlr = t2DoLoopStart rGPR

This will hopefully mean that low overhead loops are more tied together,
and we can more reliably generate loops without reverting or being at
the whims of the register allocator.

This is a fairly simple change in itself, but leads to a number of other
required alterations.

- The hardware loop pass, if UsePhi is set, now generates loops of the
form:
%start = llvm.start.loop.iterations(%N)
loop:
%p = phi [%start], [%dec]
%dec = llvm.loop.decrement.reg(%p, 1)
%c = icmp ne %dec, 0
br %c, loop, exit
- For this a new llvm.start.loop.iterations intrinsic was added, identical
to llvm.set.loop.iterations but produces a value as seen above, gluing
the loop together more through def-use chains.
- This new instrinsic conceptually produces the same output as input,
which is taught to SCEV so that the checks in MVETailPredication are not
affected.
- Some minor changes are needed to the ARMLowOverheadLoop pass, but it has
been left mostly as before. We should now more reliably be able to tell
that the t2DoLoopStart is correct without having to prove it, but
t2WhileLoopStart and tail-predicated loops will remain the same.
- And all the tests have been updated. There are a lot of them!

This patch on it's own might cause more trouble that it helps, with more
tail-predicated loops being reverted, but some additional patches can
hopefully improve upon that to get to something that is better overall.

Differential Revision: https://reviews.llvm.org/D89881

show more ...


Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4
# 89c1e35f 22-Sep-2020 Stefanos Baziotis <sdi1600105@di.uoa.gr>

[LoopInfo] empty() -> isInnermost(), add isOutermost()

Differential Revision: https://reviews.llvm.org/D82895


Revision tags: llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1
# 6c348e40 16-Jul-2020 Sam Tebbs <samuel.tebbs@arm.com>

[HWLoops] Stop converting to a while loop when it would be unsafe to

There were cases where a do-while loop would be converted to a while
loop before finding out that it would be unsafe to expand th

[HWLoops] Stop converting to a while loop when it would be unsafe to

There were cases where a do-while loop would be converted to a while
loop before finding out that it would be unsafe to expand the SCEV in
this situation and then bailing out of hardware loop conversion.

This patch checks if it would be unsafe to expand the SCEV and if so stops converting the do-while into a while, allowing conversion to a hardware loop.

Differential Revision: https://reviews.llvm.org/D83953

show more ...


Revision tags: llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3
# 9e03547c 03-Jul-2020 David Green <david.green@arm.com>

[ARM][HWLoops] Create hardware loops for sibling loops

Given a loop with two subloops, it should be possible for both to be
converted to hardware loops. That's what this patch does, simply enough.
I

[ARM][HWLoops] Create hardware loops for sibling loops

Given a loop with two subloops, it should be possible for both to be
converted to hardware loops. That's what this patch does, simply enough.
It slightly alters the loop iterating order to try and convert all
subloops. If one (or more) succeeds, it stops as before.

Differential Revision: https://reviews.llvm.org/D78502

show more ...


12