History log of /llvm-project/llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp (Results 51 – 75 of 148)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# d9bf6245 07-Dec-2020 David Green <david.green@arm.com>

[ARM] Revert low overhead loops with calls before registry allocation.

This adds code to revert low overhead loops with calls in them before
register allocation. Ideally we would not create low over

[ARM] Revert low overhead loops with calls before registry allocation.

This adds code to revert low overhead loops with calls in them before
register allocation. Ideally we would not create low overhead loops with
calls in them to begin with, but that can be difficult to always get
correct. If we want to try and glue together t2LoopDec and t2LoopEnd
into a single instruction, we need to ensure that no instructions use LR
in the loop. (Technically the final code can be better too, as it
doesn't need to use the same registers but that has not been optimized
for here, as reverting loops with calls is expected to be very rare).

It also adds a MVETailPredUtils.h header to share the revert code
between different passes, and provides a place to expand upon, with
RevertLoopWithCall becoming a place to perform other low overhead loop
alterations like removing copies or combining LoopDec and End into a
single instruction.

Differential Revision: https://reviews.llvm.org/D91273

show more ...


Revision tags: llvmorg-11.0.1-rc1
# 8ecb015e 19-Nov-2020 Sam Tebbs <samuel.tebbs@arm.com>

[ARM][LowOverheadLoops] Convert intermediate vpr use assertion to condition

This converts the intermediate VPR use assertion to a condition in the if-statement to protect against assertion failures

[ARM][LowOverheadLoops] Convert intermediate vpr use assertion to condition

This converts the intermediate VPR use assertion to a condition in the if-statement to protect against assertion failures in case behaviuour is changed.

This is a follow-up to https://reviews.llvm.org/D90935 and implements the post-approval comments.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D91790

show more ...


# f45c052c 18-Nov-2020 Mikhail Goncharov <goncharov.mikhail@gmail.com>

Fix unused variables in release build

Differential Revision: https://reviews.llvm.org/D91705


# da2e4728 06-Nov-2020 Sam Tebbs <samuel.tebbs@arm.com>

[ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks

This patch adds support for combining a VPST with a dangling VCMP from a
previous VPT block.

Differential Revision: https://reviews.llv

[ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks

This patch adds support for combining a VPST with a dangling VCMP from a
previous VPT block.

Differential Revision: https://reviews.llvm.org/D90935

show more ...


# 898a81df 11-Nov-2020 Sam Parker <sam.parker@arm.com>

[NFC][ARM] Replace lambda with any_of


# 08d1c2d4 10-Nov-2020 David Green <david.green@arm.com>

[ARM] Introduce t2DoLoopStartTP

This introduces a new pseudo instruction, almost identical to a
t2DoLoopStart but taking 2 parameters - the original loop iteration
count needed for a low overhead lo

[ARM] Introduce t2DoLoopStartTP

This introduces a new pseudo instruction, almost identical to a
t2DoLoopStart but taking 2 parameters - the original loop iteration
count needed for a low overhead loop, plus the VCTP element count needed
for a DLSTP instruction setting up a tail predicated loop. The idea is
that the instruction holds both values and the backend
ARMLowOverheadLoops pass can pick between the two, depending on whether
it creates a tail predicated loop or falls back to a low overhead loop.

To do that there needs to be something that converts a t2DoLoopStart to
a t2DoLoopStartTP, for which this patch repurposes the
MVEVPTOptimisationsPass as a "tail predication and vpt optimisation"
pass. The extra operand for the t2DoLoopStartTP is chosen based on the
operands of VCTP's in the loop, and the instruction is moved as late in
the block as possible to attempt to increase the likelihood of making
tail predicated loops.

Differential Revision: https://reviews.llvm.org/D90591

show more ...


# dbe1bf63 10-Nov-2020 David Green <david.green@arm.com>

[ARM] Cleanup for ARMLowOverheadLoops. NFC


# b2ac9681 10-Nov-2020 David Green <david.green@arm.com>

[ARM] Alter t2DoLoopStart to define lr

This changes the definition of t2DoLoopStart from
t2DoLoopStart rGPR
to
GPRlr = t2DoLoopStart rGPR

This will hopefully mean that low overhead loops are more t

[ARM] Alter t2DoLoopStart to define lr

This changes the definition of t2DoLoopStart from
t2DoLoopStart rGPR
to
GPRlr = t2DoLoopStart rGPR

This will hopefully mean that low overhead loops are more tied together,
and we can more reliably generate loops without reverting or being at
the whims of the register allocator.

This is a fairly simple change in itself, but leads to a number of other
required alterations.

- The hardware loop pass, if UsePhi is set, now generates loops of the
form:
%start = llvm.start.loop.iterations(%N)
loop:
%p = phi [%start], [%dec]
%dec = llvm.loop.decrement.reg(%p, 1)
%c = icmp ne %dec, 0
br %c, loop, exit
- For this a new llvm.start.loop.iterations intrinsic was added, identical
to llvm.set.loop.iterations but produces a value as seen above, gluing
the loop together more through def-use chains.
- This new instrinsic conceptually produces the same output as input,
which is taught to SCEV so that the checks in MVETailPredication are not
affected.
- Some minor changes are needed to the ARMLowOverheadLoop pass, but it has
been left mostly as before. We should now more reliably be able to tell
that the t2DoLoopStart is correct without having to prove it, but
t2WhileLoopStart and tail-predicated loops will remain the same.
- And all the tests have been updated. There are a lot of them!

This patch on it's own might cause more trouble that it helps, with more
tail-predicated loops being reverted, but some additional patches can
hopefully improve upon that to get to something that is better overall.

Differential Revision: https://reviews.llvm.org/D89881

show more ...


# 40a3f7e4 30-Oct-2020 Sam Tebbs <samuel.tebbs@arm.com>

[ARM][LowOverheadLoops] Merge a VCMP and the new VPST into a VPT

There were cases where a VCMP and a VPST were merged even if the VCMP
didn't have the same defs of its operands as the VPST. This is

[ARM][LowOverheadLoops] Merge a VCMP and the new VPST into a VPT

There were cases where a VCMP and a VPST were merged even if the VCMP
didn't have the same defs of its operands as the VPST. This is fixed by
adding RDA checks for the defs. This however gave rise to cases where
the new VPST created would precede the un-merged VCMP and so would fail
a predicate mask assertion since the VCMP wasn't predicated. This was
solved by converting the VCMP to a VPT instead of inserting the new
VPST.

Differential Revision: https://reviews.llvm.org/D90461

show more ...


# e24537d4 21-Oct-2020 Mircea Trofin <mtrofin@google.com>

[NFC][MC] Use MCRegister for ReachingDefAnalysis APIs

Also updated the users of the APIs; and a drive-by small change to
RDFRegister.cpp

Differential Revision: https://reviews.llvm.org/D89912


# 6dcbc323 20-Oct-2020 David Green <david.green@arm.com>

Revert "[ARM][LowOverheadLoops] Adjust Start insertion."

This reverts commit 38f625d0d1360b035271422bab922d22ed04d79a.

This commit contains some holes in its logic and has been causing
issues since

Revert "[ARM][LowOverheadLoops] Adjust Start insertion."

This reverts commit 38f625d0d1360b035271422bab922d22ed04d79a.

This commit contains some holes in its logic and has been causing
issues since it was commited. The idea sounds OK but some cases were not
handled correctly. Instead of trying to fix that up later it is probably
simpler to revert it and work to reimplement it in a more reliable way.

show more ...


# cb27006a 10-Oct-2020 David Green <david.green@arm.com>

[ARM] Attempt to make Tail predication / RDA more resilient to empty blocks

There are a number of places in RDA where we assume the block will not
be empty. This isn't necessarily true for tail pred

[ARM] Attempt to make Tail predication / RDA more resilient to empty blocks

There are a number of places in RDA where we assume the block will not
be empty. This isn't necessarily true for tail predicated loops where we
have removed instructions. This attempt to make the pass more resilient
to empty blocks, not casting pointers to machine instructions where they
would be invalid.

The test contains a case that was previously failing, but recently been
hidden on trunk. It contains an empty block to begin with to show a
similar error.

Differential Revision: https://reviews.llvm.org/D88926

show more ...


Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6
# 7e02bc81 01-Oct-2020 Sam Parker <sam.parker@arm.com>

[NFC][ARM] LowOverheadLoop DEBUG statements


# 38f625d0 01-Oct-2020 Sam Parker <sam.parker@arm.com>

[ARM][LowOverheadLoops] Adjust Start insertion.

Try to move the insertion point to become the terminator of the
block, usually the preheader.

Differential Revision: https://reviews.llvm.org/D88638


Revision tags: llvmorg-11.0.0-rc5
# 6ec5f324 30-Sep-2020 Sam Parker <sam.parker@arm.com>

[ARM][LowOverheadLoops] Iteration count liveness

Before deciding to insert a [W|D]LSTP, check that defining LR with
the element count won't affect any other instructions that should be
taking the it

[ARM][LowOverheadLoops] Iteration count liveness

Before deciding to insert a [W|D]LSTP, check that defining LR with
the element count won't affect any other instructions that should be
taking the iteration count.

Differential Revision: https://reviews.llvm.org/D88549

show more ...


# 7b90516d 30-Sep-2020 Sam Parker <sam.parker@arm.com>

[ARM][LowOverheadLoops] Start insertion point

If possible, try not to move the start position earlier than it
already is.

Differential Revision: https://reviews.llvm.org/D88542


# dfa2c14b 30-Sep-2020 Sam Parker <sam.parker@arm.com>

[ARM][LowOverheadLoops] Use iterator for InsertPt.

Use a MachineBasicBlock::iterator instead of a MachineInstr* for the
position of our LoopStart instruction. NFCish, as it change debug
info.


# 779a8a02 30-Sep-2020 Sam Parker <sam.parker@arm.com>

[ARM][LowOverheadLoops] TryRemove helper.

Make a helper function that wraps around RDA::isSafeToRemove and
utilises the existing DCE IT block checks.


Revision tags: llvmorg-11.0.0-rc4
# 195c22f2 24-Sep-2020 Sam Parker <sam.parker@arm.com>

[ARM] Change VPT state assertion

Just because we haven't encountered an instruction setting the VPR,
it doesn't mean we can't create a VPT block - the VPR maybe a
live-in.

Differential Revision: ht

[ARM] Change VPT state assertion

Just because we haven't encountered an instruction setting the VPR,
it doesn't mean we can't create a VPT block - the VPR maybe a
live-in.

Differential Revision: https://reviews.llvm.org/D88224

show more ...


# 4c19b89b 29-Sep-2020 Sam Parker <sam.parker@arm.com>

[NFC][ARM] Comments and lambdas

Add some comments in LowOverheadLoops and make some lambda variables
explicit arguments instead of capturing.


# e82a0084 25-Sep-2020 Sam Parker <sam.parker@arm.com>

[ARM][LowOverheadLoops] Cleanup and re-arrange

Rename and reorganise how we decide where to put the LoopStart
instruction.


# 3d1d0891 28-Sep-2020 Sam Parker <sam.parker@arm.com>

[NFC][ARM] Factor out some logic for LoLoops.

Create a DCE function that accepts an instruction.


# e4b9867c 28-Sep-2020 David Green <david.green@arm.com>

[ARM] Expand cannotInsertWDLSTPBetween to the last instruction

9d9a11c7be037 added this check for predicatable instructions between the
D/WLSTP and the loop's start, but it was missing the last inst

[ARM] Expand cannotInsertWDLSTPBetween to the last instruction

9d9a11c7be037 added this check for predicatable instructions between the
D/WLSTP and the loop's start, but it was missing the last instruction in
the block. Change it to use some iterators instead.

Differential Revision: https://reviews.llvm.org/D88354

show more ...


Revision tags: llvmorg-11.0.0-rc3
# a399d188 22-Sep-2020 Sam Parker <sam.parker@arm.com>

[ARM] Find VPT implicitly predicated by VCTP

On failing to find a VCTP in the list of instructions that explicitly
predicate the entry of a VPT block, inspect whether the block is
controlled via VPT

[ARM] Find VPT implicitly predicated by VCTP

On failing to find a VCTP in the list of instructions that explicitly
predicate the entry of a VPT block, inspect whether the block is
controlled via VPT which is implicitly predicated due to it's
predicated operand(s).

Differential Revision: https://reviews.llvm.org/D87819

show more ...


# 00ee52ae 24-Sep-2020 Sam Parker <sam.parker@arm.com>

[NFC][ARM] Remove dead loop.

Remove a loop that just calculated a couple of values that were now
longer needed.


123456