History log of /llvm-project/llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp (Results 126 – 148 of 148)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# e93e0d41 09-Jan-2020 Sam Parker <sam.parker@arm.com>

[ARM][LowOverheadLoops] Update liveness info

After expanding the pseudo instructions, update the liveness info.
We do this in a post-order traversal of the loop, including its
exit blocks and prehea

[ARM][LowOverheadLoops] Update liveness info

After expanding the pseudo instructions, update the liveness info.
We do this in a post-order traversal of the loop, including its
exit blocks and preheader(s).

Differential Revision: https://reviews.llvm.org/D72131

show more ...


# 0efc9e5a 06-Jan-2020 Sjoerd Meijer <sjoerd.meijer@arm.com>

[ARM][MVE] More MVETailPredication debug messages. NFC.

I've added a few more debug messages to MVETailPredication because I wanted to
trace better which instructions are added/removed. And while I

[ARM][MVE] More MVETailPredication debug messages. NFC.

I've added a few more debug messages to MVETailPredication because I wanted to
trace better which instructions are added/removed. And while I was at it, I
factored out one function which I thought was clearer, and have added some
comments to describe better the flow between MVETailPredication and
ARMLowOverheadLoops.

Differential Revision: https://reviews.llvm.org/D71549

show more ...


# 8f6a6763 03-Jan-2020 Sam Parker <sam.parker@arm.com>

[ARM][NFC] Move tail predication checks

Extract the tail predication validation checks out into their own
LowOverHeadLoop method.


# acbc9aed 20-Dec-2019 Sam Parker <sam.parker@arm.com>

[ARM][MVE] Fixes for tail predication.

1) Fix an issue with the incorrect value being used for the number of
elements being passed to [d|w]lstp. We were trying to check that
the value was avai

[ARM][MVE] Fixes for tail predication.

1) Fix an issue with the incorrect value being used for the number of
elements being passed to [d|w]lstp. We were trying to check that
the value was available at LoopStart, but this doesn't consider
that the last instruction in the block could also define the
register. Two helpers have been added to RDA for this.
2) Insert some code to now try to move the element count def or the
insertion point so that we can perform more tail predication.
3) Related to (1), the same off-by-one could prevent us from
generating a low-overhead loop when a mov lr could have been
the last instruction in the block.
4) Fix up some instruction attributes so that not all the
low-overhead loop instructions are labelled as branches and
terminators - as this is not true for dls/dlstp.

Differential Revision: https://reviews.llvm.org/D71609

show more ...


# 40425183 20-Dec-2019 Sam Parker <sam.parker@arm.com>

[ARM][MVE] Tail predicate in the presence of vcmp

Record the discovered VPT blocks while checking for validity and, for
now, only handle blocks that begin with VPST and not VPT. We're now
allowing m

[ARM][MVE] Tail predicate in the presence of vcmp

Record the discovered VPT blocks while checking for validity and, for
now, only handle blocks that begin with VPST and not VPT. We're now
allowing more than one instruction to define vpr, but each block must
somehow be predicated using the vctp. This leaves us with several
scenarios which need fixing up:
1) A VPT block with is only predicated by the vctp and has no
internal vpr defs.
2) A VPT block which is only predicated by the vctp but has an
internal vpr def.
3) A VPT block which is predicated upon the vctp as well as another
vpr def.
4) A VPT block which is not predicated upon a vctp, but contains it
and all instructions within the block are predicated upon in.

The changes needed are, for:
1) The easy one, just remove the vpst and unpredicate the
instructions in the block.
2) Remove the vpst and unpredicate the instructions up to the
internal vpr def. Need insert a new vpst to predicate the
remaining instructions.
3) No nothing.
4) The vctp will be inside a vpt and the instruction will be removed,
so adjust the size of the mask on the vpst.

Differential Revision: https://reviews.llvm.org/D71107

show more ...


# 049f9672 16-Dec-2019 Sjoerd Meijer <sjoerd.meijer@arm.com>

[ARM] Move MVE opcode helper functions to ARMBaseInstrInfo. NFC.

In ARMLowOverheadLoops.cpp, MVETailPredication.cpp, and MVEVPTBlock.cpp we have
quite a few helper functions all looking at the opcod

[ARM] Move MVE opcode helper functions to ARMBaseInstrInfo. NFC.

In ARMLowOverheadLoops.cpp, MVETailPredication.cpp, and MVEVPTBlock.cpp we have
quite a few helper functions all looking at the opcodes of MVE instructions.
This moves all these utility functions to ARMBaseInstrInfo.

Diferential Revision: https://reviews.llvm.org/D71426

show more ...


Revision tags: llvmorg-9.0.1, llvmorg-9.0.1-rc3
# d97cf1f8 11-Dec-2019 Sjoerd Meijer <sjoerd.meijer@arm.com>

[ARM][LowOverheadLoops] Remove dead loop update instructions.

After creating a low-overhead loop, the loop update instruction was still
lingering around hurting performance. This removes dead loop u

[ARM][LowOverheadLoops] Remove dead loop update instructions.

After creating a low-overhead loop, the loop update instruction was still
lingering around hurting performance. This removes dead loop update
instructions, which in our case are mostly SUBS instructions.

To support this, some helper functions were added to MachineLoopUtils and
ReachingDefAnalysis to analyse live-ins of loop exit blocks and find uses
before a particular loop instruction, respectively.

This is a first version that removes a SUBS instruction when there are no other
uses inside and outside the loop block, but there are some more interesting
cases in test/CodeGen/Thumb2/LowOverheadLoops/mve-tail-data-types.ll which
shows that there is room for improvement. For example, we can't handle this
case yet:

..
dlstp.32 lr, r2
.LBB0_1:
mov r3, r2
subs r2, #4
vldrh.u32 q2, [r1], #8
vmov q1, q0
vmla.u32 q0, q2, r0
letp lr, .LBB0_1
@ %bb.2:
vctp.32 r3
..

which is a lot more tricky because r2 is not only used by the subs, but also by
the mov to r3, which is used outside the low-overhead loop by the vctp
instruction, and that requires a bit of a different approach, and I will follow
up on this.

Differential Revision: https://reviews.llvm.org/D71007

show more ...


Revision tags: llvmorg-9.0.1-rc2
# 28166816 26-Nov-2019 Sam Parker <sam.parker@arm.com>

[ARM][ReachingDefs] Remove dead code in loloops.

Add some more helper functions to ReachingDefs to query the uses of
a given MachineInstr and also to query whether two MachineInstrs use
the same def

[ARM][ReachingDefs] Remove dead code in loloops.

Add some more helper functions to ReachingDefs to query the uses of
a given MachineInstr and also to query whether two MachineInstrs use
the same def of a register.

For Arm, while tail-predicating, these helpers are used in the
low-overhead loops to remove the dead code that calculates the number
of loop iterations.

Differential Revision: https://reviews.llvm.org/D70240

show more ...


# cced971f 26-Nov-2019 Sam Parker <sam.parker@arm.com>

[ARM][ReachingDefs] RDA in LoLoops

Add several new methods to ReachingDefAnalysis:
- getReachingMIDef, instead of returning an integer, return the
MachineInstr that produces the def.
- getInstFrom

[ARM][ReachingDefs] RDA in LoLoops

Add several new methods to ReachingDefAnalysis:
- getReachingMIDef, instead of returning an integer, return the
MachineInstr that produces the def.
- getInstFromId, return a MachineInstr for which the given integer
corresponds to.
- hasSameReachingDef, return whether two MachineInstr use the same
def of a register.
- isRegUsedAfter, return whether a register is used after a given
MachineInstr.

These methods have been used in ARMLowOverhead to replace searching
for uses/defs.

Differential Revision: https://reviews.llvm.org/D70009

show more ...


Revision tags: llvmorg-9.0.1-rc1
# 8978c12b 18-Nov-2019 Sam Parker <sam.parker@arm.com>

[ARM][MVE] Tail predication conversion

This patch modifies ARMLowOverheadLoops to convert a predicated
vector low-overhead loop into a tail-predicatd one. This is currently
a very basic conversion,

[ARM][MVE] Tail predication conversion

This patch modifies ARMLowOverheadLoops to convert a predicated
vector low-overhead loop into a tail-predicatd one. This is currently
a very basic conversion, with the following restrictions:
- Operates only on single block loops.
- The loop can only contain a single vctp instruction.
- No other instructions can write to the vpr.
- We only allow a subset of the mve instructions in the loop.

TODO: Pass the number of elements, not the number of iterations to
dlstp/wlstp.

Differential Revision: https://reviews.llvm.org/D69945

show more ...


# 4ba6d0de 23-Sep-2019 Sam Parker <sam.parker@arm.com>

[ARM][LowOverheadLoops] Use subs during revert.

Check whether there are any uses or defs between the LoopDec and
LoopEnd. If there's not, then we can use a subs to set the cpsr and
skip generating a

[ARM][LowOverheadLoops] Use subs during revert.

Check whether there are any uses or defs between the LoopDec and
LoopEnd. If there's not, then we can use a subs to set the cpsr and
skip generating a cmp.

Differential Revision: https://reviews.llvm.org/D67801

llvm-svn: 372560

show more ...


# 566127e3 23-Sep-2019 Sam Parker <sam.parker@arm.com>

[ARM][LowOverheadLoops] Use tBcc when reverting

Check the branch target ranges and use a tBcc instead of t2Bcc when
we can.

Differential Revision: https://reviews.llvm.org/D67796

llvm-svn: 372557


Revision tags: llvmorg-9.0.0
# 36c92227 17-Sep-2019 Sam Parker <sam.parker@arm.com>

[ARM][LowOverheadLoops] Add LR def safety check

Converting the *LoopStart pseudo instructions into DLS/WLS results in
LR being defined. These instructions were inserted on the assumption
that LR wou

[ARM][LowOverheadLoops] Add LR def safety check

Converting the *LoopStart pseudo instructions into DLS/WLS results in
LR being defined. These instructions were inserted on the assumption
that LR would already contain the loop counter because a mov is
introduced during ISel as the the consumers in the loop can only use
LR. That assumption proved wrong!

So perform a safety check, finding an appropriate place to insert the
DLS/WLS instructions or revert if this isn't possible.

Differential Revision: https://reviews.llvm.org/D67539

llvm-svn: 372111

show more ...


Revision tags: llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3
# 9b9a3084 15-Aug-2019 Eli Friedman <efriedma@quicinc.com>

[ARM][LowOverheadLoops] Fix generated code for "revert".

Two issues:

1. t2CMPri shouldn't use CPSR if it isn't predicated. This doesn't
really have any visible effect at the moment, but it might ma

[ARM][LowOverheadLoops] Fix generated code for "revert".

Two issues:

1. t2CMPri shouldn't use CPSR if it isn't predicated. This doesn't
really have any visible effect at the moment, but it might matter in the
future.
2. The t2CMPri generated for t2WhileLoopStart might need to use a
register that isn't LR.

My team found this because we have a patch to track register liveness
late in the pass pipeline. I'll look into upstreaming it to help catch
issues like this earlier.

Differential Revision: https://reviews.llvm.org/D66243

llvm-svn: 369069

show more ...


Revision tags: llvmorg-9.0.0-rc2
# 173de037 07-Aug-2019 Sam Parker <sam.parker@arm.com>

[ARM][LowOverheadLoops] Revert after read/write

Currently we check whether LR is stored/loaded to/from inbetween the
loop decrement and loop end pseudo instructions. There's two problems
here:
-

[ARM][LowOverheadLoops] Revert after read/write

Currently we check whether LR is stored/loaded to/from inbetween the
loop decrement and loop end pseudo instructions. There's two problems
here:
- It relies on all load/store instructions being labelled as such in
tablegen.
- Actually any use of loop decrement is troublesome because the value
doesn't exist!

So we need to check for any read/write of LR that occurs between the
two instructions and revert if we find anything.

Differential Revision: https://reviews.llvm.org/D65792

llvm-svn: 368130

show more ...


# ed2ea3e4 30-Jul-2019 Sam Parker <sam.parker@arm.com>

[ARM][LowOverheadLoops] Revert non-header LE target

Revert the hardware loop upon finding a LoopEnd that doesn't target
the loop header, instead of asserting a failure.

Differential Revision: h

[ARM][LowOverheadLoops] Revert non-header LE target

Revert the hardware loop upon finding a LoopEnd that doesn't target
the loop header, instead of asserting a failure.

Differential Revision: https://reviews.llvm.org/D65268

llvm-svn: 367296

show more ...


Revision tags: llvmorg-9.0.0-rc1
# a19f5a76 24-Jul-2019 Sjoerd Meijer <sjoerd.meijer@arm.com>

Test commit. NFC.

Removed 2 trailing whitespaces in 2 files that used to be in different
repos to test my new github monorepo workflow.

llvm-svn: 366904


# 4379a400 22-Jul-2019 Sam Parker <sam.parker@arm.com>

[ARM][LowOverheadLoops] Revert remaining pseudos

ARMLowOverheadLoops would assert a failure if it did not find all the
pseudo instructions that comprise the hardware loop. Instead of doing
this, ite

[ARM][LowOverheadLoops] Revert remaining pseudos

ARMLowOverheadLoops would assert a failure if it did not find all the
pseudo instructions that comprise the hardware loop. Instead of doing
this, iterate through all the instructions of the function and revert
any remaining pseudo instructions that haven't been converted.

Differential Revision: https://reviews.llvm.org/D65080

llvm-svn: 366691

show more ...


Revision tags: llvmorg-10-init
# 08b4a8da 11-Jul-2019 Sam Parker <sam.parker@arm.com>

[ARM][LowOverheadLoops] Correct offset checking

This patch addresses a couple of problems:
1) The maximum supported offset of LE is -4094.
2) The offset of WLS also needs to be checked, this use

[ARM][LowOverheadLoops] Correct offset checking

This patch addresses a couple of problems:
1) The maximum supported offset of LE is -4094.
2) The offset of WLS also needs to be checked, this uses a
maximum positive offset of 4094.

The use of BasicBlockUtils has been changed because the block offsets
weren't being initialised, but the isBBInRange checks both positive
and negative offsets.

ARMISelLowering has been tweaked because the test case presented
another pattern that we weren't supporting.

llvm-svn: 365749

show more ...


# 775b2f59 10-Jul-2019 Sam Parker <sam.parker@arm.com>

[NFC][ARM] Convert lambdas to static helpers

Break up and convert some of the lambdas in ARMLowOverheadLoops into
static functions.

llvm-svn: 365623


Revision tags: llvmorg-8.0.1, llvmorg-8.0.1-rc4
# 98722691 01-Jul-2019 Sam Parker <sam.parker@arm.com>

[ARM] WLS/LE Code Generation

Backend changes to enable WLS/LE low-overhead loops for armv8.1-m:
1) Use TTI to communicate to the HardwareLoop pass that we should try
to generate intrinsics th

[ARM] WLS/LE Code Generation

Backend changes to enable WLS/LE low-overhead loops for armv8.1-m:
1) Use TTI to communicate to the HardwareLoop pass that we should try
to generate intrinsics that guard the loop entry, as well as setting
the loop trip count.
2) Lower the BRCOND that uses said intrinsic to an Arm specific node:
ARMWLS.
3) ISelDAGToDAG the node to a new pseudo instruction:
t2WhileLoopStart.
4) Add support in ArmLowOverheadLoops to handle the new pseudo
instruction.

Differential Revision: https://reviews.llvm.org/D63816

llvm-svn: 364733

show more ...


Revision tags: llvmorg-8.0.1-rc3
# bcf0eb7a 25-Jun-2019 Sam Parker <sam.parker@arm.com>

[ARM] Fix for DLS/LE CodeGen

The expensive buildbots highlighted the mir tests were broken, which
I've now updated and added --verify-machineinstrs to them. This also
uncovered a couple of bugs in t

[ARM] Fix for DLS/LE CodeGen

The expensive buildbots highlighted the mir tests were broken, which
I've now updated and added --verify-machineinstrs to them. This also
uncovered a couple of bugs in the backend pass, so these have also
been fixed.

llvm-svn: 364323

show more ...


# a6fd919c 25-Jun-2019 Sam Parker <sam.parker@arm.com>

[ARM] DLS/LE low-overhead loop code generation

Introduce three pseudo instructions to be used during DAG ISel to
represent v8.1-m low-overhead loops. One maps to set_loop_iterations
while loop_decre

[ARM] DLS/LE low-overhead loop code generation

Introduce three pseudo instructions to be used during DAG ISel to
represent v8.1-m low-overhead loops. One maps to set_loop_iterations
while loop_decrement_reg is lowered to two, so that we can separate
the decrement and branching operations. The pseudo instructions are
expanded pre-emission, where we can still decide whether we actually
want to generate a low-overhead loop, in a new pass:
ARMLowOverheadLoops. The pass currently bails, reverting to an sub,
icmp and br, in the cases where a call or stack spill/restore happens
between the decrement and branching instructions, or if the loop is
too large.

Differential Revision: https://reviews.llvm.org/D63476

llvm-svn: 364288

show more ...


123456