History log of /llvm-project/llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp (Results 26 – 50 of 356)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-14.0.5
# ad73ce31 26-May-2022 Zongwei Lan <lanzongwei541@gmail.com>

[Target] use getSubtarget<> instead of static_cast<>(getSubtarget())

Differential Revision: https://reviews.llvm.org/D125391


Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# 989f1c72 15-Mar-2022 serge-sans-paille <sguelton@redhat.com>

Cleanup codegen includes

This is a (fixed) recommit of https://reviews.llvm.org/D121169

after: 1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-in

Cleanup codegen includes

This is a (fixed) recommit of https://reviews.llvm.org/D121169

after: 1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681

show more ...


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# ed98c1b3 09-Mar-2022 serge-sans-paille <sguelton@redhat.com>

Cleanup includes: DebugInfo & CodeGen

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121332


# a278250b 10-Mar-2022 Nico Weber <thakis@chromium.org>

Revert "Cleanup codegen includes"

This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https:/

Revert "Cleanup codegen includes"

This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169

show more ...


# 7f230fee 07-Mar-2022 serge-sans-paille <sguelton@redhat.com>

Cleanup codegen includes

after: 1061034926
before: 1063332844

Differential Revision: https://reviews.llvm.org/D121169


Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 48349967 12-Dec-2021 Kazu Hirata <kazu@google.com>

[Target] Use llvm::reverse (NFC)


# 63eb7ff4 07-Dec-2021 Ties Stuij <ties.stuij@arm.com>

[ARM] Implement PAC return address signing mechanism for PACBTI-M

This patch implements PAC return address signing for armv8-m. This patch roughly
accomplishes the following things:

- PAC and AUT i

[ARM] Implement PAC return address signing mechanism for PACBTI-M

This patch implements PAC return address signing for armv8-m. This patch roughly
accomplishes the following things:

- PAC and AUT instructions are generated.
- They're part of the stack frame setup, so that shrink-wrapping can move them
inwards to cover only part of a function
- The auth code generated by PAC is saved across subroutine calls so that AUT
can find it again to check
- PAC is emitted before stacking registers (so that the SP it signs is the one
on function entry).
- The new pseudo-register ra_auth_code is mentioned in the DWARF frame data
- With CMSE also in use: PAC is emitted before stacking FPCXTNS, and AUT
validates the corresponding value of SP
- Emit correct unwind information when PAC is replaced by PACBTI
- Handle tail calls correctly

Some notes:

We make the assembler accept the `.save {ra_auth_code}` directive that is
emitted by the compiler when it saves a register that contains a
return address authentication code.

For EHABI we need to have the `FrameSetup` flag on the instruction and
handle the `t2PACBTI` opcode (identically to `t2PAC`), so we can emit
`.save {ra_auth_code}`, instead of `.save {r12}`.

For PACBTI-M, the instruction which computes return address PAC should use SP
value before adjustment for the argument registers save are (used for variadic
functions and when a parameter is is split between stack and register), but at
the same it should be after the instruction that saves FPCXT when compiling a
CMSE entry function.

This patch moves the varargs SP adjustment after the FPCXT save (they are never
enabled at the same time), so in a following patch handling of the `PAC`
instruction can be placed between them.

Epilogue emission code adjusted in a similar manner.

PACBTI-M code generation should not emit any instructions for architectures
v6-m, v8-m.base, and for A- and R-class cores. Diagnostic message for such cases
is handled separately by a future ticket.

note on tail calls:

If the called function has four arguments that occupy registers `r0`-`r3`, the
only option for holding the function pointer itself is `r12`, but this register
is used to keep the PAC during function/prologue epilogue and clobbers the
function pointer.

When we do the tail call we need the five registers (`r0`-`r3` and `r12`) to
keep six values - the four function arguments, the function pointer and the PAC,
which is obviously impossible.

One option would be to authenticate the return address before all callee-saved
registers are restored, so we have a scratch register to temporarily keep the
value of `r12`. The issue with this approach is that it violates a fundamental
invariant that PAC is computed using CFA as a modifier. It would also mean using
separate instructions to pop `lr` and the rest of the callee-saved registers,
which would offset the advantages of doing a tail call.

Instead, this patch disables indirect tail calls when the called function take
four or more arguments and the return address sign and authentication is enabled
for the caller function, conservatively assuming the caller function would spill
LR.

This patch is part of a series that adds support for the PACBTI-M extension of
the Armv8.1-M architecture, as detailed here:

https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension

The PACBTI-M specification can be found in the Armv8-M Architecture Reference
Manual:

https://developer.arm.com/documentation/ddi0553/latest

The following people contributed to this patch:

- Momchil Velikov
- Ties Stuij

Reviewed By: danielkiss

Differential Revision: https://reviews.llvm.org/D112429

show more ...


# b8f1ccb0 02-Dec-2021 David Green <david.green@arm.com>

[ARM] Introduce i8neg and i8pos addressing modes

Some instructions with i8 immediate ranges can only hold negative values
(like t2LDRHi8), only hold positive values (like t2STRT) or hold +/-
dependi

[ARM] Introduce i8neg and i8pos addressing modes

Some instructions with i8 immediate ranges can only hold negative values
(like t2LDRHi8), only hold positive values (like t2STRT) or hold +/-
depending on the U bit (like the pre/post inc instructions. e.g
t2LDRH_POST). This patch splits the AddrModeT2_i8 into AddrModeT2_i8,
AddrModeT2_i8pos and AddrModeT2_i8neg to make this clear.

This allows us to get the offset ranges of t2LDRHi8 correct in the
load/store optimizer, fixing issues where we could end up creating
instructions with positive offsets (which may then be encoded as ldrht).

Differential Revision: https://reviews.llvm.org/D114638

show more ...


# 387927bb 27-Nov-2021 Kazu Hirata <kazu@google.com>

[Target] Use range-based for loops (NFC)


# 562356d6 26-Nov-2021 Kazu Hirata <kazu@google.com>

[Target] Use range-based for loops (NFC)


Revision tags: llvmorg-13.0.1-rc1
# d45cb1d7 23-Nov-2021 Kazu Hirata <kazu@google.com>

[llvm] Use range-based for loops (NFC)


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4
# 84b07c9b 19-Sep-2021 Kazu Hirata <kazu@google.com>

[llvm] Use pop_back_val (NFC)


Revision tags: llvmorg-13.0.0-rc3
# c9fca53a 10-Sep-2021 Kazu Hirata <kazu@google.com>

[CodeGen, Target] Use pred_empty and succ_empty (NFC)


# 9cb8f4d1 02-Sep-2021 David Green <david.green@arm.com>

[ARM] Add a tail-predication loop predicate register

The semantics of tail predication loops means that the value of LR as an
instruction is executed determines the predicate. In other words:

mov r

[ARM] Add a tail-predication loop predicate register

The semantics of tail predication loops means that the value of LR as an
instruction is executed determines the predicate. In other words:

mov r3, #3
DLSTP lr, r3 // Start tail predication, lr==3
VADD.s32 q0, q1, q2 // Lanes 0,1 and 2 are updated in q0.
mov lr, #1
VADD.s32 q0, q1, q2 // Only first lane is updated.

This means that the value of lr cannot be spilled and re-used in tail
predication regions without potentially altering the behaviour of the
program. More lanes than required could be stored, for example, and in
the case of a gather those lanes might not have been setup, leading to
alignment exceptions.

This patch adds a new lr predicate operand to MVE instructions in order
to keep a reference to the lr that they use as a tail predicate. It will
usually hold the zeroreg meaning not predicated, being set to the LR phi
value in the MVETPAndVPTOptimisationsPass. This will prevent it from
being spilled anywhere that it needs to be used.

A lot of tests needed updating.

Differential Revision: https://reviews.llvm.org/D107638

show more ...


Revision tags: llvmorg-13.0.0-rc2
# c140ff49 10-Aug-2021 David Green <david.green@arm.com>

[ARM] Change a couple of instances of LiveRegs.contains to !LiveRegs.available

This changes a couple of calls to LiveRegs.contains to
!LiveRegs.available, one in Thumb1FrameLoweringInfo (which modif

[ARM] Change a couple of instances of LiveRegs.contains to !LiveRegs.available

This changes a couple of calls to LiveRegs.contains to
!LiveRegs.available, one in Thumb1FrameLoweringInfo (which modifies a
test to look more correct to me, given r7 should be the frame pointer so
is not available), and another in the ARMLoadStoreOptimizer, that I
don't have a test for, it was just found by inspection.

Differential Revision: https://reviews.llvm.org/D107454

show more ...


Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init
# 010f8e30 26-Jul-2021 David Green <david.green@arm.com>

[ARM] Ensure correct regclass in distributing postinc

The register class required for some MVE loads/stores is more
constrained than the register we use when creating postinc. Make sure we
constrain

[ARM] Ensure correct regclass in distributing postinc

The register class required for some MVE loads/stores is more
constrained than the register we use when creating postinc. Make sure we
constrain the register class to keep the code correct.

show more ...


Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3
# 03892a27 24-Feb-2021 David Green <david.green@arm.com>

[ARM] Expand the range of allowed post-incs in load/store optimizer

Currently the load/store optimizer will only fold in increments of the
same size as the load/store. This patch expands that to any

[ARM] Expand the range of allowed post-incs in load/store optimizer

Currently the load/store optimizer will only fold in increments of the
same size as the load/store. This patch expands that to any legal
immediate for the post-inc instruction.

This is a recommit of 3b34b06fc5908b with correctness fixes and extra
tests.

Differential Revision: https://reviews.llvm.org/D95885

show more ...


Revision tags: llvmorg-12.0.0-rc2
# 7a5c26e9 19-Feb-2021 David Green <david.green@arm.com>

Revert "[ARM] Expand the range of allowed post-incs in load/store optimizer"

This reverts commit 3b34b06fc5908b4f7dc720c0655d5756bd8e2a28 as runtime
errors were reported.


# 3b34b06f 18-Feb-2021 David Green <david.green@arm.com>

[ARM] Expand the range of allowed post-incs in load/store optimizer

Currently the load/store optimizer will only fold in increments of the
same size as the load/store. This patch expands that to any

[ARM] Expand the range of allowed post-incs in load/store optimizer

Currently the load/store optimizer will only fold in increments of the
same size as the load/store. This patch expands that to any legal
immediate for the post-inc instruction.

Differential Revision: https://reviews.llvm.org/D95885

show more ...


# a838a4f6 15-Feb-2021 David Green <david.green@arm.com>

[ARM] Extend search for increment in load/store optimizer

Currently the findIncDecAfter will only look at the next instruction for
post-inc candidates in the load/store optimizer. This extends that

[ARM] Extend search for increment in load/store optimizer

Currently the findIncDecAfter will only look at the next instruction for
post-inc candidates in the load/store optimizer. This extends that to a
search through the current BB, until an instruction that modifies or
uses the increment reg is found. This allows more post-inc load/stores
and ldm/stm's to be created, especially in cases where a schedule might
move instructions further apart.

We make sure not to look any further for an SP, as that might invalidate
stack slots that are still in use.

Differential Revision: https://reviews.llvm.org/D95881

show more ...


Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1
# 2e17d9c0 11-Jan-2021 Stephan Herhut <herhut@google.com>

[ARM] Add uses for locals introduced for debug messages. NFC.

This adds uses for locals introduced for new debug messages for the load store optimizer. Those locals are only used on debug statements

[ARM] Add uses for locals introduced for debug messages. NFC.

This adds uses for locals introduced for new debug messages for the load store optimizer. Those locals are only used on debug statements and otherwise create unused variable warnings.

Differential Revision: https://reviews.llvm.org/D94398

show more ...


# 8165a034 11-Jan-2021 David Green <david.green@arm.com>

[ARM] Add debug messages for the load store optimizer. NFC


Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3
# 7f7993e0 17-Sep-2020 David Green <david.green@arm.com>

[ARM] Expand distributing increments to also handle existing pre/post inc instructions.

This extends the distributing postinc code in load/store optimizer to
also handle the case where there is an e

[ARM] Expand distributing increments to also handle existing pre/post inc instructions.

This extends the distributing postinc code in load/store optimizer to
also handle the case where there is an existing pre/post inc instruction,
where subsequent instructions can be modified to use the adjusted
offset from the increment. This can save us having to keep the old
register live past the increment instruction.

Differential Revision: https://reviews.llvm.org/D83377

show more ...


Revision tags: llvmorg-11.0.0-rc2
# fd69df62 01-Aug-2020 David Green <david.green@arm.com>

[ARM] Distribute post-inc for Thumb2 sign/zero extending loads/stores

This adds sign/zero extending scalar loads/stores to the MVE
instructions added in D77813, allowing us to create up more post-in

[ARM] Distribute post-inc for Thumb2 sign/zero extending loads/stores

This adds sign/zero extending scalar loads/stores to the MVE
instructions added in D77813, allowing us to create up more post-inc
instructions. These are comparatively simple, compared to LDR/STR (which
may be better turned into an LDRD/LDM), but still require some additions
over MVE instructions. Because there are i12 and i8 variants of the
offset loads/stores dealing with different signs, we may need to convert
an i12 address to a i8 negative instruction. t2LDRBi12 can also be
shrunk to a tLDRi under the right conditions, so we need to be careful
with codesize too.

Differential Revision: https://reviews.llvm.org/D78625

show more ...


Revision tags: llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1
# f33e86df 22-Apr-2020 Haojian Wu <hokein.wu@gmail.com>

Fix -Wunused-variable error.


12345678910>>...15