History log of /llvm-project/llvm/test/CodeGen/Thumb2/LowOverheadLoops/memcall.ll (Results 1 – 19 of 19)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# d69033d2 12-Jul-2023 Nikita Popov <npopov@redhat.com>

[SCEVExpander] Fix GEP IV inc reuse logic for opaque pointers

Instead of checking the pointer type, check the element type of
the GEP.

Previously we ended up reusing GEP increments that were not in

[SCEVExpander] Fix GEP IV inc reuse logic for opaque pointers

Instead of checking the pointer type, check the element type of
the GEP.

Previously we ended up reusing GEP increments that were not in
expanded form, thus not respecting LSRs choice of representation.

The change in 2011-10-06-ReusePhi.ll recovers a regression that
appeared when converting that test to opaque pointers.

Changes in various Thumb tests now compute the step outside the
loop instead of using add.w inside the loop, which is LSR's
preferred representation for this target.

show more ...


Revision tags: llvmorg-16.0.6, llvmorg-16.0.5
# c4a60c9d 25-May-2023 sgokhale <sgokhale@nvidia.com>

[CodeGen][ShrinkWrap] Enable PostShrinkWrap by default

This is an attempt to reland D42600 and enabling this optimisation by default.

This also resolves the issue pointed out in the context of PGO

[CodeGen][ShrinkWrap] Enable PostShrinkWrap by default

This is an attempt to reland D42600 and enabling this optimisation by default.

This also resolves the issue pointed out in the context of PGO build.

Differential Revision: https://reviews.llvm.org/D42600

show more ...


Revision tags: llvmorg-16.0.4
# f4999d35 08-May-2023 Alan Zhao <ayzhao@google.com>

Revert "[CodeGen][ShrinkWrap] Split restore point"

This reverts commit 1ddfd1c8186735c62b642df05c505dc4907ffac4.

The original commit causes a Chrome build assertion failure with
ThinLTO: https://cr

Revert "[CodeGen][ShrinkWrap] Split restore point"

This reverts commit 1ddfd1c8186735c62b642df05c505dc4907ffac4.

The original commit causes a Chrome build assertion failure with
ThinLTO: https://crbug.com/1443635

show more ...


# 1ddfd1c8 08-May-2023 sgokhale <sgokhale@nvidia.com>

[CodeGen][ShrinkWrap] Split restore point

Try to reland D42600

Differential Revision: https://reviews.llvm.org/D42600


Revision tags: llvmorg-16.0.3, llvmorg-16.0.2
# bb5befef 13-Apr-2023 sgokhale <sgokhale@nvidia.com>

Revert "[CodeGen][ShrinkWrap] Split restore point"

This reverts commit 5f0bccc3d1a74111458c71f009817c9995f4bf83.

An issue has been reported here: https://github.com/ClangBuiltLinux/linux/issues/1833


# 5f0bccc3 11-Apr-2023 sgokhale <sgokhale@nvidia.com>

[CodeGen][ShrinkWrap] Split restore point

This patch splits a restore point to allow it to only post-dominate blocks reachable by use
or def of CSRs(Callee Saved Registers)/FI(Frame Index).

Benchma

[CodeGen][ShrinkWrap] Split restore point

This patch splits a restore point to allow it to only post-dominate blocks reachable by use
or def of CSRs(Callee Saved Registers)/FI(Frame Index).

Benchmarking this on SPEC2017, this gives around 4% improvement on povray and no significant change
for others.

Co-authored-by: junbuml

Differential Revision: https://reviews.llvm.org/D42600

show more ...


Revision tags: llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2
# 6da3cfc3 30-Jan-2023 Sergei Barannikov <barannikov88@gmail.com>

[Thumb2] Convert some tests to opaque pointers (NFC)


Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# 9ed2f14c 14-Dec-2022 Nikita Popov <npopov@redhat.com>

[AsmParser] Remove typed pointer auto-detection

IR is now always parsed in opaque pointer mode, unless
-opaque-pointers=0 is explicitly given. There is no automatic
detection of typed pointers anymo

[AsmParser] Remove typed pointer auto-detection

IR is now always parsed in opaque pointer mode, unless
-opaque-pointers=0 is explicitly given. There is no automatic
detection of typed pointers anymore.

The -opaque-pointers=0 option is added to any remaining IR tests
that haven't been migrated yet.

Differential Revision: https://reviews.llvm.org/D141912

show more ...


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4
# 88ac25b3 27-Oct-2022 John Brawn <john.brawn@arm.com>

[MachineCSE] Allow PRE of instructions that read physical registers

Currently MachineCSE forbids PRE when the instruction reads a physical
register. Relax this so that it's allowed when the value be

[MachineCSE] Allow PRE of instructions that read physical registers

Currently MachineCSE forbids PRE when the instruction reads a physical
register. Relax this so that it's allowed when the value being read is
the same as what would be read in the place the instruction would be
hoisted to.

This is being done in preparation for adding FPCR handling to the
AArch64 backend, in order to prevent it to from worsening the
generated code, but for targets that already have a similar register
it should improve things.

This patch affects code generation in several tests. The new code
looks better except for in Thumb2/LowOverheadLoops/memcall.ll where
we perform PRE but the LowOverheadLoops transformation then undoes
it. Also in AMDGPU/selectcc-opt.ll the CHECK makes things look worse,
but actually the function as a whole is better (as a MOV is PRE'd).

Differential Revision: https://reviews.llvm.org/D136675

show more ...


# 7a7b36e9 28-Oct-2022 John Brawn <john.brawn@arm.com>

Revert "[MachineCSE] Allow PRE of instructions that read physical registers"

This reverts commit 628467e53f4ceecd2b5f0797f07591c66d9d9d2a.

This is causing a miscompile in ffmpeg when compiled for a

Revert "[MachineCSE] Allow PRE of instructions that read physical registers"

This reverts commit 628467e53f4ceecd2b5f0797f07591c66d9d9d2a.

This is causing a miscompile in ffmpeg when compiled for armv7.

show more ...


# 628467e5 27-Oct-2022 John Brawn <john.brawn@arm.com>

[MachineCSE] Allow PRE of instructions that read physical registers

Currently MachineCSE forbids PRE when the instruction reads a physical
register. Relax this so that it's allowed when the value be

[MachineCSE] Allow PRE of instructions that read physical registers

Currently MachineCSE forbids PRE when the instruction reads a physical
register. Relax this so that it's allowed when the value being read is
the same as what would be read in the place the instruction would be
hoisted to.

This is being done in preparation for adding FPCR handling to the
AArch64 backend, in order to prevent it to from worsening the
generated code, but for targets that already have a similar register
it should improve things.

This patch affects code generation in several tests. The new code
looks better except for in Thumb2/LowOverheadLoops/memcall.ll where
we perform PRE but the LowOverheadLoops transformation then undoes
it. Also in AMDGPU/selectcc-opt.ll the CHECK makes things look worse,
but actually the function as a whole is better (as a MOV is PRE'd).

Differential Revision: https://reviews.llvm.org/D136675

show more ...


Revision tags: llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2
# bee2f618 13-Jun-2021 David Green <david.green@arm.com>

[ARM] Introduce t2WhileLoopStartTP

This adds t2WhileLoopStartTP, similar to the t2DoLoopStartTP added in
D90591. It keeps a reference to both the tripcount register and the
element count register, s

[ARM] Introduce t2WhileLoopStartTP

This adds t2WhileLoopStartTP, similar to the t2DoLoopStartTP added in
D90591. It keeps a reference to both the tripcount register and the
element count register, so that the ARMLowOverheadLoops pass in the
backend can pick the correct one without having to search for it from
the operand of a VCTP.

Differential Revision: https://reviews.llvm.org/D103236

show more ...


Revision tags: llvmorg-12.0.1-rc1
# ce76093c 14-May-2021 David Green <david.green@arm.com>

[ARM] Expand predecessor search to multiple blocks when reverting WhileLoopStarts

We were previously only searching a single preheader for call
instructions when reverting WhileLoopStarts to DoLoopS

[ARM] Expand predecessor search to multiple blocks when reverting WhileLoopStarts

We were previously only searching a single preheader for call
instructions when reverting WhileLoopStarts to DoLoopStarts. This
extends that to multiple blocks that can come up when, for example a
loop is expanded from a memcpy. It also expends the instructions from
just Call's to also include other LoopStarts, to catch other low
overhead loops in the preheader.

Differential Revision: https://reviews.llvm.org/D102269

show more ...


# dfe3ffaa 06-May-2021 Malhar Jajoo <malhar.jajoo@arm.com>

[ARM] Transforming memset to Tail predicated Loop

This patch converts llvm.memset intrinsic into Tail Predicated
Hardware loops for a target that supports the Arm M-profile
Vector Extension (MVE).

[ARM] Transforming memset to Tail predicated Loop

This patch converts llvm.memset intrinsic into Tail Predicated
Hardware loops for a target that supports the Arm M-profile
Vector Extension (MVE).

The llvm.memset is converted to a TP loop for both
constant and non-constant input sizes (of llvm.memset).

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D100435

show more ...


# 9ff38e2d 06-May-2021 Malhar Jajoo <malhar.jajoo@arm.com>

[ARM] Transforming memcpy to Tail predicated Loop

This patch converts llvm.memcpy intrinsic into Tail Predicated
Hardware loops for a target that supports the Arm M-profile
Vector Extension (MVE).

[ARM] Transforming memcpy to Tail predicated Loop

This patch converts llvm.memcpy intrinsic into Tail Predicated
Hardware loops for a target that supports the Arm M-profile
Vector Extension (MVE).

From an implementation point of view, the patch

- adds an ARM specific SDAG Node (to which the llvm.memcpy intrinsic is lowered to, during first phase of ISel)
- adds a corresponding TableGen entry to generate a pseudo instruction, with a custom inserter,
on matching the above node.
- Adds a custom inserter function that expands the pseudo instruction into MIR suitable
to be (by later passes) into a WLSTP loop.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D99723

show more ...


# fc690777 06-May-2021 Malhar Jajoo <malhar.jajoo@arm.com>

Revert "[ARM] Transforming memcpy to Tail predicated Loop"

Reverting commit since it causes failure (10462).
This reverts commit b856f4a232cbd43476e9b9f75c80aacfc6f5c152.


# b856f4a2 06-May-2021 Malhar Jajoo <malhar.jajoo@arm.com>

[ARM] Transforming memcpy to Tail predicated Loop

This patch converts llvm.memcpy intrinsic into Tail Predicated
Hardware loops for a target that supports the Arm M-profile
Vector Extension (MVE).

[ARM] Transforming memcpy to Tail predicated Loop

This patch converts llvm.memcpy intrinsic into Tail Predicated
Hardware loops for a target that supports the Arm M-profile
Vector Extension (MVE).

From an implementation point of view, the patch

- adds an ARM specific SDAG Node (to which the llvm.memcpy intrinsic is lowered to, during first phase of ISel)
- adds a corresponding TableGen entry to generate a pseudo instruction, with a custom inserter,
on matching the above node.
- Adds a custom inserter function that expands the pseudo instruction into MIR suitable
to be (by later passes) into a WLSTP loop.

Note: A cli option is used to control the conversion of memcpy to TP
loop and this option is currently disabled by default. It may be enabled
in the future after further downstream testing.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D99723

show more ...


Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1
# e4744994 03-Nov-2020 David Green <david.green@arm.com>

[ARM] Treat memcpy/memset/memmove as call instructions for low overhead loops

If an instruction will be lowered to a call there is no advantage of
using a low overhead loop as the LR register will n

[ARM] Treat memcpy/memset/memmove as call instructions for low overhead loops

If an instruction will be lowered to a call there is no advantage of
using a low overhead loop as the LR register will need to be spilled and
reloaded around the call, and the low overhead will end up being
reverted. This teaches our hardware loop lowering that these memory
intrinsics will be calls under certain situations.

Differential Revision: https://reviews.llvm.org/D90439

show more ...


# 785080e3 03-Nov-2020 David Green <david.green@arm.com>

[ARM] Low overhead loop memcpy lowering test. NFC