memcall.ll - OpenGrok history log for /llvm-project/llvm/test/CodeGen/Thumb2/LowOverheadLoops/memcall.ll

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# d69033d2	12-Jul-2023	Nikita Popov <npopov@redhat.com>	[SCEVExpander] Fix GEP IV inc reuse logic for opaque pointers Instead of checking the pointer type, check the element type of the GEP. Previously we ended up reusing GEP increments that were not in [SCEVExpander] Fix GEP IV inc reuse logic for opaque pointers Instead of checking the pointer type, check the element type of the GEP. Previously we ended up reusing GEP increments that were not in expanded form, thus not respecting LSRs choice of representation. The change in 2011-10-06-ReusePhi.ll recovers a regression that appeared when converting that test to opaque pointers. Changes in various Thumb tests now compute the step outside the loop instead of using add.w inside the loop, which is LSR's preferred representation for this target. show more ...
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5
# c4a60c9d	25-May-2023	sgokhale <sgokhale@nvidia.com>	[CodeGen][ShrinkWrap] Enable PostShrinkWrap by default This is an attempt to reland D42600 and enabling this optimisation by default. This also resolves the issue pointed out in the context of PGO [CodeGen][ShrinkWrap] Enable PostShrinkWrap by default This is an attempt to reland D42600 and enabling this optimisation by default. This also resolves the issue pointed out in the context of PGO build. Differential Revision: https://reviews.llvm.org/D42600 show more ...
Revision tags: llvmorg-16.0.4
# f4999d35	08-May-2023	Alan Zhao <ayzhao@google.com>	Revert "[CodeGen][ShrinkWrap] Split restore point" This reverts commit 1ddfd1c8186735c62b642df05c505dc4907ffac4. The original commit causes a Chrome build assertion failure with ThinLTO: https://cr Revert "[CodeGen][ShrinkWrap] Split restore point" This reverts commit 1ddfd1c8186735c62b642df05c505dc4907ffac4. The original commit causes a Chrome build assertion failure with ThinLTO: https://crbug.com/1443635 show more ...
# 1ddfd1c8	08-May-2023	sgokhale <sgokhale@nvidia.com>	[CodeGen][ShrinkWrap] Split restore point Try to reland D42600 Differential Revision: https://reviews.llvm.org/D42600
Revision tags: llvmorg-16.0.3, llvmorg-16.0.2
# bb5befef	13-Apr-2023	sgokhale <sgokhale@nvidia.com>	Revert "[CodeGen][ShrinkWrap] Split restore point" This reverts commit 5f0bccc3d1a74111458c71f009817c9995f4bf83. An issue has been reported here: https://github.com/ClangBuiltLinux/linux/issues/1833
# 5f0bccc3	11-Apr-2023	sgokhale <sgokhale@nvidia.com>	[CodeGen][ShrinkWrap] Split restore point This patch splits a restore point to allow it to only post-dominate blocks reachable by use or def of CSRs(Callee Saved Registers)/FI(Frame Index). Benchma [CodeGen][ShrinkWrap] Split restore point This patch splits a restore point to allow it to only post-dominate blocks reachable by use or def of CSRs(Callee Saved Registers)/FI(Frame Index). Benchmarking this on SPEC2017, this gives around 4% improvement on povray and no significant change for others. Co-authored-by: junbuml Differential Revision: https://reviews.llvm.org/D42600 show more ...
Revision tags: llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2
# 6da3cfc3	30-Jan-2023	Sergei Barannikov <barannikov88@gmail.com>	[Thumb2] Convert some tests to opaque pointers (NFC)
Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# 9ed2f14c	14-Dec-2022	Nikita Popov <npopov@redhat.com>	[AsmParser] Remove typed pointer auto-detection IR is now always parsed in opaque pointer mode, unless -opaque-pointers=0 is explicitly given. There is no automatic detection of typed pointers anymo [AsmParser] Remove typed pointer auto-detection IR is now always parsed in opaque pointer mode, unless -opaque-pointers=0 is explicitly given. There is no automatic detection of typed pointers anymore. The -opaque-pointers=0 option is added to any remaining IR tests that haven't been migrated yet. Differential Revision: https://reviews.llvm.org/D141912 show more ...
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4
# 88ac25b3	27-Oct-2022	John Brawn <john.brawn@arm.com>	[MachineCSE] Allow PRE of instructions that read physical registers Currently MachineCSE forbids PRE when the instruction reads a physical register. Relax this so that it's allowed when the value be [MachineCSE] Allow PRE of instructions that read physical registers Currently MachineCSE forbids PRE when the instruction reads a physical register. Relax this so that it's allowed when the value being read is the same as what would be read in the place the instruction would be hoisted to. This is being done in preparation for adding FPCR handling to the AArch64 backend, in order to prevent it to from worsening the generated code, but for targets that already have a similar register it should improve things. This patch affects code generation in several tests. The new code looks better except for in Thumb2/LowOverheadLoops/memcall.ll where we perform PRE but the LowOverheadLoops transformation then undoes it. Also in AMDGPU/selectcc-opt.ll the CHECK makes things look worse, but actually the function as a whole is better (as a MOV is PRE'd). Differential Revision: https://reviews.llvm.org/D136675 show more ...
# 7a7b36e9	28-Oct-2022	John Brawn <john.brawn@arm.com>	Revert "[MachineCSE] Allow PRE of instructions that read physical registers" This reverts commit 628467e53f4ceecd2b5f0797f07591c66d9d9d2a. This is causing a miscompile in ffmpeg when compiled for a Revert "[MachineCSE] Allow PRE of instructions that read physical registers" This reverts commit 628467e53f4ceecd2b5f0797f07591c66d9d9d2a. This is causing a miscompile in ffmpeg when compiled for armv7. show more ...
# 628467e5	27-Oct-2022	John Brawn <john.brawn@arm.com>	[MachineCSE] Allow PRE of instructions that read physical registers Currently MachineCSE forbids PRE when the instruction reads a physical register. Relax this so that it's allowed when the value be [MachineCSE] Allow PRE of instructions that read physical registers Currently MachineCSE forbids PRE when the instruction reads a physical register. Relax this so that it's allowed when the value being read is the same as what would be read in the place the instruction would be hoisted to. This is being done in preparation for adding FPCR handling to the AArch64 backend, in order to prevent it to from worsening the generated code, but for targets that already have a similar register it should improve things. This patch affects code generation in several tests. The new code looks better except for in Thumb2/LowOverheadLoops/memcall.ll where we perform PRE but the LowOverheadLoops transformation then undoes it. Also in AMDGPU/selectcc-opt.ll the CHECK makes things look worse, but actually the function as a whole is better (as a MOV is PRE'd). Differential Revision: https://reviews.llvm.org/D136675 show more ...
Revision tags: llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2
# bee2f618	13-Jun-2021	David Green <david.green@arm.com>	[ARM] Introduce t2WhileLoopStartTP This adds t2WhileLoopStartTP, similar to the t2DoLoopStartTP added in D90591. It keeps a reference to both the tripcount register and the element count register, s [ARM] Introduce t2WhileLoopStartTP This adds t2WhileLoopStartTP, similar to the t2DoLoopStartTP added in D90591. It keeps a reference to both the tripcount register and the element count register, so that the ARMLowOverheadLoops pass in the backend can pick the correct one without having to search for it from the operand of a VCTP. Differential Revision: https://reviews.llvm.org/D103236 show more ...
Revision tags: llvmorg-12.0.1-rc1
# ce76093c	14-May-2021	David Green <david.green@arm.com>	[ARM] Expand predecessor search to multiple blocks when reverting WhileLoopStarts We were previously only searching a single preheader for call instructions when reverting WhileLoopStarts to DoLoopS [ARM] Expand predecessor search to multiple blocks when reverting WhileLoopStarts We were previously only searching a single preheader for call instructions when reverting WhileLoopStarts to DoLoopStarts. This extends that to multiple blocks that can come up when, for example a loop is expanded from a memcpy. It also expends the instructions from just Call's to also include other LoopStarts, to catch other low overhead loops in the preheader. Differential Revision: https://reviews.llvm.org/D102269 show more ...
# dfe3ffaa	06-May-2021	Malhar Jajoo <malhar.jajoo@arm.com>	[ARM] Transforming memset to Tail predicated Loop This patch converts llvm.memset intrinsic into Tail Predicated Hardware loops for a target that supports the Arm M-profile Vector Extension (MVE). [ARM] Transforming memset to Tail predicated Loop This patch converts llvm.memset intrinsic into Tail Predicated Hardware loops for a target that supports the Arm M-profile Vector Extension (MVE). The llvm.memset is converted to a TP loop for both constant and non-constant input sizes (of llvm.memset). Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100435 show more ...
# 9ff38e2d	06-May-2021	Malhar Jajoo <malhar.jajoo@arm.com>	[ARM] Transforming memcpy to Tail predicated Loop This patch converts llvm.memcpy intrinsic into Tail Predicated Hardware loops for a target that supports the Arm M-profile Vector Extension (MVE). [ARM] Transforming memcpy to Tail predicated Loop This patch converts llvm.memcpy intrinsic into Tail Predicated Hardware loops for a target that supports the Arm M-profile Vector Extension (MVE). From an implementation point of view, the patch - adds an ARM specific SDAG Node (to which the llvm.memcpy intrinsic is lowered to, during first phase of ISel) - adds a corresponding TableGen entry to generate a pseudo instruction, with a custom inserter, on matching the above node. - Adds a custom inserter function that expands the pseudo instruction into MIR suitable to be (by later passes) into a WLSTP loop. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D99723 show more ...
# fc690777	06-May-2021	Malhar Jajoo <malhar.jajoo@arm.com>	Revert "[ARM] Transforming memcpy to Tail predicated Loop" Reverting commit since it causes failure (10462). This reverts commit b856f4a232cbd43476e9b9f75c80aacfc6f5c152.
# b856f4a2	06-May-2021	Malhar Jajoo <malhar.jajoo@arm.com>	[ARM] Transforming memcpy to Tail predicated Loop This patch converts llvm.memcpy intrinsic into Tail Predicated Hardware loops for a target that supports the Arm M-profile Vector Extension (MVE). [ARM] Transforming memcpy to Tail predicated Loop This patch converts llvm.memcpy intrinsic into Tail Predicated Hardware loops for a target that supports the Arm M-profile Vector Extension (MVE). From an implementation point of view, the patch - adds an ARM specific SDAG Node (to which the llvm.memcpy intrinsic is lowered to, during first phase of ISel) - adds a corresponding TableGen entry to generate a pseudo instruction, with a custom inserter, on matching the above node. - Adds a custom inserter function that expands the pseudo instruction into MIR suitable to be (by later passes) into a WLSTP loop. Note: A cli option is used to control the conversion of memcpy to TP loop and this option is currently disabled by default. It may be enabled in the future after further downstream testing. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D99723 show more ...
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1
# e4744994	03-Nov-2020	David Green <david.green@arm.com>	[ARM] Treat memcpy/memset/memmove as call instructions for low overhead loops If an instruction will be lowered to a call there is no advantage of using a low overhead loop as the LR register will n [ARM] Treat memcpy/memset/memmove as call instructions for low overhead loops If an instruction will be lowered to a call there is no advantage of using a low overhead loop as the LR register will need to be spilled and reloaded around the call, and the low overhead will end up being reverted. This teaches our hardware loop lowering that these memory intrinsics will be calls under certain situations. Differential Revision: https://reviews.llvm.org/D90439 show more ...
# 785080e3	03-Nov-2020	David Green <david.green@arm.com>	[ARM] Low overhead loop memcpy lowering test. NFC