memory-legalizer-local-singlethread.ll - OpenGrok history log for /llvm-project/llvm/test/CodeGen/AMDGPU/memory-legalizer-local-singlethread.ll

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 6548b635	09-Nov-2024	Shilei Tian <i@tianshilei.me>	Reapply "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)" This reverts commit ca33649abe5fad93c57afef54e43ed9b3249cd86.
# ca33649a	08-Nov-2024	Shilei Tian <i@tianshilei.me>	Revert "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)" This reverts commit e215a1e27d84adad2635a52393621eb4fa439dc9 as it broke both hip and openmp buildbots.
# e215a1e2	08-Nov-2024	Shilei Tian <i@tianshilei.me>	[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0
# 86627149	04-Sep-2024	Carl Ritson <carl.ritson@amd.com>	[AMDGPU] Mitigate GFX12 VALU read SGPR hazard (#100067) Any SGPR read by a VALU can potentially obscure SALU writes to the same register. Insert s_wait_alu instructions to mitigate the hazard on a [AMDGPU] Mitigate GFX12 VALU read SGPR hazard (#100067) Any SGPR read by a VALU can potentially obscure SALU writes to the same register. Insert s_wait_alu instructions to mitigate the hazard on affected paths. Compute a global cache of SGPRs with any VALU reads and use this to avoid inserting mitigation for SGPRs never accessed by VALUs. To avoid excessive search when compile time is priority implement secondary mode where all SALU writes are mitigated. Co-authored-by: Shilei Tian <shilei.tian@amd.com> show more ...
Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5
# 0b21b25e	01-May-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Do not optimize away pre-existing waitcnt instructions at -O0 (#90716) The autogenerated memory legalizer tests use -O0 so this allows us to see the exact waitcnts that were inserted by th [AMDGPU] Do not optimize away pre-existing waitcnt instructions at -O0 (#90716) The autogenerated memory legalizer tests use -O0 so this allows us to see the exact waitcnts that were inserted by the memory legalizer without them being optimized away. show more ...
Revision tags: llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1
# 982e9022	04-Mar-2024	Mirko Brkušanin <Mirko.Brkusanin@amd.com>	[AMDGPU] Add GFX12 memory legalizer tests (#83814)
Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# ac296b69	22-Jan-2024	Pierre van Houtryve <pierre.vanhoutryve@amd.com>	[AMDGPU] Drop verify from SIMemoryLegalizer tests (#78697) SIMemoryLegalizer tests were slow, with most of them taking 4.5 to 5.3s to complete and that's on a fast machine. I also recall seeing the [AMDGPU] Drop verify from SIMemoryLegalizer tests (#78697) SIMemoryLegalizer tests were slow, with most of them taking 4.5 to 5.3s to complete and that's on a fast machine. I also recall seeing them in the slowest tests list on build bots. This removes the verify-machineinstrs option from these tests to speed them up, bringing the slowest test down to +-2s. Verifier still runs in EXPENSIVE_CHECKS builds. show more ...
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# 8dce4333	05-Dec-2022	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Bulk update memory legalizer tests to use opaque pointers
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1
# 4c4db816	30-Jul-2022	Carl Ritson <carl.ritson@amd.com>	[AMDGPU] Extend SILoadStoreOptimizer to s_load instructions Apply merging to s_load as is done for s_buffer_load. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D130742
Revision tags: llvmorg-16-init
# d1af09ad	23-Jun-2022	Joe Nash <Joseph.Nash@amd.com>	[AMDGPU] gfx11 Generate VOPD Instructions We form VOPD instructions in the GCNCreateVOPD pass by combining back-to-back component instructions. There are strict register constraints for creating a [AMDGPU] gfx11 Generate VOPD Instructions We form VOPD instructions in the GCNCreateVOPD pass by combining back-to-back component instructions. There are strict register constraints for creating a legal VOPD, namely that the matching operands (e.g. src0x and src0y, src1x and src1y) must be in different register banks. We add a PostRA scheduler mutation to put possible VOPD components back-to-back. Depends on D128442, D128270 Reviewed By: #amdgpu, rampitec Differential Revision: https://reviews.llvm.org/D128656 show more ...
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5
# a3fc8adb	09-Jun-2022	Jay Foad <jay.foad@amd.com>	[AMDGPU] Add GFX11 test coverage for the memory legalizer
Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# 47bac63d	08-Mar-2022	Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	[AMDGPU] gfx940 memory model Differential Revision: https://reviews.llvm.org/D121242
Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3
# 89c447e4	16-Jan-2022	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Stop reserving 36-bytes before kernel arguments for amdpal This was inheriting the mesa behavior, and as far as I know nobody is using opencl kernels with amdpal. The isMesaKernel check was AMDGPU: Stop reserving 36-bytes before kernel arguments for amdpal This was inheriting the mesa behavior, and as far as I know nobody is using opencl kernels with amdpal. The isMesaKernel check was irrelevant because this property needs to be held for all functions. show more ...
Revision tags: llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1
# da067ed5	10-Nov-2021	Austin Kerbow <Austin.Kerbow@amd.com>	[AMDGPU] Set most sched model resource's BufferSize to one Using a BufferSize of one for memory ProcResources will result in better ILP since it more accurately models the dependencies between memor [AMDGPU] Set most sched model resource's BufferSize to one Using a BufferSize of one for memory ProcResources will result in better ILP since it more accurately models the dependencies between memory ops and their consumers on an in-order processor. After this change, the scheduler will treat the data edges from loads as blocking so that stalls are guaranteed when waiting for data to be retreaved from memory. Since we don't actually track waitcnt here, this should do a better job at modeling their behavior. Practically, this means that the scheduler will trigger the 'STALL' heuristic more often. This type of change needs to be evaluated experimentally. Preliminary results are positive. Fixes: SWDEV-282962 Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D114777 show more ...
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init
# 53eb4691	23-Jul-2021	Tony Tye <Tony.Tye@amd.com>	[AMDGPU] Support non-strictly stronger memory orderings in SIMemoryLegalizer C++20 no longer requires the failure memory ordering to be no stronger than the success memory ordering. Adjust assert in [AMDGPU] Support non-strictly stronger memory orderings in SIMemoryLegalizer C++20 no longer requires the failure memory ordering to be no stronger than the success memory ordering. Adjust assert in AMD GPU SIMemoryLegalizer, and merge instruction memory orderings Add common operation to merge memory orders that allows non strict memory orderings to be combined. Use it in SIMemoryLegalizer and MachineMemOperand::getMergedOrdering. Reviewed By: efriedma, rampitec Differential Revision: https://reviews.llvm.org/D106729 show more ...
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2
# a8d9d507	17-Feb-2021	Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	[AMDGPU] gfx90a support Differential Revision: https://reviews.llvm.org/D96906
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1
# 61177943	24-Dec-2020	Praveen Velliengiri <Praveen.Velliengiri@amd.com>	[AMDGPU] Use MUBUF instructions for global address space access Currently, the compiler crashes in instruction selection of global load/stores in gfx600 due to the lack of FLAT instructions. This pa [AMDGPU] Use MUBUF instructions for global address space access Currently, the compiler crashes in instruction selection of global load/stores in gfx600 due to the lack of FLAT instructions. This patch fix the crash by selecting MUBUF instructions for global load/stores in gfx600. Authored-by: Praveen Velliengiri <Praveen.Velliengiri@amd.com> Reviewed by: t-tye Differential revision: https://reviews.llvm.org/D92483 show more ...
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2
# f6b9afae	30-Nov-2020	Scott Linder <Scott.Linder@amd.com>	[AMDGPU] Extend and reorganize memory legalizer tests * Rename some tests to try to make a convention (where all components are optional) of: <addrspace>_<syncscope>_<memory-orders>_<operatio [AMDGPU] Extend and reorganize memory legalizer tests * Rename some tests to try to make a convention (where all components are optional) of: <addrspace>_<syncscope>_<memory-orders>_<operation> * Split up at a level of granularity appropriate for the different RUN lines (i.e. split on addrspace so GFX6 can avoid FLAT) and that makes running a specific test reasonable in terms of wall time taken. This also means when run as part of the test suite the testing is not one serial bottleneck. * Auto-generate check lines with `update_llc_test_checks.py` to make future maintenance more tractable. Reviewed By: rampitec, t-tye Differential Revision: https://reviews.llvm.org/D91545 show more ...