stack-realign-kernel.ll - OpenGrok history log for /llvm-project/llvm/test/CodeGen/AMDGPU/stack-realign-kernel.ll

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
# b1bcb7ca	15-Jul-2024	Matt Arsenault <Matthew.Arsenault@amd.com>	Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851) This reverts commit adaff46d087799 Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851) This reverts commit adaff46d087799072438dd744b038e6fd50a2d78. Drop the -O3 checks from default-attributes.hip. I don't know why they are different on some bots but reverting this is far too disruptive. show more ...
# adaff46d	15-Jul-2024	dyung <douglas.yung@sony.com>	Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851) This reverts commits 677cc15e0ff2e0 Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851) This reverts commits 677cc15e0ff2e0e6aa30538eb187990a6a8f53c0 and 78bc1b64a6dc3fb6191355a5e1b502be8b3668e7. The test CodeGenHIP/default-attributes.hip is failing on multiple bots even after the attempted fix including the following: - https://lab.llvm.org/buildbot/#/builders/3/builds/1473 - https://lab.llvm.org/buildbot/#/builders/65/builds/1380 - https://lab.llvm.org/buildbot/#/builders/161/builds/595 - https://lab.llvm.org/buildbot/#/builders/154/builds/1372 - https://lab.llvm.org/buildbot/#/builders/133/builds/1547 - https://lab.llvm.org/buildbot/#/builders/81/builds/755 - https://lab.llvm.org/buildbot/#/builders/40/builds/570 - https://lab.llvm.org/buildbot/#/builders/13/builds/748 - https://lab.llvm.org/buildbot/#/builders/12/builds/1845 - https://lab.llvm.org/buildbot/#/builders/11/builds/1695 - https://lab.llvm.org/buildbot/#/builders/190/builds/1829 - https://lab.llvm.org/buildbot/#/builders/193/builds/962 - https://lab.llvm.org/buildbot/#/builders/23/builds/991 - https://lab.llvm.org/buildbot/#/builders/144/builds/2256 - https://lab.llvm.org/buildbot/#/builders/46/builds/1614 These bots have been broken for a day, so reverting to get everything back to green. show more ...
# 78bc1b64	14-Jul-2024	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Move attributor into optimization pipeline (#83131) Removing it from the codegen pipeline induces a lot of test churn because llc is no longer optimizing out implicit arguments to kernels. AMDGPU: Move attributor into optimization pipeline (#83131) Removing it from the codegen pipeline induces a lot of test churn because llc is no longer optimizing out implicit arguments to kernels. Mostly mechanical, but there are some creative test updates. I preferred to take the changes as-is in tests where the ABI isn't relevant. In cases where it's more relevant, or the optimize out logic was too ingrained in the test, I pre-run the optimization. Some cases manually add attributes to disable inputs. show more ...
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1
# 4490003a	06-Mar-2024	Emma Pilkington <emma.pilkington95@gmail.com>	[AMDGPU] Rename COV module flag to amdhsa_code_object_version (#79905) The previous name 'amdgpu_code_object_version', was misleading since this is really a property of the HSA OS. The new spelling [AMDGPU] Rename COV module flag to amdhsa_code_object_version (#79905) The previous name 'amdgpu_code_object_version', was misleading since this is really a property of the HSA OS. The new spelling also matches the asm directive I added in bc82cfb. show more ...
Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3
# bc6955f1	09-Feb-2024	Diana Picus <Diana-Magda.Picus@amd.com>	[AMDGPU] Don't fix the scavenge slot at offset 0 (#79136) At the moment, the emergency spill slot is a fixed object for entry functions and chain functions, and a regular stack object otherwise. T [AMDGPU] Don't fix the scavenge slot at offset 0 (#79136) At the moment, the emergency spill slot is a fixed object for entry functions and chain functions, and a regular stack object otherwise. This patch adopts the latter behaviour for entry/chain functions too. It seems this was always the intention [1] and it will also save us a bit of stack space in cases where the first stack object has a large alignment. [1] https://github.com/llvm/llvm-project/commit/34c8b835b16fb3879f1b9770e91df21883356bb6 show more ...
Revision tags: llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# b629b866	09-Jan-2024	Shilei Tian <i@tianshilei.me>	[AMDGPU][MC] Use normal ELF syntax for section switching (#77267) For some reasons `SunStyleELFSectionSwitchSyntax` is set to `true` for AMDGPU, but according to https://github.com/llvm/llvm-proje [AMDGPU][MC] Use normal ELF syntax for section switching (#77267) For some reasons `SunStyleELFSectionSwitchSyntax` is set to `true` for AMDGPU, but according to https://github.com/llvm/llvm-project/issues/64862#issuecomment-1880419239 that syntax is only limited to Sun system. Fix #64862. show more ...
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3
# 4428b01f	16-Oct-2023	Pierre van Houtryve <pierre.vanhoutryve@amd.com>	Reland: [AMDGPU] Remove Code Object V3 (#67118) V3 has been deprecated for a while as well, so it can safely be removed like V2 was removed. - [Clang] Set minimum code object version to 4 - [lld] F Reland: [AMDGPU] Remove Code Object V3 (#67118) V3 has been deprecated for a while as well, so it can safely be removed like V2 was removed. - [Clang] Set minimum code object version to 4 - [lld] Fix tests using code object v3 - Remove code object V3 from the AMDGPU backend, and delete or port v3 tests to v4. - Update docs to make it clear V3 can no longer be emitted. show more ...
# 868abf09	18-Oct-2023	pvanhout <pierre.vanhoutryve@amd.com>	Revert "[AMDGPU] Remove Code Object V3 (#67118)" This reverts commit 544d91280c26fd5f7acd70eac4d667863562f4cc.
# 544d9128	16-Oct-2023	Pierre van Houtryve <pierre.vanhoutryve@amd.com>	[AMDGPU] Remove Code Object V3 (#67118) V3 has been deprecated for a while as well, so it can safely be removed like V2 was removed. - [Clang] Set minimum code object version to 4 - [lld] Fix t [AMDGPU] Remove Code Object V3 (#67118) V3 has been deprecated for a while as well, so it can safely be removed like V2 was removed. - [Clang] Set minimum code object version to 4 - [lld] Fix tests using code object v3 - Remove code object V3 from the AMDGPU backend, and delete or port v3 tests to v4. - Update docs to make it clear V3 can no longer be emitted. show more ...
Revision tags: llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2
# 54cf69c9	03-Feb-2023	Changpeng Fang <changpeng.fang@amd.com>	AMDGPU: Use module flag to get code object version at IR level Summary: This patch introduces a mechanism to check the code object version from the module flag, This avoids checking from command l AMDGPU: Use module flag to get code object version at IR level Summary: This patch introduces a mechanism to check the code object version from the module flag, This avoids checking from command line. In case the module flag is missing, we use the current default code object version supported in the compiler. For tools whose inputs are not IR, we may need other approach (directive, for example) to check the code object version, That will be in a separate patch later. For LIT tests update, we directly add module flag if there is only a single code object version associated with all checks in one file. In cause of multiple code object version in one file, we use the "sed" method to "clone" the checks to achieve the goal. Reviewer: arsenm Differential Revision: https://reviews.llvm.org/D14313 show more ...
Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# 75a3f22a	02-Dec-2022	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Convert some stack tests to opaque pointers
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2
# 3d9f011a	27-Sep-2022	Abinav Puthan Purayil <abinavpp@gmail.com>	[AMDGPU] Make the uses_dynamic_stack field in the kernel descriptor and the metadata map specific to code object v5 and later Unfortunately, we have a broken handling of this in the runtime of rocm [AMDGPU] Make the uses_dynamic_stack field in the kernel descriptor and the metadata map specific to code object v5 and later Unfortunately, we have a broken handling of this in the runtime of rocm 5.3. The runtime is expected to handle this correctly when v5 becomes the default. Differential Revision: https://reviews.llvm.org/D134714 show more ...
Revision tags: llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6
# d96361d7	17-Jun-2022	Abinav Puthan Purayil <abinavpp@gmail.com>	[AMDGPU] Add the uses_dynamic_stack field to the kernel descriptor and the kernel metadata map This change introduces the dynamic stack boolean field to code-object-v3 and above under the code prope [AMDGPU] Add the uses_dynamic_stack field to the kernel descriptor and the kernel metadata map This change introduces the dynamic stack boolean field to code-object-v3 and above under the code properties of the kernel descriptor and under the kernel metadata map of NT_AMDGPU_METADATA. This field corresponds to the is_dynamic_callstack field of amd_kernel_code_t. Differential Revision: https://reviews.llvm.org/D128344 show more ...
Revision tags: llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init
# e6564f39	21-Jan-2022	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Emit user sgpr count directives in text asm We were emitting these in the object file but not printing them.
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init
# 4359b870	14-Jul-2021	Sebastian Neubauer <sebastian.neubauer@amd.com>	[AMDGPU] Init scratch only if necessary If no scratch or flat instructions are used, we do not need to initialize the flat scratch hardware register. Differential Revision: https://reviews.llvm.org [AMDGPU] Init scratch only if necessary If no scratch or flat instructions are used, we do not need to initialize the flat scratch hardware register. Differential Revision: https://reviews.llvm.org/D105920 show more ...
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2
# 96e1fcb1	07-Jun-2021	Sebastian Neubauer <sebastian.neubauer@amd.com>	[AMDGPU] Use s_add_i32 for address additions This allows to convert the add instruction to s_addk_i32 and v_add_nc_u32 instead of needing v_add_co_u32 when converting to a VALU instruction. Differe [AMDGPU] Use s_add_i32 for address additions This allows to convert the add instruction to s_addk_i32 and v_add_nc_u32 instead of needing v_add_co_u32 when converting to a VALU instruction. Differential Revision: https://reviews.llvm.org/D103322 show more ...
Revision tags: llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4
# f4ace637	24-Mar-2021	Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>	AMDGPU: Add target id and code object v4 support - Add target id support (https://clang.llvm.org/docs/ClangOffloadBundler.html#target-id) - Add code object v4 support (https://llvm.org/docs/AMDG AMDGPU: Add target id and code object v4 support - Add target id support (https://clang.llvm.org/docs/ClangOffloadBundler.html#target-id) - Add code object v4 support (https://llvm.org/docs/AMDGPUUsage.html#elf-code-object) - Add kernarg_size to kernel descriptor - Change trap handler ABI to no longer move queue pointer into s[0:1] - Cleanup ELF definitions - Add V2, V3, V4 suffixes to make a clear distinction for code object version - Consolidate note names Differential Revision: https://reviews.llvm.org/D95638 show more ...
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1
# 2f499b9a	19-Dec-2020	Tony <Tony.Tye@amd.com>	[AMDGPU] Add volatile support to SIMemoryLegalizer Treat a non-atomic volatile load and store as a relaxed atomic at system scope for the address spaces accessed. This will ensure all relevant cache [AMDGPU] Add volatile support to SIMemoryLegalizer Treat a non-atomic volatile load and store as a relaxed atomic at system scope for the address spaces accessed. This will ensure all relevant caches will be bypassed. A volatile atomic is not changed and still only bypasses caches upto the level specified by the SyncScope operand. Differential Revision: https://reviews.llvm.org/D94214 show more ...
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1
# 75cf3091	01-Nov-2019	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Assume f32 denormals are enabled by default This will likely introduce catastrophic performance regressions on older subtargets, but should be correct. A follow up change will remove the old AMDGPU: Assume f32 denormals are enabled by default This will likely introduce catastrophic performance regressions on older subtargets, but should be correct. A follow up change will remove the old fp32-denormals subtarget features, and switch to using the new denormal-fp-math/denormal-fp-math-f32 attributes. Frontends should be making sure to add the denormal-fp-math-f32 attribute when appropriate to avoid performance regressions. show more ...
# 60b1967c	21-Jan-2020	Scott Linder <Scott.Linder@amd.com>	[AMDGPU] Add Scratch Wave Offset to Scratch Buffer Descriptor in entry functions Add the scratch wave offset to the scratch buffer descriptor (SRSrc) in the entry function prologue. This allows us t [AMDGPU] Add Scratch Wave Offset to Scratch Buffer Descriptor in entry functions Add the scratch wave offset to the scratch buffer descriptor (SRSrc) in the entry function prologue. This allows us to removes the scratch wave offset register from the calling convention ABI. As part of this change, allow the use of an inline constant zero for the SOffset of MUBUF instructions accessing the stack in entry functions when a frame pointer is not requested/required. Entry functions with calls still need to set up the calling convention ABI stack pointer register, and reference it in order to address arguments of called functions. The ABI stack pointer register remains unswizzled, but is now wave-relative instead of queue-relative. Non-entry functions also use an inline constant zero SOffset for wave-relative scratch access, but continue to use the stack and frame pointers as before. When the stack or frame pointer is converted to a swizzled offset it is now scaled directly, as the scratch wave offset no longer needs to be subtracted first. Update llvm/docs/AMDGPUUsage.rst to reflect these changes to the calling convention. Tags: #llvm Differential Revision: https://reviews.llvm.org/D75138 show more ...
# 2c378339	24-Oct-2019	Scott Linder <scott@scottlinder.com>	[AMDGPU] Clean up update_llc_test_checks CodeGen tests Summary: Some tests have been hand edited without removing the update_llc_test_checks header, some have slightly outdated CHECK lines which sti [AMDGPU] Clean up update_llc_test_checks CodeGen tests Summary: Some tests have been hand edited without removing the update_llc_test_checks header, some have slightly outdated CHECK lines which still pass, and some have additional comments which update_llc_test_checks pushes towards the function body. Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69402 show more ...
Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2
# 0ceda9fb	03-Jun-2019	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Disable stack realignment for kernels This is something of a workaround, and the state of stack realignment controls is kind of a mess. Ideally, we would be able to specify the stack is infi AMDGPU: Disable stack realignment for kernels This is something of a workaround, and the state of stack realignment controls is kind of a mess. Ideally, we would be able to specify the stack is infinitely aligned on entry to a kernel. TargetFrameLowering provides multiple controls which apply at different points. The StackRealignable field is used during SelectionDAG, and for some reason distinct from this hook. StackAlignment is a single field not dependent on the function. It would probably be better to make that dependent on the calling convention, and the maximum value for kernels. Currently this doesn't really change anything, since the frame lowering mostly does its own thing. This helps avoid regressions in a future change which will rely more heavily on hasFP. llvm-svn: 362447 show more ...