kill-infinite-loop.ll - OpenGrok history log for /llvm-project/llvm/test/CodeGen/AMDGPU/kill-infinite-loop.ll

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
# b1bcb7ca	15-Jul-2024	Matt Arsenault <Matthew.Arsenault@amd.com>	Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851) This reverts commit adaff46d087799 Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851) This reverts commit adaff46d087799072438dd744b038e6fd50a2d78. Drop the -O3 checks from default-attributes.hip. I don't know why they are different on some bots but reverting this is far too disruptive. show more ...
# adaff46d	15-Jul-2024	dyung <douglas.yung@sony.com>	Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851) This reverts commits 677cc15e0ff2e0 Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851) This reverts commits 677cc15e0ff2e0e6aa30538eb187990a6a8f53c0 and 78bc1b64a6dc3fb6191355a5e1b502be8b3668e7. The test CodeGenHIP/default-attributes.hip is failing on multiple bots even after the attempted fix including the following: - https://lab.llvm.org/buildbot/#/builders/3/builds/1473 - https://lab.llvm.org/buildbot/#/builders/65/builds/1380 - https://lab.llvm.org/buildbot/#/builders/161/builds/595 - https://lab.llvm.org/buildbot/#/builders/154/builds/1372 - https://lab.llvm.org/buildbot/#/builders/133/builds/1547 - https://lab.llvm.org/buildbot/#/builders/81/builds/755 - https://lab.llvm.org/buildbot/#/builders/40/builds/570 - https://lab.llvm.org/buildbot/#/builders/13/builds/748 - https://lab.llvm.org/buildbot/#/builders/12/builds/1845 - https://lab.llvm.org/buildbot/#/builders/11/builds/1695 - https://lab.llvm.org/buildbot/#/builders/190/builds/1829 - https://lab.llvm.org/buildbot/#/builders/193/builds/962 - https://lab.llvm.org/buildbot/#/builders/23/builds/991 - https://lab.llvm.org/buildbot/#/builders/144/builds/2256 - https://lab.llvm.org/buildbot/#/builders/46/builds/1614 These bots have been broken for a day, so reverting to get everything back to green. show more ...
# 78bc1b64	14-Jul-2024	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Move attributor into optimization pipeline (#83131) Removing it from the codegen pipeline induces a lot of test churn because llc is no longer optimizing out implicit arguments to kernels. AMDGPU: Move attributor into optimization pipeline (#83131) Removing it from the codegen pipeline induces a lot of test churn because llc is no longer optimizing out implicit arguments to kernels. Mostly mechanical, but there are some creative test updates. I preferred to take the changes as-is in tests where the ABI isn't relevant. In cases where it's more relevant, or the optimize out logic was too ingrained in the test, I pre-run the optimization. Some cases manually add attributes to disable inputs. show more ...
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# 9e9907f1	17-Jan-2024	Fangrui Song <i@maskray.me>	[AMDGPU,test] Change llc -march= to -mtriple= (#75982) Similar to 806761a7629df268c8aed49657aeccffa6bca449. For IR files without a target triple, -mtriple= specifies the full target triple while [AMDGPU,test] Change llc -march= to -mtriple= (#75982) Similar to 806761a7629df268c8aed49657aeccffa6bca449. For IR files without a target triple, -mtriple= specifies the full target triple while -march= merely sets the architecture part of the default target triple, leaving a target triple which may not make sense, e.g. amdgpu-apple-darwin. Therefore, -march= is error-prone and not recommended for tests without a target triple. The issue has been benign as we recognize $unknown-apple-darwin as ELF instead of rejecting it outrightly. This patch changes AMDGPU tests to not rely on the default OS/environment components. Tests that need fixes are not changed: ``` LLVM :: CodeGen/AMDGPU/fabs.f64.ll LLVM :: CodeGen/AMDGPU/fabs.ll LLVM :: CodeGen/AMDGPU/floor.ll LLVM :: CodeGen/AMDGPU/fneg-fabs.f64.ll LLVM :: CodeGen/AMDGPU/fneg-fabs.ll LLVM :: CodeGen/AMDGPU/r600-infinite-loop-bug-while-reorganizing-vector.ll LLVM :: CodeGen/AMDGPU/schedule-if-2.ll ``` show more ...
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# 4bbcbdae	04-Jan-2023	Anshil Gandhi <gandhi21299@gmail.com>	[AMDGPU] Unify divergent nodes if the PostDom tree has one root This patch allows AMDGPUUnifyDivergenceExitNodes pass to transform a function whose PDT has exactly one root and ends in a branch inst [AMDGPU] Unify divergent nodes if the PostDom tree has one root This patch allows AMDGPUUnifyDivergenceExitNodes pass to transform a function whose PDT has exactly one root and ends in a branch instruction. Fixes https://github.com/llvm/llvm-project/issues/58861. Reviewed By: ruiling, arsenm Differential Revision: https://reviews.llvm.org/D139780 show more ...
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1
# 9b8b1ac4	07-Sep-2022	Jay Foad <jay.foad@amd.com>	[AMDGPU] Add a comment for a missing fold
Revision tags: llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# fd64a857	29-Jun-2022	Thomas Symalla <thomas.symalla@amd.com>	[AMDGPU] Combine s_or_saveexec, s_xor instructions. This patch merges a consecutive sequence of s_or_saveexec s_o, s_i s_xor exec, exec, s_o into a single s_andn2_saveexec s_o, s_i instruction. T [AMDGPU] Combine s_or_saveexec, s_xor instructions. This patch merges a consecutive sequence of s_or_saveexec s_o, s_i s_xor exec, exec, s_o into a single s_andn2_saveexec s_o, s_i instruction. This patch also cleans up the SIOptimizeExecMasking pass a bit. Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D129073 show more ...
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1
# 18f93512	19-Nov-2021	RamNalamothu <VenkataRamanaiah.Nalamothu@amd.com>	[AMDGPU] Do not generate ELF symbols for the local branch target labels The compiler was generating symbols in the final code object for local branch target labels. This bloats the code object, slow [AMDGPU] Do not generate ELF symbols for the local branch target labels The compiler was generating symbols in the final code object for local branch target labels. This bloats the code object, slows down the loader, and is only used to simplify disassembly. Use '--symbolize-operands' with llvm-objdump to improve readability of the branch target operands in disassembly. Fixes: SWDEV-312223 Reviewed By: scott.linder Differential Revision: https://reviews.llvm.org/D114273 show more ...
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init
# d9b9fdd9	08-Jul-2021	Ruiling Song <ruiling.song@amd.com>	[AMDGPU] Don't handle export done when unify exit nodes This patch aims to revert the changes introduced by D70781 D71192 D76364 D70781 was introduced to fix hardware hang where we do not insert ex [AMDGPU] Don't handle export done when unify exit nodes This patch aims to revert the changes introduced by D70781 D71192 D76364 D70781 was introduced to fix hardware hang where we do not insert exp- null-done for a kill inside infinit loop. At that time we have not added exp-null-done for kill early termination, but I believe as for now, we will always add the exp-null-done for early termination case in LaterBranchLowering. D71192 was introduced to handle the only_kill case, which is also been handled by the kill early termination work. D76364 was used to fix a regression by D71192, where we cleared the done bit of the export in the existing program and not let the normal return block branching to the new unified return block. With this change, we just trust frontends have setup exp-done correctly which is true for all existing frontends. The backend only inserts exp-null-done for the kill cases which is handled in SILateBranchLowering.cpp. Reviewed by: critson Differential Revision: https://reviews.llvm.org/D105610 show more ...
# 1d9585c8	08-Jul-2021	Ruiling Song <ruiling.song@amd.com>	[NFC][AMDGPU] autogenerate kill-infinite-loop.ll checks This would help us to track the assembly changes to these tests. Reviewed by: foad Differential Revision: https://reviews.llvm.org/D105609
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4
# fe5f4c39	20-Mar-2021	Carl Ritson <carl.ritson@amd.com>	[AMDGPU] Rename SIInsertSkips Pass Pass no longer handles skips. Pass now removes unnecessary unconditional branches and lowers early termination branches. Hence rename to SILateBranchLowering. Mo [AMDGPU] Rename SIInsertSkips Pass Pass no longer handles skips. Pass now removes unnecessary unconditional branches and lowers early termination branches. Hence rename to SILateBranchLowering. Move code to handle returns to epilog from SIPreEmitPeephole into SILateBranchLowering. This means SIPreEmitPeephole only contains optional optimisations, and all required transforms are in SILateBranchLowering. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D98915 show more ...
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1
# 4b806473	01-Jan-2021	Roman Lebedev <lebedev.ri@gmail.com>	[AMDGPU][SimplifyCFG] Teach AMDGPUUnifyDivergentExitNodes to preserve {,Post}DomTree This is a (last big?) part of the patch series to make SimplifyCFG preserve DomTree. Currently, it still does not [AMDGPU][SimplifyCFG] Teach AMDGPUUnifyDivergentExitNodes to preserve {,Post}DomTree This is a (last big?) part of the patch series to make SimplifyCFG preserve DomTree. Currently, it still does not actually preserve it, even thought it is pretty much fully updated to preserve it. Once the default is flipped, a valid DomTree must be passed into simplifyCFG, which means that whatever pass calls simplifyCFG, should also be smart about DomTree's. As far as i can see from `check-llvm` with default flipped, this is the last LLVM test batch (other than bugpoint tests) that needed fixes to not break with default flipped. The changes here are boringly identical to the ones i did over 42+ times/commits recently already, so while AMDGPU is outside of my normal ecosystem, i'm going to go for post-commit review here, like in all the other 42+ changes. Note that while the pass is taught to preserve {,Post}DomTree, it still doesn't do that by default, because simplifycfg still doesn't do that by default, and flipping default in this pass will implicitly flip the default for simplifycfg. That will happen, but not right now. show more ...
# b23b1bcc	01-Jan-2021	Roman Lebedev <lebedev.ri@gmail.com>	[NFC][CodeGen][Tests] Mark all tests that fail to preserve DomTree for SimplifyCFG as such These tests start to fail when the SimplifyCFG's default regarding DomTree updating is switched on, so mark [NFC][CodeGen][Tests] Mark all tests that fail to preserve DomTree for SimplifyCFG as such These tests start to fail when the SimplifyCFG's default regarding DomTree updating is switched on, so mark them as needing changes. show more ...
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3
# 42ca2070	03-Jul-2020	Carl Ritson <carl.ritson@amd.com>	[AMDGPU] Insert PS early exit at end of control flow Exit early if the exec mask is zero at the end of control flow. Mark the ends of control flow during control flow lowering and convert these to e [AMDGPU] Insert PS early exit at end of control flow Exit early if the exec mask is zero at the end of control flow. Mark the ends of control flow during control flow lowering and convert these to exits during the insert skips pass. Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D82737 show more ...
# 7ec6927b	03-Jul-2020	Carl Ritson <carl.ritson@amd.com>	Revert "[AMDGPU] Insert PS early exit at end of control flow" This reverts commit 2bfcacf0ad362956277a1c2c9ba00ddc453a42ce. There appears to be an issue to analysis preservation.
# 2bfcacf0	03-Jul-2020	Carl Ritson <carl.ritson@amd.com>	[AMDGPU] Insert PS early exit at end of control flow Exit early if the exec mask is zero at the end of control flow. Mark the ends of control flow during control flow lowering and convert these to e [AMDGPU] Insert PS early exit at end of control flow Exit early if the exec mask is zero at the end of control flow. Mark the ends of control flow during control flow lowering and convert these to exits during the insert skips pass. Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D82737 show more ...
Revision tags: llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3
# ce06d507	09-Dec-2019	Connor Abbott <cwabbott0@gmail.com>	AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns Summary: The code was assuming in a few places that if there was only one exit from the function that it was a normal return, which i AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns Summary: The code was assuming in a few places that if there was only one exit from the function that it was a normal return, which is invalid. It could be an infinite loop, in which case we still need to insert the usual fake edge so that the null export happens. This fixes shaders that end with an infinite loop that discards. Reviewers: arsenm, nhaehnle, critson Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71192 show more ...
Revision tags: llvmorg-9.0.1-rc2
# 87d98c14	27-Nov-2019	Connor Abbott <cwabbott0@gmail.com>	AMDGPU: Fix handling of infinite loops in fragment shaders Summary: Due to the fact that kill is just a normal intrinsic, even though it's supposed to terminate the thread, we can end up with provab AMDGPU: Fix handling of infinite loops in fragment shaders Summary: Due to the fact that kill is just a normal intrinsic, even though it's supposed to terminate the thread, we can end up with provably infinite loops that are actually supposed to end successfully. The AMDGPUUnifyDivergentExitNodes pass breaks up these loops, but because there's no obvious place to make the loop branch to, it just makes it return immediately, which skips the exports that are supposed to happen at the end and hangs the GPU if all the threads end up being killed. While it would be nice if the fact that kill terminates the thread were modeled in the IR, I think that the structurizer as-is would make a mess if we did that when the kill is inside control flow. For now, we just add a null export at the end to make sure that it always exports something, which fixes the immediate problem without penalizing the more common case. This means that we sometimes do two "done" exports when only some of the threads enter the discard loop, but from tests the hardware seems ok with that. This fixes dEQP-VK.graphicsfuzz.while-inside-switch with radv. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70781 show more ...
# 13ab22ab	29-Jan-2020	Connor Abbott <cwabbott0@gmail.com>	Revert "AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns" This reverts commit 323bfde20c5f3e63db3d6b385b394ed38542abe6.
# 323bfde2	09-Dec-2019	Connor Abbott <cwabbott0@gmail.com>	AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns Summary: The code was assuming in a few places that if there was only one exit from the function that it was a normal return, which i AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns Summary: The code was assuming in a few places that if there was only one exit from the function that it was a normal return, which is invalid. It could be an infinite loop, in which case we still need to insert the usual fake edge so that the null export happens. This fixes shaders that end with an infinite loop that discards. Reviewers: arsenm, nhaehnle, critson Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71192 show more ...
# 0994c485	27-Nov-2019	Connor Abbott <cwabbott0@gmail.com>	AMDGPU: Fix handling of infinite loops in fragment shaders Summary: Due to the fact that kill is just a normal intrinsic, even though it's supposed to terminate the thread, we can end up with provab AMDGPU: Fix handling of infinite loops in fragment shaders Summary: Due to the fact that kill is just a normal intrinsic, even though it's supposed to terminate the thread, we can end up with provably infinite loops that are actually supposed to end successfully. The AMDGPUUnifyDivergentExitNodes pass breaks up these loops, but because there's no obvious place to make the loop branch to, it just makes it return immediately, which skips the exports that are supposed to happen at the end and hangs the GPU if all the threads end up being killed. While it would be nice if the fact that kill terminates the thread were modeled in the IR, I think that the structurizer as-is would make a mess if we did that when the kill is inside control flow. For now, we just add a null export at the end to make sure that it always exports something, which fixes the immediate problem without penalizing the more common case. This means that we sometimes do two "done" exports when only some of the threads enter the discard loop, but from tests the hardware seems ok with that. This fixes dEQP-VK.graphicsfuzz.while-inside-switch with radv. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70781 show more ...