SILowerControlFlow.cpp - OpenGrok history log for /llvm-project/llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
# 2bfcacf0	03-Jul-2020	Carl Ritson <carl.ritson@amd.com>	[AMDGPU] Insert PS early exit at end of control flow Exit early if the exec mask is zero at the end of control flow. Mark the ends of control flow during control flow lowering and convert these to e [AMDGPU] Insert PS early exit at end of control flow Exit early if the exec mask is zero at the end of control flow. Mark the ends of control flow during control flow lowering and convert these to exits during the insert skips pass. Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D82737 show more ...
Revision tags: llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1
# 12a32439	07-Apr-2020	Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	[AMDGPU] Limit endcf-collapase to simple if We can only collapse adjacent SI_END_CF if outer statement belongs to a simple SI_IF, otherwise correct mask is not in the register we expect, but is an a [AMDGPU] Limit endcf-collapase to simple if We can only collapse adjacent SI_END_CF if outer statement belongs to a simple SI_IF, otherwise correct mask is not in the register we expect, but is an argument of an S_XOR instruction. Even if SI_IF is simple it might be lowered using S_XOR because lowering is dependent on a basic block layout. It is not considered simple if instruction consuming its output is not an SI_END_CF. Since that SI_END_CF might have already been lowered to an S_OR isSimpleIf() check may return false. This situation is an opportunity for a further optimization of SI_IF lowering, but that is a separate optimization. In the meanwhile move SI_END_CF post the lowering when we already know how the rest of the CFG was lowered since a non-simple SI_IF case still needs to be handled. Differential Revision: https://reviews.llvm.org/D77610 show more ...
# ddd2f4b9	06-Apr-2020	Jay Foad <jay.foad@amd.com>	[AMDGPU] Fix inaccurate comments
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5
# c262b69d	13-Mar-2020	Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	[AMDGPU] Fix endcf collapse Only collapse inner endcf if the outer one belongs to SI_IF. If it does belong to SI_ELSE then mask being restored in fact a partial inverse of what we need. Differentia [AMDGPU] Fix endcf collapse Only collapse inner endcf if the outer one belongs to SI_IF. If it does belong to SI_ELSE then mask being restored in fact a partial inverse of what we need. Differential Revision: https://reviews.llvm.org/D76154 show more ...
# 32e90cbc	13-Mar-2020	Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	[AMDGPU] Disable endcf collapse There are some functional regressions and I suspect our scopes are not as perfectly enclosed as I expected. Disable it for now. Differential Revision: https://review [AMDGPU] Disable endcf collapse There are some functional regressions and I suspect our scopes are not as perfectly enclosed as I expected. Disable it for now. Differential Revision: https://reviews.llvm.org/D76148 show more ...
Revision tags: llvmorg-10.0.0-rc4
# 360aff04	11-Mar-2020	Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	[AMDGPU] Simplify nested SI_END_CF This is to replace the optimization from the SIOptimizeExecMaskingPreRA. We have less opportunities in the control flow lowering because many VGPR copies are still [AMDGPU] Simplify nested SI_END_CF This is to replace the optimization from the SIOptimizeExecMaskingPreRA. We have less opportunities in the control flow lowering because many VGPR copies are still in place and will be removed later, but we know for sure an instruction is SI_END_CF and not just an arbitrary S_OR_B64 with EXEC. The subsequent change needs to convert s_and_saveexec into s_and and address new TODO lines in tests, then code block guarded by the -amdgpu-remove-redundant-endcf option in the pre-RA exec mask optimizer will be removed. Differential Revision: https://reviews.llvm.org/D76033 show more ...
Revision tags: llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init
# 6e177082	27-Dec-2019	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Fix SI_IF lowering when the save exec reg has terminator uses Reverts part of 6524a7a2b9ca072bd7f7b4355d1230e70c679d2f. Since that commit, the expansion was ignoring the actual save exec reg AMDGPU: Fix SI_IF lowering when the save exec reg has terminator uses Reverts part of 6524a7a2b9ca072bd7f7b4355d1230e70c679d2f. Since that commit, the expansion was ignoring the actual save exec register produced by the instruction, and looking at other instructions. I do not understand why it was looking at other instructions, but relying on this scan was wrong. Fixes verifier errors after SI_IF is tail duplicated, which should be correct to do. The results were fed into a phi, which was lowered to the S_MOV_B64_term instructions. show more ...
# e53a9d96	22-Jan-2020	cdevadas <cdevadas@amd.com>	Resubmit: [AMDGPU] Invert the handling of skip insertion. The current implementation of skip insertion (SIInsertSkip) makes it a mandatory pass required for correctness. Initially, the idea was to h Resubmit: [AMDGPU] Invert the handling of skip insertion. The current implementation of skip insertion (SIInsertSkip) makes it a mandatory pass required for correctness. Initially, the idea was to have an optional pass. This patch inserts the s_cbranch_execz upfront during SILowerControlFlow to skip over the sections of code when no lanes are active. Later, SIRemoveShortExecBranches removes the skips for short branches, unless there is a sideeffect and the skip branch is really necessary. This new pass will replace the handling of skip insertion in the existing SIInsertSkip Pass. Differential revision: https://reviews.llvm.org/D68092 show more ...
# a80291ce	21-Jan-2020	Nicolai Hähnle <nicolai.haehnle@amd.com>	Revert "[AMDGPU] Invert the handling of skip insertion." This reverts commit 0dc6c249bffac9f23a605ce4e42a84341da3ddbd. The commit is reported to cause a regression in piglit/bin/glsl-vs-loop for Me Revert "[AMDGPU] Invert the handling of skip insertion." This reverts commit 0dc6c249bffac9f23a605ce4e42a84341da3ddbd. The commit is reported to cause a regression in piglit/bin/glsl-vs-loop for Mesa. show more ...
# 0dc6c249	10-Jan-2020	cdevadas <cdevadas@amd.com>	[AMDGPU] Invert the handling of skip insertion. The current implementation of skip insertion (SIInsertSkip) makes it a mandatory pass required for correctness. Initially, the idea was to have an opt [AMDGPU] Invert the handling of skip insertion. The current implementation of skip insertion (SIInsertSkip) makes it a mandatory pass required for correctness. Initially, the idea was to have an optional pass. This patch inserts the s_cbranch_execz upfront during SILowerControlFlow to skip over the sections of code when no lanes are active. Later, SIRemoveShortExecBranches removes the skips for short branches, unless there is a sideeffect and the skip branch is really necessary. This new pass will replace the handling of skip insertion in the existing SIInsertSkip Pass. Differential revision: https://reviews.llvm.org/D68092 show more ...
Revision tags: llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1
# 008e65a7	18-Nov-2019	vpykhtin <valery.pykhtin@gmail.com>	[AMDGPU] Fix emitIfBreak CF lowering: use temp reg to make register coalescer life easier. Differential revision: https://reviews.llvm.org/D70405
Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6
# 6524a7a2	17-Sep-2019	Alexander Timofeev <Alexander.Timofeev@amd.com>	[AMDGPU]: PHI Elimination hooks added for custom COPY insertion. Fixed Defferential Revision: https://reviews.llvm.org/D67101 Reviewers: rampitec, vpykhtin llvm-svn: 372086
# 9ff70132	13-Sep-2019	Alexander Timofeev <Alexander.Timofeev@amd.com>	Revert for: [AMDGPU]: PHI Elimination hooks added for custom COPY insertion. llvm-svn: 371873
Revision tags: llvmorg-9.0.0-rc5
# c2d292f8	10-Sep-2019	Alexander Timofeev <Alexander.Timofeev@amd.com>	[AMDGPU]: PHI Elimination hooks added for custom COPY insertion. Reviewers: rampitec, vpykhtin Differential Revision: https://reviews.llvm.org/D67101 llvm-svn: 371508
Revision tags: llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3
# 4b7fc85c	20-Aug-2019	Matt Arsenault <Matthew.Arsenault@amd.com>	Revert "AMDGPU: Fix iterator error when lowering SI_END_CF" This reverts r367500 and r369203. This is causing various test failures. llvm-svn: 369417
# 479f3bdb	18-Aug-2019	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Fix iterator error when lowering SI_END_CF If the instruction is the last in the block, there is no next instruction but the iteration still needs to look at the new block. llvm-svn: 369203
# 0c476111	15-Aug-2019	Daniel Sanders <daniel_l_sanders@apple.com>	Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Re Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned& Depends on D65919 Reviewers: arsenm, bogner, craig.topper, RKSimon Reviewed By: arsenm Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65962 llvm-svn: 369041 show more ...
Revision tags: llvmorg-9.0.0-rc2
# 2bea69bf	01-Aug-2019	Daniel Sanders <daniel_l_sanders@apple.com>	Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC llvm-svn: 367633
# d48324ff	01-Aug-2019	Matt Arsenault <Matthew.Arsenault@amd.com>	Reapply "AMDGPU: Split block for si_end_cf" This reverts commit r359363, reapplying r357634 llvm-svn: 367500
Revision tags: llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3
# e3a676e9	24-Jun-2019	Matt Arsenault <Matthew.Arsenault@amd.com>	CodeGen: Introduce a class for registers Avoids using a plain unsigned for registers throughoug codegen. Doesn't attempt to change every register use, just something a little more than the set neede CodeGen: Introduce a class for registers Avoids using a plain unsigned for registers throughoug codegen. Doesn't attempt to change every register use, just something a little more than the set needed to build after changing the return type of MachineOperand::getReg(). llvm-svn: 364191 show more ...
# 52500216	16-Jun-2019	Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	[AMDGPU] gfx10 conditional registers handling This is cpp source part of wave32 support, excluding overriden getRegClass(). Differential Revision: https://reviews.llvm.org/D63351 llvm-svn: 363513
Revision tags: llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1
# 76c5b629	27-Apr-2019	Mark Searles <m.c.searles@gmail.com>	Revert "AMDGPU: Split block for si_end_cf" This reverts commit 7a6ef3004655dd86d722199c471ae78c28e31bb4. We discovered some internal test failures, so reverting for now. Differential Revision: htt Revert "AMDGPU: Split block for si_end_cf" This reverts commit 7a6ef3004655dd86d722199c471ae78c28e31bb4. We discovered some internal test failures, so reverting for now. Differential Revision: https://reviews.llvm.org/D61213 llvm-svn: 359363 show more ...
# 396653f8	03-Apr-2019	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Split block for si_end_cf Relying on no spill or other code being inserted before this was precarious. It relied on code diligently checking isBasicBlockPrologue which is likely to be forgot AMDGPU: Split block for si_end_cf Relying on no spill or other code being inserted before this was precarious. It relied on code diligently checking isBasicBlockPrologue which is likely to be forgotten. Ideally this could be done earlier, but this doesn't work because of phis. Any other instruction can't be placed before them, so we have to accept the position being incorrect during SSA. This avoids regressions in the fast register allocator rewrite from inverting the direction. llvm-svn: 357634 show more ...
Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4
# 87039773	05-Mar-2019	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Preserve undef flag when expanding SI_IF Fixes undefined value verifier error. llvm-svn: 355426
Revision tags: llvmorg-8.0.0-rc3
# 476e26b5	22-Feb-2019	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Use removeAllRegUnitsForPhysReg llvm-svn: 354686
1 234 5 6