Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
#
c8e5aef1 |
| 04-May-2021 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Use standard MachineBasicBlock::getFallThrough method. NFCI.
Differential Revision: https://reviews.llvm.org/D101825
|
#
a129932b |
| 18-Oct-2021 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Add link to bug
|
#
36deb9a6 |
| 15-Oct-2021 |
Jay Foad <jay.foad@amd.com> |
Add new MachineFunction property FailsVerification
TargetPassConfig::addPass takes a "bool verifyAfter" argument which lets you skip machine verification after a particular pass. Unfortunately this
Add new MachineFunction property FailsVerification
TargetPassConfig::addPass takes a "bool verifyAfter" argument which lets you skip machine verification after a particular pass. Unfortunately this is used in generic code in TargetPassConfig itself to skip verification after a generic pass, only because some previous target- specific pass damaged the MIR on that specific target. This is bad because problems in one target cause lack of verification for all targets.
This patch replaces that mechanism with a new MachineFunction property called "FailsVerification" which can be set by (usually target-specific) passes that are known to introduce problems. Later passes can reset it again if they are known to clean up the previous problems.
Differential Revision: https://reviews.llvm.org/D111397
show more ...
|
#
bacddf47 |
| 14-Oct-2021 |
Michael Liao <michael.hliao@gmail.com> |
[amdgpu] Fix a crash case when preserving MDT in SILowerControlFlow
- When a redundant MBB is being erased from MDT, check whether its single successor is dominiated by it. If yes, update that suc
[amdgpu] Fix a crash case when preserving MDT in SILowerControlFlow
- When a redundant MBB is being erased from MDT, check whether its single successor is dominiated by it. If yes, update that successor's idom before erasing MBB; otherwise, it implies MBB is a leaf node and could be erased directly.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D111831
show more ...
|
#
e996cf7d |
| 07-Oct-2021 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Preserve MachineDominatorTree in SILowerControlFlow
Updating the MachineDominatorTree is easy since SILowerControlFlow only splits and removes basic blocks. This should save a bit of compil
[AMDGPU] Preserve MachineDominatorTree in SILowerControlFlow
Updating the MachineDominatorTree is easy since SILowerControlFlow only splits and removes basic blocks. This should save a bit of compile time because previously we would recompute the dominator tree from scratch after this pass.
Another reason for doing this is that SILowerControlFlow preserves LiveIntervals which transitively requires MachineDominatorTree. I think that means that SILowerControlFlow is obliged to preserve MachineDominatorTree too as explained here: https://lists.llvm.org/pipermail/llvm-dev/2020-November/146923.html although it does not seem to have caused any problems in practice yet.
Differential Revision: https://reviews.llvm.org/D111313
show more ...
|
#
9d72c0ad |
| 06-Jul-2021 |
Sebastian Neubauer <sebastian.neubauer@amd.com> |
[AMDGPU] Mark waterfall loops as SI_WATERFALL_LOOP
This way, they can be detected later, e.g. by the SIOptimizeVGPRLiveRange pass.
Differential Revision: https://reviews.llvm.org/D105467
|
#
c9d747e9 |
| 06-Jul-2021 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Remove outdated comment and tidy up. NFC.
This was left over from D94746.
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2 |
|
#
aef781b4 |
| 14-Feb-2021 |
Carl Ritson <carl.ritson@amd.com> |
[AMDGPU] Add llvm.amdgcn.wqm.demote intrinsic
Add intrinsic which demotes all active lanes to helper lanes. This is used to implement demote to helper Vulkan extension.
In practice demoting a lane
[AMDGPU] Add llvm.amdgcn.wqm.demote intrinsic
Add intrinsic which demotes all active lanes to helper lanes. This is used to implement demote to helper Vulkan extension.
In practice demoting a lane to helper simply means removing it from the mask of live lanes used for WQM/WWM/Exact mode. Where the shader does not use WQM, demotes just become kills.
Additionally add llvm.amdgcn.live.mask intrinsic to complement demote operations. In theory llvm.amdgcn.ps.live can be used to detect helper lanes; however, ps.live can be moved by LICM. The movement of ps.live cannot be remedied without changing its type signature and such a change would require ps.live users to update as well.
Reviewed By: piotr
Differential Revision: https://reviews.llvm.org/D94747
show more ...
|
#
c16f7760 |
| 10-Feb-2021 |
Carl Ritson <carl.ritson@amd.com> |
[AMDGPU] Move kill lowering to WQM pass and add live mask tracking
Move implementation of kill intrinsics to WQM pass. Add live lane tracking by updating a stored exec mask when lanes are killed. Us
[AMDGPU] Move kill lowering to WQM pass and add live mask tracking
Move implementation of kill intrinsics to WQM pass. Add live lane tracking by updating a stored exec mask when lanes are killed. Use live lane tracking to enable early termination of shader at any point in control flow.
Reviewed By: piotr
Differential Revision: https://reviews.llvm.org/D94746
show more ...
|
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init |
|
#
a80ebd01 |
| 24-Jan-2021 |
Carl Ritson <carl.ritson@amd.com> |
[AMDGPU] Fix llvm.amdgcn.init.exec and frame materialization
Frame-base materialization may insert vector instructions before EXEC is initialised. Fix this by moving lowering of llvm.amdgcn.init.exe
[AMDGPU] Fix llvm.amdgcn.init.exec and frame materialization
Frame-base materialization may insert vector instructions before EXEC is initialised. Fix this by moving lowering of llvm.amdgcn.init.exec later in backend. Also remove SI_INIT_EXEC_LO pseudo as this is not necessary.
Reviewed By: ruiling
Differential Revision: https://reviews.llvm.org/D94645
show more ...
|
Revision tags: llvmorg-11.1.0-rc2 |
|
#
560d7e04 |
| 20-Jan-2021 |
dfukalov <daniil.fukalov@amd.com> |
[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets
... to reduce headers dependency.
Reviewed By: rampitec, arsenm
Differential Revision: https://reviews.llvm.org/D95036
|
Revision tags: llvmorg-11.1.0-rc1 |
|
#
6a87e9b0 |
| 25-Dec-2020 |
dfukalov <daniil.fukalov@amd.com> |
[NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D93813
|
#
0e219b64 |
| 03-Jan-2021 |
Kazu Hirata <kazu@google.com> |
[Target] Construct SmallVector with iterator ranges (NFC)
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1 |
|
#
a4f7e426 |
| 28-Oct-2020 |
alex-t <alexander.timofeev@amd.com> |
[AMDGPU] SILowerControlFlow::removeMBBifRedundant. Refactoring plus fix for the null MBB pointer in MF->splice
Detailed description: This change addresses the refactoring adviced by foad. It also co
[AMDGPU] SILowerControlFlow::removeMBBifRedundant. Refactoring plus fix for the null MBB pointer in MF->splice
Detailed description: This change addresses the refactoring adviced by foad. It also contain the fix for the case when getNextNode is null if the successor block is the last in MachineFunction.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D90314
show more ...
|
#
be2afbd0 |
| 20-Oct-2020 |
Carl Ritson <carl.ritson@amd.com> |
[AMDGPU] Remove fix up operand from SI_ELSE
Remove immediate operand from SI_ELSE which indicates if EXEC has been modified. Instead always emit code that handles EXEC and remove unnecessary instru
[AMDGPU] Remove fix up operand from SI_ELSE
Remove immediate operand from SI_ELSE which indicates if EXEC has been modified. Instead always emit code that handles EXEC and remove unnecessary instructions during pre-RA optimisation.
This facilitates passes (i.e. SIWholeQuadMode) adding exec mask manipulation post control flow lowering, and pre control flow lower passes do not need to be aware of SI_ELSE handling.
Reviewed By: nhaehnle
Differential Revision: https://reviews.llvm.org/D89644
show more ...
|
#
42ed3881 |
| 14-Oct-2020 |
alex-t <alexander.timofeev@amd.com> |
[AMDGPU] SILowerControlFlow::removeMBBifRedundant should not try to change MBB layout if it can fallthrough
removeMBBifRedundant normally tries to keep predecessors fallthrough when removing redunda
[AMDGPU] SILowerControlFlow::removeMBBifRedundant should not try to change MBB layout if it can fallthrough
removeMBBifRedundant normally tries to keep predecessors fallthrough when removing redundant MBB. It has to change MBBs layout to keep the new successor to immediately follow the predecessor of removed MBB. It only may be allowed in case the new successor itself has no successors to which it fall through.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D89397
show more ...
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3 |
|
#
3105d0f8 |
| 11-Sep-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
CodeGen: Move split block utility to MachineBasicBlock
AMDGPU needs this in several places, so consolidate them here.
|
#
0576f436 |
| 10-Sep-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Don't sometimes allow instructions before lowered si_end_cf
Since 6524a7a2b9ca072bd7f7b4355d1230e70c679d2f, this would sometimes not emit the or to exec at the beginning of the block, where
AMDGPU: Don't sometimes allow instructions before lowered si_end_cf
Since 6524a7a2b9ca072bd7f7b4355d1230e70c679d2f, this would sometimes not emit the or to exec at the beginning of the block, where it really has to be. If there is an instruction that defines one of the source operands, split the block and turn the si_end_cf into a terminator.
This avoids regressions when regalloc fast is switched to inserting reloads at the beginning of the block, instead of spills at the end of the block.
In a future change, this should always split the block.
show more ...
|
#
2480a31e |
| 07-Sep-2020 |
alex-t <alexander.timofeev@amd.com> |
[AMDGPU] SILowerControlFlow::optimizeEndCF should remove empty basic block
optimizeEndCF removes EXEC restoring instruction case this instruction is the only one except the branch to the single succ
[AMDGPU] SILowerControlFlow::optimizeEndCF should remove empty basic block
optimizeEndCF removes EXEC restoring instruction case this instruction is the only one except the branch to the single successor and that successor contains EXEC mask restoring instruction that was lowered from END_CF belonging to IF_ELSE. As a result of such optimization we get the basic block with the only one instruction that is a branch to the single successor. In case the control flow can reach such an empty block from S_CBRANCH_EXEZ/EXECNZ it might happen that spill/reload instructions that were inserted later by register allocator are placed under exec == 0 condition and never execute. Removing empty block solves the problem.
This change require further work to re-implement LIS updates. Recently, LIS is always nullptr in this pass. To enable it we need another patch to fix many places across the codegen.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D86634
show more ...
|
#
3c2a7bd2 |
| 02-Sep-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Remove code to handle tied si_else operands
This has not used tied operands for a long time.
|
#
98de0d22 |
| 21-Aug-2020 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Apply llvm-prefer-register-over-unsigned from clang-tidy
|
#
34978602 |
| 20-Aug-2020 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Remove uses of Register::isPhysicalRegister/isVirtualRegister
... in favour of the isPhysical/isVirtual methods.
|
Revision tags: llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1 |
|
#
2bd72abe |
| 23-Jul-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Skip other terminators before inserting s_cbranch_exec[n]z
PHIElimination/createPHISourceCopy inserts non-branch terminators after the control flow pseudo if a successor phi reads a register
AMDGPU: Skip other terminators before inserting s_cbranch_exec[n]z
PHIElimination/createPHISourceCopy inserts non-branch terminators after the control flow pseudo if a successor phi reads a register defined by the control flow pseudo. If this happens, we need to split the expansion of the control flow pseudo to ensure all the branches are after all of the other mask management instructions.
GlobalISel hit this in testscases that happened to be tail duplicated. The original testcase still does not work, since the same problem appears to be present in a later pass.
show more ...
|
Revision tags: llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3 |
|
#
42ca2070 |
| 03-Jul-2020 |
Carl Ritson <carl.ritson@amd.com> |
[AMDGPU] Insert PS early exit at end of control flow
Exit early if the exec mask is zero at the end of control flow. Mark the ends of control flow during control flow lowering and convert these to e
[AMDGPU] Insert PS early exit at end of control flow
Exit early if the exec mask is zero at the end of control flow. Mark the ends of control flow during control flow lowering and convert these to exits during the insert skips pass.
Reviewed By: nhaehnle
Differential Revision: https://reviews.llvm.org/D82737
show more ...
|
#
7ec6927b |
| 03-Jul-2020 |
Carl Ritson <carl.ritson@amd.com> |
Revert "[AMDGPU] Insert PS early exit at end of control flow"
This reverts commit 2bfcacf0ad362956277a1c2c9ba00ddc453a42ce.
There appears to be an issue to analysis preservation.
|