SIMachineFunctionInfo.h - OpenGrok history log for /llvm-project/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
# a496c8be	26-Jul-2023	Vitaly Buka <vitalybuka@google.com>	Revert "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting" And dependent commits. Details in D150388. This reverts commit 825b7f0ca5f2211ec3c93139f98d1e24048 Revert "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting" And dependent commits. Details in D150388. This reverts commit 825b7f0ca5f2211ec3c93139f98d1e24048c225c. This reverts commit 7a98f084c4d121244ef7286bc6503b6a181d446e. This reverts commit b4a62b1fa546312d882fa12dfdcd015177d66826. This reverts commit b7836d856206ec39509d42529f958c920368166b. No conflicts in the code, few tests had conflicts in autogenerated CHECKs: llvm/test/CodeGen/Thumb2/mve-float32regloops.ll llvm/test/CodeGen/AMDGPU/fix-frame-reg-in-custom-csr-spills.ll Reviewed By: alexfh Differential Revision: https://reviews.llvm.org/D156381 show more ...
Revision tags: llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5
# 7a98f084	17-May-2023	Christudasan Devadasan <Christudasan.Devadasan@amd.com>	[AMDGPU][SILowerSGPRSpills] Spill SGPRs to virtual VGPRs Currently, the custom SGPR spill lowering pass spills SGPRs into physical VGPR lanes and the remaining VGPRs are used by regalloc for vector [AMDGPU][SILowerSGPRSpills] Spill SGPRs to virtual VGPRs Currently, the custom SGPR spill lowering pass spills SGPRs into physical VGPR lanes and the remaining VGPRs are used by regalloc for vector regclass allocation. This imposes many restrictions that we ended up with unsuccessful SGPR spilling when there won't be enough VGPRs and we are forced to spill the leftover into memory during PEI. The custom spill handling during PEI has many edge cases and often breaks the compiler time to time. This patch implements spilling SGPRs into virtual VGPR lanes. Since we now split the register allocation for SGPRs and VGPRs, the virtual registers introduced for the spill lanes would get allocated automatically in the subsequent regalloc invocation for VGPRs. Spill to virtual registers will always be successful, even in the high-pressure situations, and hence it avoids most of the edge cases during PEI. We are now left with only the custom SGPR spills during PEI for special registers like the frame pointer which is an unproblematic case. Differential Revision: https://reviews.llvm.org/D124196 show more ...
# b4a62b1f	07-Jul-2023	Christudasan Devadasan <Christudasan.Devadasan@amd.com>	[AMDGPU] Enable whole wave register copy So far, we haven't exposed the allocation of whole-wave registers to regalloc. We hand-picked them for various whole wave mode operations. With a future patc [AMDGPU] Enable whole wave register copy So far, we haven't exposed the allocation of whole-wave registers to regalloc. We hand-picked them for various whole wave mode operations. With a future patch, we want the allocator to efficiently allocate them rather than using the custom pre-allocation pass. Any liverange split of virtual registers involved in whole-wave operations require the resulting COPY introduced with the split to be performed for all lanes. It isn't implemented in the compiler yet. This patch would identify all such copies and manipulate the exec mask around them to enable all lanes without affecting the value of exec mask elsewhere. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D143762 show more ...
# b78b36e1	07-Jul-2023	Christudasan Devadasan <Christudasan.Devadasan@amd.com>	[AMDGPU] Implement whole wave register spill To reduce the register pressure during allocation, when the allocator spills a virtual register that corresponds to a whole wave mode operation, the spil [AMDGPU] Implement whole wave register spill To reduce the register pressure during allocation, when the allocator spills a virtual register that corresponds to a whole wave mode operation, the spill loads and restores should be activated for all lanes by temporarily flipping all bits in exec register to one just before the spills. It is not implemented in the compiler as of today and this patch enables the necessary support. This is a pre-patch before the SGPR spill to virtual VGPR lanes that would eventually causes the whole wave register spills during allocation. Reviewed By: arsenm, cdevadas Differential Revision: https://reviews.llvm.org/D143759 show more ...
# 853b2a84	29-Jun-2023	Brendon Cahoon <brendon.cahoon@amd.com>	[AMDGPU] Reserve SGPR pair when long branches are present Branch relaxation requires 2 additional SGPRs for AMDGPU to handle the case when an indirect branch target is too far away. The register sca [AMDGPU] Reserve SGPR pair when long branches are present Branch relaxation requires 2 additional SGPRs for AMDGPU to handle the case when an indirect branch target is too far away. The register scavanger may not find available registers, which causes a “did not find scavenging index” assert to occur in assignRegToScavengingIndex. In this patch, we estimate before register allocation whether an indirect branch is likely to be needed, and reserve 2 SGPRs if the branch distance is found to be above a threshold. The distance threshold is an approximation as the exact code size and branch distance are unknown prior to register allocation. Patch by Corbin Robeck. Thanks! Differential Review: https://reviews.llvm.org/D149775 show more ...
# aa144fbe	19-May-2023	Kazu Hirata <kazu@google.com>	[AMDGPU] Fix warnings This patch fixes warnings like: llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h:711: warning: enumerated and non-enumerated type in conditional expression
Revision tags: llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4
# 7ac3ab34	05-Mar-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Fix missing MIR serialization for PSInputAddr/PSInputEnable Resuming any mir test for a pixel shader would assert in the AsmPrinter.
# 7ada7bbe	15-Mar-2023	Kazu Hirata <kazu@google.com>	[Target] Use *{Set,Map}::contains (NFC)
# dcb83484	23-Feb-2023	Jay Foad <jay.foad@amd.com>	[AMDGPU] Split SIModeRegisterDefaults out of AMDGPUBaseInfo. NFC. This is only used by CodeGen. Moving it out of AMDGPUBaseInfo simplifies future changes to make some of it depend on the subtarget. [AMDGPU] Split SIModeRegisterDefaults out of AMDGPUBaseInfo. NFC. This is only used by CodeGen. Moving it out of AMDGPUBaseInfo simplifies future changes to make some of it depend on the subtarget. Differential Revision: https://reviews.llvm.org/D144650 show more ...
Revision tags: llvmorg-16.0.0-rc3
# 1c9e6238	10-Feb-2023	Christudasan Devadasan <Christudasan.Devadasan@amd.com>	[AMDGPU] Allow architected SGPRs for workgroup IDs Some subtargets use architected SGPRs for workgroup IDs instead of the regular SGPRs. This patch enables the support for the same and is guarded un [AMDGPU] Allow architected SGPRs for workgroup IDs Some subtargets use architected SGPRs for workgroup IDs instead of the regular SGPRs. This patch enables the support for the same and is guarded under the subtarget feature FeatureArchitectedSGPRs. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D143707 show more ...
Revision tags: llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# 38818b60	04-Jan-2023	serge-sans-paille <sguelton@mozilla.com>	Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part Use deduction guides instead of helper functions. The only non-automatic changes have been: 1. ArrayRef(some_uint8_pointer, 0 Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part Use deduction guides instead of helper functions. The only non-automatic changes have been: 1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t), (uint8_t)) 2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase. 3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated. 4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that). Per reviewers' comment, some useless makeArrayRef have been removed in the process. This is a follow-up to https://reviews.llvm.org/D140896 that introduced the deduction guides. Differential Revision: https://reviews.llvm.org/D140955 show more ...
# 4463badf	06-Dec-2022	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Use DenormalMode type in FP mode tracking This simplies a future patch. The MIR handling should be fixed. We're still printing these in custom MachineFunctionInfo as bools (plus the inverted AMDGPU: Use DenormalMode type in FP mode tracking This simplies a future patch. The MIR handling should be fixed. We're still printing these in custom MachineFunctionInfo as bools (plus the inverted meaning is hard to follow). show more ...
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2
# 69e75ae6	18-Jun-2020	Matt Arsenault <Matthew.Arsenault@amd.com>	CodeGen: Don't lazily construct MachineFunctionInfo This fixes what I consider to be an API flaw I've tripped over multiple times. The point this is constructed isn't well defined, so depending on w CodeGen: Don't lazily construct MachineFunctionInfo This fixes what I consider to be an API flaw I've tripped over multiple times. The point this is constructed isn't well defined, so depending on where this is first called, you can conclude different information based on the MachineFunction. For example, the AMDGPU implementation inspected the MachineFrameInfo on construction for the stack objects and if the frame has calls. This kind of worked in SelectionDAG which visited all allocas up front, but broke in GlobalISel which hasn't visited any of the IR when arguments are lowered. I've run into similar problems before with the MIR parser and trying to make use of other MachineFunction fields, so I think it's best to just categorically disallow dependency on the MachineFunction state in the constructor and to always construct this at the same time as the MachineFunction itself. A missing feature I still could use is a way to access an custom analysis pass on the IR here. show more ...
# a3028239	21-Dec-2022	Christudasan Devadasan <Christudasan.Devadasan@amd.com>	Revert "[AMDGPU][SILowerSGPRSpills] Spill SGPRs to virtual VGPRs" This reverts commit 40ba0942e2ab1107f83aa5a0ee5ae2980bf47b1a.
# 40ba0942	14-Apr-2022	Christudasan Devadasan <Christudasan.Devadasan@amd.com>	[AMDGPU][SILowerSGPRSpills] Spill SGPRs to virtual VGPRs Currently, the custom SGPR spill lowering pass spills SGPRs into physical VGPR lanes and the remaining VGPRs are used by regalloc for vector [AMDGPU][SILowerSGPRSpills] Spill SGPRs to virtual VGPRs Currently, the custom SGPR spill lowering pass spills SGPRs into physical VGPR lanes and the remaining VGPRs are used by regalloc for vector regclass allocation. This imposes many restrictions that we ended up with unsuccessful SGPR spilling when there won't be enough VGPRs and we are forced to spill the leftover into memory during PEI. The custom spill handling during PEI has many edge cases and often breaks the compiler time to time. This patch implements spilling SGPRs into virtual VGPR lanes. Since we now split the register allocation for SGPRs and VGPRs, the virtual registers introduced for the spill lanes would get allocated automatically in the subsequent regalloc invocation for VGPRs. Spill to virtual registers will always be successful, even in the high-pressure situations, and hence it avoids most of the edge cases during PEI. We are now left with only the custom SGPR spills during PEI for special registers like the frame pointer which isn an unproblematic case. This patch also implements the whole wave spills which might occur if RA spills any live range of virtual registers involved in the whole wave operations. Earlier, we had been hand-picking registers for such machine operands. But now with SGPR spills into virtual VGPR lanes, we are exposing them to the allocator. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D124196 show more ...
# 29247824	30-Sep-2022	Christudasan Devadasan <Christudasan.Devadasan@amd.com>	[AMDGPU][SIFrameLowering] Use the right frame register in CSR spills Unlike the callee-saved VGPR spill instructions emitted by `PEI::spillCalleeSavedRegs`, the CS VGPR spills inserted during emitPr [AMDGPU][SIFrameLowering] Use the right frame register in CSR spills Unlike the callee-saved VGPR spill instructions emitted by `PEI::spillCalleeSavedRegs`, the CS VGPR spills inserted during emitPrologue/emitEpilogue require the exec bits flipping to avoid clobbering the inactive lanes of VGPRs used for SGPR spilling. Currently, these spill instructions are referenced from the SP at function entry and when the callee performs a stack realignment, they ended up getting incorrect stack offsets. Even if we try to adjust the offsets, the FP-SP becomes a runtime entity with dynamic stack realignment and the offsets would still be inaccurate. To fix it, use FP as the frame base in the spill instructions whenever the function has FP. The offsets obtained for the CS objects would always be the right values from FP. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D134949 show more ...
# 7a72a935	23-Sep-2022	Christudasan Devadasan <Christudasan.Devadasan@amd.com>	[AMDGPU] Preserve only the inactive lanes of scratch vgprs In general, a callee is free to use a scratch register without preserving its previous state. However, the VGPR used for SGPR spilling can [AMDGPU] Preserve only the inactive lanes of scratch vgprs In general, a callee is free to use a scratch register without preserving its previous state. However, the VGPR used for SGPR spilling can potentially have its inactive lanes overwritten by the writelane instructions. When the function returns, it can cause unexpected behavior if the VGPR value is not preserved appropriately. The current scheme to preserve the inactive lanes of such scratch VGPRs is not done rightly. It preserves all lanes and causes the outgoing values (if any) getting overwritten by the epilog restores. It then corrupts the return value. To avoid such situation with scratch VGPRs, this patch ensures we preserve only their inactive lanes. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D134526 show more ...
# 20a940f1	18-Aug-2022	Christudasan Devadasan <Christudasan.Devadasan@amd.com>	[AMDGPU][SIFrameLowering] Unify PEI SGPR spill saves and restores There is a lot of customization and eventually code duplication in the frame lowering that handles special SGPR spills like the one [AMDGPU][SIFrameLowering] Unify PEI SGPR spill saves and restores There is a lot of customization and eventually code duplication in the frame lowering that handles special SGPR spills like the one needed for the Frame Pointer. Incorporating any additional SGPR spill currently makes it difficult during PEI. This patch introduces a new spill builder to efficiently handle such spill requirements. Various spill methods are special handled using a separate class. Reviewed By: sebastian-ne, scott.linder Differential Revision: https://reviews.llvm.org/D132436 show more ...
# b25b4c0a	13-Apr-2022	Christudasan Devadasan <Christudasan.Devadasan@amd.com>	[AMDGPU] Separate out SGPR spills to VGPR lanes during PEI SILowerSGPRSpills pass handles the lowering of SGPR spills into VGPR lanes. Some SGPR spills are handled later during PEI. There is a commo [AMDGPU] Separate out SGPR spills to VGPR lanes during PEI SILowerSGPRSpills pass handles the lowering of SGPR spills into VGPR lanes. Some SGPR spills are handled later during PEI. There is a common function used in both places to find the free VGPR lane. This patch eliminates that dependency to find the free VGPR by handling it separately for PEI. It is a prerequisite patch for a future work to allow SGPR spills to virtual VGPR lanes during SILowerSGPRSpills. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D124195 show more ...
# af5e5c40	19-Apr-2022	Christudasan Devadasan <Christudasan.Devadasan@amd.com>	[AMDGPU] Add WWM reserved VGPRs to WWMSpills The custom VGPR spills inserted during frame lowering maintain a separate list for WWM reserved registers. Added them into WWMSpills that already tracks [AMDGPU] Add WWM reserved VGPRs to WWMSpills The custom VGPR spills inserted during frame lowering maintain a separate list for WWM reserved registers. Added them into WWMSpills that already tracks such reserved registers. It unifies the spill insertion. Reviewed By: nhaehnle, arsenm Differential Revision: https://reviews.llvm.org/D124193 show more ...
# 5692a7e8	13-Jun-2022	Christudasan Devadasan <Christudasan.Devadasan@amd.com>	[AMDGPU] Callee must always spill writelane VGPRs Since the writelane instruction used for SGPR spills can modify inactive lanes, the callee must preserve the VGPR this instruction modifies even if [AMDGPU] Callee must always spill writelane VGPRs Since the writelane instruction used for SGPR spills can modify inactive lanes, the callee must preserve the VGPR this instruction modifies even if it was marked Caller-saved. Reviewed By: arsenm, nhaehnle Differential Revision: https://reviews.llvm.org/D124192 show more ...
# 67819a72	13-Dec-2022	Fangrui Song <i@maskray.me>	[CodeGen] llvm::Optional => std::optional
# c589730a	05-Dec-2022	Krzysztof Parzyszek <kparzysz@quicinc.com>	[YAML] Convert Optional to std::optional
# b7f44f7c	29-Nov-2022	Nicolai Hähnle <nicolai.haehnle@amd.com>	AMDGPU: Remove ImagePSV and move images to addrspace 7 Following up on the removal of BufferPSV in commit 43b86bf992 ("AMDGPU: Remove BufferPseudoSourceValue") It is unclear what exactly the right AMDGPU: Remove ImagePSV and move images to addrspace 7 Following up on the removal of BufferPSV in commit 43b86bf992 ("AMDGPU: Remove BufferPseudoSourceValue") It is unclear what exactly the right address space for images should be. They seem morally closest to buffers, so that's what I went with. In practical terms, address space 7 is better than address space 0 because it can't alias with LDS. Differential Revision: https://reviews.llvm.org/D138949 show more ...
# 43b86bf9	25-Nov-2022	Nicolai Hähnle <nicolai.haehnle@amd.com>	AMDGPU: Remove BufferPseudoSourceValue The use of a PSV for buffer intrinsics is misleading because it may be misinterpreted as all buffer intrinsics accessing the same address in memory, which is c AMDGPU: Remove BufferPseudoSourceValue The use of a PSV for buffer intrinsics is misleading because it may be misinterpreted as all buffer intrinsics accessing the same address in memory, which is clearly not true. Instead, build MachineMemOperands without a pointer value but with an address space, so that address space-based alias analysis can still work. There is a lot of test churn because previously address space 4 (constant address space) was used as an address space for buffer intrinsics. This doesn't make much sense and seems to have been an accident -- see the change in AMDGPUTargetMachine::getAddressSpaceForPseudoSourceKind. Differential Revision: https://reviews.llvm.org/D138711 show more ...
123 4 5 6 7 8