History log of /llvm-project/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h (Results 26 – 50 of 186)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# a496c8be 26-Jul-2023 Vitaly Buka <vitalybuka@google.com>

Revert "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting"

And dependent commits.

Details in D150388.

This reverts commit 825b7f0ca5f2211ec3c93139f98d1e24048

Revert "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting"

And dependent commits.

Details in D150388.

This reverts commit 825b7f0ca5f2211ec3c93139f98d1e24048c225c.
This reverts commit 7a98f084c4d121244ef7286bc6503b6a181d446e.
This reverts commit b4a62b1fa546312d882fa12dfdcd015177d66826.
This reverts commit b7836d856206ec39509d42529f958c920368166b.

No conflicts in the code, few tests had conflicts in autogenerated CHECKs:
llvm/test/CodeGen/Thumb2/mve-float32regloops.ll
llvm/test/CodeGen/AMDGPU/fix-frame-reg-in-custom-csr-spills.ll

Reviewed By: alexfh

Differential Revision: https://reviews.llvm.org/D156381

show more ...


Revision tags: llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5
# 7a98f084 17-May-2023 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

[AMDGPU][SILowerSGPRSpills] Spill SGPRs to virtual VGPRs

Currently, the custom SGPR spill lowering pass spills
SGPRs into physical VGPR lanes and the remaining VGPRs
are used by regalloc for vector

[AMDGPU][SILowerSGPRSpills] Spill SGPRs to virtual VGPRs

Currently, the custom SGPR spill lowering pass spills
SGPRs into physical VGPR lanes and the remaining VGPRs
are used by regalloc for vector regclass allocation.
This imposes many restrictions that we ended up with
unsuccessful SGPR spilling when there won't be enough
VGPRs and we are forced to spill the leftover into
memory during PEI. The custom spill handling during PEI
has many edge cases and often breaks the compiler time
to time.

This patch implements spilling SGPRs into virtual VGPR
lanes. Since we now split the register allocation for
SGPRs and VGPRs, the virtual registers introduced for
the spill lanes would get allocated automatically in
the subsequent regalloc invocation for VGPRs.

Spill to virtual registers will always be successful,
even in the high-pressure situations, and hence it avoids
most of the edge cases during PEI. We are now left with
only the custom SGPR spills during PEI for special registers
like the frame pointer which is an unproblematic case.

Differential Revision: https://reviews.llvm.org/D124196

show more ...


# b4a62b1f 07-Jul-2023 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

[AMDGPU] Enable whole wave register copy

So far, we haven't exposed the allocation of whole-wave
registers to regalloc. We hand-picked them for various
whole wave mode operations. With a future patc

[AMDGPU] Enable whole wave register copy

So far, we haven't exposed the allocation of whole-wave
registers to regalloc. We hand-picked them for various
whole wave mode operations. With a future patch, we
want the allocator to efficiently allocate them rather
than using the custom pre-allocation pass.

Any liverange split of virtual registers involved in
whole-wave operations require the resulting COPY
introduced with the split to be performed for all
lanes. It isn't implemented in the compiler yet.

This patch would identify all such copies and
manipulate the exec mask around them to enable all
lanes without affecting the value of exec mask
elsewhere.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D143762

show more ...


# b78b36e1 07-Jul-2023 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

[AMDGPU] Implement whole wave register spill

To reduce the register pressure during allocation,
when the allocator spills a virtual register that
corresponds to a whole wave mode operation, the
spil

[AMDGPU] Implement whole wave register spill

To reduce the register pressure during allocation,
when the allocator spills a virtual register that
corresponds to a whole wave mode operation, the
spill loads and restores should be activated for
all lanes by temporarily flipping all bits in exec
register to one just before the spills. It is not
implemented in the compiler as of today and this
patch enables the necessary support.

This is a pre-patch before the SGPR spill to virtual
VGPR lanes that would eventually causes the whole
wave register spills during allocation.

Reviewed By: arsenm, cdevadas

Differential Revision: https://reviews.llvm.org/D143759

show more ...


# 853b2a84 29-Jun-2023 Brendon Cahoon <brendon.cahoon@amd.com>

[AMDGPU] Reserve SGPR pair when long branches are present

Branch relaxation requires 2 additional SGPRs for AMDGPU to handle the
case when an indirect branch target is too far away. The register
sca

[AMDGPU] Reserve SGPR pair when long branches are present

Branch relaxation requires 2 additional SGPRs for AMDGPU to handle the
case when an indirect branch target is too far away. The register
scavanger may not find available registers, which causes a “did not find
scavenging index” assert to occur in assignRegToScavengingIndex.

In this patch, we estimate before register allocation whether an
indirect branch is likely to be needed, and reserve 2 SGPRs if the
branch distance is found to be above a threshold. The distance threshold
is an approximation as the exact code size and branch distance are
unknown prior to register allocation.

Patch by Corbin Robeck. Thanks!

Differential Review: https://reviews.llvm.org/D149775

show more ...


# aa144fbe 19-May-2023 Kazu Hirata <kazu@google.com>

[AMDGPU] Fix warnings

This patch fixes warnings like:

llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h:711: warning:
enumerated and non-enumerated type in conditional expression


Revision tags: llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4
# 7ac3ab34 05-Mar-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fix missing MIR serialization for PSInputAddr/PSInputEnable

Resuming any mir test for a pixel shader would assert in the AsmPrinter.


# 7ada7bbe 15-Mar-2023 Kazu Hirata <kazu@google.com>

[Target] Use *{Set,Map}::contains (NFC)


# dcb83484 23-Feb-2023 Jay Foad <jay.foad@amd.com>

[AMDGPU] Split SIModeRegisterDefaults out of AMDGPUBaseInfo. NFC.

This is only used by CodeGen. Moving it out of AMDGPUBaseInfo simplifies
future changes to make some of it depend on the subtarget.

[AMDGPU] Split SIModeRegisterDefaults out of AMDGPUBaseInfo. NFC.

This is only used by CodeGen. Moving it out of AMDGPUBaseInfo simplifies
future changes to make some of it depend on the subtarget.

Differential Revision: https://reviews.llvm.org/D144650

show more ...


Revision tags: llvmorg-16.0.0-rc3
# 1c9e6238 10-Feb-2023 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

[AMDGPU] Allow architected SGPRs for workgroup IDs

Some subtargets use architected SGPRs for workgroup
IDs instead of the regular SGPRs. This patch enables
the support for the same and is guarded un

[AMDGPU] Allow architected SGPRs for workgroup IDs

Some subtargets use architected SGPRs for workgroup
IDs instead of the regular SGPRs. This patch enables
the support for the same and is guarded under the
subtarget feature FeatureArchitectedSGPRs.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D143707

show more ...


Revision tags: llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# 38818b60 04-Jan-2023 serge-sans-paille <sguelton@mozilla.com>

Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part

Use deduction guides instead of helper functions.

The only non-automatic changes have been:

1. ArrayRef(some_uint8_pointer, 0

Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part

Use deduction guides instead of helper functions.

The only non-automatic changes have been:

1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t*), (uint8_t*))
2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase.
3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated.
4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that).

Per reviewers' comment, some useless makeArrayRef have been removed in the process.

This is a follow-up to https://reviews.llvm.org/D140896 that introduced
the deduction guides.

Differential Revision: https://reviews.llvm.org/D140955

show more ...


# 4463badf 06-Dec-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Use DenormalMode type in FP mode tracking

This simplies a future patch. The MIR handling should be fixed. We're
still printing these in custom MachineFunctionInfo as bools (plus the
inverted

AMDGPU: Use DenormalMode type in FP mode tracking

This simplies a future patch. The MIR handling should be fixed. We're
still printing these in custom MachineFunctionInfo as bools (plus the
inverted meaning is hard to follow).

show more ...


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2
# 69e75ae6 18-Jun-2020 Matt Arsenault <Matthew.Arsenault@amd.com>

CodeGen: Don't lazily construct MachineFunctionInfo

This fixes what I consider to be an API flaw I've tripped over
multiple times. The point this is constructed isn't well defined, so
depending on w

CodeGen: Don't lazily construct MachineFunctionInfo

This fixes what I consider to be an API flaw I've tripped over
multiple times. The point this is constructed isn't well defined, so
depending on where this is first called, you can conclude different
information based on the MachineFunction. For example, the AMDGPU
implementation inspected the MachineFrameInfo on construction for the
stack objects and if the frame has calls. This kind of worked in
SelectionDAG which visited all allocas up front, but broke in
GlobalISel which hasn't visited any of the IR when arguments are
lowered.

I've run into similar problems before with the MIR parser and trying
to make use of other MachineFunction fields, so I think it's best to
just categorically disallow dependency on the MachineFunction state in
the constructor and to always construct this at the same time as the
MachineFunction itself.

A missing feature I still could use is a way to access an custom
analysis pass on the IR here.

show more ...


# a3028239 21-Dec-2022 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

Revert "[AMDGPU][SILowerSGPRSpills] Spill SGPRs to virtual VGPRs"

This reverts commit 40ba0942e2ab1107f83aa5a0ee5ae2980bf47b1a.


# 40ba0942 14-Apr-2022 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

[AMDGPU][SILowerSGPRSpills] Spill SGPRs to virtual VGPRs

Currently, the custom SGPR spill lowering pass spills
SGPRs into physical VGPR lanes and the remaining VGPRs
are used by regalloc for vector

[AMDGPU][SILowerSGPRSpills] Spill SGPRs to virtual VGPRs

Currently, the custom SGPR spill lowering pass spills
SGPRs into physical VGPR lanes and the remaining VGPRs
are used by regalloc for vector regclass allocation.
This imposes many restrictions that we ended up with
unsuccessful SGPR spilling when there won't be enough
VGPRs and we are forced to spill the leftover into
memory during PEI. The custom spill handling during PEI
has many edge cases and often breaks the compiler time
to time.

This patch implements spilling SGPRs into virtual VGPR
lanes. Since we now split the register allocation for
SGPRs and VGPRs, the virtual registers introduced for
the spill lanes would get allocated automatically in
the subsequent regalloc invocation for VGPRs.

Spill to virtual registers will always be successful,
even in the high-pressure situations, and hence it avoids
most of the edge cases during PEI. We are now left with
only the custom SGPR spills during PEI for special registers
like the frame pointer which isn an unproblematic case.

This patch also implements the whole wave spills which
might occur if RA spills any live range of virtual registers
involved in the whole wave operations. Earlier, we had
been hand-picking registers for such machine operands.
But now with SGPR spills into virtual VGPR lanes, we are
exposing them to the allocator.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D124196

show more ...


# 29247824 30-Sep-2022 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

[AMDGPU][SIFrameLowering] Use the right frame register in CSR spills

Unlike the callee-saved VGPR spill instructions emitted by
`PEI::spillCalleeSavedRegs`, the CS VGPR spills inserted during
emitPr

[AMDGPU][SIFrameLowering] Use the right frame register in CSR spills

Unlike the callee-saved VGPR spill instructions emitted by
`PEI::spillCalleeSavedRegs`, the CS VGPR spills inserted during
emitPrologue/emitEpilogue require the exec bits flipping to avoid
clobbering the inactive lanes of VGPRs used for SGPR spilling.
Currently, these spill instructions are referenced from the SP at
function entry and when the callee performs a stack realignment,
they ended up getting incorrect stack offsets. Even if we try to
adjust the offsets, the FP-SP becomes a runtime entity with dynamic
stack realignment and the offsets would still be inaccurate.

To fix it, use FP as the frame base in the spill instructions
whenever the function has FP. The offsets obtained for the CS
objects would always be the right values from FP.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D134949

show more ...


# 7a72a935 23-Sep-2022 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

[AMDGPU] Preserve only the inactive lanes of scratch vgprs

In general, a callee is free to use a scratch register without
preserving its previous state. However, the VGPR used for SGPR
spilling can

[AMDGPU] Preserve only the inactive lanes of scratch vgprs

In general, a callee is free to use a scratch register without
preserving its previous state. However, the VGPR used for SGPR
spilling can potentially have its inactive lanes overwritten by
the writelane instructions. When the function returns, it can
cause unexpected behavior if the VGPR value is not preserved
appropriately.

The current scheme to preserve the inactive lanes of such
scratch VGPRs is not done rightly. It preserves all lanes
and causes the outgoing values (if any) getting overwritten
by the epilog restores. It then corrupts the return value.

To avoid such situation with scratch VGPRs, this patch ensures
we preserve only their inactive lanes.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D134526

show more ...


# 20a940f1 18-Aug-2022 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

[AMDGPU][SIFrameLowering] Unify PEI SGPR spill saves and restores

There is a lot of customization and eventually code duplication in the
frame lowering that handles special SGPR spills like the one

[AMDGPU][SIFrameLowering] Unify PEI SGPR spill saves and restores

There is a lot of customization and eventually code duplication in the
frame lowering that handles special SGPR spills like the one needed for
the Frame Pointer. Incorporating any additional SGPR spill currently
makes it difficult during PEI. This patch introduces a new spill builder
to efficiently handle such spill requirements. Various spill methods are
special handled using a separate class.

Reviewed By: sebastian-ne, scott.linder

Differential Revision: https://reviews.llvm.org/D132436

show more ...


# b25b4c0a 13-Apr-2022 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

[AMDGPU] Separate out SGPR spills to VGPR lanes during PEI

SILowerSGPRSpills pass handles the lowering of SGPR spills
into VGPR lanes. Some SGPR spills are handled later during
PEI. There is a commo

[AMDGPU] Separate out SGPR spills to VGPR lanes during PEI

SILowerSGPRSpills pass handles the lowering of SGPR spills
into VGPR lanes. Some SGPR spills are handled later during
PEI. There is a common function used in both places to find
the free VGPR lane. This patch eliminates that dependency to
find the free VGPR by handling it separately for PEI. It is a
prerequisite patch for a future work to allow SGPR spills to
virtual VGPR lanes during SILowerSGPRSpills.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D124195

show more ...


# af5e5c40 19-Apr-2022 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

[AMDGPU] Add WWM reserved VGPRs to WWMSpills

The custom VGPR spills inserted during frame lowering
maintain a separate list for WWM reserved registers.
Added them into WWMSpills that already tracks

[AMDGPU] Add WWM reserved VGPRs to WWMSpills

The custom VGPR spills inserted during frame lowering
maintain a separate list for WWM reserved registers.
Added them into WWMSpills that already tracks such
reserved registers. It unifies the spill insertion.

Reviewed By: nhaehnle, arsenm

Differential Revision: https://reviews.llvm.org/D124193

show more ...


# 5692a7e8 13-Jun-2022 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

[AMDGPU] Callee must always spill writelane VGPRs

Since the writelane instruction used for SGPR spills can
modify inactive lanes, the callee must preserve the VGPR
this instruction modifies even if

[AMDGPU] Callee must always spill writelane VGPRs

Since the writelane instruction used for SGPR spills can
modify inactive lanes, the callee must preserve the VGPR
this instruction modifies even if it was marked Caller-saved.

Reviewed By: arsenm, nhaehnle

Differential Revision: https://reviews.llvm.org/D124192

show more ...


# 67819a72 13-Dec-2022 Fangrui Song <i@maskray.me>

[CodeGen] llvm::Optional => std::optional


# c589730a 05-Dec-2022 Krzysztof Parzyszek <kparzysz@quicinc.com>

[YAML] Convert Optional to std::optional


# b7f44f7c 29-Nov-2022 Nicolai Hähnle <nicolai.haehnle@amd.com>

AMDGPU: Remove ImagePSV and move images to addrspace 7

Following up on the removal of BufferPSV in commit 43b86bf992 ("AMDGPU:
Remove BufferPseudoSourceValue")

It is unclear what exactly the right

AMDGPU: Remove ImagePSV and move images to addrspace 7

Following up on the removal of BufferPSV in commit 43b86bf992 ("AMDGPU:
Remove BufferPseudoSourceValue")

It is unclear what exactly the right address space for images should be.
They seem morally closest to buffers, so that's what I went with. In
practical terms, address space 7 is better than address space 0 because
it can't alias with LDS.

Differential Revision: https://reviews.llvm.org/D138949

show more ...


# 43b86bf9 25-Nov-2022 Nicolai Hähnle <nicolai.haehnle@amd.com>

AMDGPU: Remove BufferPseudoSourceValue

The use of a PSV for buffer intrinsics is misleading because it may be
misinterpreted as all buffer intrinsics accessing the same address in
memory, which is c

AMDGPU: Remove BufferPseudoSourceValue

The use of a PSV for buffer intrinsics is misleading because it may be
misinterpreted as all buffer intrinsics accessing the same address in
memory, which is clearly not true.

Instead, build MachineMemOperands without a pointer value but with an
address space, so that address space-based alias analysis can still
work.

There is a lot of test churn because previously address space 4
(constant address space) was used as an address space for buffer
intrinsics. This doesn't make much sense and seems to have been an
accident -- see the change in
AMDGPUTargetMachine::getAddressSpaceForPseudoSourceKind.

Differential Revision: https://reviews.llvm.org/D138711

show more ...


12345678