History log of /llvm-project/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h (Results 76 – 100 of 186)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# d8f99bb6 11-Feb-2022 Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com>

[AMDGPU] replace hostcall module flag with function attribute

The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
r

[AMDGPU] replace hostcall module flag with function attribute

The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
replaced by a function attribute that gets propagated to top-level
kernel functions via their respective call-graph.

If the attribute "amdgpu-no-hostcall-ptr" is absent on a kernel, the
default behaviour is to emit kernel metadata indicating that the
kernel uses the hostcall buffer pointer passed as an implicit
argument.

The attribute may be placed explicitly by the user, or inferred by the
AMDGPU attributor by examining the call-graph. The attribute is
inferred only if the function is not being sanitized, and the
implictarg_ptr does not result in a load of any byte in the hostcall
pointer argument.

Reviewed By: jdoerfert, arsenm, kpyzhov

Differential Revision: https://reviews.llvm.org/D119216

show more ...


# aeaf85b9 13-Jan-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Select VGPR versions of MFMA if possible

We can select _vgprcd versions of MAI instructions and have no
AGPRs with the whole budget left for VGPRs if:

1. This is a kernel;
2. It has no cal

[AMDGPU] Select VGPR versions of MFMA if possible

We can select _vgprcd versions of MAI instructions and have no
AGPRs with the whole budget left for VGPRs if:

1. This is a kernel;
2. It has no calls;
3. It runs at least on 2 waves thus having not more that 256 VGPRs.
4. There is no inline asm requesting AGPRs.

Differential Revision: https://reviews.llvm.org/D117253

show more ...


# d6fdbbca 24-Nov-2021 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Add second emergency slot for SGPR to vmem for large frames

In a future change, we will sometimes use a VGPR offset for doing
spills to memory, in which case we need 2 free VGPRs to do the S

AMDGPU: Add second emergency slot for SGPR to vmem for large frames

In a future change, we will sometimes use a VGPR offset for doing
spills to memory, in which case we need 2 free VGPRs to do the SGPR
spill. In most cases we could spill the VGPR along with the SGPR being
spilled, but we don't have any free lanes for SGPR_1024 in wave32 so
we could still potentially need a second scavenging slot.

show more ...


# 8470bf2b 12-Jan-2022 Austin Kerbow <Austin.Kerbow@amd.com>

[AMDGPU] Do not reserve any VGPR for SGPR spills

After the split register allocation changes in eebe841a47cb it is no
longer necessary to reserve a VGPR before RA. This can also create bugs
when IPR

[AMDGPU] Do not reserve any VGPR for SGPR spills

After the split register allocation changes in eebe841a47cb it is no
longer necessary to reserve a VGPR before RA. This can also create bugs
when IPRA is enabled since we cannot predict that a called function may
not reserve any register if it does not have any SGPR spills. If that
happens those functions may override reserved registers that are
normally callee saved. Added a test to show this.

Fixes: SWDEV-309900

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D115551

show more ...


# d45a2479 18-Dec-2021 Brendon Cahoon <brendon.cahoon@amd.com>

[AMDGPU] Don't remove VGPR to AGPR dead spills from frame info

Removing dead frame indices for VGPR to AGPR spills is incorrect
when the frame index is shared by multiple objects, which may
occur du

[AMDGPU] Don't remove VGPR to AGPR dead spills from frame info

Removing dead frame indices for VGPR to AGPR spills is incorrect
when the frame index is shared by multiple objects, which may
occur due to stack slot coloring. The problem is that subsequent
code that processes the other object will assert because the stack
frame index is marked dead.

Removing dead frame indices is needed prior to stack slot
coloring, which is what happens with SGPR to VGPR spills. These
spills are lowered prior to stack slot coloring, but the VGPR
to AGPR spills are processed afterwards during the Prolog/Epilog
Inserter pass. This patch marks the VGPR to AGPR spill slot as
dead if the slot is not used by another object.

Differential Revision: https://reviews.llvm.org/D115996

show more ...


# ca0c92d6 13-Oct-2021 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Allow to use a whole register file on gfx90a for VGPRs

In a kernel which does not have calls or AGPR usage we can allocate
the whole vector register budget for VGPRs and have no AGPRs as
lo

[AMDGPU] Allow to use a whole register file on gfx90a for VGPRs

In a kernel which does not have calls or AGPR usage we can allocate
the whole vector register budget for VGPRs and have no AGPRs as
long as VGPRs stay addressable (i.e. below 256).

Differential Revision: https://reviews.llvm.org/D111764

show more ...


# 48958d02 23-Aug-2021 Daniil Fukalov <daniil.fukalov@amd.com>

[NFC][AMDGPU] Reduce includes dependencies.

1. Splitted out some parts of R600 target to separate modules/headers.
2. Reduced some include lists in headers.
3. Found and fixed issue with override `G

[NFC][AMDGPU] Reduce includes dependencies.

1. Splitted out some parts of R600 target to separate modules/headers.
2. Reduced some include lists in headers.
3. Found and fixed issue with override `GCNTargetMachine::getSubtargetImpl()`
and `R600TargetMachine::getSubtargetImpl()` had different return value type
than base class.
4. Minor forward declarations cleanup.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D108596

show more ...


# 98e5ede6 30-Apr-2021 Sebastian Neubauer <sebastian.neubauer@amd.com>

[AMDGPU] Serialize MFInfo::ScavengeFI

Serialize ScavengeFI from SIMachineFunctionInfo into yaml.

ScavengeFI is not used outside of the PrologEpilogInserter,
so this shouldn't change anything.

Diff

[AMDGPU] Serialize MFInfo::ScavengeFI

Serialize ScavengeFI from SIMachineFunctionInfo into yaml.

ScavengeFI is not used outside of the PrologEpilogInserter,
so this shouldn't change anything.

Differential Revision: https://reviews.llvm.org/D101367

show more ...


# fcc40d9c 26-Apr-2021 Sebastian Neubauer <sebastian.neubauer@amd.com>

[AMDGPU] Use MapVector for WWMReservedRegs

Use MapVector instead of SmallDenseMap because it has a deterministic
iteration order.

Differential Revision: https://reviews.llvm.org/D101299


# 3366d811 23-Apr-2021 Sebastian Neubauer <sebastian.neubauer@amd.com>

[AMDGPU] Save WWM registers in functions

The values of registers in inactive lanes needs to be saved during
function calls.

Save all registers used for whole wave mode, similar to how it is done
fo

[AMDGPU] Save WWM registers in functions

The values of registers in inactive lanes needs to be saved during
function calls.

Save all registers used for whole wave mode, similar to how it is done
for VGPRs that are used for SGPR spilling.

Differential Revision: https://reviews.llvm.org/D99429

Reapply with fixed tests on window.

show more ...


# 22d99cb6 23-Apr-2021 Sebastian Neubauer <sebastian.neubauer@amd.com>

Revert "[AMDGPU] Save WWM registers in functions"

This reverts commit 91464c30bfcf731ccb7f9d6ef6d26e8c1657a6e6.

Seems to break tests on windows.


# 91464c30 23-Apr-2021 Sebastian Neubauer <sebastian.neubauer@amd.com>

[AMDGPU] Save WWM registers in functions

The values of registers in inactive lanes needs to be saved during
function calls.

Save all registers used for whole wave mode, similar to how it is done
fo

[AMDGPU] Save WWM registers in functions

The values of registers in inactive lanes needs to be saved during
function calls.

Save all registers used for whole wave mode, similar to how it is done
for VGPRs that are used for SGPR spilling.

Differential Revision: https://reviews.llvm.org/D99429

show more ...


# f9a8c6a0 12-Apr-2021 Sebastian Neubauer <sebastian.neubauer@amd.com>

[AMDGPU] Save VGPR of whole wave when spilling

Spilling SGPRs to scratch uses a temporary VGPR. LLVM currently cannot
determine if a VGPR is used in other lanes or not, so we need to save
all lanes

[AMDGPU] Save VGPR of whole wave when spilling

Spilling SGPRs to scratch uses a temporary VGPR. LLVM currently cannot
determine if a VGPR is used in other lanes or not, so we need to save
all lanes of the VGPR. We even need to save the VGPR if it is marked as
dead.

The generated code depends on two things:
- Can we scavenge an SGPR to save EXEC?
- And can we scavenge a VGPR?

If we can scavenge an SGPR, we
- save EXEC into the SGPR
- set the needed lane mask
- save the temporary VGPR
- write the spilled SGPR into VGPR lanes
- save the VGPR again to the target stack slot
- restore the VGPR
- restore EXEC

If we were not able to scavenge an SGPR, we do the same operations, but
everytime the temporary VGPR is written to memory, we
- write VGPR to memory
- flip exec (s_not exec, exec)
- write VGPR again (previously inactive lanes)

Surprisingly often, we are able to scavenge an SGPR, even though we are
at the brink of running out of SGPRs.
Scavenging a VGPR does not have a great effect (saves three instructions
if no SGPR was scavenged), but we need to know if the VGPR we use is
live before or not, otherwise the machine verifier complains.

Differential Revision: https://reviews.llvm.org/D96336

show more ...


# 2dc6be52 01-Apr-2021 Sebastian Neubauer <sebastian.neubauer@amd.com>

[AMDGPU] Update SGPRSpillVGPRCSR name. NFC

The struct is used for both, callee and caller-save registers now.
The frame index is not set for entrypoints, as we do not need to save
the registers then

[AMDGPU] Update SGPRSpillVGPRCSR name. NFC

The struct is used for both, callee and caller-save registers now.
The frame index is not set for entrypoints, as we do not need to save
the registers then.
Update the struct name to reflect that.

Differential Revision: https://reviews.llvm.org/D99722

show more ...


# 8214982b 21-Jan-2021 Sebastian Neubauer <sebastian.neubauer@amd.com>

[AMDGPU] Implement mir parseCustomPseudoSourceValue

Allow parsing generated mir with custom pseudo source value tokens.
Also rename pseudo source values to have more meaningful names.

Relands ba7dc

[AMDGPU] Implement mir parseCustomPseudoSourceValue

Allow parsing generated mir with custom pseudo source value tokens.
Also rename pseudo source values to have more meaningful names.

Relands ba7dcd8542ab, which had memory leaks.

Differential Revision: https://reviews.llvm.org/D95215

show more ...


# 4dbdff66 21-Jan-2021 Sebastian Neubauer <sebastian.neubauer@amd.com>

Revert "[AMDGPU] Implement mir parseCustomPseudoSourceValue"

This reverts commit ba7dcd8542abfc784255efcb0767701dec42fe83.

(caused memory leaks)


# ba7dcd85 15-Jan-2021 Sebastian Neubauer <sebastian.neubauer@amd.com>

[AMDGPU] Implement mir parseCustomPseudoSourceValue

Allow parsing generated mir with custom pseudo source value tokens.
Also rename pseudo source values to have more meaningful names.

Differential

[AMDGPU] Implement mir parseCustomPseudoSourceValue

Allow parsing generated mir with custom pseudo source value tokens.
Also rename pseudo source values to have more meaningful names.

Differential Revision: https://reviews.llvm.org/D94768

show more ...


# 20566a2e 15-Jan-2021 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Add occupancy to serialized MachineFunctionInfo

Not sure about the default value handling, but also not sure
defaulting to a theoretically subtarget dependent value.


# 6a87e9b0 25-Dec-2020 dfukalov <daniil.fukalov@amd.com>

[NFC][AMDGPU] Reduce include files dependency.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D93813


# 5257a60e 24-Jun-2020 Michael Liao <michael.hliao@gmail.com>

[amdgpu] Add codegen support for HIP dynamic shared memory.

Summary:
- HIP uses an unsized extern array `extern __shared__ T s[]` to declare
the dynamic shared memory, which size is not known at t

[amdgpu] Add codegen support for HIP dynamic shared memory.

Summary:
- HIP uses an unsized extern array `extern __shared__ T s[]` to declare
the dynamic shared memory, which size is not known at the
compile time.

Reviewers: arsenm, yaxunl, kpyzhov, b-sumner

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82496

show more ...


# 66d60e06 24-Jul-2020 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Serialize MFI spill fields

These should probably be inferred from the function on parse, but the
target specific infrastructure currently does not give you a way to do
this. SILowerSGPRSpill

AMDGPU: Serialize MFI spill fields

These should probably be inferred from the function on parse, but the
target specific infrastructure currently does not give you a way to do
this. SILowerSGPRSpills early exits without this reporting spills,
which makes it difficult to write a MIR test for.

show more ...


# ce76d15a 20-Jul-2020 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Use MCRegister for preloaded arguments

Attempt to fix build error with ancient GCC


# f25d020c 05-Jul-2020 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU/GlobalISel: Add types to special inputs

When passing special ABI inputs, we have no existing context for the
type to use.


Revision tags: llvmorg-10.0.1-rc1
# 7c4e711e 21-Apr-2020 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

[AMDGPU] Enable base pointer.

When the callee requires a dynamic stack realignment,
it is not possible to correcty access the incoming
stack arguments using the stack pointer. We reserve a
base poin

[AMDGPU] Enable base pointer.

When the callee requires a dynamic stack realignment,
it is not possible to correcty access the incoming
stack arguments using the stack pointer. We reserve a
base pointer in such cases to access the function arguments
inside the callee. The base pointer will hold the incoming
stack pointer value before any kind of delta added to it.

Reviewed By: arsenm, scott.linder

Differential Revision: https://reviews.llvm.org/D78811

show more ...


# 117e5609 10-Apr-2020 Saiyedul Islam <Saiyedul.Islam@amd.com>

[AMDGPU] Reserving VGPR for future SGPR Spill

Summary: One VGPR register is allocated to handle a future spill of SGPR if "--amdgpu-reserve-vgpr-for-sgpr-spill" option is used

Reviewers: arsenm, ra

[AMDGPU] Reserving VGPR for future SGPR Spill

Summary: One VGPR register is allocated to handle a future spill of SGPR if "--amdgpu-reserve-vgpr-for-sgpr-spill" option is used

Reviewers: arsenm, rampitec, msearles, cdevadas

Reviewed By: arsenm

Subscribers: madhur13490, qcolombet, kerbowa, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #amdgpu, #llvm

Differential Revision: https://reviews.llvm.org/D70379

show more ...


12345678