#
d8f99bb6 |
| 11-Feb-2022 |
Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com> |
[AMDGPU] replace hostcall module flag with function attribute
The module flag to indicate use of hostcall is insufficient to catch all cases where hostcall might be in use by a kernel. This is now r
[AMDGPU] replace hostcall module flag with function attribute
The module flag to indicate use of hostcall is insufficient to catch all cases where hostcall might be in use by a kernel. This is now replaced by a function attribute that gets propagated to top-level kernel functions via their respective call-graph.
If the attribute "amdgpu-no-hostcall-ptr" is absent on a kernel, the default behaviour is to emit kernel metadata indicating that the kernel uses the hostcall buffer pointer passed as an implicit argument.
The attribute may be placed explicitly by the user, or inferred by the AMDGPU attributor by examining the call-graph. The attribute is inferred only if the function is not being sanitized, and the implictarg_ptr does not result in a load of any byte in the hostcall pointer argument.
Reviewed By: jdoerfert, arsenm, kpyzhov
Differential Revision: https://reviews.llvm.org/D119216
show more ...
|
#
aeaf85b9 |
| 13-Jan-2022 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Select VGPR versions of MFMA if possible
We can select _vgprcd versions of MAI instructions and have no AGPRs with the whole budget left for VGPRs if:
1. This is a kernel; 2. It has no cal
[AMDGPU] Select VGPR versions of MFMA if possible
We can select _vgprcd versions of MAI instructions and have no AGPRs with the whole budget left for VGPRs if:
1. This is a kernel; 2. It has no calls; 3. It runs at least on 2 waves thus having not more that 256 VGPRs. 4. There is no inline asm requesting AGPRs.
Differential Revision: https://reviews.llvm.org/D117253
show more ...
|
#
d6fdbbca |
| 24-Nov-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Add second emergency slot for SGPR to vmem for large frames
In a future change, we will sometimes use a VGPR offset for doing spills to memory, in which case we need 2 free VGPRs to do the S
AMDGPU: Add second emergency slot for SGPR to vmem for large frames
In a future change, we will sometimes use a VGPR offset for doing spills to memory, in which case we need 2 free VGPRs to do the SGPR spill. In most cases we could spill the VGPR along with the SGPR being spilled, but we don't have any free lanes for SGPR_1024 in wave32 so we could still potentially need a second scavenging slot.
show more ...
|
#
8470bf2b |
| 12-Jan-2022 |
Austin Kerbow <Austin.Kerbow@amd.com> |
[AMDGPU] Do not reserve any VGPR for SGPR spills
After the split register allocation changes in eebe841a47cb it is no longer necessary to reserve a VGPR before RA. This can also create bugs when IPR
[AMDGPU] Do not reserve any VGPR for SGPR spills
After the split register allocation changes in eebe841a47cb it is no longer necessary to reserve a VGPR before RA. This can also create bugs when IPRA is enabled since we cannot predict that a called function may not reserve any register if it does not have any SGPR spills. If that happens those functions may override reserved registers that are normally callee saved. Added a test to show this.
Fixes: SWDEV-309900
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D115551
show more ...
|
#
d45a2479 |
| 18-Dec-2021 |
Brendon Cahoon <brendon.cahoon@amd.com> |
[AMDGPU] Don't remove VGPR to AGPR dead spills from frame info
Removing dead frame indices for VGPR to AGPR spills is incorrect when the frame index is shared by multiple objects, which may occur du
[AMDGPU] Don't remove VGPR to AGPR dead spills from frame info
Removing dead frame indices for VGPR to AGPR spills is incorrect when the frame index is shared by multiple objects, which may occur due to stack slot coloring. The problem is that subsequent code that processes the other object will assert because the stack frame index is marked dead.
Removing dead frame indices is needed prior to stack slot coloring, which is what happens with SGPR to VGPR spills. These spills are lowered prior to stack slot coloring, but the VGPR to AGPR spills are processed afterwards during the Prolog/Epilog Inserter pass. This patch marks the VGPR to AGPR spill slot as dead if the slot is not used by another object.
Differential Revision: https://reviews.llvm.org/D115996
show more ...
|
#
ca0c92d6 |
| 13-Oct-2021 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Allow to use a whole register file on gfx90a for VGPRs
In a kernel which does not have calls or AGPR usage we can allocate the whole vector register budget for VGPRs and have no AGPRs as lo
[AMDGPU] Allow to use a whole register file on gfx90a for VGPRs
In a kernel which does not have calls or AGPR usage we can allocate the whole vector register budget for VGPRs and have no AGPRs as long as VGPRs stay addressable (i.e. below 256).
Differential Revision: https://reviews.llvm.org/D111764
show more ...
|
#
48958d02 |
| 23-Aug-2021 |
Daniil Fukalov <daniil.fukalov@amd.com> |
[NFC][AMDGPU] Reduce includes dependencies.
1. Splitted out some parts of R600 target to separate modules/headers. 2. Reduced some include lists in headers. 3. Found and fixed issue with override `G
[NFC][AMDGPU] Reduce includes dependencies.
1. Splitted out some parts of R600 target to separate modules/headers. 2. Reduced some include lists in headers. 3. Found and fixed issue with override `GCNTargetMachine::getSubtargetImpl()` and `R600TargetMachine::getSubtargetImpl()` had different return value type than base class. 4. Minor forward declarations cleanup.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D108596
show more ...
|
#
98e5ede6 |
| 30-Apr-2021 |
Sebastian Neubauer <sebastian.neubauer@amd.com> |
[AMDGPU] Serialize MFInfo::ScavengeFI
Serialize ScavengeFI from SIMachineFunctionInfo into yaml.
ScavengeFI is not used outside of the PrologEpilogInserter, so this shouldn't change anything.
Diff
[AMDGPU] Serialize MFInfo::ScavengeFI
Serialize ScavengeFI from SIMachineFunctionInfo into yaml.
ScavengeFI is not used outside of the PrologEpilogInserter, so this shouldn't change anything.
Differential Revision: https://reviews.llvm.org/D101367
show more ...
|
#
fcc40d9c |
| 26-Apr-2021 |
Sebastian Neubauer <sebastian.neubauer@amd.com> |
[AMDGPU] Use MapVector for WWMReservedRegs
Use MapVector instead of SmallDenseMap because it has a deterministic iteration order.
Differential Revision: https://reviews.llvm.org/D101299
|
#
3366d811 |
| 23-Apr-2021 |
Sebastian Neubauer <sebastian.neubauer@amd.com> |
[AMDGPU] Save WWM registers in functions
The values of registers in inactive lanes needs to be saved during function calls.
Save all registers used for whole wave mode, similar to how it is done fo
[AMDGPU] Save WWM registers in functions
The values of registers in inactive lanes needs to be saved during function calls.
Save all registers used for whole wave mode, similar to how it is done for VGPRs that are used for SGPR spilling.
Differential Revision: https://reviews.llvm.org/D99429
Reapply with fixed tests on window.
show more ...
|
#
22d99cb6 |
| 23-Apr-2021 |
Sebastian Neubauer <sebastian.neubauer@amd.com> |
Revert "[AMDGPU] Save WWM registers in functions"
This reverts commit 91464c30bfcf731ccb7f9d6ef6d26e8c1657a6e6.
Seems to break tests on windows.
|
#
91464c30 |
| 23-Apr-2021 |
Sebastian Neubauer <sebastian.neubauer@amd.com> |
[AMDGPU] Save WWM registers in functions
The values of registers in inactive lanes needs to be saved during function calls.
Save all registers used for whole wave mode, similar to how it is done fo
[AMDGPU] Save WWM registers in functions
The values of registers in inactive lanes needs to be saved during function calls.
Save all registers used for whole wave mode, similar to how it is done for VGPRs that are used for SGPR spilling.
Differential Revision: https://reviews.llvm.org/D99429
show more ...
|
#
f9a8c6a0 |
| 12-Apr-2021 |
Sebastian Neubauer <sebastian.neubauer@amd.com> |
[AMDGPU] Save VGPR of whole wave when spilling
Spilling SGPRs to scratch uses a temporary VGPR. LLVM currently cannot determine if a VGPR is used in other lanes or not, so we need to save all lanes
[AMDGPU] Save VGPR of whole wave when spilling
Spilling SGPRs to scratch uses a temporary VGPR. LLVM currently cannot determine if a VGPR is used in other lanes or not, so we need to save all lanes of the VGPR. We even need to save the VGPR if it is marked as dead.
The generated code depends on two things: - Can we scavenge an SGPR to save EXEC? - And can we scavenge a VGPR?
If we can scavenge an SGPR, we - save EXEC into the SGPR - set the needed lane mask - save the temporary VGPR - write the spilled SGPR into VGPR lanes - save the VGPR again to the target stack slot - restore the VGPR - restore EXEC
If we were not able to scavenge an SGPR, we do the same operations, but everytime the temporary VGPR is written to memory, we - write VGPR to memory - flip exec (s_not exec, exec) - write VGPR again (previously inactive lanes)
Surprisingly often, we are able to scavenge an SGPR, even though we are at the brink of running out of SGPRs. Scavenging a VGPR does not have a great effect (saves three instructions if no SGPR was scavenged), but we need to know if the VGPR we use is live before or not, otherwise the machine verifier complains.
Differential Revision: https://reviews.llvm.org/D96336
show more ...
|
#
2dc6be52 |
| 01-Apr-2021 |
Sebastian Neubauer <sebastian.neubauer@amd.com> |
[AMDGPU] Update SGPRSpillVGPRCSR name. NFC
The struct is used for both, callee and caller-save registers now. The frame index is not set for entrypoints, as we do not need to save the registers then
[AMDGPU] Update SGPRSpillVGPRCSR name. NFC
The struct is used for both, callee and caller-save registers now. The frame index is not set for entrypoints, as we do not need to save the registers then. Update the struct name to reflect that.
Differential Revision: https://reviews.llvm.org/D99722
show more ...
|
#
8214982b |
| 21-Jan-2021 |
Sebastian Neubauer <sebastian.neubauer@amd.com> |
[AMDGPU] Implement mir parseCustomPseudoSourceValue
Allow parsing generated mir with custom pseudo source value tokens. Also rename pseudo source values to have more meaningful names.
Relands ba7dc
[AMDGPU] Implement mir parseCustomPseudoSourceValue
Allow parsing generated mir with custom pseudo source value tokens. Also rename pseudo source values to have more meaningful names.
Relands ba7dcd8542ab, which had memory leaks.
Differential Revision: https://reviews.llvm.org/D95215
show more ...
|
#
4dbdff66 |
| 21-Jan-2021 |
Sebastian Neubauer <sebastian.neubauer@amd.com> |
Revert "[AMDGPU] Implement mir parseCustomPseudoSourceValue"
This reverts commit ba7dcd8542abfc784255efcb0767701dec42fe83.
(caused memory leaks)
|
#
ba7dcd85 |
| 15-Jan-2021 |
Sebastian Neubauer <sebastian.neubauer@amd.com> |
[AMDGPU] Implement mir parseCustomPseudoSourceValue
Allow parsing generated mir with custom pseudo source value tokens. Also rename pseudo source values to have more meaningful names.
Differential
[AMDGPU] Implement mir parseCustomPseudoSourceValue
Allow parsing generated mir with custom pseudo source value tokens. Also rename pseudo source values to have more meaningful names.
Differential Revision: https://reviews.llvm.org/D94768
show more ...
|
#
20566a2e |
| 15-Jan-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Add occupancy to serialized MachineFunctionInfo
Not sure about the default value handling, but also not sure defaulting to a theoretically subtarget dependent value.
|
#
6a87e9b0 |
| 25-Dec-2020 |
dfukalov <daniil.fukalov@amd.com> |
[NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D93813
|
#
5257a60e |
| 24-Jun-2020 |
Michael Liao <michael.hliao@gmail.com> |
[amdgpu] Add codegen support for HIP dynamic shared memory.
Summary: - HIP uses an unsized extern array `extern __shared__ T s[]` to declare the dynamic shared memory, which size is not known at t
[amdgpu] Add codegen support for HIP dynamic shared memory.
Summary: - HIP uses an unsized extern array `extern __shared__ T s[]` to declare the dynamic shared memory, which size is not known at the compile time.
Reviewers: arsenm, yaxunl, kpyzhov, b-sumner
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82496
show more ...
|
#
66d60e06 |
| 24-Jul-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Serialize MFI spill fields
These should probably be inferred from the function on parse, but the target specific infrastructure currently does not give you a way to do this. SILowerSGPRSpill
AMDGPU: Serialize MFI spill fields
These should probably be inferred from the function on parse, but the target specific infrastructure currently does not give you a way to do this. SILowerSGPRSpills early exits without this reporting spills, which makes it difficult to write a MIR test for.
show more ...
|
#
ce76d15a |
| 20-Jul-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Use MCRegister for preloaded arguments
Attempt to fix build error with ancient GCC
|
#
f25d020c |
| 05-Jul-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU/GlobalISel: Add types to special inputs
When passing special ABI inputs, we have no existing context for the type to use.
|
Revision tags: llvmorg-10.0.1-rc1 |
|
#
7c4e711e |
| 21-Apr-2020 |
Christudasan Devadasan <Christudasan.Devadasan@amd.com> |
[AMDGPU] Enable base pointer.
When the callee requires a dynamic stack realignment, it is not possible to correcty access the incoming stack arguments using the stack pointer. We reserve a base poin
[AMDGPU] Enable base pointer.
When the callee requires a dynamic stack realignment, it is not possible to correcty access the incoming stack arguments using the stack pointer. We reserve a base pointer in such cases to access the function arguments inside the callee. The base pointer will hold the incoming stack pointer value before any kind of delta added to it.
Reviewed By: arsenm, scott.linder
Differential Revision: https://reviews.llvm.org/D78811
show more ...
|
#
117e5609 |
| 10-Apr-2020 |
Saiyedul Islam <Saiyedul.Islam@amd.com> |
[AMDGPU] Reserving VGPR for future SGPR Spill
Summary: One VGPR register is allocated to handle a future spill of SGPR if "--amdgpu-reserve-vgpr-for-sgpr-spill" option is used
Reviewers: arsenm, ra
[AMDGPU] Reserving VGPR for future SGPR Spill
Summary: One VGPR register is allocated to handle a future spill of SGPR if "--amdgpu-reserve-vgpr-for-sgpr-spill" option is used
Reviewers: arsenm, rampitec, msearles, cdevadas
Reviewed By: arsenm
Subscribers: madhur13490, qcolombet, kerbowa, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #amdgpu, #llvm
Differential Revision: https://reviews.llvm.org/D70379
show more ...
|