#
6cfb6427 |
| 19-Oct-2023 |
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> |
AMDGPU: Minor updates to program resource registers (#69525)
- Be explicit about which program resource register is supported by
which target
- RSRC1
- FP16_OVFL is GFX9+
- WGP_M
AMDGPU: Minor updates to program resource registers (#69525)
- Be explicit about which program resource register is supported by
which target
- RSRC1
- FP16_OVFL is GFX9+
- WGP_MODE is GFX10+
- MEM_ORDERED is GFX10+
- FWD_PROGRESS is GFX10+
- RSRC3
- INST_PREF_SIZE is GFX11+
- TRAP_ON_START is GFX11+
- TRAP_ON_END is GFX11+
- IMAGE_OP is GFX11+
- Do not emit GFX11+ fields when disassembling GFX10 code objects
- Tighten enforcement of reserved bits in disassembler
---------
Co-authored-by: Konstantin Zhuravlyov <kzhuravl@amd.com>
show more ...
|
#
868abf09 |
| 18-Oct-2023 |
pvanhout <pierre.vanhoutryve@amd.com> |
Revert "[AMDGPU] Remove Code Object V3 (#67118)"
This reverts commit 544d91280c26fd5f7acd70eac4d667863562f4cc.
|
#
544d9128 |
| 16-Oct-2023 |
Pierre van Houtryve <pierre.vanhoutryve@amd.com> |
[AMDGPU] Remove Code Object V3 (#67118)
V3 has been deprecated for a while as well, so it can safely be removed
like V2 was removed.
- [Clang] Set minimum code object version to 4
- [lld] Fix t
[AMDGPU] Remove Code Object V3 (#67118)
V3 has been deprecated for a while as well, so it can safely be removed
like V2 was removed.
- [Clang] Set minimum code object version to 4
- [lld] Fix tests using code object v3
- Remove code object V3 from the AMDGPU backend, and delete or port v3
tests to v4.
- Update docs to make it clear V3 can no longer be emitted.
show more ...
|
#
ab6c3d50 |
| 12-Oct-2023 |
Stanislav Mekhanoshin <rampitec@users.noreply.github.com> |
[AMDGPU] Change the representation of double literals in operands (#68740)
A 64-bit literal can be used as a 32-bit zero or sign extended operand.
In case of double zeroes are added to the low 32 b
[AMDGPU] Change the representation of double literals in operands (#68740)
A 64-bit literal can be used as a 32-bit zero or sign extended operand.
In case of double zeroes are added to the low 32 bits. Currently asm
parser stores only high 32 bits of a double into an operand. To support
codegen as requested by the
https://github.com/llvm/llvm-project/issues/67781 we need to change the
representation to store a full 64-bit value so that codegen can simply
add immediates to an instruction.
There is some code to support compatibility with existing tests and asm
kernels. We allow to use short hex strings to represent only a high 32
bit of a double value as a valid literal.
show more ...
|
Revision tags: llvmorg-17.0.2 |
|
#
2cd2445c |
| 29-Sep-2023 |
Mirko Brkušanin <Mirko.Brkusanin@amd.com> |
[AMDGPU] Src1 of VOP3 DPP instructions can be SGPR on supported subtargets (#67461)
In order to avoid duplicating every dpp pseudo opcode that has src1, we
allow it for all opcodes and add manual c
[AMDGPU] Src1 of VOP3 DPP instructions can be SGPR on supported subtargets (#67461)
In order to avoid duplicating every dpp pseudo opcode that has src1, we
allow it for all opcodes and add manual checks on subtargets that do not
support it.
show more ...
|
#
2024bfec |
| 26-Sep-2023 |
Stanislav Mekhanoshin <rampitec@users.noreply.github.com> |
[AMDGPU] Remove int types from isSISrcFPOperand (#67401)
This is NFCI, I don't believe there are any instructions using packed
types in the ins dag, only in patterns, and the affected function is
[AMDGPU] Remove int types from isSISrcFPOperand (#67401)
This is NFCI, I don't believe there are any instructions using packed
types in the ins dag, only in patterns, and the affected function is
only used in the asm parser. However, int types shall not be reported as
fp types.
This may be usesul if we create an asm syntax for packed fp literals
which we currently don't. If/when we do it that shall affect if we
accept FP modifiers on these types or not. Say we could create a syntax
like v2(-lit1, |lit2|) that would matter then.
show more ...
|
#
9310baa5 |
| 25-Sep-2023 |
Ivan Kosarev <ivan.kosarev@amd.com> |
[AMDGPU][NFC] Add True16 operand definitions.
Reviewed By: Joe_Nash
Differential Revision: https://reviews.llvm.org/D156103
|
#
45e425e3 |
| 22-Sep-2023 |
Ruiling, Song <ruiling.song@amd.com> |
AMDGPU: Teach isArgPassedInSGPR() about cs_chain* calling convention (#67086)
This cs_chain and cs_chain_preserve use InReg attribute to indicate
argument passed through SGPR.
|
#
fe2f67e4 |
| 21-Sep-2023 |
Pierre van Houtryve <pierre.vanhoutryve@amd.com> |
[AMDGPU] Remove Code Object V2 (#65715)
Code Object V2 has been deprecated for more than a year now. We can
safely remove it from LLVM.
- [clang] Remove support for the `-mcode-object-version=2`
[AMDGPU] Remove Code Object V2 (#65715)
Code Object V2 has been deprecated for more than a year now. We can
safely remove it from LLVM.
- [clang] Remove support for the `-mcode-object-version=2` option.
- [lld] Remove/refactor tests that were still using COV2
- [llvm] Update AMDGPUUsage.rst
- Code Object V2 docs are left for informational purposes because those
code objects may still be supported by the runtime/loaders for a while.
- [AMDGPU] Remove COV2 emission capabilities.
- [AMDGPU] Remove `MetadataStreamerYamlV2` which was only used by COV2
- [AMDGPU] Update all tests that were still using COV2 - They are either
deleted or ported directly to code object v4 (as v3 is also planned to
be removed soon).
show more ...
|
Revision tags: llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2 |
|
#
69447d6a |
| 02-Aug-2023 |
Austin Kerbow <Austin.Kerbow@amd.com> |
[AMDGPU] Add ASM and MC updates for preloading kernargs
Add assembler directives for preloading kernel arguments that correspond to new fields in the kernel descriptor for the length and offset of a
[AMDGPU] Add ASM and MC updates for preloading kernargs
Add assembler directives for preloading kernel arguments that correspond to new fields in the kernel descriptor for the length and offset of arguments that will be placed in SGPRs prior to kernel launch. Alignment of the arguments in SGPRs is equivalent to the kernarg segment when accessed via the kernarg_segment_ptr. Kernarg SGPRs are allocated directly after other user SGPRs.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D159459
show more ...
|
#
466a8149 |
| 12-Sep-2023 |
Saiyedul Islam <Saiyedul.Islam@amd.com> |
Revert "[AMDGPU] Make default AMDHSA Code Object Version to be 5 (#65410)" (#66060)
This reverts commit 0a8d17e79b02a92814a2a788d79df1f54d70ec3e.
|
#
0a8d17e7 |
| 12-Sep-2023 |
Saiyedul Islam <Saiyedul.Islam@amd.com> |
[AMDGPU] Make default AMDHSA Code Object Version to be 5 (#65410)
Also update LIT tests and docs.
For more details, see
https://llvm.org/docs/AMDGPUUsage.html#code-object-v5-metadata
Reviewed B
[AMDGPU] Make default AMDHSA Code Object Version to be 5 (#65410)
Also update LIT tests and docs.
For more details, see
https://llvm.org/docs/AMDGPUUsage.html#code-object-v5-metadata
Reviewed By: arsenm, jhuber6
Github PR: #65410
Differential Revision: https://reviews.llvm.org/D129818
show more ...
|
#
cfe9a134 |
| 21-Aug-2023 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Rename 64BitDPP feature and fix the checks
Names '64BitDPP' and especially 'DPP64' were found misleading, and DPP64 can easily be mixed with DPP16 and DPP8 while these are different concept
[AMDGPU] Rename 64BitDPP feature and fix the checks
Names '64BitDPP' and especially 'DPP64' were found misleading, and DPP64 can easily be mixed with DPP16 and DPP8 while these are different concepts. DPP16 and DPP8 refers to lanes where DPP64 refers to the operand size.
In fact the essential part here is that these instructions are executed on the DP ALU, so rename the feature accordingly.
I have also found a bug in a check for these instructions, which is fixed here and a common utility function is now used.
Differential Revision: https://reviews.llvm.org/D158465
show more ...
|
Revision tags: llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5 |
|
#
26dc2844 |
| 30-May-2023 |
Diana Picus <Diana-Magda.Picus@amd.com> |
[AMDGPU] ISel for amdgpu_cs_chain[_preserve] functions
Lower formal arguments and returns for functions with the `amdgpu_cs_chain` and `amdgpu_cs_chain_preserve` calling conventions:
* Put `inreg`
[AMDGPU] ISel for amdgpu_cs_chain[_preserve] functions
Lower formal arguments and returns for functions with the `amdgpu_cs_chain` and `amdgpu_cs_chain_preserve` calling conventions:
* Put `inreg` arguments into SGPRs, starting at s0, and other arguments into VGPRs, starting at v8. No arguments should end up on the stack, if we don't have enough registers we should error out.
* Lower the return (which is always void) as an S_ENDPGM.
* Set the ScratchRSrc register to s48:51, as described in the docs.
* Set the SP to s32, matching amdgpu_gfx. This might be revisited in a future patch.
Differential Revision: https://reviews.llvm.org/D153517
show more ...
|
#
de82fde2 |
| 18-Aug-2023 |
Mirko Brkusanin <Mirko.Brkusanin@amd.com> |
AMDGPU/Uniformity/GlobalISel: G_AMDGPU atomics are always divergent
Patch by: Acim Maravic
Differential Revision: https://reviews.llvm.org/D157091
|
#
e61ca232 |
| 04-Aug-2023 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Add and use SIInstrFlags::GWS. NFC.
This reduces the number of places where we have to check for a list of DS_GWS_* opcodes.
Differential Revision: https://reviews.llvm.org/D157099
|
#
f86c81b2 |
| 26-Jul-2023 |
Reid Kleckner <rnk@google.com> |
[AMDGPU] Avoid CodeGen dependencies from AMDGPU/Utils and MCTargetDesc
This required two substantial changes: 1. Moving a `getRegBitWidth(TargetRegisterClass)` overload out of Utils and into Code
[AMDGPU] Avoid CodeGen dependencies from AMDGPU/Utils and MCTargetDesc
This required two substantial changes: 1. Moving a `getRegBitWidth(TargetRegisterClass)` overload out of Utils and into CodeGen 2. Passing the string function name to AMDGPUPALMetadata instead of the MachineFunction
Other changes are minor or updates to accommodate the first two.
See issue #64166 for more information on the layering issue.
Differential Revision: https://reviews.llvm.org/D156486
show more ...
|
#
a2229511 |
| 15-Jul-2023 |
Jon Chesterfield <jonathanchesterfield@gmail.com> |
[amdgpu][nfc] Use unsigned for getIntegerPairAttribute to match the only call sites
|
#
8aedad0f |
| 04-Jul-2023 |
Stephen Thomas <Stephen.Thomas@amd.com> |
[AMDGPU] Add functions for composing and decomposing S_WAIT_DEPCTR operands
Add functions AMDGPU::DepCtr::encodeField*() and AMDGPU::DepCtr::decodeField*() for each of vm_vsrc, va_vdst and sa_sdst.
[AMDGPU] Add functions for composing and decomposing S_WAIT_DEPCTR operands
Add functions AMDGPU::DepCtr::encodeField*() and AMDGPU::DepCtr::decodeField*() for each of vm_vsrc, va_vdst and sa_sdst. These are now used in AMDGPUInsertDelayAlu and GCNHazardRecognizer so as to make working with S_WAITCNT_DEPCTR operands easier and more readable.
Differential Revision: https://reviews.llvm.org/D154424
show more ...
|
#
e2903abc |
| 10-Jun-2023 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Remove integer division in VOPD checks
There is no way any compiler can simplify this division, while the check is done rather often.
Differential Revision: https://reviews.llvm.org/D152613
|
#
4e312abd |
| 07-Jun-2023 |
Ivan Kosarev <ivan.kosarev@amd.com> |
[AMDGPU][NFC] Add a getRegBitWidth() helper for TargetRegisterClass operands.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D152257
|
Revision tags: llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4 |
|
#
dcb83484 |
| 23-Feb-2023 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Split SIModeRegisterDefaults out of AMDGPUBaseInfo. NFC.
This is only used by CodeGen. Moving it out of AMDGPUBaseInfo simplifies future changes to make some of it depend on the subtarget.
[AMDGPU] Split SIModeRegisterDefaults out of AMDGPUBaseInfo. NFC.
This is only used by CodeGen. Moving it out of AMDGPUBaseInfo simplifies future changes to make some of it depend on the subtarget.
Differential Revision: https://reviews.llvm.org/D144650
show more ...
|
#
b3dc0e69 |
| 23-Feb-2023 |
Mirko Brkusanin <Mirko.Brkusanin@amd.com> |
[AMDGPU][MC][GFX11] Add Partial NSA format for image sample instructions
Image sample instructions that need more than 5 VGPRs for VAddr can use partial NSA for NSA encoding format. VGPRs that can n
[AMDGPU][MC][GFX11] Add Partial NSA format for image sample instructions
Image sample instructions that need more than 5 VGPRs for VAddr can use partial NSA for NSA encoding format. VGPRs that can not fit into the encoding are sequential after the last one. This patch adds assembly and disassembly parts.
Differential Revision: https://reviews.llvm.org/D144033
show more ...
|
#
c9f4df57 |
| 22-Feb-2023 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Move splitMUBUFOffset from AMDGPUBaseInfo to SIInstrInfo
Moving this out of AMDGPUBaseInfo enforces that AMDGPUBaseInfo should not be calling into GCNSubtarget.
Differential Revision: http
[AMDGPU] Move splitMUBUFOffset from AMDGPUBaseInfo to SIInstrInfo
Moving this out of AMDGPUBaseInfo enforces that AMDGPUBaseInfo should not be calling into GCNSubtarget.
Differential Revision: https://reviews.llvm.org/D144564
show more ...
|
Revision tags: llvmorg-16.0.0-rc3 |
|
#
432caca3 |
| 18-Feb-2023 |
Fangrui Song <i@maskray.me> |
Simplify with hasFeature. NFC
|