#
b3f5c724 |
| 12-Jul-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Assume true in getVOPNIsSingle helpers (#98516)
If we have something we don't know what it is, we should conservatively
avoid printing an additional suffix. For isCodeGenOnly
pseudoinstruc
AMDGPU: Assume true in getVOPNIsSingle helpers (#98516)
If we have something we don't know what it is, we should conservatively
avoid printing an additional suffix. For isCodeGenOnly
pseudoinstructions,
no encoded instruction is added to the tables this is queried, and the
null
case would assume true.
This happens to fix the case I ran into, but this isn't a wholistic fix.
These really should be encoded directly in the TSFlags of the
MCInstrDesc,
which would allow encoding pseudos to work correctly.
show more ...
|
#
3aef525a |
| 24-Jun-2024 |
vangthao95 <vang.thao@amd.com> |
[AMDGPU] Fix negative immediate offset for unbuffered smem loads (#89165)
For unbuffered smem loads, it is illegal for the immediate offset to be
negative if the resulting IOFFSET + (SGPR[Offset] o
[AMDGPU] Fix negative immediate offset for unbuffered smem loads (#89165)
For unbuffered smem loads, it is illegal for the immediate offset to be
negative if the resulting IOFFSET + (SGPR[Offset] or M0 or zero) is
negative.
New PR of https://github.com/llvm/llvm-project/pull/79553.
show more ...
|
#
85200612 |
| 18-Jun-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Support local atomicrmw fmin/fmax for float/double (#95590)
This has always been supported. Somehow, we ended up with 2 copies of clang builtins for this case, and the newer one erroneously
AMDGPU: Support local atomicrmw fmin/fmax for float/double (#95590)
This has always been supported. Somehow, we ended up with 2 copies of clang builtins for this case, and the newer one erroneously requires gfx8-insts.
show more ...
|
Revision tags: llvmorg-18.1.8 |
|
#
4a305d40 |
| 12-Jun-2024 |
Scott Egerton <9487234+ScottEgerton@users.noreply.github.com> |
[AMDGPU] Exclude certain opcodes from being marked as single use (#91802)
The s_singleuse_vdst instruction is used to mark regions of instructions
that produce values that have only one use.
Certa
[AMDGPU] Exclude certain opcodes from being marked as single use (#91802)
The s_singleuse_vdst instruction is used to mark regions of instructions
that produce values that have only one use.
Certain instructions take more than one cycle to execute, resulting in
regions being incorrectly marked.
This patch excludes these multi-cycle instructions from being marked as
either producing single use values or consuming single use values
or both depending on the instruction.
show more ...
|
Revision tags: llvmorg-18.1.7 |
|
#
a699ccbf |
| 22-May-2024 |
Janek van Oirschot <janek.vanoirschot@amd.com> |
MCExpr-ify amd_kernel_code_t (#91587)
Redefines the amd_kernel_code_t struct with MCExprs for members that would be
derived from SIProgramInfo MCExpr members.
|
Revision tags: llvmorg-18.1.6 |
|
#
d86b68af |
| 09-May-2024 |
Janek van Oirschot <5994977+JanekvO@users.noreply.github.com> |
MCExpr-ify SIProgramInfo (#88257)
Convert members in SIProgramInfo affected by variables provided by AMDGPUResourceUsageAnalysis into MCExprs.
|
#
dcc7ef3c |
| 07-May-2024 |
Emma Pilkington <emma.pilkington95@gmail.com> |
[AMDGPU][MC] Disable sendmsg SYSMSG_OP_HOST_TRAP_ACK on gfx9+ (#90203)
This is no longer supported as of gfx9. Fixes #52903
This commit also includes some refactoring of sendmsg operand parsing:
[AMDGPU][MC] Disable sendmsg SYSMSG_OP_HOST_TRAP_ACK on gfx9+ (#90203)
This is no longer supported as of gfx9. Fixes #52903
This commit also includes some refactoring of sendmsg operand parsing:
- Use CustomOperand for sendmsg operations, this allows them to be
conditionally available based on a STI check (and automatically in
sync with SIDefines.h).
- Move CustomOperand table lookups from AMDGPUBaseInfo to
AMDGPUAsmUtils. This cleans up an awkward interface where
AMDGPUAsmUtils defined a table/size as globals that AMDGPUBaseInfo
had to loop over.
- Clean up a few of the operand lookup functions while moving them.
show more ...
|
Revision tags: llvmorg-18.1.5, llvmorg-18.1.4 |
|
#
607b4bc6 |
| 03-Apr-2024 |
Emma Pilkington <emma.pilkington95@gmail.com> |
[AMDGPU] Add a missing COV6 case to getAMDHSACodeObjectVersion() (#87492)
|
#
e29228ef |
| 03-Apr-2024 |
Joe Nash <joseph.nash@amd.com> |
[AMDGPU][MC] Allow VOP3C dpp src1 to be imm or SGPR (#87418)
Allows src1 of VOP3 encoded VOPC to be an SGPR or inline immediate on
GFX1150Plus
The w32 and w64 _e64_dpp assembler only real instru
[AMDGPU][MC] Allow VOP3C dpp src1 to be imm or SGPR (#87418)
Allows src1 of VOP3 encoded VOPC to be an SGPR or inline immediate on
GFX1150Plus
The w32 and w64 _e64_dpp assembler only real instructions were unused,
and erroneously constructed in a way that bugged parsing of the new
instructions. They are removed.
This patch is a follow up to PR
https://github.com/llvm/llvm-project/pull/87382
show more ...
|
Revision tags: llvmorg-18.1.3 |
|
#
1103a2a3 |
| 27-Mar-2024 |
Janek van Oirschot <5994977+JanekvO@users.noreply.github.com> |
Reland [AMDGPU] MCExpr-ify MC layer kernel descriptor (#86494)
Kernel descriptor attributes, with their respective emit and asm parse functionality, converted to MCExpr.
Relands #80855 with fixes
|
#
94a550da |
| 25-Mar-2024 |
Mariusz Sikora <mariusz.sikora@amd.com> |
[AMDGPU][NFC] Rename Feature GFX11FullVGPRs to 1_5xVGPRs (#86468)
|
#
75e528fd |
| 25-Mar-2024 |
David Stuttard <david.stuttard@amd.com> |
[AMDGPU] Extend zero initialization of return values for TFE (#85759)
buffer_load instructions that use TFE also need to zero initialize
return values similar to how the image instructions currentl
[AMDGPU] Extend zero initialization of return values for TFE (#85759)
buffer_load instructions that use TFE also need to zero initialize
return values similar to how the image instructions currently work. Add
support for this with standard zero init of all results + zero init of
just TFE flag when enable-prt-strict-null subtarget feature is disabled.
show more ...
|
#
797336b1 |
| 21-Mar-2024 |
Janek van Oirschot <5994977+JanekvO@users.noreply.github.com> |
Revert "[AMDGPU] MCExpr-ify MC layer kernel descriptor" (#86151)
Reverts llvm/llvm-project#80855
|
#
857161c3 |
| 21-Mar-2024 |
Janek van Oirschot <5994977+JanekvO@users.noreply.github.com> |
[AMDGPU] MCExpr-ify MC layer kernel descriptor (#80855)
Kernel descriptor attributes, with their respective emit and asm parse functionality, converted to MCExpr.
|
Revision tags: llvmorg-18.1.2 |
|
#
c29b265e |
| 14-Mar-2024 |
Carl Ritson <carl.ritson@amd.com> |
Reapply "[AMDGPU] Add pal metadata 3.0 support to callable pal funcs (#67104)"
This reverts commit 7d508eb5d38f4bbbab4230a666d9e742e271af61.
|
#
c4e517f5 |
| 12-Mar-2024 |
Jun Wang <jwang86@yahoo.com> |
[AMDGPU] Adding the amdgpu_num_work_groups function attribute (#79035)
A new function attribute named amdgpu_num_work_groups is added. This
attribute, which consists of three integers, allows progr
[AMDGPU] Adding the amdgpu_num_work_groups function attribute (#79035)
A new function attribute named amdgpu_num_work_groups is added. This
attribute, which consists of three integers, allows programmers to let
the compiler know the number of workgroups to be launched in each of the
three dimensions and do optimizations based on that information.
---------
Co-authored-by: Jun Wang <jun.wang7@amd.com>
show more ...
|
#
e963d074 |
| 08-Mar-2024 |
Shilei Tian <i@tianshilei.me> |
[AMDGPU] Replace `isInlinableLiteral16` with specific version (#84402)
The current implementation of `isInlinableLiteral16` assumes, a 16-bit
inlinable
literal is either an `i16` or a `fp16`. This
[AMDGPU] Replace `isInlinableLiteral16` with specific version (#84402)
The current implementation of `isInlinableLiteral16` assumes, a 16-bit
inlinable
literal is either an `i16` or a `fp16`. This is not always true because
of
`bf16`. However, we can't tell `fp16` and `bf16` apart by just looking
at the
value. This patch splits `isInlinableLiteral16` into three versions,
`i16`,
`fp16`, `bf16` respectively, and call the corresponding version.
show more ...
|
Revision tags: llvmorg-18.1.1 |
|
#
0086cc95 |
| 07-Mar-2024 |
Diana Picus <Diana-Magda.Picus@amd.com> |
[AMDGPU] Rename getNumVGPRBlocks. NFC (#84161)
Rename getNumVGPRBlocks to getEncodedNumVGPRBlocks, to clarify that it's
using the encoding granule. This is used to program the hardware. In
practic
[AMDGPU] Rename getNumVGPRBlocks. NFC (#84161)
Rename getNumVGPRBlocks to getEncodedNumVGPRBlocks, to clarify that it's
using the encoding granule. This is used to program the hardware. In
practice, the hardware will use the alloc granule instead, so this patch
also adds a new helper, getAllocatedNumVGPRBlocks, which can be useful
when driving heuristics.
show more ...
|
#
4490003a |
| 06-Mar-2024 |
Emma Pilkington <emma.pilkington95@gmail.com> |
[AMDGPU] Rename COV module flag to amdhsa_code_object_version (#79905)
The previous name 'amdgpu_code_object_version', was misleading since
this is really a property of the HSA OS. The new spelling
[AMDGPU] Rename COV module flag to amdhsa_code_object_version (#79905)
The previous name 'amdgpu_code_object_version', was misleading since
this is really a property of the HSA OS. The new spelling also matches
the asm directive I added in bc82cfb.
show more ...
|
#
e9c1dbb4 |
| 06-Mar-2024 |
Shilei Tian <i@tianshilei.me> |
Revert "[AMDGPU] Replace `isInlinableLiteral16` with specific version (#81345)"
This reverts commit 530f0e64ec11327879c44f2fd55c7c28efdbaa2d because it breaks downstream.
|
#
530f0e64 |
| 04-Mar-2024 |
Shilei Tian <i@tianshilei.me> |
[AMDGPU] Replace `isInlinableLiteral16` with specific version (#81345)
|
#
680c780a |
| 28-Feb-2024 |
Ivan Kosarev <ivan.kosarev@amd.com> |
[AMDGPU][AsmParser] Support structured HWREG operands. (#82805)
Symbolic values are to be supported separately.
|
Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5 |
|
#
113052b2 |
| 09-Nov-2023 |
Jeffrey Byrnes <Jeffrey.Byrnes@amd.com> |
[AMDGPU] Prefer lower total register usage in regions with spilling
Change-Id: Ia5c434b0945bdcbc357c5e06c3164118fc91df25
|
#
dfa1d9b0 |
| 23-Feb-2024 |
Ivan Kosarev <ivan.kosarev@amd.com> |
[AMDGPU][NFC] Have helpers to deal with encoding fields. (#82772)
These are hoped to provide more convenient and less error prone
facilities to encode and decode fields than manually defined consta
[AMDGPU][NFC] Have helpers to deal with encoding fields. (#82772)
These are hoped to provide more convenient and less error prone
facilities to encode and decode fields than manually defined constants
and functions.
show more ...
|
#
46734aa1 |
| 16-Feb-2024 |
Shilei Tian <i@tianshilei.me> |
[AMDGPU] Use `bf16` instead of `i16` for bfloat (#80908)
Currently we generally use `i16` to represent `bf16` in those tablegen
files. This patch is trying to use `bf16` directly.
Fix #79369.
|