Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6 |
|
#
b1d42465 |
| 08-Dec-2024 |
Austin Kerbow <Austin.Kerbow@amd.com> |
[AMDGPU] Fix hidden kernarg preload count inconsistency (#116759)
It is possible that the number of hidden arguments that are selected to
be preloaded in AMDGPULowerKernel arguments and isel can di
[AMDGPU] Fix hidden kernarg preload count inconsistency (#116759)
It is possible that the number of hidden arguments that are selected to
be preloaded in AMDGPULowerKernel arguments and isel can differ. This
isn't an issue with explicit arguments since isel can lower the argument
correctly either way, but with hidden arguments we may have alignment
issues if we try to load these hidden arguments that were added to the
kernel signature.
The reason for the mismatch is that isel reserves an extra synthetic
user SGPR for module LDS.
Instead of teaching lowerFormalArguments how to handle these properly it
makes more sense and is less expensive to fix the mismatch and assert if
we ever run into this issue again. We should never be trying to lower
these in the normal way.
In a future change we probably want to revise how we track "synthetic"
user SGPRs and unify the handling in GCNUserSGPRUsageInfo. Sometimes
synthetic SGPRSs are considered user SGPRs and sometimes they are not.
Until then this patch resolves the inconsistency, fixes the bug, and is
otherwise a NFC.
show more ...
|
Revision tags: llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
be187369 |
| 14-Nov-2024 |
Kazu Hirata <kazu@google.com> |
[AMDGPU] Remove unused includes (NFC) (#116154)
Identified with misc-include-cleaner.
|
#
6548b635 |
| 09-Nov-2024 |
Shilei Tian <i@tianshilei.me> |
Reapply "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"
This reverts commit ca33649abe5fad93c57afef54e43ed9b3249cd86.
|
#
ca33649a |
| 08-Nov-2024 |
Shilei Tian <i@tianshilei.me> |
Revert "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"
This reverts commit e215a1e27d84adad2635a52393621eb4fa439dc9 as it broke both hip and openmp buildbots.
|
#
e215a1e2 |
| 08-Nov-2024 |
Shilei Tian <i@tianshilei.me> |
[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)
|
#
8ee5e19c |
| 30-Oct-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Fix @llvm.amdgcn.cs.chain with SGPR args not provably uniform (#114232)
The correct behaviour is to insert a readfirstlane. SelectionDAG was
already doing this in some cases, but not in th
[AMDGPU] Fix @llvm.amdgcn.cs.chain with SGPR args not provably uniform (#114232)
The correct behaviour is to insert a readfirstlane. SelectionDAG was
already doing this in some cases, but not in the general case for chain
calls. GlobalISel was already doing this for return values but not for
arguments.
show more ...
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2 |
|
#
428ae0f1 |
| 03-Oct-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Do not tail call if an inreg argument requires waterfalling (#111002)
If we have a divergent value passed to an outgoing inreg argument, the call needs to be executed in a waterfall loop and
AMDGPU: Do not tail call if an inreg argument requires waterfalling (#111002)
If we have a divergent value passed to an outgoing inreg argument, the call needs to be executed in a waterfall loop and thus cannot be tail called.
The waterfall handling of arbitrary calls is broken on the selectiondag path, so some of these cases still hit an error later.
I also noticed the argument evaluation code in isEligibleForTailCallOptimization is not correctly accounting for implicit argument assignments. It also seems inreg codegen is generally broken; we are assigning arguments to the reserved private resource descriptor.
show more ...
|
#
8d13e7b8 |
| 03-Oct-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Qualify auto. NFC. (#110878)
Generated automatically with:
$ clang-tidy -fix -checks=-*,llvm-qualified-auto $(find
lib/Target/AMDGPU/ -type f)
|
Revision tags: llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3 |
|
#
3b9f1839 |
| 13-Aug-2024 |
Kazu Hirata <kazu@google.com> |
[AMDGPU] Use llvm::any_of, llvm::all_of, and llvm::none_of (NFC) (#103007)
|
Revision tags: llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
74b87b02 |
| 16-Jul-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Fix and add namespace closing comments. NFC.
|
#
9df71d76 |
| 28-Jun-2024 |
Nikita Popov <npopov@redhat.com> |
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, re
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
show more ...
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3 |
|
#
256343a0 |
| 26-Mar-2024 |
Thomas Symalla <5754458+tsymalla@users.noreply.github.com> |
Revert "Update amdgpu_gfx functions to use s0-s3 for inreg SGPR arguments on targets using scratch instructions for stack #78226" (#86273)
Reverts llvm/llvm-project#81394
This reverts commit 3ac2
Revert "Update amdgpu_gfx functions to use s0-s3 for inreg SGPR arguments on targets using scratch instructions for stack #78226" (#86273)
Reverts llvm/llvm-project#81394
This reverts commit 3ac243bc0d7922d083af2cf025247b5698556062.
It is not handling RSrc registers s0-s3 correctly. This leads to a
broken test, where it expects s0-s3 as function argument and uses it as
RSrc register as well.
We need to re-visit the patch, but apparently we only want to have s0-s3
as
argument registers if we don't need them as RSrc registers.
show more ...
|
#
3ac243bc |
| 21-Mar-2024 |
SahilPatidar <patidarsahil2001@gmail.com> |
Update amdgpu_gfx functions to use s0-s3 for inreg SGPR arguments on targets using scratch instructions for stack #78226 (#81394)
Resolve #78226
|
Revision tags: llvmorg-18.1.2 |
|
#
ec34699f |
| 18-Mar-2024 |
Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com> |
[GlobalISel] convergence control tokens and intrinsics (#67006)
[GlobalISel] Implement convergence control tokens and intrinsics in GMIR
In the IR translator, convert the LLVM token type to LLT::
[GlobalISel] convergence control tokens and intrinsics (#67006)
[GlobalISel] Implement convergence control tokens and intrinsics in GMIR
In the IR translator, convert the LLVM token type to LLT::token(), which is an
alias for the s0 type. These show up as implicit uses on convergent operations.
Differential Revision: https://reviews.llvm.org/D158147
show more ...
|
Revision tags: llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init |
|
#
bc82cfb3 |
| 21-Jan-2024 |
Emma Pilkington <emma.pilkington95@gmail.com> |
[AMDGPU] Add an asm directive to track code_object_version (#76267)
Named '.amdhsa_code_object_version'. This directive sets the
e_ident[ABIVERSION] in the ELF header, and should be used as the ass
[AMDGPU] Add an asm directive to track code_object_version (#76267)
Named '.amdhsa_code_object_version'. This directive sets the
e_ident[ABIVERSION] in the ELF header, and should be used as the assumed
COV for the rest of the asm file.
This commit also weakens the --amdhsa-code-object-version CL flag.
Previously, the CL flag took precedence over the IR flag. Now the IR
flag/asm directive take precedence over the CL flag. This is implemented
by merging a few COV-checking functions in AMDGPUBaseInfo.h.
show more ...
|
#
af4f1766 |
| 17-Jan-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Allocate special SGPRs before user SGPR arguments (#78234)
|
#
480cc413 |
| 16-Jan-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU/GlobalISel: Handle inreg arguments as SGPRs (#78123)
This is the missing GISel part of
54470176afe20b16e6b026ab989591d1d19ad2b7
|
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5 |
|
#
7f5d59b3 |
| 06-Nov-2023 |
Diana <Diana-Magda.Picus@amd.com> |
[AMDGPU] ISel for @llvm.amdgcn.cs.chain intrinsic (#68186)
The @llvm.amdgcn.cs.chain intrinsic is essentially a call. The call
parameters are bundled up into 2 intrinsic arguments, one for those th
[AMDGPU] ISel for @llvm.amdgcn.cs.chain intrinsic (#68186)
The @llvm.amdgcn.cs.chain intrinsic is essentially a call. The call
parameters are bundled up into 2 intrinsic arguments, one for those that
should go in the SGPRs (the 3rd intrinsic argument), and one for those
that should go in the VGPRs (the 4th intrinsic argument). Both will
often be some kind of aggregate.
Both instruction selection frameworks have some internal representation
for intrinsics (G_INTRINSIC[_WITH_SIDE_EFFECTS] for GlobalISel,
ISD::INTRINSIC_[VOID|WITH_CHAIN] for DAGISel), but we can't use those
because aggregates are dissolved very early on during ISel and we'd lose
the inreg information. Therefore, this patch shortcircuits both the
IRTranslator and SelectionDAGBuilder to lower this intrinsic as a call
from the very start. It tries to use the existing infrastructure as much
as possible, by calling into the code for lowering tail calls.
This has already gone through a few rounds of review in Phab:
Differential Revision: https://reviews.llvm.org/D153761
show more ...
|
Revision tags: llvmorg-17.0.4 |
|
#
2f4328e6 |
| 24-Oct-2023 |
Craig Topper <craig.topper@sifive.com> |
[GISel] Make assignValueToReg take CCValAssign by const reference. (#70086)
This was previously passed by value. It used to be passed by non-const
reference, but it was changed to value in D110610.
[GISel] Make assignValueToReg take CCValAssign by const reference. (#70086)
This was previously passed by value. It used to be passed by non-const
reference, but it was changed to value in D110610. I'm not sure why.
show more ...
|
#
9f592cbc |
| 24-Oct-2023 |
Craig Topper <craig.topper@sifive.com> |
[GISel] Pass MPO and VA to assignValueToAddress by const reference. NFC (#69810)
Previously they were passed by non-const reference. No in tree target
modifies the values.
This makes it possible
[GISel] Pass MPO and VA to assignValueToAddress by const reference. NFC (#69810)
Previously they were passed by non-const reference. No in tree target
modifies the values.
This makes it possible to call assignValueToAddress from
assignCustomValue without a const_cast. For example in this patch
https://github.com/llvm/llvm-project/pull/69138.
show more ...
|
Revision tags: llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3 |
|
#
343be513 |
| 19-Aug-2023 |
Austin Kerbow <Austin.Kerbow@amd.com> |
[AMDGPU] Add utilities to track number of user SGPRs. NFC.
Factor out and unify some common code that calculates and tracks the number of user SGRPs.
Reviewed By: arsenm
Differential Revision: htt
[AMDGPU] Add utilities to track number of user SGPRs. NFC.
Factor out and unify some common code that calculates and tracks the number of user SGRPs.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D159439
show more ...
|
#
ef38e6d9 |
| 18-Aug-2023 |
Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com> |
[GlobalISel] introduce MIFlag::NoConvergent
Some opcodes in MIR are defined to be convergent by the target by setting IsConvergent in the corresponding TD file. For example, in AMDGPU, the opcodes G
[GlobalISel] introduce MIFlag::NoConvergent
Some opcodes in MIR are defined to be convergent by the target by setting IsConvergent in the corresponding TD file. For example, in AMDGPU, the opcodes G_SI_CALL and G_INTRINSIC* are marked as convergent. But this is too conservative, since calls to functions that do not execute convergent operations should not be marked convergent. This information is available in LLVM IR.
The new flag MIFlag::NoConvergent now allows the IR translator to mark an instruction as not performing any convergent operations. It is relevant only on occurrences of opcodes that are marked isConvergent in the target.
Differential Revision: https://reviews.llvm.org/D157475
show more ...
|
Revision tags: llvmorg-17.0.0-rc2 |
|
#
d9847cde |
| 31-Jul-2023 |
Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com> |
[GlobalISel] convergent intrinsics
Introduced the convergent equivalent of the existing G_INTRINSIC opcodes:
- G_INTRINSIC_CONVERGENT - G_INTRINSIC_CONVERGENT_W_SIDE_EFFECTS
Out of the targets tha
[GlobalISel] convergent intrinsics
Introduced the convergent equivalent of the existing G_INTRINSIC opcodes:
- G_INTRINSIC_CONVERGENT - G_INTRINSIC_CONVERGENT_W_SIDE_EFFECTS
Out of the targets that currently have some support for GlobalISel, the patch assumes that the convergent intrinsics only relevant to SPIRV and AMDGPU.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D154766
show more ...
|
Revision tags: llvmorg-17.0.0-rc1, llvmorg-18-init |
|
#
74e928a0 |
| 13-Jul-2023 |
Jon Chesterfield <jonathanchesterfield@gmail.com> |
[amdgpu][lds] Remove recalculation of LDS frame from backend
Do the LDS frame calculation once, in the IR pass, instead of repeating the work in the backend.
Prior to this patch: The IR lowering pa
[amdgpu][lds] Remove recalculation of LDS frame from backend
Do the LDS frame calculation once, in the IR pass, instead of repeating the work in the backend.
Prior to this patch: The IR lowering pass sets up a per-kernel LDS frame and annotates the variables with absolute_symbol metadata so that the assembler can build lookup tables out of it. There is a fragile association between kernel functions and named structs which is used to recompute the frame layout in the backend, with fatal_errors catching inconsistencies in the second calculation.
After this patch: The IR lowering pass additionally sets a frame size attribute on kernels. The backend uses the same absolute_symbol metadata that the assembler uses to place objects within that frame size.
Deleted the now dead allocation code from the backend. Left for a later cleanup: - enabling lowering for anonymous functions - removing the elide-module-lds attribute (test churn, it's not used by llc any more) - adjusting the dynamic alignment check to not use symbol names
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D155190
show more ...
|
Revision tags: llvmorg-16.0.6 |
|
#
3d0350b7 |
| 07-Jun-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Add MF independent version of getImplicitParameterOffset
|