History log of /llvm-project/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp (Results 1 – 25 of 174)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6
# b1d42465 08-Dec-2024 Austin Kerbow <Austin.Kerbow@amd.com>

[AMDGPU] Fix hidden kernarg preload count inconsistency (#116759)

It is possible that the number of hidden arguments that are selected to
be preloaded in AMDGPULowerKernel arguments and isel can di

[AMDGPU] Fix hidden kernarg preload count inconsistency (#116759)

It is possible that the number of hidden arguments that are selected to
be preloaded in AMDGPULowerKernel arguments and isel can differ. This
isn't an issue with explicit arguments since isel can lower the argument
correctly either way, but with hidden arguments we may have alignment
issues if we try to load these hidden arguments that were added to the
kernel signature.

The reason for the mismatch is that isel reserves an extra synthetic
user SGPR for module LDS.

Instead of teaching lowerFormalArguments how to handle these properly it
makes more sense and is less expensive to fix the mismatch and assert if
we ever run into this issue again. We should never be trying to lower
these in the normal way.

In a future change we probably want to revise how we track "synthetic"
user SGPRs and unify the handling in GCNUserSGPRUsageInfo. Sometimes
synthetic SGPRSs are considered user SGPRs and sometimes they are not.
Until then this patch resolves the inconsistency, fixes the bug, and is
otherwise a NFC.

show more ...


Revision tags: llvmorg-19.1.5, llvmorg-19.1.4
# be187369 14-Nov-2024 Kazu Hirata <kazu@google.com>

[AMDGPU] Remove unused includes (NFC) (#116154)

Identified with misc-include-cleaner.


# 6548b635 09-Nov-2024 Shilei Tian <i@tianshilei.me>

Reapply "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"

This reverts commit ca33649abe5fad93c57afef54e43ed9b3249cd86.


# ca33649a 08-Nov-2024 Shilei Tian <i@tianshilei.me>

Revert "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"

This reverts commit e215a1e27d84adad2635a52393621eb4fa439dc9 as it broke both
hip and openmp buildbots.


# e215a1e2 08-Nov-2024 Shilei Tian <i@tianshilei.me>

[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)


# 8ee5e19c 30-Oct-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Fix @llvm.amdgcn.cs.chain with SGPR args not provably uniform (#114232)

The correct behaviour is to insert a readfirstlane. SelectionDAG was
already doing this in some cases, but not in th

[AMDGPU] Fix @llvm.amdgcn.cs.chain with SGPR args not provably uniform (#114232)

The correct behaviour is to insert a readfirstlane. SelectionDAG was
already doing this in some cases, but not in the general case for chain
calls. GlobalISel was already doing this for return values but not for
arguments.

show more ...


Revision tags: llvmorg-19.1.3, llvmorg-19.1.2
# 428ae0f1 03-Oct-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Do not tail call if an inreg argument requires waterfalling (#111002)

If we have a divergent value passed to an outgoing inreg argument,
the call needs to be executed in a waterfall loop and

AMDGPU: Do not tail call if an inreg argument requires waterfalling (#111002)

If we have a divergent value passed to an outgoing inreg argument,
the call needs to be executed in a waterfall loop and thus cannot
be tail called.

The waterfall handling of arbitrary calls is broken on the selectiondag
path, so some of these cases still hit an error later.

I also noticed the argument evaluation code in isEligibleForTailCallOptimization
is not correctly accounting for implicit argument assignments. It also seems
inreg codegen is generally broken; we are assigning arguments to the reserved
private resource descriptor.

show more ...


# 8d13e7b8 03-Oct-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Qualify auto. NFC. (#110878)

Generated automatically with:
$ clang-tidy -fix -checks=-*,llvm-qualified-auto $(find
lib/Target/AMDGPU/ -type f)


Revision tags: llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3
# 3b9f1839 13-Aug-2024 Kazu Hirata <kazu@google.com>

[AMDGPU] Use llvm::any_of, llvm::all_of, and llvm::none_of (NFC) (#103007)


Revision tags: llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 74b87b02 16-Jul-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Fix and add namespace closing comments. NFC.


# 9df71d76 28-Jun-2024 Nikita Popov <npopov@redhat.com>

[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)

Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, re

[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)

Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.

show more ...


Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3
# 256343a0 26-Mar-2024 Thomas Symalla <5754458+tsymalla@users.noreply.github.com>

Revert "Update amdgpu_gfx functions to use s0-s3 for inreg SGPR arguments on targets using scratch instructions for stack #78226" (#86273)

Reverts llvm/llvm-project#81394

This reverts commit 3ac2

Revert "Update amdgpu_gfx functions to use s0-s3 for inreg SGPR arguments on targets using scratch instructions for stack #78226" (#86273)

Reverts llvm/llvm-project#81394

This reverts commit 3ac243bc0d7922d083af2cf025247b5698556062.
It is not handling RSrc registers s0-s3 correctly. This leads to a
broken test, where it expects s0-s3 as function argument and uses it as
RSrc register as well.
We need to re-visit the patch, but apparently we only want to have s0-s3
as
argument registers if we don't need them as RSrc registers.

show more ...


# 3ac243bc 21-Mar-2024 SahilPatidar <patidarsahil2001@gmail.com>

Update amdgpu_gfx functions to use s0-s3 for inreg SGPR arguments on targets using scratch instructions for stack #78226 (#81394)

Resolve #78226


Revision tags: llvmorg-18.1.2
# ec34699f 18-Mar-2024 Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com>

[GlobalISel] convergence control tokens and intrinsics (#67006)

[GlobalISel] Implement convergence control tokens and intrinsics in GMIR

In the IR translator, convert the LLVM token type to LLT::

[GlobalISel] convergence control tokens and intrinsics (#67006)

[GlobalISel] Implement convergence control tokens and intrinsics in GMIR

In the IR translator, convert the LLVM token type to LLT::token(), which is an
alias for the s0 type. These show up as implicit uses on convergent operations.

Differential Revision: https://reviews.llvm.org/D158147

show more ...


Revision tags: llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# bc82cfb3 21-Jan-2024 Emma Pilkington <emma.pilkington95@gmail.com>

[AMDGPU] Add an asm directive to track code_object_version (#76267)

Named '.amdhsa_code_object_version'. This directive sets the
e_ident[ABIVERSION] in the ELF header, and should be used as the ass

[AMDGPU] Add an asm directive to track code_object_version (#76267)

Named '.amdhsa_code_object_version'. This directive sets the
e_ident[ABIVERSION] in the ELF header, and should be used as the assumed
COV for the rest of the asm file.

This commit also weakens the --amdhsa-code-object-version CL flag.
Previously, the CL flag took precedence over the IR flag. Now the IR
flag/asm directive take precedence over the CL flag. This is implemented
by merging a few COV-checking functions in AMDGPUBaseInfo.h.

show more ...


# af4f1766 17-Jan-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Allocate special SGPRs before user SGPR arguments (#78234)


# 480cc413 16-Jan-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU/GlobalISel: Handle inreg arguments as SGPRs (#78123)

This is the missing GISel part of
54470176afe20b16e6b026ab989591d1d19ad2b7


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5
# 7f5d59b3 06-Nov-2023 Diana <Diana-Magda.Picus@amd.com>

[AMDGPU] ISel for @llvm.amdgcn.cs.chain intrinsic (#68186)

The @llvm.amdgcn.cs.chain intrinsic is essentially a call. The call
parameters are bundled up into 2 intrinsic arguments, one for those th

[AMDGPU] ISel for @llvm.amdgcn.cs.chain intrinsic (#68186)

The @llvm.amdgcn.cs.chain intrinsic is essentially a call. The call
parameters are bundled up into 2 intrinsic arguments, one for those that
should go in the SGPRs (the 3rd intrinsic argument), and one for those
that should go in the VGPRs (the 4th intrinsic argument). Both will
often be some kind of aggregate.

Both instruction selection frameworks have some internal representation
for intrinsics (G_INTRINSIC[_WITH_SIDE_EFFECTS] for GlobalISel,
ISD::INTRINSIC_[VOID|WITH_CHAIN] for DAGISel), but we can't use those
because aggregates are dissolved very early on during ISel and we'd lose
the inreg information. Therefore, this patch shortcircuits both the
IRTranslator and SelectionDAGBuilder to lower this intrinsic as a call
from the very start. It tries to use the existing infrastructure as much
as possible, by calling into the code for lowering tail calls.

This has already gone through a few rounds of review in Phab:

Differential Revision: https://reviews.llvm.org/D153761

show more ...


Revision tags: llvmorg-17.0.4
# 2f4328e6 24-Oct-2023 Craig Topper <craig.topper@sifive.com>

[GISel] Make assignValueToReg take CCValAssign by const reference. (#70086)

This was previously passed by value. It used to be passed by non-const
reference, but it was changed to value in D110610.

[GISel] Make assignValueToReg take CCValAssign by const reference. (#70086)

This was previously passed by value. It used to be passed by non-const
reference, but it was changed to value in D110610. I'm not sure why.

show more ...


# 9f592cbc 24-Oct-2023 Craig Topper <craig.topper@sifive.com>

[GISel] Pass MPO and VA to assignValueToAddress by const reference. NFC (#69810)

Previously they were passed by non-const reference. No in tree target
modifies the values.

This makes it possible

[GISel] Pass MPO and VA to assignValueToAddress by const reference. NFC (#69810)

Previously they were passed by non-const reference. No in tree target
modifies the values.

This makes it possible to call assignValueToAddress from
assignCustomValue without a const_cast. For example in this patch
https://github.com/llvm/llvm-project/pull/69138.

show more ...


Revision tags: llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3
# 343be513 19-Aug-2023 Austin Kerbow <Austin.Kerbow@amd.com>

[AMDGPU] Add utilities to track number of user SGPRs. NFC.

Factor out and unify some common code that calculates and tracks the
number of user SGRPs.

Reviewed By: arsenm

Differential Revision: htt

[AMDGPU] Add utilities to track number of user SGPRs. NFC.

Factor out and unify some common code that calculates and tracks the
number of user SGRPs.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D159439

show more ...


# ef38e6d9 18-Aug-2023 Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com>

[GlobalISel] introduce MIFlag::NoConvergent

Some opcodes in MIR are defined to be convergent by the target by setting
IsConvergent in the corresponding TD file. For example, in AMDGPU, the opcodes
G

[GlobalISel] introduce MIFlag::NoConvergent

Some opcodes in MIR are defined to be convergent by the target by setting
IsConvergent in the corresponding TD file. For example, in AMDGPU, the opcodes
G_SI_CALL and G_INTRINSIC* are marked as convergent. But this is too
conservative, since calls to functions that do not execute convergent operations
should not be marked convergent. This information is available in LLVM IR.

The new flag MIFlag::NoConvergent now allows the IR translator to mark an
instruction as not performing any convergent operations. It is relevant only on
occurrences of opcodes that are marked isConvergent in the target.

Differential Revision: https://reviews.llvm.org/D157475

show more ...


Revision tags: llvmorg-17.0.0-rc2
# d9847cde 31-Jul-2023 Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com>

[GlobalISel] convergent intrinsics

Introduced the convergent equivalent of the existing G_INTRINSIC opcodes:

- G_INTRINSIC_CONVERGENT
- G_INTRINSIC_CONVERGENT_W_SIDE_EFFECTS

Out of the targets tha

[GlobalISel] convergent intrinsics

Introduced the convergent equivalent of the existing G_INTRINSIC opcodes:

- G_INTRINSIC_CONVERGENT
- G_INTRINSIC_CONVERGENT_W_SIDE_EFFECTS

Out of the targets that currently have some support for GlobalISel, the patch
assumes that the convergent intrinsics only relevant to SPIRV and AMDGPU.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D154766

show more ...


Revision tags: llvmorg-17.0.0-rc1, llvmorg-18-init
# 74e928a0 13-Jul-2023 Jon Chesterfield <jonathanchesterfield@gmail.com>

[amdgpu][lds] Remove recalculation of LDS frame from backend

Do the LDS frame calculation once, in the IR pass, instead of repeating the work in the backend.

Prior to this patch:
The IR lowering pa

[amdgpu][lds] Remove recalculation of LDS frame from backend

Do the LDS frame calculation once, in the IR pass, instead of repeating the work in the backend.

Prior to this patch:
The IR lowering pass sets up a per-kernel LDS frame and annotates the variables with absolute_symbol
metadata so that the assembler can build lookup tables out of it. There is a fragile association between
kernel functions and named structs which is used to recompute the frame layout in the backend, with
fatal_errors catching inconsistencies in the second calculation.

After this patch:
The IR lowering pass additionally sets a frame size attribute on kernels. The backend uses the same
absolute_symbol metadata that the assembler uses to place objects within that frame size.

Deleted the now dead allocation code from the backend. Left for a later cleanup:
- enabling lowering for anonymous functions
- removing the elide-module-lds attribute (test churn, it's not used by llc any more)
- adjusting the dynamic alignment check to not use symbol names

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D155190

show more ...


Revision tags: llvmorg-16.0.6
# 3d0350b7 07-Jun-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Add MF independent version of getImplicitParameterOffset


1234567