History log of /llvm-project/llvm/test/CodeGen/AMDGPU/annotate-kernel-features-hsa.ll (Results 1 – 25 of 48)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6
# 7dbd6cd2 11-Dec-2024 Shilei Tian <i@tianshilei.me>

[AMDGPU][Attributor] Make `AAAMDFlatWorkGroupSize` honor existing attribute (#114357)

If a function has `amdgpu-flat-work-group-size`, honor it in `initialize` by
taking its value directly; otherwi

[AMDGPU][Attributor] Make `AAAMDFlatWorkGroupSize` honor existing attribute (#114357)

If a function has `amdgpu-flat-work-group-size`, honor it in `initialize` by
taking its value directly; otherwise, it uses the default range as a starting
point. We will no longer manipulate the known range, which can cause issues
because the known range is a "throttle" to the assumed range such that the
assumed range can't get widened properly in `updateImpl` if the known range is
not set properly for whatever reasons. Another benefit of not touching the known
range is, if we indicate pessimistic state, it also invalidates the AA such that
`manifest` will not be called. Since we honor the attribute, we don't want and
will not add any half-baked attribute added to a function.

show more ...


# 41ed16c3 10-Dec-2024 Jun Wang <jwang86@yahoo.com>

Reapply "[AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (#94647)" (#118907)

This reverts commit 1ef9410a96c1d9669a6feaf03fcab8d0a4a13bd5.

This fixes the test file attrib

Reapply "[AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (#94647)" (#118907)

This reverts commit 1ef9410a96c1d9669a6feaf03fcab8d0a4a13bd5.

This fixes the test file attributor-flatscratchinit-globalisel.ll.

show more ...


# 1ef9410a 04-Dec-2024 Philip Reames <preames@rivosinc.com>

Revert "[AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (#94647)"

This reverts commit e6aec2c12095cc7debd1a8004c8535eef41f4c36. Commit breaks "ninja check-llvm" on x86 host.


# e6aec2c1 04-Dec-2024 Jun Wang <jwang86@yahoo.com>

[AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (#94647)

The AMDGPUAnnotateKernelFeatures pass infers the "amdgpu-calls" and
"amdgpu-stack-objects" attributes, which are us

[AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (#94647)

The AMDGPUAnnotateKernelFeatures pass infers the "amdgpu-calls" and
"amdgpu-stack-objects" attributes, which are used to infer whether we
need to initialize flat scratch. This is, however, not precise. Instead,
we should use AMDGPUAttributor and infer amdgpu-no-flat-scratch-init on
kernels. Refer to https://github.com/llvm/llvm-project/issues/63586 .

show more ...


Revision tags: llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0
# 109cd11d 06-Sep-2024 Shilei Tian <i@tianshilei.me>

[Attributor] Skip AS specialization for volatile memory instructions (#107250)


Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3
# a0afcbfb 06-Aug-2024 Shilei Tian <i@tianshilei.me>

[AMDGPU] Enable `AAAddressSpace` in `AMDGPUAttributor` (#101593)


Revision tags: llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3
# b6b703b2 21-Mar-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Infer no-agpr usage in AMDGPUAttributor (#85948)

SIMachineFunctionInfo has a scan of the function body for inline asm
which may use AGPRs, or callees in SIMachineFunctionInfo. Move this
i

AMDGPU: Infer no-agpr usage in AMDGPUAttributor (#85948)

SIMachineFunctionInfo has a scan of the function body for inline asm
which may use AGPRs, or callees in SIMachineFunctionInfo. Move this
into the attributor, so it actually works interprocedurally.

Could probably avoid most of the test churn if this bothered to avoid
adding this on subtargets without AGPRs. We should also probably
try to delete the MIR scan in usesAGPRs but it seems to be trickier
to eliminate.

show more ...


Revision tags: llvmorg-18.1.2, llvmorg-18.1.1
# 4490003a 06-Mar-2024 Emma Pilkington <emma.pilkington95@gmail.com>

[AMDGPU] Rename COV module flag to amdhsa_code_object_version (#79905)

The previous name 'amdgpu_code_object_version', was misleading since
this is really a property of the HSA OS. The new spelling

[AMDGPU] Rename COV module flag to amdhsa_code_object_version (#79905)

The previous name 'amdgpu_code_object_version', was misleading since
this is really a property of the HSA OS. The new spelling also matches
the asm directive I added in bc82cfb.

show more ...


Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# 777b6de7 12-Dec-2023 Saiyedul Islam <Saiyedul.Islam@amd.com>

[AMDGPU][NFC] Test autogenerated llc tests for COV5 (#74339)

Regenerate a few llc tests to test for COV5 instead of the default ABI
version.


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5
# d34a10a4 07-Nov-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Port AMDGPUAttributor to new pass manager (#71349)


Revision tags: llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0
# 466a8149 12-Sep-2023 Saiyedul Islam <Saiyedul.Islam@amd.com>

Revert "[AMDGPU] Make default AMDHSA Code Object Version to be 5 (#65410)" (#66060)

This reverts commit 0a8d17e79b02a92814a2a788d79df1f54d70ec3e.


# 0a8d17e7 12-Sep-2023 Saiyedul Islam <Saiyedul.Islam@amd.com>

[AMDGPU] Make default AMDHSA Code Object Version to be 5 (#65410)

Also update LIT tests and docs.
For more details, see
https://llvm.org/docs/AMDGPUUsage.html#code-object-v5-metadata

Reviewed B

[AMDGPU] Make default AMDHSA Code Object Version to be 5 (#65410)

Also update LIT tests and docs.
For more details, see
https://llvm.org/docs/AMDGPUUsage.html#code-object-v5-metadata

Reviewed By: arsenm, jhuber6

Github PR: #65410

Differential Revision: https://reviews.llvm.org/D129818

show more ...


Revision tags: llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3
# b9c6d9e6 13-Sep-2021 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Propagate amdgpu-waves-per-eu with attributor

This will do a value range merging down the callgraph, unlike the
current pass which can only propagate values to undecorated functions
from a k

AMDGPU: Propagate amdgpu-waves-per-eu with attributor

This will do a value range merging down the callgraph, unlike the
current pass which can only propagate values to undecorated functions
from a kernel.

This one is a bit weird due to the interaction with the implied range
from amdgpu-flat-workgroup-size. At the default group range of 1,1024,
the minimum implied bounds is 4 so this ends up introducing the
attribute on undecorated functions. We could probably simplify this by
ignoring it and propagating the raw values. The subtarget interaction
and the interaction with amdgpu-flat-workgroup-size only really clamp
invalid values (plus the lower bound doesn't seem to do anything as
far as I can tell anyway).

show more ...


# f0415f2a 03-May-2023 Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>

Re-land "[AMDGPU] Define data layout entries for buffers""

Re-land D145441 with data layout upgrade code fixed to not break OpenMP.

This reverts commit 3f2fbe92d0f40bcb46db7636db9ec3f7e7899b27.

Di

Re-land "[AMDGPU] Define data layout entries for buffers""

Re-land D145441 with data layout upgrade code fixed to not break OpenMP.

This reverts commit 3f2fbe92d0f40bcb46db7636db9ec3f7e7899b27.

Differential Revision: https://reviews.llvm.org/D149776

show more ...


# 3f2fbe92 03-May-2023 Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>

Revert "[AMDGPU] Define data layout entries for buffers"

This reverts commit f9c1ede2543b37fabe9f2d8f8fed5073c475d850.

Differential Revision: https://reviews.llvm.org/D149758


# f9c1ede2 07-Feb-2023 Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>

[AMDGPU] Define data layout entries for buffers

Per discussion at
https://discourse.llvm.org/t/representing-buffer-descriptors-in-the-amdgpu-target-call-for-suggestions/68798,
we define two new addr

[AMDGPU] Define data layout entries for buffers

Per discussion at
https://discourse.llvm.org/t/representing-buffer-descriptors-in-the-amdgpu-target-call-for-suggestions/68798,
we define two new address spaces for AMDGCN targets.

The first is address space 7, a non-integral address space (which was
already in the data layout) that has 160-bit pointers (which are
256-bit aligned) and uses a 32-bit offset. These pointers combine a
128-bit buffer descriptor and a 32-bit offset, and will be usable with
normal LLVM operations (load, store, GEP). However, they will be
rewritten out of existence before code generation.

The second of these is address space 8, the address space for "buffer
resources". These will be used to represent the resource arguments to
buffer instructions, and new buffer intrinsics will be defined that
take them instead of <4 x i32> as resource arguments. ptr
addrspace(8). These pointers are 128-bits long (with the same
alignment). They must not be used as the arguments to getelementptr or
otherwise used in address computations, since they can have
arbitrarily complex inherent addressing semantics that can't be
represented in LLVM. Even though, like their address space 7 cousins,
these pointers have deterministic ptrtoint/inttoptr semantics, they
are defined to be non-integral in order to prevent optimizations that
rely on pointers being a [0, [addr_max]] value from applying to them.

Future work includes:
- Defining new buffer intrinsics that take ptr addrspace(8) resources.
- A late rewrite to turn address space 7 operations into buffer
intrinsics and offset computations.

This commit also updates the "fallback address space" for buffer
intrinsics to the buffer resource, and updates the alias analysis
table.

Depends on D143437

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D145441

show more ...


# 4d4894ab 08-Jan-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

Partially reapply "AMDGPU: Invert handling of enqueued block detection"

This mostly reverts commit 270e96f435596449002fc89962595497481c8770.

Keep the attributor related changes around, but function

Partially reapply "AMDGPU: Invert handling of enqueued block detection"

This mostly reverts commit 270e96f435596449002fc89962595497481c8770.

Keep the attributor related changes around, but functionally restore
the old behavior as a workaround. Device enqueue goes back to not
working at -O0 with this version.

show more ...


# 270e96f4 08-Jan-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

Revert "AMDGPU: Invert handling of enqueued block detection"

This reverts commit 47288cc977fa31c44cc92b4e65044a5b75c2597e.

The runtime is having trouble with this at -O0 when the inputs are
always

Revert "AMDGPU: Invert handling of enqueued block detection"

This reverts commit 47288cc977fa31c44cc92b4e65044a5b75c2597e.

The runtime is having trouble with this at -O0 when the inputs are
always enabled.

show more ...


# 47288cc9 23-Dec-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Invert handling of enqueued block detection

Invert the sense of the attribute and let the attributor figure this
out like everything else. If needed we can have the not-OpenCL
languages set

AMDGPU: Invert handling of enqueued block detection

Invert the sense of the attribute and let the attributor figure this
out like everything else. If needed we can have the not-OpenCL
languages set amdgpu-no-default-queue and amdgpu-no-completion-action
up front so they never have to pay the cost.

There are also so many of these now, the offset use API should
probably consider all of them at once. Maybe they should merge into
one attribute with used fields. Having separate functions for each
field in AMDGPUBaseInfo is also not the greatest API (might as well
fix this when the patch to get the object version from the module
lands).

show more ...


# 262c2c0f 19-Dec-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Update some tests to use opaque pointers

vectorize-buffer-fat-pointer.ll required a manual check line fix.
vector-alloca-addrspacecast.ll required a manual fixup of a check
line. partial-reg

AMDGPU: Update some tests to use opaque pointers

vectorize-buffer-fat-pointer.ll required a manual check line fix.
vector-alloca-addrspacecast.ll required a manual fixup of a check
line. partial-regcopy-and-spill-missed-at-regalloc.ll required
re-running update_mir_test_checks. The HSA metadata tests required
avoiding the script touching the type name in the metadata.

annotate-noclobber.ll ran into one update script bug. It deleted a
check line with a 0 offset GEP, moving the following -NEXT check
logically up one line.

show more ...


# f6e3a89c 04-Oct-2022 Johannes Doerfert <johannes@jdoerfert.de>

[AMDGPU] Annotate the intrinsics to be default and nocallback

Differential Revision: https://reviews.llvm.org/D135155


# ca856fff 29-Nov-2022 Ron Lieberman <ron.lieberman@amd.com>

Revert "enable code-object-version=5"

very sorry wrong repo.

This reverts commit d882ba7aeac4b496dccd1b10cb58bd691786b691.


# d882ba7a 29-Nov-2022 Ron Lieberman <ron.lieberman@amd.com>

enable code-object-version=5


# 304f1d59 02-Nov-2022 Nikita Popov <npopov@redhat.com>

[IR] Switch everything to use memory attribute

This switches everything to use the memory attribute proposed in
https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579.
The old argmemo

[IR] Switch everything to use memory attribute

This switches everything to use the memory attribute proposed in
https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579.
The old argmemonly, inaccessiblememonly and inaccessiblemem_or_argmemonly
attributes are dropped. The readnone, readonly and writeonly attributes
are restricted to parameters only.

The old attributes are auto-upgraded both in bitcode and IR.
The bitcode upgrade is a policy requirement that has to be retained
indefinitely. The IR upgrade is mainly there so it's not necessary
to update all tests using memory attributes in this patch, which
is already large enough. We could drop that part after migrating
tests, or retain it longer term, to make it easier to import IR
from older LLVM versions.

High-level Function/CallBase APIs like doesNotAccessMemory() or
setDoesNotAccessMemory() are mapped transparently to the memory
attribute. Code that directly manipulates attributes (e.g. via
AttributeList) on the other hand needs to switch to working with
the memory attribute instead.

Differential Revision: https://reviews.llvm.org/D135780

show more ...


# 3a205977 19-Jul-2022 Jon Chesterfield <jonathanchesterfield@gmail.com>

[amdgpu] Implement lds kernel id intrinsic

Implement an intrinsic for use lowering LDS variables to different
addresses from different kernels. This will allow kernels that cannot
reach an LDS varia

[amdgpu] Implement lds kernel id intrinsic

Implement an intrinsic for use lowering LDS variables to different
addresses from different kernels. This will allow kernels that cannot
reach an LDS variable to avoid wasting space for it.

There are a number of implicit arguments accessed by intrinsic already
so this implementation closely follows the existing handling. It is slightly
novel in that this SGPR is written by the kernel prologue.

It is necessary in the general case to put variables at different addresses
such that they can be compactly allocated and thus necessary for an
indirect function call to have some means of determining where a
given variable was allocated. Claiming an arbitrary SGPR into which
an integer can be written by the kernel, in this implementation based
on metadata associated with that kernel, which is then passed on to
indirect call sites is sufficient to determine the variable address.

The intent is to emit a __const array of LDS addresses and index into it.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D125060

show more ...


12