History log of /llvm-project/llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp (Results 1 – 22 of 22)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init
# 697c1883 20-Jan-2025 Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>

Reapply "[AMDGPU] Handle natively unsupported types in addrspace(7) lowering" (#123660)

(#123657)

This reverts commit 64749fb01538fba2b56d9850497d5f3a626cabc2.

Adds a constructor to VecSlice t

Reapply "[AMDGPU] Handle natively unsupported types in addrspace(7) lowering" (#123660)

(#123657)

This reverts commit 64749fb01538fba2b56d9850497d5f3a626cabc2.

Adds a constructor to VecSlice to address the failure

show more ...


# 64749fb0 20-Jan-2025 Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>

Revert "[AMDGPU] Handle natively unsupported types in addrspace(7) lowering" (#123657)

Reverts llvm/llvm-project#110572

Seem to have broken a buildbot, not sure why
https://lab.llvm.org/buildbot

Revert "[AMDGPU] Handle natively unsupported types in addrspace(7) lowering" (#123657)

Reverts llvm/llvm-project#110572

Seem to have broken a buildbot, not sure why
https://lab.llvm.org/buildbot/#/builders/108/builds/8346

show more ...


# 3805355e 20-Jan-2025 Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>

[AMDGPU] Handle natively unsupported types in addrspace(7) lowering (#110572)

The current lowering for ptr addrspace(7) assumed that the instruction
selector can handle arbtrary LLVM types, which i

[AMDGPU] Handle natively unsupported types in addrspace(7) lowering (#110572)

The current lowering for ptr addrspace(7) assumed that the instruction
selector can handle arbtrary LLVM types, which is not the case. Code
generation can't deal with
- Values that aren't 8, 16, 32, 64, 96, or 128 bits long
- Aggregates (this commit only handles arrays of scalars, more may come)
- Vectors of more than one byte
- 3-word values that aren't a vector of 3 32-bit values (for axample, a
<6 x half>)

This commit adds a buffer contents type legalizer that adds the needed
bitcasts, zero-extensions, and splits into subcompnents needed to
convert a load or store operation into one that can be successfully
lowered through code generation.

In the long run, some of the involved bitcasts (though potentially not
the buffer operation splitting) ought to be handled by the instruction
legalizer, but SelectionDAG makes this difficult.

It also takes advantage of the new `nuw` flag on `getelementptr` when
lowering GEPs to offset additions.

We don't currently plumb through `nsw` on GEPs since that should likely
be a separate change and would require declaring what we mean by "the
address" in the context of the GEP guarantees.

show more ...


# 4f614a8f 14-Jan-2025 Nikita Popov <npopov@redhat.com>

[AMDGPULowerBufferFatPointers] Use typeIncompatible() (#122902)

Use typeIncompatible() to drop attributes incompatible with the new
argument/return type, instead of keeping a custom list.


# cc3aab58 14-Jan-2025 Acim Maravic <Acim.Maravic@amd.com>

[AMDGPU] Handle nontemporal and amdgpu.last.use metadata in amdgpu-lower-buffer-fat-pointers (#120139)


Revision tags: llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 3b0f506c 06-Nov-2024 Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>

[AMDGPU] Support `nuw` and `nusw` in buffer fat pointer lowering (#115039)

This commit usis the `nuw` flag on `getelemnetptr` to set the `nuw` flag
on buffer offset additions, and also moves from `

[AMDGPU] Support `nuw` and `nusw` in buffer fat pointer lowering (#115039)

This commit usis the `nuw` flag on `getelemnetptr` to set the `nuw` flag
on buffer offset additions, and also moves from `inbounds` to the looser
`nusw` for the existing case.

show more ...


Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0
# e1fdaaaf 06-Sep-2024 Kazu Hirata <kazu@google.com>

[AMDGPU] Work around a warning

This patch works around:

llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:1101:13:
error: enumeration values 'USubCond' and 'USubSat' not handled in
swit

[AMDGPU] Work around a warning

This patch works around:

llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:1101:13:
error: enumeration values 'USubCond' and 'USubSat' not handled in
switch [-Werror,-Wswitch]

I've notified the author in #105568.

show more ...


Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# ec7f8e11 22-Jul-2024 Jessica Del <50999226+OutOfCache@users.noreply.github.com>

[AMDGPU] Add intrinsic for raw atomic buffer loads (#97707)

Upstream the intrinsics `llvm.amdgcn.raw.atomic.buffer.load`
and `llvm.amdgcn.raw.atomic.ptr.buffer.load`.

These additional intrinsics

[AMDGPU] Add intrinsic for raw atomic buffer loads (#97707)

Upstream the intrinsics `llvm.amdgcn.raw.atomic.buffer.load`
and `llvm.amdgcn.raw.atomic.ptr.buffer.load`.

These additional intrinsics mark atomic buffer loads
as atomic to LLVM by removing the `IntrReadMem`
attribute. Otherwise, it could hoist these
intrinsics out of loops in cases where LLVM marks
them as invariant. That can cause issues such as
infinite loops.

Continuation of https://reviews.llvm.org/D138786
with the additional use in the fat buffer lowering,
more test cases and the additional ptr versions
of these intrinsics.

---------

Co-authored-by: rtayl <>
Co-authored-by: Jay Foad <jay.foad@amd.com>
Co-authored-by: Mariusz Sikora <mariusz.sikora@amd.com>

show more ...


# 6bba44e8 16-Jul-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Use member initializers. NFC.


# 2d209d96 27-Jun-2024 Nikita Popov <npopov@redhat.com>

[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)

This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it does

[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)

This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...

`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.

show more ...


# 5ef768d2 17-Jun-2024 Nikita Popov <npopov@redhat.com>

[AMDGPULowerBufferFatPointers] Expand const exprs using fat pointers (#95558)

Expand all constant expressions that use fat pointers upfront, so that
the rewriting logic only has to deal with instru

[AMDGPULowerBufferFatPointers] Expand const exprs using fat pointers (#95558)

Expand all constant expressions that use fat pointers upfront, so that
the rewriting logic only has to deal with instructions and not the
constant expression variants as well.

My primary motivation is to remove the creation of illegal constant
expressions (mul and shl) from this pass, but this also cuts down quite
a bit on the amount of duplicate logic.

show more ...


Revision tags: llvmorg-18.1.8
# 0774000e 14-Jun-2024 Nikita Popov <npopov@redhat.com>

[AMDGPULowerBufferFatPointers] Fix offset-only ptrtoint (#95543)

For ptrtoint that truncates to the offset only, the expansion generated
a shift by the bit width, which is poison. Instead, we shoul

[AMDGPULowerBufferFatPointers] Fix offset-only ptrtoint (#95543)

For ptrtoint that truncates to the offset only, the expansion generated
a shift by the bit width, which is poison. Instead, we should return the
offset directly.

(The same problem exists for the constant expression case, but I plan to
address that separately, and more comprehensively.)

show more ...


# 1ceede33 14-Jun-2024 Nikita Popov <npopov@redhat.com>

[AMDGPULowerBufferFatPointers] Don't try to preserve flags for constant expressions

We expect all of these ConstantExpr ctors to fold away, don't try
to preserve flags, especially as the flags are n

[AMDGPULowerBufferFatPointers] Don't try to preserve flags for constant expressions

We expect all of these ConstantExpr ctors to fold away, don't try
to preserve flags, especially as the flags are not correct.

show more ...


# cb3a6bde 12-Jun-2024 Nikita Popov <npopov@redhat.com>

[AMDGPULowerBufferFatPointers] Restore zero offset special case

OffAccum will never be nullptr now, instead check for a zero
constant.


# 6fc63ab7 12-Jun-2024 Nikita Popov <npopov@redhat.com>

[AMDGPULowerBufferFatPointers] Simplify and fix GEP offset emission (#95115)

Use emitGEPOffset() to emit the GEP offset, which already has all the
necessary logic.

This also fixes the nuw flag i

[AMDGPULowerBufferFatPointers] Simplify and fix GEP offset emission (#95115)

Use emitGEPOffset() to emit the GEP offset, which already has all the
necessary logic.

This also fixes the nuw flag incorrectly being set on the offset
calculation, while only nsw is implied by inbounds.

show more ...


Revision tags: llvmorg-18.1.7
# 8cdecd4d 27-May-2024 Nikita Popov <npopov@redhat.com>

[IR] Add getelementptr nusw and nuw flags (#90824)

This implements the `nusw` and `nuw` flags for `getelementptr` as
proposed at
https://discourse.llvm.org/t/rfc-add-nusw-and-nuw-flags-for-getelem

[IR] Add getelementptr nusw and nuw flags (#90824)

This implements the `nusw` and `nuw` flags for `getelementptr` as
proposed at
https://discourse.llvm.org/t/rfc-add-nusw-and-nuw-flags-for-getelementptr/78672.

The three possible flags are encapsulated in the new `GEPNoWrapFlags`
class. Currently this class has a ctor from bool, interpreted as the
InBounds flag. This ctor should be removed in the future, as code gets
migrated to handle all flags.

There are a few places annotated with `TODO(gep_nowrap)`, where I've had
to touch code but opted to not infer or precisely preserve the new
flags, so as to keep this as NFC as possible and make sure any changes
of that kind get test coverage when they are made.

show more ...


Revision tags: llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4
# 6fa2d03b 04-Apr-2024 Simon Pilgrim <llvm-dev@redking.me.uk>

AMDGPULowerBufferFatPointers.cpp - fix Wunused-variable warning. NFC.


# 24c256a6 04-Apr-2024 Simon Pilgrim <llvm-dev@redking.me.uk>

AMDGPULowerBufferFatPointers.cpp - fix Wparentheses warning. NFC.


Revision tags: llvmorg-18.1.3
# 0f46e31c 20-Mar-2024 Nikita Popov <npopov@redhat.com>

[IR] Change representation of getelementptr inrange (#84341)

As part of the migration to ptradd
(https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699),
we need to change the

[IR] Change representation of getelementptr inrange (#84341)

As part of the migration to ptradd
(https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699),
we need to change the representation of the `inrange` attribute, which
is used for vtable splitting.

Currently, inrange is specified as follows:

```
getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, inrange i32 1, i64 2)
```

The `inrange` is placed on a GEP index, and all accesses must be "in
range" of that index. The new representation is as follows:

```
getelementptr inbounds inrange(-16, 16) ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, i32 1, i64 2)
```

This specifies which offsets are "in range" of the GEP result. The new
representation will continue working when canonicalizing to ptradd
representation:

```
getelementptr inbounds inrange(-16, 16) (i8, ptr @vt, i64 48)
```

The inrange offsets are relative to the return value of the GEP. An
alternative design could make them relative to the source pointer
instead. The result-relative format was chosen on the off-chance that we
want to extend support to non-constant GEPs in the future, in which case
this variant is more expressive.

This implementation "upgrades" the old inrange representation in bitcode
by simply dropping it. This is a very niche feature, and I don't think
trying to upgrade it is worthwhile. Let me know if you disagree.

show more ...


Revision tags: llvmorg-18.1.2
# d0117b71 10-Mar-2024 Orlando Cazalet-Hyams <orlando.hyams@sony.com>

[RemoveDIs] Copy debug mode to new functions in amdgpu-lower-buffer-fat-pointers

Fixes failing tests after https://github.com/llvm/llvm-project/pull/84308

LLVM :: CodeGen/AMDGPU/GlobalISel/irtransl

[RemoveDIs] Copy debug mode to new functions in amdgpu-lower-buffer-fat-pointers

Fixes failing tests after https://github.com/llvm/llvm-project/pull/84308

LLVM :: CodeGen/AMDGPU/GlobalISel/irtranslator-non-integral-address-spaces-vectors.ll
LLVM :: CodeGen/AMDGPU/GlobalISel/irtranslator-non-integral-address-spaces.ll
LLVM :: CodeGen/AMDGPU/lower-buffer-fat-pointers-calls.ll
LLVM :: CodeGen/AMDGPU/lower-buffer-fat-pointers-constants.ll
LLVM :: CodeGen/AMDGPU/lower-buffer-fat-pointers-pointer-ops.ll
LLVM :: CodeGen/AMDGPU/pal-metadata-3.0.ll

Buildbots: https://lab.llvm.org/buildbot/#/builders/121/builds/39855

show more ...


# 769eab47 11-Mar-2024 Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>

[NFC][AMDGPU] Fix redundant assignment from #77952 (#84586)

Someone pointed out a typo (Value* RsrcRes = RsrcRes = ...) in PR the
address space 7 lowering, this commit fixes it.


Revision tags: llvmorg-18.1.1
# 6540f163 06-Mar-2024 Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>

[AMDGPU] Add IR-level pass to rewrite away address space 7 (#77952)

This commit adds the -lower-buffer-fat-pointers pass, which is
applicable to all AMDGCN compilations.

The purpose of this pass

[AMDGPU] Add IR-level pass to rewrite away address space 7 (#77952)

This commit adds the -lower-buffer-fat-pointers pass, which is
applicable to all AMDGCN compilations.

The purpose of this pass is to remove the type `ptr addrspace(7)` from
incoming IR. This must be done at the LLVM IR level because `ptr
addrspace(7)`, as a 160-bit primitive type, cannot be correctly handled
by SelectionDAG.

The detailed operation of the pass is described in comments, but, in
summary, the removal proceeds by:
1. Rewriting loads and stores of ptr addrspace(7) to loads and stores of
i160 (including vectors and aggregates). This is needed because the
in-register representation of these pointers will stop matching their
in-memory representation in step 2, and so ptrtoint/inttoptr operations
are used to preserve the expected memory layout

2. Mutating the IR to replace all occurrences of `ptr addrspace(7)` with
the type `{ptr addrspace(8), ptr addrspace(6) }`, which makes the two
parts of a buffer fat pointer (the 128-bit address space 8 resource and
the 32-bit address space 6 offset) visible in the IR. This also impacts
the argument and return types of functions.

3. *Splitting* the resource and offset parts. All instructions that
produce or consume buffer fat pointers (like GEP or load) are rewritten
to produce or consume the resource and offset parts separately. For
example, GEP updates the offset part of the result and a load uses the
resource and offset parts to populate the relevant
llvm.amdgcn.raw.ptr.buffer.load intrinsic call.

At the end of this process, the original mutated instructions are
replaced by their new split counterparts, ensuring no invalidly-typed IR
escapes this pass. (For operations like call, where the struct form is
needed, insertelement operations are inserted).

Compared to LGC's PatchBufferOp (

https://github.com/GPUOpen-Drivers/llpc/blob/32cda89776980202597d5bf4ed4447a1bae64047/lgc/patch/PatchBufferOp.cpp
): this pass
- Also handles vectors of ptr addrspace(7)s
- Also handles function boundaries
- Includes the same uniform buffer optimization for loops and
conditionals
- Does *not* handle memcpy() and friends (this is future work)
- Does *not* break up large loads and stores into smaller parts. This
should be handled by extending the legalization
of *.buffer.{load,store} to handle larger types by producing multiple
instructions (the same way ordinary LOAD and STORE are legalized). That
work is planned for a followup commit.
- Does *not* have special logic for handling divergent buffer
descriptors. The logic in LGC is, as far as I can tell, incorrect in
general, and, per discussions with @nhaehnle, isn't widely used.
Therefore, divergent descriptors are handled with waterfall loops later
in legalization.

As a final matter, this commit updates atomic expansion to treat buffer
operations analogously to global ones.

(One question for reviewers: is the new pass is the right place? Should
it be later in the pipeline?)

Differential Revision: https://reviews.llvm.org/D158463

show more ...