History log of /llvm-project/llvm/test/CodeGen/AMDGPU/amdgcn-load-offset-from-reg.ll (Results 1 – 22 of 22)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 229e1185 23-Jul-2024 Christudasan Devadasan <christudasan.devadasan@amd.com>

[AMDGPU] Codegen support for constrained multi-dword sloads (#96163)

For targets that support xnack replay feature (gfx8+), the
multi-dword scalar loads shouldn't clobber any register that
holds the

[AMDGPU] Codegen support for constrained multi-dword sloads (#96163)

For targets that support xnack replay feature (gfx8+), the
multi-dword scalar loads shouldn't clobber any register that
holds the src address. The constrained version of the scalar
loads have the early clobber flag attached to the dst operand
to restrict RA from re-allocating any of the src regs for its
dst operand.

show more ...


Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# 9e9907f1 17-Jan-2024 Fangrui Song <i@maskray.me>

[AMDGPU,test] Change llc -march= to -mtriple= (#75982)

Similar to 806761a7629df268c8aed49657aeccffa6bca449.

For IR files without a target triple, -mtriple= specifies the full
target triple while

[AMDGPU,test] Change llc -march= to -mtriple= (#75982)

Similar to 806761a7629df268c8aed49657aeccffa6bca449.

For IR files without a target triple, -mtriple= specifies the full
target triple while -march= merely sets the architecture part of the
default target triple, leaving a target triple which may not make sense,
e.g. amdgpu-apple-darwin.

Therefore, -march= is error-prone and not recommended for tests without
a target triple. The issue has been benign as we recognize
$unknown-apple-darwin as ELF instead of rejecting it outrightly.

This patch changes AMDGPU tests to not rely on the default
OS/environment components. Tests that need fixes are not changed:

```
LLVM :: CodeGen/AMDGPU/fabs.f64.ll
LLVM :: CodeGen/AMDGPU/fabs.ll
LLVM :: CodeGen/AMDGPU/floor.ll
LLVM :: CodeGen/AMDGPU/fneg-fabs.f64.ll
LLVM :: CodeGen/AMDGPU/fneg-fabs.ll
LLVM :: CodeGen/AMDGPU/r600-infinite-loop-bug-while-reorganizing-vector.ll
LLVM :: CodeGen/AMDGPU/schedule-if-2.ll
```

show more ...


Revision tags: llvmorg-17.0.6
# 01c1c7a1 14-Nov-2023 Acim Maravic <119684637+Acim-Maravic@users.noreply.github.com>

[AMDGPU][CodeGen] Update support (soffset + offset) s_buffer_load's (#68302)

getBaseWithConstantOffset() is used for scalar and non-scalar buffer
loads. Diffrence between s_load and load instructio

[AMDGPU][CodeGen] Update support (soffset + offset) s_buffer_load's (#68302)

getBaseWithConstantOffset() is used for scalar and non-scalar buffer
loads. Diffrence between s_load and load instruction is that s_load
instruction extends 32-bit offset to 64-bits, so a 32-bit (address +
offset) should not cause unsigned 32-bit integer wraparound, because it
performs addition in 64-bits.

show more ...


Revision tags: llvmorg-17.0.5
# 86f2e092 01-Nov-2023 Jay Foad <jay.foad@amd.com>

[AMDGPU] Tweak handling of GlobalAddress operands in SI_PC_ADD_REL_OFFSET (#70960)

When SI_PC_ADD_REL_OFFSET is expanded to S_GETPC/S_ADD/S_ADDC, the
GlobalAddress operands have to be adjusted by 4

[AMDGPU] Tweak handling of GlobalAddress operands in SI_PC_ADD_REL_OFFSET (#70960)

When SI_PC_ADD_REL_OFFSET is expanded to S_GETPC/S_ADD/S_ADDC, the
GlobalAddress operands have to be adjusted by 4 or 12 bytes to account
for the offset from the end of the S_GETPC instruction to the literal
operands. Do this all in SIInstrInfo::expandPostRAPseudo instead of
duplicating the adjustment code in both AMDGPULegalizerInfo and
SITargetLowering. NFCI.

show more ...


Revision tags: llvmorg-17.0.4
# d96529af 29-Oct-2023 Simon Pilgrim <llvm-dev@redking.me.uk>

[DAG] Attempt shl narrowing in SimplifyDemandedBits (REAPPLIED)

If a shl node leaves the upper half bits zero / undemanded, then see if we can profitably perform this with a half-width shl and a fre

[DAG] Attempt shl narrowing in SimplifyDemandedBits (REAPPLIED)

If a shl node leaves the upper half bits zero / undemanded, then see if we can profitably perform this with a half-width shl and a free trunc/zext.

Followup to D146121

Reapplied - moved after the ShrinkDemandedOp call; reuse the existing KnownBits result; ensure that we only attempt this if all the upper bits are demanded; 547dc461225ba should address the remaining regressions that were noticed in the previous commit.

Differential Revision: https://reviews.llvm.org/D155472

show more ...


Revision tags: llvmorg-17.0.3
# 0a776996 04-Oct-2023 Kirill Stoimenov <kstoimenov@google.com>

Revert "[DAG] Attempt shl narrowing in SimplifyDemandedBits"

This reverts commit 7a8c04ef84ecdab4390b451d4c2fe17bc45a7b63.


# 7a8c04ef 04-Oct-2023 Simon Pilgrim <llvm-dev@redking.me.uk>

[DAG] Attempt shl narrowing in SimplifyDemandedBits

If a shl node leaves the upper half bits zero / undemanded, then see if we can profitably perform this with a half-width shl and a free trunc/zext

[DAG] Attempt shl narrowing in SimplifyDemandedBits

If a shl node leaves the upper half bits zero / undemanded, then see if we can profitably perform this with a half-width shl and a free trunc/zext.

Followup to D146121

Differential Revision: https://reviews.llvm.org/D155472

show more ...


Revision tags: llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1
# faa2c678 04-Apr-2023 Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>

[AMDGPU] Add buffer intrinsics that take resources as pointers

In order to enable the LLVM frontend to better analyze buffer
operations (and to potentially enable more precise analyses on the
backen

[AMDGPU] Add buffer intrinsics that take resources as pointers

In order to enable the LLVM frontend to better analyze buffer
operations (and to potentially enable more precise analyses on the
backend), define versions of the raw and structured buffer intrinsics
that use `ptr addrspace(8)` instead of `<4 x i32>` to represent their
rsrc arguments.

The new intrinsics are named by replacing `buffer.` with `buffer.ptr`.

One advantage to these intrinsic definitions is that, instead of
specifying that a buffer load/store will read/write some memory, we
can indicate that the memory read or written will be based on the
pointer argument. This means that, for example, a read from a
`noalias` buffer can be pulled out of a loop that is modifying a
distinct buffer.

In the future, we will define custom PseudoSourceValues that will
allow us to package up the (buffer, index, offset) triples that buffer
intrinsics contain and allow for more precise backend analysis.

This work also enables creating address space 7, which represents
manipulation of raw buffers using native LLVM load and store
instructions.

Where tests simply used a buffer intrinsic while testing some other
code path (such as the tests for VGPR spills), they have been updated
to use the new intrinsic form. Tests that are "about" buffer
intrinsics (for instance, those that ensure that they codegen as
expected) have been duplicated, either within existing files or into
new ones.

Depends on D145441

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D147547

show more ...


Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4
# 2be31896 10-Mar-2023 Jay Foad <jay.foad@amd.com>

[AMDGPU] Don't select _SGPR forms of SMEM instructions on GFX9+

On GFX9+, SMEM instructions have an _SGPR_IMM form which is strictly
more powerful than the _SGPR form. It simplifies codegen if we al

[AMDGPU] Don't select _SGPR forms of SMEM instructions on GFX9+

On GFX9+, SMEM instructions have an _SGPR_IMM form which is strictly
more powerful than the _SGPR form. It simplifies codegen if we always
select the _SGPR_IMM form with an immediate offset of 0 instead of the
_SGPR form.

Note that this patch just makes minimal changes to the selection
patterns to prove the concept. Further simplifications are possible to
reduced the number of selection patterns.

On GFX9 the _SGPR form of the Real instruction is still required for
assembly/disassembly but on GFX10+ it can be removed completely.

Differential Revision: https://reviews.llvm.org/D147334

show more ...


Revision tags: llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# d85e849f 02-Dec-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Convert some assorted tests to opaque pointers


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# bb70b5d4 18-Jul-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

CodeGen: Set MODereferenceable from isDereferenceableAndAlignedPointer

Previously this was assuming piontsToConstantMemory implies
dereferenceable.


# 5db8d6fd 05-Sep-2022 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU][CodeGen] Support (base | offset) SMEM loads.

Prevents generation of unnecessary s_or_b32 instructions.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D132552


# 1f550d86 05-Sep-2022 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU][CodeGen] Pre-commit a test on (base | offset) SMEM loads for D132552.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D133021


# f3364530 05-Sep-2022 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU][CodeGen] Support (soffset + offset) s_buffer_load's.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D130263


# 8d0383eb 24-Jun-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

CodeGen: Remove AliasAnalysis from regalloc

This was stored in LiveIntervals, but not actually used for anything
related to LiveIntervals. It was only used in one check for if a load
instruction is

CodeGen: Remove AliasAnalysis from regalloc

This was stored in LiveIntervals, but not actually used for anything
related to LiveIntervals. It was only used in one check for if a load
instruction is rematerializable. I also don't think this was entirely
correct, since it was implicitly assuming constant loads are also
dereferenceable.

Remove this and rely only on the invariant+dereferenceable flags in
the memory operand. Set the flag based on the AA query upfront. This
should have the same net benefit, but has the possible disadvantage of
making this AA query nonlazy.

Preserve the behavior of assuming pointsToConstantMemory implying
dereferenceable for now, but maybe this should be changed.

show more ...


# 432cbd78 18-Jul-2022 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU][CodeGen] Support (register + immediate) SMRD offsets.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D129381


# 9c66c02e 18-Jul-2022 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU][CodeGen] Match SMRDs with constant bases and register offsets.

Saves some add instructions on a couple Rage 2 shaders and is also a
prerequisite for a coming-soon change matching (register

[AMDGPU][CodeGen] Match SMRDs with constant bases and register offsets.

Saves some add instructions on a couple Rage 2 shaders and is also a
prerequisite for a coming-soon change matching (register + immediate)
offsets.

Reviewed By: foad, arsenm

Differential Revision: https://reviews.llvm.org/D129095

show more ...


# 8cd79bc1 05-Jul-2022 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU][GlobalISel] Support register offsets for SMRDs.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D128836


Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1
# fae05692 20-May-2021 Matt Arsenault <Matthew.Arsenault@amd.com>

CodeGen: Print/parse LLTs in MachineMemOperands

This will currently accept the old number of bytes syntax, and convert
it to a scalar. This should be removed in the near future (I think I
converted

CodeGen: Print/parse LLTs in MachineMemOperands

This will currently accept the old number of bytes syntax, and convert
it to a scalar. This should be removed in the near future (I think I
converted all of the tests already, but likely missed a few).

Not sure what the exact syntax and policy should be. We can continue
printing the number of bytes for non-generic instructions to avoid
test churn and only allow non-scalar types for generic instructions.

This will currently print the LLT in parentheses, but accept parsing
the existing integers and implicitly converting to scalar. The
parentheses are a bit ugly, but the parser logic seems unable to deal
without either parentheses or some keyword to indicate the start of a
type.

show more ...


Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2
# 3bffb1cd 09-Feb-2021 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Use single cache policy operand

Replace individual operands GLC, SLC, and DLC with a single cache_policy
bitmask operand. This will reduce the number of operands in MIR and I hope
the amoun

[AMDGPU] Use single cache policy operand

Replace individual operands GLC, SLC, and DLC with a single cache_policy
bitmask operand. This will reduce the number of operands in MIR and I hope
the amount of code. These operands are mostly 0 anyway.

Additional advantage that parser will accept these flags in any order unlike
now.

Differential Revision: https://reviews.llvm.org/D96469

show more ...


Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2
# b8c8d1b3 30-Jul-2020 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Convert some tests to use new buffer intrinsics

The legacy not struct or raw buffer intrinsics should now all be
consolidated into the tests specifically for those intrinsics.


Revision tags: llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1
# 77ce2e21 30-Mar-2020 Jakub Kuderski <kubak@google.com>

[AMDGPU] Add Relocation Constant Support

Summary:
This change adds amdgcn.reloc.constant intrinsic to the amdgpu backend, which will compile into a relocation entry in the resulting elf.

The intrin

[AMDGPU] Add Relocation Constant Support

Summary:
This change adds amdgcn.reloc.constant intrinsic to the amdgpu backend, which will compile into a relocation entry in the resulting elf.

The intrinsics takes a MetadataNode (String) as its only argument, which specifies the symbol name of the relocation entry.

`SelectionDAGBuilder::getValueImpl` is changed to allow metadata operands passed through to ISel.

Author: csyonghe <yonghe@google.com>

Reviewers: tpr, nhaehnle

Reviewed By: nhaehnle

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76440

show more ...