Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 6548b635 09-Nov-2024 Shilei Tian <i@tianshilei.me>

Reapply "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"

This reverts commit ca33649abe5fad93c57afef54e43ed9b3249cd86.


# ca33649a 08-Nov-2024 Shilei Tian <i@tianshilei.me>

Revert "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"

This reverts commit e215a1e27d84adad2635a52393621eb4fa439dc9 as it broke both
hip and openmp buildbots.


# e215a1e2 08-Nov-2024 Shilei Tian <i@tianshilei.me>

[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)


Revision tags: llvmorg-19.1.3, llvmorg-19.1.2
# f2eeb3dc 02-Oct-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Handle v_add* in eliminateFrameIndex (#102346)


Revision tags: llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# b1bcb7ca 15-Jul-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commit adaff46d087799

Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commit adaff46d087799072438dd744b038e6fd50a2d78.

Drop the -O3 checks from default-attributes.hip. I don't know why they
are different on some bots but reverting this is far too disruptive.

show more ...


# adaff46d 15-Jul-2024 dyung <douglas.yung@sony.com>

Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commits 677cc15e0ff2e0

Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commits 677cc15e0ff2e0e6aa30538eb187990a6a8f53c0 and
78bc1b64a6dc3fb6191355a5e1b502be8b3668e7.

The test CodeGenHIP/default-attributes.hip is failing on multiple bots
even after the attempted fix including the following:
- https://lab.llvm.org/buildbot/#/builders/3/builds/1473
- https://lab.llvm.org/buildbot/#/builders/65/builds/1380
- https://lab.llvm.org/buildbot/#/builders/161/builds/595
- https://lab.llvm.org/buildbot/#/builders/154/builds/1372
- https://lab.llvm.org/buildbot/#/builders/133/builds/1547
- https://lab.llvm.org/buildbot/#/builders/81/builds/755
- https://lab.llvm.org/buildbot/#/builders/40/builds/570
- https://lab.llvm.org/buildbot/#/builders/13/builds/748
- https://lab.llvm.org/buildbot/#/builders/12/builds/1845
- https://lab.llvm.org/buildbot/#/builders/11/builds/1695
- https://lab.llvm.org/buildbot/#/builders/190/builds/1829
- https://lab.llvm.org/buildbot/#/builders/193/builds/962
- https://lab.llvm.org/buildbot/#/builders/23/builds/991
- https://lab.llvm.org/buildbot/#/builders/144/builds/2256
- https://lab.llvm.org/buildbot/#/builders/46/builds/1614

These bots have been broken for a day, so reverting to get everything
back to green.

show more ...


# 78bc1b64 14-Jul-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Move attributor into optimization pipeline (#83131)

Removing it from the codegen pipeline induces a lot of test churn
because llc is no longer optimizing out implicit arguments to kernels.

AMDGPU: Move attributor into optimization pipeline (#83131)

Removing it from the codegen pipeline induces a lot of test churn
because llc is no longer optimizing out implicit arguments to kernels.

Mostly mechanical, but there are some creative test updates. I preferred
to take the changes as-is in tests where the ABI isn't relevant. In
cases where it's more relevant, or the optimize out logic was too
ingrained in the test, I pre-run the optimization. Some cases manually
add attributes to disable inputs.

show more ...


Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# 9e9907f1 17-Jan-2024 Fangrui Song <i@maskray.me>

[AMDGPU,test] Change llc -march= to -mtriple= (#75982)

Similar to 806761a7629df268c8aed49657aeccffa6bca449.

For IR files without a target triple, -mtriple= specifies the full
target triple while

[AMDGPU,test] Change llc -march= to -mtriple= (#75982)

Similar to 806761a7629df268c8aed49657aeccffa6bca449.

For IR files without a target triple, -mtriple= specifies the full
target triple while -march= merely sets the architecture part of the
default target triple, leaving a target triple which may not make sense,
e.g. amdgpu-apple-darwin.

Therefore, -march= is error-prone and not recommended for tests without
a target triple. The issue has been benign as we recognize
$unknown-apple-darwin as ELF instead of rejecting it outrightly.

This patch changes AMDGPU tests to not rely on the default
OS/environment components. Tests that need fixes are not changed:

```
LLVM :: CodeGen/AMDGPU/fabs.f64.ll
LLVM :: CodeGen/AMDGPU/fabs.ll
LLVM :: CodeGen/AMDGPU/floor.ll
LLVM :: CodeGen/AMDGPU/fneg-fabs.f64.ll
LLVM :: CodeGen/AMDGPU/fneg-fabs.ll
LLVM :: CodeGen/AMDGPU/r600-infinite-loop-bug-while-reorganizing-vector.ll
LLVM :: CodeGen/AMDGPU/schedule-if-2.ll
```

show more ...


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# 4b6d41cd 19-Jun-2023 Jay Foad <jay.foad@amd.com>

[AMDGPU] Do not release VGPRs if there may be pending scratch stores

Differential Revision: https://reviews.llvm.org/D153295


Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4
# 5aea839a 16-May-2023 Jay Foad <jay.foad@amd.com>

[AMDGPU] Switch to backwards scavenging in eliminateFrameIndex

Frame index elimination runs backwards so we must use backwards
scavenging. Otherwise, when a scavenged register is spilled, the
scaven

[AMDGPU] Switch to backwards scavenging in eliminateFrameIndex

Frame index elimination runs backwards so we must use backwards
scavenging. Otherwise, when a scavenged register is spilled, the
scavenger will remember that the register is in use until the restore
point, but it will never reach that restore point. The result is that in
some cases it will keep scavenging different registers instead of
reusing the same one.

Differential Revision: https://reviews.llvm.org/D152394

show more ...


# 8fcb4fa8 17-May-2023 Jay Foad <jay.foad@amd.com>

[RegScavenger] Change scavengeRegister to pick registers in allocation order

This matches what scavengeRegisterBackwards does.

This is in preparation for converting most uses of scavengeRegister to

[RegScavenger] Change scavengeRegister to pick registers in allocation order

This matches what scavengeRegisterBackwards does.

This is in preparation for converting most uses of scavengeRegister to
scavengeRegisterBackwards, to reduce test case churn when that lands and
to help with bisection if anything goes wrong.

Differential Revision: https://reviews.llvm.org/D150792

show more ...


Revision tags: llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# d85e849f 02-Dec-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Convert some assorted tests to opaque pointers


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 5cae8816 06-Jul-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Add GFX11 test coverage

Add GFX11 test coverage to a bunch of tests where it was easy to do so,
mostly because the checks are autogenerated and/or GFX11 can share the
same checks as GFX10.

[AMDGPU] Add GFX11 test coverage

Add GFX11 test coverage to a bunch of tests where it was easy to do so,
mostly because the checks are autogenerated and/or GFX11 can share the
same checks as GFX10.

Differential Revision: https://reviews.llvm.org/D129295

show more ...


Revision tags: llvmorg-14.0.6
# 77851cc1 15-Jun-2022 David Stuttard <david.stuttard@amd.com>

[AMDGPU] Change use null for dead sdst to be gfx1030+

Pre gfx1030 null for sdst is different.
c97436f8b6e2 [AMDGPU] Use null for dead sdst operand - requires a change to make
it not apply to pre gfx

[AMDGPU] Change use null for dead sdst to be gfx1030+

Pre gfx1030 null for sdst is different.
c97436f8b6e2 [AMDGPU] Use null for dead sdst operand - requires a change to make
it not apply to pre gfx1030

Differential Revision: https://reviews.llvm.org/D127869

show more ...


# c97436f8 10-Jun-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Use null for dead sdst operand

Differential Revision: https://reviews.llvm.org/D127542


Revision tags: llvmorg-14.0.5
# 23db8e4b 06-Jun-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Use v_mad_u64_u32 for IMAD32

Nic Curtis done the experiments to prove it is faster than a
separate mul and add.

Fixes: SWDEV-332806

Differential Revision: https://reviews.llvm.org/D127253


Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2
# a5d4f82b 11-Feb-2022 Sebastian Neubauer <Sebastian.Neubauer@amd.com>

[AMDGPU] Make enable-flat-scratch a subtarget feature

Use a subtarget feature instead of a command line argument to reduce
global state.
We want to enable flat scratch for graphics in some cases and

[AMDGPU] Make enable-flat-scratch a subtarget feature

Use a subtarget feature instead of a command line argument to reduce
global state.
We want to enable flat scratch for graphics in some cases and this
doesn't work well with command line options.

Differential Revision: https://reviews.llvm.org/D119425

show more ...


Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3
# 10ed1eca 18-Jan-2022 Vang Thao <vang.thao@amd.com>

[MachineSink] Allow sinking of constant or ignorable physreg uses

For AMDGPU, any use of the physical register EXEC prevents sinking even if it is not a real physical register read. Add check to see

[MachineSink] Allow sinking of constant or ignorable physreg uses

For AMDGPU, any use of the physical register EXEC prevents sinking even if it is not a real physical register read. Add check to see if a physical
register use can be ignored for sinking.

Also perform same constant and ignorable physical register check when considering sinking in loops.

https://reviews.llvm.org/D116053

show more ...


Revision tags: llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2
# 729bf9b2 14-Aug-2021 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Enable fixed function ABI by default

Code using indirect calls is broken without this, and there isn't
really much value in supporting the old attempt to vary the argument
placement based on

AMDGPU: Enable fixed function ABI by default

Code using indirect calls is broken without this, and there isn't
really much value in supporting the old attempt to vary the argument
placement based on uses. This resulted in more argument shuffling code
anyway.

Also have the option stop implying all inputs need to be passed. This
will no rely on the amdgpu-no-* attributes to avoid passing
unnecessary values.

show more ...


# 18f93512 19-Nov-2021 RamNalamothu <VenkataRamanaiah.Nalamothu@amd.com>

[AMDGPU] Do not generate ELF symbols for the local branch target labels

The compiler was generating symbols in the final code object for local
branch target labels. This bloats the code object, slow

[AMDGPU] Do not generate ELF symbols for the local branch target labels

The compiler was generating symbols in the final code object for local
branch target labels. This bloats the code object, slows down the loader,
and is only used to simplify disassembly.

Use '--symbolize-operands' with llvm-objdump to improve readability of the
branch target operands in disassembly.

Fixes: SWDEV-312223

Reviewed By: scott.linder

Differential Revision: https://reviews.llvm.org/D114273

show more ...


# 722b8e0e 14-Aug-2021 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Invert ABI attribute handling

Previously we assumed all callable functions did not need any
implicitly passed inputs, and added attributes to functions to
indicate when they were necessary.

AMDGPU: Invert ABI attribute handling

Previously we assumed all callable functions did not need any
implicitly passed inputs, and added attributes to functions to
indicate when they were necessary. Requiring attributes for
correctness is pretty ugly, and it makes supporting indirect and
external calls more complicated.

This inverts the direction of the attributes, so an undecorated
function is assumed to need all implicit imputs. This enables
AMDGPUAttributor by default to mark when functions are proven to not
need a given input. This strips the equivalent functionality from the
legacy AMDGPUAnnotateKernelFeatures pass.

However, AMDGPUAnnotateKernelFeatures is not fully removed at this
point although it should be in the future. It is still necessary for
the two hacky amdgpu-calls and amdgpu-stack-objects attributes, which
would be better served by a trivial analysis on the IR during
selection. Additionally, AMDGPUAnnotateKernelFeatures still
redundantly handles the uniform-work-group-size attribute to be
removed in a future commit.

At this point when not using -amdgpu-fixed-function-abi, we are still
modifying the ABI based on these newly negated attributes. In the
future, this option will be removed and the locations for implicit
inputs will always be fixed. We will then use the new attributes to
avoid passing the values when unnecessary.

show more ...


Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3
# ed745839 03-Mar-2021 Jay Foad <jay.foad@amd.com>

[AMDGPU] Don't check for VMEM hazards on GFX10

The hazard where a VMEM reads an SGPR written by a VALU counts as a data
dependency hazard, so no nops are required on GFX10. Tested with Vulkan
CTS on

[AMDGPU] Don't check for VMEM hazards on GFX10

The hazard where a VMEM reads an SGPR written by a VALU counts as a data
dependency hazard, so no nops are required on GFX10. Tested with Vulkan
CTS on GFX10.1 and GFX10.3.

Differential Revision: https://reviews.llvm.org/D97926

show more ...


Revision tags: llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2
# ff8a1cae 15-Jan-2021 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

[AMDGPU] Fix the inconsistency in soffset for MUBUF stack accesses.

During instruction selection, there is an inconsistency in choosing
the initial soffset value. With certain early passes, this val

[AMDGPU] Fix the inconsistency in soffset for MUBUF stack accesses.

During instruction selection, there is an inconsistency in choosing
the initial soffset value. With certain early passes, this value is
getting modified and that brought additional fixup during
eliminateFrameIndex to work for all cases. This whole transformation
looks trivial and can be handled better.

This patch clearly defines the initial value for soffset and keeps it
unchanged before eliminateFrameIndex. The initial value must be zero
for MUBUF with a frame index. The non-frame index MUBUF forms that
use a raw offset from SP will have the stack register for soffset.
During frame elimination, the soffset remains zero for entry functions
with zero dynamic allocas and no callsites, or else is updated to the
appropriate frame/stack register.

Also, did some code clean up and made all asserts around soffset
stricter to match.

Reviewed By: scott.linder

Differential Revision: https://reviews.llvm.org/D95071

show more ...


Revision tags: llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2
# d28624a2 01-Dec-2020 Jay Foad <jay.foad@amd.com>

[AMDGPU] Stop adding an implicit def of vcc_hi for wave32

This doesn't seem to be needed for anything.

Differential Revision: https://reviews.llvm.org/D92400


Revision tags: llvmorg-11.0.1-rc1
# cf6565f6 12-Nov-2020 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Enable multi-dword flat scratch load/stores

Differential Revision: https://reviews.llvm.org/D91384


12