History log of /llvm-project/llvm/test/CodeGen/AMDGPU/clamp.ll (Results 1 – 25 of 48)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 6548b635 09-Nov-2024 Shilei Tian <i@tianshilei.me>

Reapply "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"

This reverts commit ca33649abe5fad93c57afef54e43ed9b3249cd86.


# ca33649a 08-Nov-2024 Shilei Tian <i@tianshilei.me>

Revert "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"

This reverts commit e215a1e27d84adad2635a52393621eb4fa439dc9 as it broke both
hip and openmp buildbots.


# e215a1e2 08-Nov-2024 Shilei Tian <i@tianshilei.me>

[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)


Revision tags: llvmorg-19.1.3
# 3277c7cd 21-Oct-2024 Stanislav Mekhanoshin <rampitec@users.noreply.github.com>

[AMDGPU] Skip VGPR deallocation for waveslot limited kernels (#112765)

MSG_DEALLOC_VGPRS slows down very small waveslot limited kernels. It's
been identified this message is only really needed for

[AMDGPU] Skip VGPR deallocation for waveslot limited kernels (#112765)

MSG_DEALLOC_VGPRS slows down very small waveslot limited kernels. It's
been identified this message is only really needed for VGPR limited
kernels. A kernel becomes VGPR limited if a total number of VGPRs per
SIMD / number of used VGPRs is more than a number of wave slots.

show more ...


Revision tags: llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 229e1185 23-Jul-2024 Christudasan Devadasan <christudasan.devadasan@amd.com>

[AMDGPU] Codegen support for constrained multi-dword sloads (#96163)

For targets that support xnack replay feature (gfx8+), the
multi-dword scalar loads shouldn't clobber any register that
holds the

[AMDGPU] Codegen support for constrained multi-dword sloads (#96163)

For targets that support xnack replay feature (gfx8+), the
multi-dword scalar loads shouldn't clobber any register that
holds the src address. The constrained version of the scalar
loads have the early clobber flag attached to the dst operand
to restrict RA from re-allocating any of the src regs for its
dst operand.

show more ...


# b1bcb7ca 15-Jul-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commit adaff46d087799

Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commit adaff46d087799072438dd744b038e6fd50a2d78.

Drop the -O3 checks from default-attributes.hip. I don't know why they
are different on some bots but reverting this is far too disruptive.

show more ...


# adaff46d 15-Jul-2024 dyung <douglas.yung@sony.com>

Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commits 677cc15e0ff2e0

Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commits 677cc15e0ff2e0e6aa30538eb187990a6a8f53c0 and
78bc1b64a6dc3fb6191355a5e1b502be8b3668e7.

The test CodeGenHIP/default-attributes.hip is failing on multiple bots
even after the attempted fix including the following:
- https://lab.llvm.org/buildbot/#/builders/3/builds/1473
- https://lab.llvm.org/buildbot/#/builders/65/builds/1380
- https://lab.llvm.org/buildbot/#/builders/161/builds/595
- https://lab.llvm.org/buildbot/#/builders/154/builds/1372
- https://lab.llvm.org/buildbot/#/builders/133/builds/1547
- https://lab.llvm.org/buildbot/#/builders/81/builds/755
- https://lab.llvm.org/buildbot/#/builders/40/builds/570
- https://lab.llvm.org/buildbot/#/builders/13/builds/748
- https://lab.llvm.org/buildbot/#/builders/12/builds/1845
- https://lab.llvm.org/buildbot/#/builders/11/builds/1695
- https://lab.llvm.org/buildbot/#/builders/190/builds/1829
- https://lab.llvm.org/buildbot/#/builders/193/builds/962
- https://lab.llvm.org/buildbot/#/builders/23/builds/991
- https://lab.llvm.org/buildbot/#/builders/144/builds/2256
- https://lab.llvm.org/buildbot/#/builders/46/builds/1614

These bots have been broken for a day, so reverting to get everything
back to green.

show more ...


# 78bc1b64 14-Jul-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Move attributor into optimization pipeline (#83131)

Removing it from the codegen pipeline induces a lot of test churn
because llc is no longer optimizing out implicit arguments to kernels.

AMDGPU: Move attributor into optimization pipeline (#83131)

Removing it from the codegen pipeline induces a lot of test churn
because llc is no longer optimizing out implicit arguments to kernels.

Mostly mechanical, but there are some creative test updates. I preferred
to take the changes as-is in tests where the ABI isn't relevant. In
cases where it's more relevant, or the optimize out logic was too
ingrained in the test, I pre-run the optimization. Some cases manually
add attributes to disable inputs.

show more ...


Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2
# ceb744eb 13-Mar-2024 Harald van Dijk <harald@gigawatt.nl>

[AMDGPU] Fix canonicalization of truncated values. (#83054)

We were relying on roundings to implicitly canonicalize, which is
generally safe, except with roundings that may be optimized away.

Fi

[AMDGPU] Fix canonicalization of truncated values. (#83054)

We were relying on roundings to implicitly canonicalize, which is
generally safe, except with roundings that may be optimized away.

Fixes #82937.

show more ...


Revision tags: llvmorg-18.1.1
# 3e35ba53 28-Feb-2024 Petar Avramovic <Petar.Avramovic@amd.com>

AMDGPU/GFX12: Insert waitcnts before stores with scope_sys (#82996)

Insert waitcnts for loads and atomics before stores with system scope.
Scope is field in instruction encoding and corresponds to

AMDGPU/GFX12: Insert waitcnts before stores with scope_sys (#82996)

Insert waitcnts for loads and atomics before stores with system scope.
Scope is field in instruction encoding and corresponds to desired
coherence level in cache hierarchy.
Intrinsic stores can set scope in cache policy operand.
If volatile keyword is used on generic stores memory legalizer will set
scope to system. Generic stores, by default, get lowest scope level.
Waitcnts are not required if it is guaranteed that memory is cached.
For example vulkan shaders can guarantee this.
TODO: implement flag for frontends to give us a hint not to insert
waits.
Expecting vulkan flag to be implemented as vulkan:private MMRA.

show more ...


Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# 0a3a0ea5 18-Jan-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Update uses of new VOP2 pseudos for GFX12 (#78155)

New pseudos were added for instructions that were natively VOP3 on
GFX11: V_ADD_F64_pseudo, V_MUL_F64_pseudo, V_MIN_NUM_F64, V_MAX_NUM_F6

[AMDGPU] Update uses of new VOP2 pseudos for GFX12 (#78155)

New pseudos were added for instructions that were natively VOP3 on
GFX11: V_ADD_F64_pseudo, V_MUL_F64_pseudo, V_MIN_NUM_F64, V_MAX_NUM_F64,
V_LSHLREV_B64_pseudo

---------

Co-authored-by: Mirko Brkusanin <Mirko.Brkusanin@amd.com>

show more ...


# ba52f06f 18-Jan-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] CodeGen for GFX12 S_WAIT_* instructions (#77438)

Update SIMemoryLegalizer and SIInsertWaitcnts to use separate wait
instructions per counter (e.g. S_WAIT_LOADCNT) and split VMCNT into
sep

[AMDGPU] CodeGen for GFX12 S_WAIT_* instructions (#77438)

Update SIMemoryLegalizer and SIInsertWaitcnts to use separate wait
instructions per counter (e.g. S_WAIT_LOADCNT) and split VMCNT into
separate LOADCNT, SAMPLECNT and BVHCNT counters.

show more ...


# 9e9907f1 17-Jan-2024 Fangrui Song <i@maskray.me>

[AMDGPU,test] Change llc -march= to -mtriple= (#75982)

Similar to 806761a7629df268c8aed49657aeccffa6bca449.

For IR files without a target triple, -mtriple= specifies the full
target triple while

[AMDGPU,test] Change llc -march= to -mtriple= (#75982)

Similar to 806761a7629df268c8aed49657aeccffa6bca449.

For IR files without a target triple, -mtriple= specifies the full
target triple while -march= merely sets the architecture part of the
default target triple, leaving a target triple which may not make sense,
e.g. amdgpu-apple-darwin.

Therefore, -march= is error-prone and not recommended for tests without
a target triple. The issue has been benign as we recognize
$unknown-apple-darwin as ELF instead of rejecting it outrightly.

This patch changes AMDGPU tests to not rely on the default
OS/environment components. Tests that need fixes are not changed:

```
LLVM :: CodeGen/AMDGPU/fabs.f64.ll
LLVM :: CodeGen/AMDGPU/fabs.ll
LLVM :: CodeGen/AMDGPU/floor.ll
LLVM :: CodeGen/AMDGPU/fneg-fabs.f64.ll
LLVM :: CodeGen/AMDGPU/fneg-fabs.ll
LLVM :: CodeGen/AMDGPU/r600-infinite-loop-bug-while-reorganizing-vector.ll
LLVM :: CodeGen/AMDGPU/schedule-if-2.ll
```

show more ...


# a278ac57 15-Dec-2023 Mirko Brkušanin <Mirko.Brkusanin@amd.com>

[AMDGPU] CodeGen for SMEM instructions (#75579)


# fac093dd 13-Dec-2023 Piotr Sobczak <piotr.sobczak@amd.com>

[AMDGPU] Update IEEE and DX10_CLAMP for GFX12 (#75030)

Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5
# a4196666 13-Nov-2023 Jay Foad <jay.foad@amd.com>

[AMDGPU] Revert "Preliminary patch for divergence driven instruction selection. Operands Folding 1." (#71710)

This reverts commit 201f892b3b597f24287ab6a712a286e25a45a7d9.


Revision tags: llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# 7fa7a08f 19-Jul-2023 Jay Foad <jay.foad@amd.com>

[AMDGPU] Insert s_nop before s_sendmsg sendmsg(MSG_DEALLOC_VGPRS)

Differential Revision: https://reviews.llvm.org/D155681


# 2449931b 20-Jun-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Don't use old form of fneg in some tests


Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# 177ff42d 29-Nov-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Convert some fp op tests to opaque issues

fmax_legacy.ll had one test that produced "ptraddrspace(1)", since
somehow "i1addrspace(1)*" used to parse.


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 12fd00ee 23-Jun-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Add patterns for GFX11 v_minmax and v_maxmin instructions

Differential Revision: https://reviews.llvm.org/D128445


# d1af09ad 23-Jun-2022 Joe Nash <Joseph.Nash@amd.com>

[AMDGPU] gfx11 Generate VOPD Instructions

We form VOPD instructions in the GCNCreateVOPD pass by combining
back-to-back component instructions. There are strict register
constraints for creating a

[AMDGPU] gfx11 Generate VOPD Instructions

We form VOPD instructions in the GCNCreateVOPD pass by combining
back-to-back component instructions. There are strict register
constraints for creating a legal VOPD, namely that the matching operands
(e.g. src0x and src0y, src1x and src1y) must be in different register
banks. We add a PostRA scheduler
mutation to put possible VOPD components back-to-back.

Depends on D128442, D128270

Reviewed By: #amdgpu, rampitec

Differential Revision: https://reviews.llvm.org/D128656

show more ...


# 0f94d2b3 30-Jun-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] GFX11: automatically release VGPRs at the end of the shader

GFX11 has a new message type MSG_DEALLOC_VGPRS which can be used to
release a shader's VGPRs. Sending this at the end of a shader

[AMDGPU] GFX11: automatically release VGPRs at the end of the shader

GFX11 has a new message type MSG_DEALLOC_VGPRS which can be used to
release a shader's VGPRs. Sending this at the end of a shader (just
before the s_endpgm) can help overall system performance in cases where
the s_endpgm would have to wait for outstanding VMEM stores to complete
before releasing the VGPRs.

Differential Revision: https://reviews.llvm.org/D128442

show more ...


Revision tags: llvmorg-14.0.6
# cfb7ffde 21-Jun-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] New AMDGPUInsertDelayAlu pass

Differential Revision: https://reviews.llvm.org/D128270


# 6540d032 28-Jun-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Generate checks for clamp.ll and add GFX11

Differential Revision: https://reviews.llvm.org/D128802


# 7758f3aa 29-Jun-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Remove FIXMEs that were resolved by D30134


12