History log of /llvm-project/llvm/test/CodeGen/AMDGPU/simplify-libcalls.ll (Results 1 – 25 of 47)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6
# 7dbd6cd2 11-Dec-2024 Shilei Tian <i@tianshilei.me>

[AMDGPU][Attributor] Make `AAAMDFlatWorkGroupSize` honor existing attribute (#114357)

If a function has `amdgpu-flat-work-group-size`, honor it in `initialize` by
taking its value directly; otherwi

[AMDGPU][Attributor] Make `AAAMDFlatWorkGroupSize` honor existing attribute (#114357)

If a function has `amdgpu-flat-work-group-size`, honor it in `initialize` by
taking its value directly; otherwise, it uses the default range as a starting
point. We will no longer manipulate the known range, which can cause issues
because the known range is a "throttle" to the assumed range such that the
assumed range can't get widened properly in `updateImpl` if the known range is
not set properly for whatever reasons. Another benefit of not touching the known
range is, if we indicate pessimistic state, it also invalidates the AA such that
`manifest` will not be called. Since we honor the attribute, we don't want and
will not add any half-baked attribute added to a function.

show more ...


Revision tags: llvmorg-19.1.5, llvmorg-19.1.4
# 38fffa63 06-Nov-2024 Paul Walker <paul.walker@arm.com>

[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548)


Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3
# 492484e6 09-Aug-2024 Shilei Tian <i@tianshilei.me>

Revert "[AMDGPU] Move `AMDGPUAttributorPass` to full LTO post link stage (#102086)"

This reverts commit 2fe61a5acf272d6826352ef72f47196b01003fc5.


# 2fe61a5a 09-Aug-2024 Shilei Tian <i@tianshilei.me>

[AMDGPU] Move `AMDGPUAttributorPass` to full LTO post link stage (#102086)

Currently `AMDGPUAttributorPass` is registered in default optimizer
pipeline.
This will allow the pass to run in default

[AMDGPU] Move `AMDGPUAttributorPass` to full LTO post link stage (#102086)

Currently `AMDGPUAttributorPass` is registered in default optimizer
pipeline.
This will allow the pass to run in default pipeline as well as at
thinLTO post
link stage. However, it will not run in full LTO post link stage. This
patch
moves it to full LTO.

show more ...


Revision tags: llvmorg-19.1.0-rc2
# b455edbc 31-Jul-2024 Yingwei Zheng <dtcxzyw2333@gmail.com>

[InstCombine] Recognize copysign idioms (#101324)

This patch folds `(bitcast (or (and (bitcast X to int), signmask), nneg
Y) to fp)` into `copysign((bitcast Y to fp), X)`. I found this pattern
exi

[InstCombine] Recognize copysign idioms (#101324)

This patch folds `(bitcast (or (and (bitcast X to int), signmask), nneg
Y) to fp)` into `copysign((bitcast Y to fp), X)`. I found this pattern
exists in some graphics applications/math libraries.

Alive2: https://alive2.llvm.org/ce/z/ggQZV2

show more ...


Revision tags: llvmorg-19.1.0-rc1, llvmorg-20-init
# b1bcb7ca 15-Jul-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commit adaff46d087799

Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commit adaff46d087799072438dd744b038e6fd50a2d78.

Drop the -O3 checks from default-attributes.hip. I don't know why they
are different on some bots but reverting this is far too disruptive.

show more ...


# adaff46d 15-Jul-2024 dyung <douglas.yung@sony.com>

Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commits 677cc15e0ff2e0

Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commits 677cc15e0ff2e0e6aa30538eb187990a6a8f53c0 and
78bc1b64a6dc3fb6191355a5e1b502be8b3668e7.

The test CodeGenHIP/default-attributes.hip is failing on multiple bots
even after the attempted fix including the following:
- https://lab.llvm.org/buildbot/#/builders/3/builds/1473
- https://lab.llvm.org/buildbot/#/builders/65/builds/1380
- https://lab.llvm.org/buildbot/#/builders/161/builds/595
- https://lab.llvm.org/buildbot/#/builders/154/builds/1372
- https://lab.llvm.org/buildbot/#/builders/133/builds/1547
- https://lab.llvm.org/buildbot/#/builders/81/builds/755
- https://lab.llvm.org/buildbot/#/builders/40/builds/570
- https://lab.llvm.org/buildbot/#/builders/13/builds/748
- https://lab.llvm.org/buildbot/#/builders/12/builds/1845
- https://lab.llvm.org/buildbot/#/builders/11/builds/1695
- https://lab.llvm.org/buildbot/#/builders/190/builds/1829
- https://lab.llvm.org/buildbot/#/builders/193/builds/962
- https://lab.llvm.org/buildbot/#/builders/23/builds/991
- https://lab.llvm.org/buildbot/#/builders/144/builds/2256
- https://lab.llvm.org/buildbot/#/builders/46/builds/1614

These bots have been broken for a day, so reverting to get everything
back to green.

show more ...


# 78bc1b64 14-Jul-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Move attributor into optimization pipeline (#83131)

Removing it from the codegen pipeline induces a lot of test churn
because llc is no longer optimizing out implicit arguments to kernels.

AMDGPU: Move attributor into optimization pipeline (#83131)

Removing it from the codegen pipeline induces a lot of test churn
because llc is no longer optimizing out implicit arguments to kernels.

Mostly mechanical, but there are some creative test updates. I preferred
to take the changes as-is in tests where the ABI isn't relevant. In
cases where it's more relevant, or the optimize out logic was too
ingrained in the test, I pre-run the optimization. Some cases manually
add attributes to disable inputs.

show more ...


# bff619f9 01-Jul-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

Revert "AMDGPU: Use real copysign in fast pow (#97152)"

This reverts commit d3e7c4ce7a3d7f08cea02cba8f34c590a349688b.


# d3e7c4ce 01-Jul-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Use real copysign in fast pow (#97152)

Previously this would introduce some codegen regressions, but
those have been avoided by simplifying demanded bits on copysign
operations.


Revision tags: llvmorg-18.1.8, llvmorg-18.1.7
# dab1f7c8 21-May-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Emit 1/llvm.sqrt(x) instead of rsqrt calls in libcall handling (#92863)

With the contract flag we should end up codegening to the rsqrt
instruction, or denormal corrected rsqrt sequence pre

AMDGPU: Emit 1/llvm.sqrt(x) instead of rsqrt calls in libcall handling (#92863)

With the contract flag we should end up codegening to the rsqrt
instruction, or denormal corrected rsqrt sequence present in the
library.

show more ...


# 66b76faf 21-May-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Directly emit sqrt intrinsic when folding rootn(x, 2) (#92598)

This avoids depending on pre/post link runs.

Depends #92595


Revision tags: llvmorg-18.1.6
# 3aae916f 14-May-2024 Yingwei Zheng <dtcxzyw2333@gmail.com>

Reland "[ValueTracking] Compute knownbits from known fp classes" (#92084)

This patch relands https://github.com/llvm/llvm-project/pull/86409.

I mistakenly thought that `Known.makeNegative()` clea

Reland "[ValueTracking] Compute knownbits from known fp classes" (#92084)

This patch relands https://github.com/llvm/llvm-project/pull/86409.

I mistakenly thought that `Known.makeNegative()` clears the sign bit of
`Known.Zero`. This patch fixes the assertion failure by explicitly
clearing the sign bit.

show more ...


# 2e165a2c 14-May-2024 Martin Storsjö <martin@martin.st>

Revert "[ValueTracking] Compute knownbits from known fp classes (#86409)"

This reverts commit d03a1a6e5838c7c2c0836d71507dfdf7840ade49.

This change caused failed assertions, see
https://github.com/

Revert "[ValueTracking] Compute knownbits from known fp classes (#86409)"

This reverts commit d03a1a6e5838c7c2c0836d71507dfdf7840ade49.

This change caused failed assertions, see
https://github.com/llvm/llvm-project/pull/86409#issuecomment-2109469845
for details.

show more ...


# d03a1a6e 13-May-2024 Yingwei Zheng <dtcxzyw2333@gmail.com>

[ValueTracking] Compute knownbits from known fp classes (#86409)

This patch calculates knownbits from fp instructions/dominating fcmp
conditions. It will enable more optimizations with signbit idio

[ValueTracking] Compute knownbits from known fp classes (#86409)

This patch calculates knownbits from fp instructions/dominating fcmp
conditions. It will enable more optimizations with signbit idioms.

show more ...


Revision tags: llvmorg-18.1.5
# bc3620d3 17-Apr-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Move libcall simplify into PeepholeEP (#88853)

We were running this immediately on the incoming IR, which
is still littered with temporary allocas obscuring trivial values.
This needs to r

AMDGPU: Move libcall simplify into PeepholeEP (#88853)

We were running this immediately on the incoming IR, which
is still littered with temporary allocas obscuring trivial values.
This needs to run after initial SROA to handle sincos insertion.

show more ...


Revision tags: llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# daecc303 09-Jan-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Replace sqrt OpenCL libcalls with llvm.sqrt (#74197)

The library implementation is just a wrapper around a call to the
intrinsic, but loses metadata. Swap out the call site to the intrinsic

AMDGPU: Replace sqrt OpenCL libcalls with llvm.sqrt (#74197)

The library implementation is just a wrapper around a call to the
intrinsic, but loses metadata. Swap out the call site to the intrinsic
so that the lowering can see the !fpmath metadata and fast math flags.

Since d56e0d07cc5ee8e334fd1ad403eef0b1a771384f, clang started placing
!fpmath on OpenCL library sqrt calls. Also don't bother emitting
native_sqrt anymore, it's just another wrapper around llvm.sqrt.

show more ...


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4
# deefda70 26-Aug-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Use exp2 and log2 intrinsics directly for f16/f32

These codegen correctly but f64 doesn't. This prevents losing fast
math flags on the way to the underlying intrinsic.

https://reviews.llvm.

AMDGPU: Use exp2 and log2 intrinsics directly for f16/f32

These codegen correctly but f64 doesn't. This prevents losing fast
math flags on the way to the underlying intrinsic.

https://reviews.llvm.org/D158997

show more ...


# 80e5b46e 26-Aug-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fix assertion on half typed pow with constant exponents

https://reviews.llvm.org/D158993


# 35c2a754 25-Aug-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fix asserting on fast f16 pown

https://reviews.llvm.org/D158903


Revision tags: llvmorg-17.0.0-rc3
# d2517616 14-Aug-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Replace log libcalls with log intrinsics


# d45022b0 12-Aug-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Remove special case constant folding of divide

We should probably just swap this out for the fdiv, but that's what
the implementation is anyway.


# a70006c4 12-Aug-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Replace some libcalls with intrinsics

OpenCL loses fast math information by going through libcall wrappers
around intrinsics.

Do this to preserve call site flags which are lost when inlinin

AMDGPU: Replace some libcalls with intrinsics

OpenCL loses fast math information by going through libcall wrappers
around intrinsics.

Do this to preserve call site flags which are lost when inlining. It's
not safe in general to propagate flags during inline, so avoid dealing
with this by just special casing some of the useful calls.

show more ...


Revision tags: llvmorg-17.0.0-rc2
# f44beecb 31-Jul-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Try to use private version of sincos if available

The comment was out of date, the device libs build does provide all
the pointer overloads. An extremely pedantic interpretation of the
spec

AMDGPU: Try to use private version of sincos if available

The comment was out of date, the device libs build does provide all
the pointer overloads. An extremely pedantic interpretation of the
spec would suggest only the flat version exists, but the overloads do
exist in the implementation.

https://reviews.llvm.org/D156720

show more ...


# 6dbd4581 30-Jul-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Remove pointless libcall optimization of fma/mad

After the library is linked and trivially inlined, the generic fma and
fmuladd intrinsics already handle these cases, and with precise flag
h

AMDGPU: Remove pointless libcall optimization of fma/mad

After the library is linked and trivially inlined, the generic fma and
fmuladd intrinsics already handle these cases, and with precise flag
handling. This was requiring all fast math flags when we really just
need nsz for the fma(a, b, 0) case.

https://reviews.llvm.org/D156677

show more ...


12