History log of /llvm-project/llvm/test/CodeGen/AMDGPU/fneg-modifier-casting.ll (Results 1 – 21 of 21)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 6548b635 09-Nov-2024 Shilei Tian <i@tianshilei.me>

Reapply "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"

This reverts commit ca33649abe5fad93c57afef54e43ed9b3249cd86.


# ca33649a 08-Nov-2024 Shilei Tian <i@tianshilei.me>

Revert "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"

This reverts commit e215a1e27d84adad2635a52393621eb4fa439dc9 as it broke both
hip and openmp buildbots.


# e215a1e2 08-Nov-2024 Shilei Tian <i@tianshilei.me>

[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)


Revision tags: llvmorg-19.1.3
# 3277c7cd 21-Oct-2024 Stanislav Mekhanoshin <rampitec@users.noreply.github.com>

[AMDGPU] Skip VGPR deallocation for waveslot limited kernels (#112765)

MSG_DEALLOC_VGPRS slows down very small waveslot limited kernels. It's
been identified this message is only really needed for

[AMDGPU] Skip VGPR deallocation for waveslot limited kernels (#112765)

MSG_DEALLOC_VGPRS slows down very small waveslot limited kernels. It's
been identified this message is only really needed for VGPR limited
kernels. A kernel becomes VGPR limited if a total number of VGPRs per
SIMD / number of used VGPRs is more than a number of wave slots.

show more ...


Revision tags: llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4
# 4c4908cd 27-Aug-2024 Alex MacLean <amaclean@nvidia.com>

[AMDGPU] adjust tests to prevent fpclass bitcast folding (#106268)

Make some minor tweaks to AMDGPU tests to ensure they still work as
intended after https://github.com/llvm/llvm-project/pull/97762

[AMDGPU] adjust tests to prevent fpclass bitcast folding (#106268)

Make some minor tweaks to AMDGPU tests to ensure they still work as
intended after https://github.com/llvm/llvm-project/pull/97762. These
tests can be radically simplified after bitcast aware fpclass deduction.

show more ...


Revision tags: llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# b1bcb7ca 15-Jul-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commit adaff46d087799

Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commit adaff46d087799072438dd744b038e6fd50a2d78.

Drop the -O3 checks from default-attributes.hip. I don't know why they
are different on some bots but reverting this is far too disruptive.

show more ...


# adaff46d 15-Jul-2024 dyung <douglas.yung@sony.com>

Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commits 677cc15e0ff2e0

Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commits 677cc15e0ff2e0e6aa30538eb187990a6a8f53c0 and
78bc1b64a6dc3fb6191355a5e1b502be8b3668e7.

The test CodeGenHIP/default-attributes.hip is failing on multiple bots
even after the attempted fix including the following:
- https://lab.llvm.org/buildbot/#/builders/3/builds/1473
- https://lab.llvm.org/buildbot/#/builders/65/builds/1380
- https://lab.llvm.org/buildbot/#/builders/161/builds/595
- https://lab.llvm.org/buildbot/#/builders/154/builds/1372
- https://lab.llvm.org/buildbot/#/builders/133/builds/1547
- https://lab.llvm.org/buildbot/#/builders/81/builds/755
- https://lab.llvm.org/buildbot/#/builders/40/builds/570
- https://lab.llvm.org/buildbot/#/builders/13/builds/748
- https://lab.llvm.org/buildbot/#/builders/12/builds/1845
- https://lab.llvm.org/buildbot/#/builders/11/builds/1695
- https://lab.llvm.org/buildbot/#/builders/190/builds/1829
- https://lab.llvm.org/buildbot/#/builders/193/builds/962
- https://lab.llvm.org/buildbot/#/builders/23/builds/991
- https://lab.llvm.org/buildbot/#/builders/144/builds/2256
- https://lab.llvm.org/buildbot/#/builders/46/builds/1614

These bots have been broken for a day, so reverting to get everything
back to green.

show more ...


# 78bc1b64 14-Jul-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Move attributor into optimization pipeline (#83131)

Removing it from the codegen pipeline induces a lot of test churn
because llc is no longer optimizing out implicit arguments to kernels.

AMDGPU: Move attributor into optimization pipeline (#83131)

Removing it from the codegen pipeline induces a lot of test churn
because llc is no longer optimizing out implicit arguments to kernels.

Mostly mechanical, but there are some creative test updates. I preferred
to take the changes as-is in tests where the ABI isn't relevant. In
cases where it's more relevant, or the optimize out logic was too
ingrained in the test, I pre-run the optimization. Some cases manually
add attributes to disable inputs.

show more ...


Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4
# cc13f3ba 21-Feb-2024 David Majnemer <david.majnemer@gmail.com>

Correctly round FP -> BF16 when SDAG expands such nodes (#82399)

We did something pretty naive:
- round FP64 -> BF16 by first rounding to FP32
- skip FP32 -> BF16 rounding entirely
- taking the t

Correctly round FP -> BF16 when SDAG expands such nodes (#82399)

We did something pretty naive:
- round FP64 -> BF16 by first rounding to FP32
- skip FP32 -> BF16 rounding entirely
- taking the top 16 bits of a FP32 which will turn some NaNs into
infinities

Let's do this in a more principled way by rounding types with more
precision than FP32 to FP32 using round-inexact-to-odd which will negate
double rounding issues.

show more ...


Revision tags: llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# 460ffcdd 04-Jan-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Make bf16/v2bf16 legal types (#76215)

There are some intrinsics are using i16 vectors in place of bfloat
vectors.
Move towards making bf16 vectors legal so these can migrate. Leave the
la

AMDGPU: Make bf16/v2bf16 legal types (#76215)

There are some intrinsics are using i16 vectors in place of bfloat
vectors.
Move towards making bf16 vectors legal so these can migrate. Leave the
larger vectors for a later change.

Depends #76213 #76214

show more ...


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# 7fa7a08f 19-Jul-2023 Jay Foad <jay.foad@amd.com>

[AMDGPU] Insert s_nop before s_sendmsg sendmsg(MSG_DEALLOC_VGPRS)

Differential Revision: https://reviews.llvm.org/D155681


# f7684d85 11-Jul-2023 Jay Foad <jay.foad@amd.com>

[DAG] Use legal shift amount type in DAGTypeLegalizer::JoinIntegers

Documentation for TargetLowering::getShiftAmountTy says that LegalTypes
should generally be true during type legalization, so this

[DAG] Use legal shift amount type in DAGTypeLegalizer::JoinIntegers

Documentation for TargetLowering::getShiftAmountTy says that LegalTypes
should generally be true during type legalization, so this patch does
that.

On AMDGPU the effect is that we use i32 (a sane type) instead of i64
(pointer sized type) for more shift amounts, which in turn allows more
formation of rotates and funnel shifts pre-legalization.

Differential Revision: https://reviews.llvm.org/D154960

show more ...


# f2c164c8 21-Jun-2023 Jay Foad <jay.foad@amd.com>

[AMDGPU] Do not wait for vscnt on function entry and return

SIInsertWaitcnts inserts waitcnt instructions to resolve data
dependencies. The GFX10+ vscnt (VMEM store count) counter is never used
in t

[AMDGPU] Do not wait for vscnt on function entry and return

SIInsertWaitcnts inserts waitcnt instructions to resolve data
dependencies. The GFX10+ vscnt (VMEM store count) counter is never used
in this way. It is only used to resolve memory dependencies, and that is
handled by SIMemoryLegalizer. Hence there is no need to conservatively
wait for vscnt to be 0 on function entry and before returns.

Differential Revision: https://reviews.llvm.org/D153537

show more ...


Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4
# ac2d6df2 08-May-2023 Jeffrey Byrnes <Jeffrey.Byrnes@amd.com>

[AMDGPU] Add basic support for extended i8 perm matching

Differential Revision: https://reviews.llvm.org/D142782

Change-Id: Ibb95224f7885839e8b77a705f487f10b47a258a6


Revision tags: llvmorg-16.0.3
# 2fce50e8 20-Apr-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fix assertion with multiple uses of f64 fneg of select

A bitcast needs to be inserted back to the original type. Just
skip the multiple use case for a safer quick fix. Handling
the multiple

AMDGPU: Fix assertion with multiple uses of f64 fneg of select

A bitcast needs to be inserted back to the original type. Just
skip the multiple use case for a safer quick fix. Handling
the multiple use case seems to be beneficial in some but not
all cases.

show more ...


Revision tags: llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1
# f608ac62 26-Jan-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Push fneg into bitcast of integer select

Avoids some regressions in the math libraries in a future
patch.


# 0f59720e 26-Jan-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fold fneg into bitcast of build_vector

The math libraries have a lot of code that performs
manual sign bit operations by bitcasting doubles to int2
and doing bithacking on them. This is a ba

AMDGPU: Fold fneg into bitcast of build_vector

The math libraries have a lot of code that performs
manual sign bit operations by bitcasting doubles to int2
and doing bithacking on them. This is a bad canonical form
we should rewrite to use high level sign operations directly
on double. To avoid codegen regressions, we need to do a better
job moving fnegs to operate only on the high 32-bits.

This is only halfway to fixing the real case.

show more ...


Revision tags: llvmorg-17-init
# bd1f7c41 23-Jan-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Try to push fneg as integer into select

I initially attempted to select the source modifier from xor of
a sign mask. This proved to be more difficult since
foldBinOpIntoSelect does not consi

AMDGPU: Try to push fneg as integer into select

I initially attempted to select the source modifier from xor of
a sign mask. This proved to be more difficult since
foldBinOpIntoSelect does not consider free fneg of integers
and undoes the combine.

show more ...


# f3c008ca 23-Jan-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

DAG: Relax foldBitcastedFPLogic conditions

Requiring a bitcast to exist was unhelpful. The most basic cases
are always going to be a CopyFromReg or load, so they would need
a new cast inserted. Don'

DAG: Relax foldBitcastedFPLogic conditions

Requiring a bitcast to exist was unhelpful. The most basic cases
are always going to be a CopyFromReg or load, so they would need
a new cast inserted. Don't require a bitcast if it's a free
operation. I don't think this logic makes particularly much sense
(it seems to be imparting special interpretation of bitcast), but
this needs to be in sync with foldSignChangeInBitcast.

We should also get rid of this hasBitPreservingFPLogic hook. fabs/fneg
are bitpreserving or incorrectly implemented, so this should just be a
regular legality check.

show more ...


# a7fad92b 27-Jan-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Add more tests to fneg modifier with casting tests


Revision tags: llvmorg-15.0.7
# 8e6406c2 13-Dec-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Add fneg and select test