History log of /llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp (Results 51 – 75 of 88)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 83ba349a 14-Feb-2023 Sanjay Patel <spatel@rotateright.com>

[InstSimplify] fix/improve folding with an SNaN operand

There are 2 issues here:

1. In the default LLVM FP environment (regular FP math instructions),
SNaN is some flavor of "don't care" which w

[InstSimplify] fix/improve folding with an SNaN operand

There are 2 issues here:

1. In the default LLVM FP environment (regular FP math instructions),
SNaN is some flavor of "don't care" which we will nail down in
D143074, so this is just a quality-of-implementation improvement
for default FP.
2. In the constrained FP environment (constrained intrinsics), SNaN
must not propagate through a math operation; it has to be quieted
according to IEEE-754 spec. That is independent of exception
handling mode, so the current behavior is a miscompile.

Differential Revision: https://reviews.llvm.org/D143505

show more ...


# caa99a01 22-Jan-2023 Kazu Hirata <kazu@google.com>

Use llvm::popcount instead of llvm::countPopulation(NFC)


# 821c7be8 22-Dec-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Simplify simplifyAMDGCNMemoryIntrinsicDemanded. NFC.


# 20cde154 03-Dec-2022 Kazu Hirata <kazu@google.com>

[Target] Use std::nullopt instead of None (NFC)

This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of

[Target] Use std::nullopt instead of None (NFC)

This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

show more ...


# 86fe4dfd 02-Dec-2022 Krzysztof Parzyszek <kparzysz@quicinc.com>

TargetTransformInfo: convert Optional to std::optional

Recommit: added missing "#include <cstdint>".


# 4e12d183 02-Dec-2022 Krzysztof Parzyszek <kparzysz@quicinc.com>

Revert "TargetTransformInfo: convert Optional to std::optional"

This reverts commit b83711248cb12639e7ef7303cfbb4452b4067e85.

Some buildbots are failing.


# b8371124 02-Dec-2022 Krzysztof Parzyszek <kparzysz@quicinc.com>

TargetTransformInfo: convert Optional to std::optional


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2
# a58541f1 26-Jun-2020 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fold llvm.amdgcn.sqrt(undef)


# 3e4280c0 11-Nov-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Disable some class simplifications for strictfp


# 8ea3cf4b 10-Nov-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Use generic is.fpclass enum instead of locally defined copy

The generic intrinsic uses the same bitlayout as the amdgcn intrinsic,
so re-use the enum.


# 445a483b 13-Jun-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Add new GFX11 intrinsic llvm.amdgcn.exp.row

Differential Revision: https://reviews.llvm.org/D127671


# bfcfd53b 13-Jun-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Add GFX11 llvm.amdgcn.permlane64 intrinsic

Compared to permlane16, permlane64 has no BC input because it has no
boundary conditions, no fi input because the instruction acts as if FI
were a

[AMDGPU] Add GFX11 llvm.amdgcn.permlane64 intrinsic

Compared to permlane16, permlane64 has no BC input because it has no
boundary conditions, no fi input because the instruction acts as if FI
were always enabled, and no OLD input because it always writes to every
active lane.

Also use the new intrinsic in the atomic optimizer pass.

Differential Revision: https://reviews.llvm.org/D127662

show more ...


# 2417de27 25-Apr-2022 Mariusz Sikora <mariusz.sikora@amd.com>

[AMDGPU] Use d16 flag for image.sample instructions

Image.sample instruction can be forced to return half type instead of
float when d16 flag is enabled.

This patch adds new pattern in InstCombine

[AMDGPU] Use d16 flag for image.sample instructions

Image.sample instruction can be forced to return half type instead of
float when d16 flag is enabled.

This patch adds new pattern in InstCombine to detect if output of
image.sample is used later only by fptrunc which converts the type
from float to half. If pattern is detected then fptrunc and image.sample
are combined to single image.sample which is returning half type.
Later in Lowering part d16 flag is added to image sample intrinsic.

Differential Revision: https://reviews.llvm.org/D124232

show more ...


# c6afbdb5 25-Apr-2022 Piotr Sobczak <Piotr.Sobczak@amd.com>

Revert "[AMDGPU] Use d16 flag for image.sample instructions"

This reverts commit d1762fc454c0d7ee0bcffe87e798f67b6c43c1d2.

Reverting D124232 as the buildbot reported some errors in sanitizers.


# d1762fc4 25-Apr-2022 Mariusz Sikora <mariusz.sikora@amd.com>

[AMDGPU] Use d16 flag for image.sample instructions

Image.sample instruction can be forced to return half type instead of
float when d16 flag is enabled.

This patch adds new pattern in InstCombine

[AMDGPU] Use d16 flag for image.sample instructions

Image.sample instruction can be forced to return half type instead of
float when d16 flag is enabled.

This patch adds new pattern in InstCombine to detect if output of
image.sample is used later only by fptrunc which converts the type
from float to half. If pattern is detected then fptrunc and image.sample
are combined to single image.sample which is returning half type.
Later in Lowering part d16 flag is added to image sample intrinsic.

Differential Revision: https://reviews.llvm.org/D124232

show more ...


# 4ed7c6ee 24-Jan-2022 Sebastian Neubauer <Sebastian.Neubauer@amd.com>

[AMDGPU] Only match correct type for a16

Addresses are floats when a sampler is present and unsigned integers
when no sampler is present.

Therefore, only zext instructions, not sext instructions sh

[AMDGPU] Only match correct type for a16

Addresses are floats when a sampler is present and unsigned integers
when no sampler is present.

Therefore, only zext instructions, not sext instructions should match.

Also match integer constants that can be truncated.

Differential Revision: https://reviews.llvm.org/D118043

show more ...


# 80532ebb 24-Jan-2022 Sebastian Neubauer <Sebastian.Neubauer@amd.com>

[AMDGPU][InstCombine] Remove zero image offset

Remove the offset parameter if it is zero.

Differential Revision: https://reviews.llvm.org/D117876


# 603d1803 21-Dec-2021 Sebastian Neubauer <Sebastian.Neubauer@amd.com>

[AMDGPU][InstCombine] Remove zero LOD bias

If the bias is zero, we can remove it from the image instruction.
Also copy other image optimizations (l->lz, mip->nomip) to IR combines.

Differential Rev

[AMDGPU][InstCombine] Remove zero LOD bias

If the bias is zero, we can remove it from the image instruction.
Also copy other image optimizations (l->lz, mip->nomip) to IR combines.

Differential Revision: https://reviews.llvm.org/D116042

show more ...


# 0530fdbb 20-Dec-2021 Sebastian Neubauer <Sebastian.Neubauer@amd.com>

[AMDGPU] Fix LOD bias in A16 combine

As the codegen fix in D111754, the LOD bias needs to be converted to 16
bits. Fix this in the combine.

Differential Revision: https://reviews.llvm.org/D116038


# 45f16eab 13-Dec-2021 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Combine is.shared/is.private of null/undef


# f631173d 30-Sep-2021 Kazu Hirata <kazu@google.com>

[llvm] Migrate from arg_operands to args (NFC)

Note that arg_operands is considered a legacy name. See
llvm/include/llvm/IR/InstrTypes.h for details.


# dc6e8dfd 20-Sep-2021 Jacob Lambert <jacob.lambert@amd.com>

[AMDGPU][NFC] Correct typos in lib/Target/AMDGPU/AMDGPU*.cpp files. Test commit for new contributor.


# 48958d02 23-Aug-2021 Daniil Fukalov <daniil.fukalov@amd.com>

[NFC][AMDGPU] Reduce includes dependencies.

1. Splitted out some parts of R600 target to separate modules/headers.
2. Reduced some include lists in headers.
3. Found and fixed issue with override `G

[NFC][AMDGPU] Reduce includes dependencies.

1. Splitted out some parts of R600 target to separate modules/headers.
2. Reduced some include lists in headers.
3. Found and fixed issue with override `GCNTargetMachine::getSubtargetImpl()`
and `R600TargetMachine::getSubtargetImpl()` had different return value type
than base class.
4. Minor forward declarations cleanup.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D108596

show more ...


# 3f4d00bc 18-Aug-2021 Arthur Eubanks <aeubanks@google.com>

[NFC] More get/removeAttribute() cleanup


# 560d7e04 20-Jan-2021 dfukalov <daniil.fukalov@amd.com>

[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets

... to reduce headers dependency.

Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D95036


1234