#
83ba349a |
| 14-Feb-2023 |
Sanjay Patel <spatel@rotateright.com> |
[InstSimplify] fix/improve folding with an SNaN operand
There are 2 issues here:
1. In the default LLVM FP environment (regular FP math instructions), SNaN is some flavor of "don't care" which w
[InstSimplify] fix/improve folding with an SNaN operand
There are 2 issues here:
1. In the default LLVM FP environment (regular FP math instructions), SNaN is some flavor of "don't care" which we will nail down in D143074, so this is just a quality-of-implementation improvement for default FP. 2. In the constrained FP environment (constrained intrinsics), SNaN must not propagate through a math operation; it has to be quieted according to IEEE-754 spec. That is independent of exception handling mode, so the current behavior is a miscompile.
Differential Revision: https://reviews.llvm.org/D143505
show more ...
|
#
caa99a01 |
| 22-Jan-2023 |
Kazu Hirata <kazu@google.com> |
Use llvm::popcount instead of llvm::countPopulation(NFC)
|
#
821c7be8 |
| 22-Dec-2022 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Simplify simplifyAMDGCNMemoryIntrinsicDemanded. NFC.
|
#
20cde154 |
| 03-Dec-2022 |
Kazu Hirata <kazu@google.com> |
[Target] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of
[Target] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional.
This is part of an effort to migrate from llvm::Optional to std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
show more ...
|
#
86fe4dfd |
| 02-Dec-2022 |
Krzysztof Parzyszek <kparzysz@quicinc.com> |
TargetTransformInfo: convert Optional to std::optional
Recommit: added missing "#include <cstdint>".
|
#
4e12d183 |
| 02-Dec-2022 |
Krzysztof Parzyszek <kparzysz@quicinc.com> |
Revert "TargetTransformInfo: convert Optional to std::optional"
This reverts commit b83711248cb12639e7ef7303cfbb4452b4067e85.
Some buildbots are failing.
|
#
b8371124 |
| 02-Dec-2022 |
Krzysztof Parzyszek <kparzysz@quicinc.com> |
TargetTransformInfo: convert Optional to std::optional
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2 |
|
#
a58541f1 |
| 26-Jun-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Fold llvm.amdgcn.sqrt(undef)
|
#
3e4280c0 |
| 11-Nov-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Disable some class simplifications for strictfp
|
#
8ea3cf4b |
| 10-Nov-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Use generic is.fpclass enum instead of locally defined copy
The generic intrinsic uses the same bitlayout as the amdgcn intrinsic, so re-use the enum.
|
#
445a483b |
| 13-Jun-2022 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Add new GFX11 intrinsic llvm.amdgcn.exp.row
Differential Revision: https://reviews.llvm.org/D127671
|
#
bfcfd53b |
| 13-Jun-2022 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Add GFX11 llvm.amdgcn.permlane64 intrinsic
Compared to permlane16, permlane64 has no BC input because it has no boundary conditions, no fi input because the instruction acts as if FI were a
[AMDGPU] Add GFX11 llvm.amdgcn.permlane64 intrinsic
Compared to permlane16, permlane64 has no BC input because it has no boundary conditions, no fi input because the instruction acts as if FI were always enabled, and no OLD input because it always writes to every active lane.
Also use the new intrinsic in the atomic optimizer pass.
Differential Revision: https://reviews.llvm.org/D127662
show more ...
|
#
2417de27 |
| 25-Apr-2022 |
Mariusz Sikora <mariusz.sikora@amd.com> |
[AMDGPU] Use d16 flag for image.sample instructions
Image.sample instruction can be forced to return half type instead of float when d16 flag is enabled.
This patch adds new pattern in InstCombine
[AMDGPU] Use d16 flag for image.sample instructions
Image.sample instruction can be forced to return half type instead of float when d16 flag is enabled.
This patch adds new pattern in InstCombine to detect if output of image.sample is used later only by fptrunc which converts the type from float to half. If pattern is detected then fptrunc and image.sample are combined to single image.sample which is returning half type. Later in Lowering part d16 flag is added to image sample intrinsic.
Differential Revision: https://reviews.llvm.org/D124232
show more ...
|
#
c6afbdb5 |
| 25-Apr-2022 |
Piotr Sobczak <Piotr.Sobczak@amd.com> |
Revert "[AMDGPU] Use d16 flag for image.sample instructions"
This reverts commit d1762fc454c0d7ee0bcffe87e798f67b6c43c1d2.
Reverting D124232 as the buildbot reported some errors in sanitizers.
|
#
d1762fc4 |
| 25-Apr-2022 |
Mariusz Sikora <mariusz.sikora@amd.com> |
[AMDGPU] Use d16 flag for image.sample instructions
Image.sample instruction can be forced to return half type instead of float when d16 flag is enabled.
This patch adds new pattern in InstCombine
[AMDGPU] Use d16 flag for image.sample instructions
Image.sample instruction can be forced to return half type instead of float when d16 flag is enabled.
This patch adds new pattern in InstCombine to detect if output of image.sample is used later only by fptrunc which converts the type from float to half. If pattern is detected then fptrunc and image.sample are combined to single image.sample which is returning half type. Later in Lowering part d16 flag is added to image sample intrinsic.
Differential Revision: https://reviews.llvm.org/D124232
show more ...
|
#
4ed7c6ee |
| 24-Jan-2022 |
Sebastian Neubauer <Sebastian.Neubauer@amd.com> |
[AMDGPU] Only match correct type for a16
Addresses are floats when a sampler is present and unsigned integers when no sampler is present.
Therefore, only zext instructions, not sext instructions sh
[AMDGPU] Only match correct type for a16
Addresses are floats when a sampler is present and unsigned integers when no sampler is present.
Therefore, only zext instructions, not sext instructions should match.
Also match integer constants that can be truncated.
Differential Revision: https://reviews.llvm.org/D118043
show more ...
|
#
80532ebb |
| 24-Jan-2022 |
Sebastian Neubauer <Sebastian.Neubauer@amd.com> |
[AMDGPU][InstCombine] Remove zero image offset
Remove the offset parameter if it is zero.
Differential Revision: https://reviews.llvm.org/D117876
|
#
603d1803 |
| 21-Dec-2021 |
Sebastian Neubauer <Sebastian.Neubauer@amd.com> |
[AMDGPU][InstCombine] Remove zero LOD bias
If the bias is zero, we can remove it from the image instruction. Also copy other image optimizations (l->lz, mip->nomip) to IR combines.
Differential Rev
[AMDGPU][InstCombine] Remove zero LOD bias
If the bias is zero, we can remove it from the image instruction. Also copy other image optimizations (l->lz, mip->nomip) to IR combines.
Differential Revision: https://reviews.llvm.org/D116042
show more ...
|
#
0530fdbb |
| 20-Dec-2021 |
Sebastian Neubauer <Sebastian.Neubauer@amd.com> |
[AMDGPU] Fix LOD bias in A16 combine
As the codegen fix in D111754, the LOD bias needs to be converted to 16 bits. Fix this in the combine.
Differential Revision: https://reviews.llvm.org/D116038
|
#
45f16eab |
| 13-Dec-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Combine is.shared/is.private of null/undef
|
#
f631173d |
| 30-Sep-2021 |
Kazu Hirata <kazu@google.com> |
[llvm] Migrate from arg_operands to args (NFC)
Note that arg_operands is considered a legacy name. See llvm/include/llvm/IR/InstrTypes.h for details.
|
#
dc6e8dfd |
| 20-Sep-2021 |
Jacob Lambert <jacob.lambert@amd.com> |
[AMDGPU][NFC] Correct typos in lib/Target/AMDGPU/AMDGPU*.cpp files. Test commit for new contributor.
|
#
48958d02 |
| 23-Aug-2021 |
Daniil Fukalov <daniil.fukalov@amd.com> |
[NFC][AMDGPU] Reduce includes dependencies.
1. Splitted out some parts of R600 target to separate modules/headers. 2. Reduced some include lists in headers. 3. Found and fixed issue with override `G
[NFC][AMDGPU] Reduce includes dependencies.
1. Splitted out some parts of R600 target to separate modules/headers. 2. Reduced some include lists in headers. 3. Found and fixed issue with override `GCNTargetMachine::getSubtargetImpl()` and `R600TargetMachine::getSubtargetImpl()` had different return value type than base class. 4. Minor forward declarations cleanup.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D108596
show more ...
|
#
3f4d00bc |
| 18-Aug-2021 |
Arthur Eubanks <aeubanks@google.com> |
[NFC] More get/removeAttribute() cleanup
|
#
560d7e04 |
| 20-Jan-2021 |
dfukalov <daniil.fukalov@amd.com> |
[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets
... to reduce headers dependency.
Reviewed By: rampitec, arsenm
Differential Revision: https://reviews.llvm.org/D95036
|