AMDGPUInstCombineIntrinsic.cpp - OpenGrok history log for /llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
# 83ba349a	14-Feb-2023	Sanjay Patel <spatel@rotateright.com>	[InstSimplify] fix/improve folding with an SNaN operand There are 2 issues here: 1. In the default LLVM FP environment (regular FP math instructions), SNaN is some flavor of "don't care" which w [InstSimplify] fix/improve folding with an SNaN operand There are 2 issues here: 1. In the default LLVM FP environment (regular FP math instructions), SNaN is some flavor of "don't care" which we will nail down in D143074, so this is just a quality-of-implementation improvement for default FP. 2. In the constrained FP environment (constrained intrinsics), SNaN must not propagate through a math operation; it has to be quieted according to IEEE-754 spec. That is independent of exception handling mode, so the current behavior is a miscompile. Differential Revision: https://reviews.llvm.org/D143505 show more ...
# caa99a01	22-Jan-2023	Kazu Hirata <kazu@google.com>	Use llvm::popcount instead of llvm::countPopulation(NFC)
# 821c7be8	22-Dec-2022	Jay Foad <jay.foad@amd.com>	[AMDGPU] Simplify simplifyAMDGCNMemoryIntrinsicDemanded. NFC.
# 20cde154	03-Dec-2022	Kazu Hirata <kazu@google.com>	[Target] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of [Target] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716 show more ...
# 86fe4dfd	02-Dec-2022	Krzysztof Parzyszek <kparzysz@quicinc.com>	TargetTransformInfo: convert Optional to std::optional Recommit: added missing "#include <cstdint>".
# 4e12d183	02-Dec-2022	Krzysztof Parzyszek <kparzysz@quicinc.com>	Revert "TargetTransformInfo: convert Optional to std::optional" This reverts commit b83711248cb12639e7ef7303cfbb4452b4067e85. Some buildbots are failing.
# b8371124	02-Dec-2022	Krzysztof Parzyszek <kparzysz@quicinc.com>	TargetTransformInfo: convert Optional to std::optional
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2
# a58541f1	26-Jun-2020	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Fold llvm.amdgcn.sqrt(undef)
# 3e4280c0	11-Nov-2022	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Disable some class simplifications for strictfp
# 8ea3cf4b	10-Nov-2022	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Use generic is.fpclass enum instead of locally defined copy The generic intrinsic uses the same bitlayout as the amdgcn intrinsic, so re-use the enum.
# 445a483b	13-Jun-2022	Jay Foad <jay.foad@amd.com>	[AMDGPU] Add new GFX11 intrinsic llvm.amdgcn.exp.row Differential Revision: https://reviews.llvm.org/D127671
# bfcfd53b	13-Jun-2022	Jay Foad <jay.foad@amd.com>	[AMDGPU] Add GFX11 llvm.amdgcn.permlane64 intrinsic Compared to permlane16, permlane64 has no BC input because it has no boundary conditions, no fi input because the instruction acts as if FI were a [AMDGPU] Add GFX11 llvm.amdgcn.permlane64 intrinsic Compared to permlane16, permlane64 has no BC input because it has no boundary conditions, no fi input because the instruction acts as if FI were always enabled, and no OLD input because it always writes to every active lane. Also use the new intrinsic in the atomic optimizer pass. Differential Revision: https://reviews.llvm.org/D127662 show more ...
# 2417de27	25-Apr-2022	Mariusz Sikora <mariusz.sikora@amd.com>	[AMDGPU] Use d16 flag for image.sample instructions Image.sample instruction can be forced to return half type instead of float when d16 flag is enabled. This patch adds new pattern in InstCombine [AMDGPU] Use d16 flag for image.sample instructions Image.sample instruction can be forced to return half type instead of float when d16 flag is enabled. This patch adds new pattern in InstCombine to detect if output of image.sample is used later only by fptrunc which converts the type from float to half. If pattern is detected then fptrunc and image.sample are combined to single image.sample which is returning half type. Later in Lowering part d16 flag is added to image sample intrinsic. Differential Revision: https://reviews.llvm.org/D124232 show more ...
# c6afbdb5	25-Apr-2022	Piotr Sobczak <Piotr.Sobczak@amd.com>	Revert "[AMDGPU] Use d16 flag for image.sample instructions" This reverts commit d1762fc454c0d7ee0bcffe87e798f67b6c43c1d2. Reverting D124232 as the buildbot reported some errors in sanitizers.
# d1762fc4	25-Apr-2022	Mariusz Sikora <mariusz.sikora@amd.com>	[AMDGPU] Use d16 flag for image.sample instructions Image.sample instruction can be forced to return half type instead of float when d16 flag is enabled. This patch adds new pattern in InstCombine [AMDGPU] Use d16 flag for image.sample instructions Image.sample instruction can be forced to return half type instead of float when d16 flag is enabled. This patch adds new pattern in InstCombine to detect if output of image.sample is used later only by fptrunc which converts the type from float to half. If pattern is detected then fptrunc and image.sample are combined to single image.sample which is returning half type. Later in Lowering part d16 flag is added to image sample intrinsic. Differential Revision: https://reviews.llvm.org/D124232 show more ...
# 4ed7c6ee	24-Jan-2022	Sebastian Neubauer <Sebastian.Neubauer@amd.com>	[AMDGPU] Only match correct type for a16 Addresses are floats when a sampler is present and unsigned integers when no sampler is present. Therefore, only zext instructions, not sext instructions sh [AMDGPU] Only match correct type for a16 Addresses are floats when a sampler is present and unsigned integers when no sampler is present. Therefore, only zext instructions, not sext instructions should match. Also match integer constants that can be truncated. Differential Revision: https://reviews.llvm.org/D118043 show more ...
# 80532ebb	24-Jan-2022	Sebastian Neubauer <Sebastian.Neubauer@amd.com>	[AMDGPU][InstCombine] Remove zero image offset Remove the offset parameter if it is zero. Differential Revision: https://reviews.llvm.org/D117876
# 603d1803	21-Dec-2021	Sebastian Neubauer <Sebastian.Neubauer@amd.com>	[AMDGPU][InstCombine] Remove zero LOD bias If the bias is zero, we can remove it from the image instruction. Also copy other image optimizations (l->lz, mip->nomip) to IR combines. Differential Rev [AMDGPU][InstCombine] Remove zero LOD bias If the bias is zero, we can remove it from the image instruction. Also copy other image optimizations (l->lz, mip->nomip) to IR combines. Differential Revision: https://reviews.llvm.org/D116042 show more ...
# 0530fdbb	20-Dec-2021	Sebastian Neubauer <Sebastian.Neubauer@amd.com>	[AMDGPU] Fix LOD bias in A16 combine As the codegen fix in D111754, the LOD bias needs to be converted to 16 bits. Fix this in the combine. Differential Revision: https://reviews.llvm.org/D116038
# 45f16eab	13-Dec-2021	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Combine is.shared/is.private of null/undef
# f631173d	30-Sep-2021	Kazu Hirata <kazu@google.com>	[llvm] Migrate from arg_operands to args (NFC) Note that arg_operands is considered a legacy name. See llvm/include/llvm/IR/InstrTypes.h for details.
# dc6e8dfd	20-Sep-2021	Jacob Lambert <jacob.lambert@amd.com>	[AMDGPU][NFC] Correct typos in lib/Target/AMDGPU/AMDGPU*.cpp files. Test commit for new contributor.
# 48958d02	23-Aug-2021	Daniil Fukalov <daniil.fukalov@amd.com>	[NFC][AMDGPU] Reduce includes dependencies. 1. Splitted out some parts of R600 target to separate modules/headers. 2. Reduced some include lists in headers. 3. Found and fixed issue with override `G [NFC][AMDGPU] Reduce includes dependencies. 1. Splitted out some parts of R600 target to separate modules/headers. 2. Reduced some include lists in headers. 3. Found and fixed issue with override `GCNTargetMachine::getSubtargetImpl()` and `R600TargetMachine::getSubtargetImpl()` had different return value type than base class. 4. Minor forward declarations cleanup. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D108596 show more ...
# 3f4d00bc	18-Aug-2021	Arthur Eubanks <aeubanks@google.com>	[NFC] More get/removeAttribute() cleanup
# 560d7e04	20-Jan-2021	dfukalov <daniil.fukalov@amd.com>	[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets ... to reduce headers dependency. Reviewed By: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D95036
1 234