Revision tags: llvmorg-15.0.6 |
|
#
3830e4e5 |
| 16-Nov-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Create poison values instead of undef
These placeholders don't care about the finer points on the difference between the two.
|
Revision tags: llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3 |
|
#
838fd611 |
| 12-Oct-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Fix assertion on <1 x i16> vectors
Fixes issue 58331.
|
Revision tags: working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
#
8e70258b |
| 04-Jul-2022 |
Nikita Popov <npopov@redhat.com> |
[AMDGPUCodeGenPrepare] Check result of ConstantFoldBinaryOpOperands()
This function will become fallible once we don't support constant expressions for all binops, so make sure to check the result.
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2 |
|
#
6527b2a4 |
| 18-Feb-2022 |
Sebastian Neubauer <Sebastian.Neubauer@amd.com> |
[AMDGPU][NFC] Fix typos
Fix some typos in the amdgpu backend.
Differential Revision: https://reviews.llvm.org/D119235
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
#
cbcbbd6a |
| 03-Jan-2022 |
Craig Topper <craig.topper@sifive.com> |
[ValueTracking][SelectionDAG] Rename ComputeMinSignedBits->ComputeMaxSignificantBits. NFC
This function returns an upper bound on the number of bits needed to represent the signed value. Use "Max" t
[ValueTracking][SelectionDAG] Rename ComputeMinSignedBits->ComputeMaxSignificantBits. NFC
This function returns an upper bound on the number of bits needed to represent the signed value. Use "Max" to match similar functions in KnownBits like countMaxActiveBits.
Rename APInt::getMinSignedBits->getSignificantBits. Keeping the old name around to keep this patch size down. Will do a bulk rename as follow up.
Rename KnownBits::countMaxSignedBits->countMaxSignificantBits.
Reviewed By: lebedev.ri, RKSimon, spatel
Differential Revision: https://reviews.llvm.org/D116522
show more ...
|
#
361216f3 |
| 03-Jan-2022 |
Craig Topper <craig.topper@sifive.com> |
[AMDGPU] Use ComputeMinSignedBits and KnownBits::countMaxActiveBits to simplify some code. NFC
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D116516
|
Revision tags: llvmorg-13.0.1-rc1 |
|
#
21a1d4cf |
| 29-Oct-2021 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Change numBitsSigned for simplicity and document it. NFC.
Change numBitsSigned to return the minimum size of a signed integer that can hold the value. This is different by one from the prev
[AMDGPU] Change numBitsSigned for simplicity and document it. NFC.
Change numBitsSigned to return the minimum size of a signed integer that can hold the value. This is different by one from the previous result but is more consistent with numBitsUnsigned. Update all callers. All callers are now more consistent between the signed and unsigned cases, and some callers get simpler, especially the ones that deal with quantities like numBitsSigned(LHS) + numBitsSigned(RHS).
Differential Revision: https://reviews.llvm.org/D112813
show more ...
|
#
781dd39b |
| 23-Oct-2021 |
Abinav Puthan Purayil <abinav.puthanpurayil@amd.com> |
[AMDGPU] Enable 48-bit mul in AMDGPUCodeGenPrepare.
We were bailing out of creating 24-bit muls for results wider than 32 bits in AMDGPUCodeGenPrepare. With the 24-bit mulhi intrinsic, this change t
[AMDGPU] Enable 48-bit mul in AMDGPUCodeGenPrepare.
We were bailing out of creating 24-bit muls for results wider than 32 bits in AMDGPUCodeGenPrepare. With the 24-bit mulhi intrinsic, this change teaches AMDGPUCodeGenPrepare to generate the 48-bit mul correctly.
Differential Revision: https://reviews.llvm.org/D112395
show more ...
|
#
de303840 |
| 15-Oct-2021 |
Abinav Puthan Purayil <abinav.puthanpurayil@amd.com> |
[AMDGPU] Avoid redundant calls to numBits in AMDGPUCodeGenPrepare::replaceMulWithMul24().
The isU24() and isI24() calls numBits to make its decision. This change replaces them with the internal numB
[AMDGPU] Avoid redundant calls to numBits in AMDGPUCodeGenPrepare::replaceMulWithMul24().
The isU24() and isI24() calls numBits to make its decision. This change replaces them with the internal numBits call so that we can use its result for the > 32 bit width cases.
Differential Revision: https://reviews.llvm.org/D111864
show more ...
|
#
0379263f |
| 14-Oct-2021 |
Abinav Puthan Purayil <abinav.puthanpurayil@amd.com> |
[AMDGPU] Fix width check for signed mul24 generation.
This changes fixes a case in which the highest set bit of the original result is at bit 31 and sign-extending the mul24 for it would make the re
[AMDGPU] Fix width check for signed mul24 generation.
This changes fixes a case in which the highest set bit of the original result is at bit 31 and sign-extending the mul24 for it would make the result negative.
Differential Revision: https://reviews.llvm.org/D111823
show more ...
|
#
b3c9d84e |
| 10-Oct-2021 |
Abinav Puthan Purayil <abinav.puthanpurayil@amd.com> |
[AMDGPU] Fix 24-bit mul intrinsic generation for > 32-bit result.
The 24-bit mul intrinsics yields the low-order 32 bits. We should only do the transformation if the operands are known to be not wid
[AMDGPU] Fix 24-bit mul intrinsic generation for > 32-bit result.
The 24-bit mul intrinsics yields the low-order 32 bits. We should only do the transformation if the operands are known to be not wider than 24 bits and the result is known to be not wider than 32 bits.
Differential Revision: https://reviews.llvm.org/D111523
show more ...
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
#
dc6e8dfd |
| 20-Sep-2021 |
Jacob Lambert <jacob.lambert@amd.com> |
[AMDGPU][NFC] Correct typos in lib/Target/AMDGPU/AMDGPU*.cpp files. Test commit for new contributor.
|
Revision tags: llvmorg-13.0.0-rc3 |
|
#
477b9bc9 |
| 13-Sep-2021 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Minor cleanup after D109483. NFC.
|
Revision tags: llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3 |
|
#
2e5dc4a1 |
| 18-Jun-2021 |
Anshil Gandhi <angandhi@amd.com> |
[AMDGPU] [CodeGen] Fold negate llvm.amdgcn.class into test mask
Implemented the transformation of xor (llvm.amdgcn.class x, mask), -1 into llvm.amdgcn.class(x, ~mask). Added LIT tests as well.
Diff
[AMDGPU] [CodeGen] Fold negate llvm.amdgcn.class into test mask
Implemented the transformation of xor (llvm.amdgcn.class x, mask), -1 into llvm.amdgcn.class(x, ~mask). Added LIT tests as well.
Differential Revision: https://reviews.llvm.org/D104049
show more ...
|
Revision tags: llvmorg-12.0.1-rc2 |
|
#
99142003 |
| 06-Jun-2021 |
Nikita Popov <nikita.ppv@gmail.com> |
[CodeGen] Add missing includes (NFC)
These currently rely on the IRBuilder.h include in TargetLowering.h. Make them explicit.
|
Revision tags: llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
#
d6de1e1a |
| 24-Mar-2021 |
Serge Guelton <sguelton@redhat.com> |
Normalize interaction with boolean attributes
Such attributes can either be unset, or set to "true" or "false" (as string). throughout the codebase, this led to inelegant checks ranging from
Normalize interaction with boolean attributes
Such attributes can either be unset, or set to "true" or "false" (as string). throughout the codebase, this led to inelegant checks ranging from
if (Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true")
to
if (Fn->hasAttribute("no-jump-tables") && Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true")
Introduce a getValueAsBool that normalize the check, with the following behavior:
no attributes or attribute set to "false" => return false attribute set to "true" => return true
Differential Revision: https://reviews.llvm.org/D99299
show more ...
|
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2 |
|
#
2a0db8d7 |
| 20-Jan-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Use more accurate fast f64 fdiv
A raw v_rcp_f64 isn't accurate enough, so start applying correction.
|
#
560d7e04 |
| 20-Jan-2021 |
dfukalov <daniil.fukalov@amd.com> |
[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets
... to reduce headers dependency.
Reviewed By: rampitec, arsenm
Differential Revision: https://reviews.llvm.org/D95036
|
Revision tags: llvmorg-11.1.0-rc1 |
|
#
6a87e9b0 |
| 25-Dec-2020 |
dfukalov <daniil.fukalov@amd.com> |
[NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D93813
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3 |
|
#
1673a080 |
| 03-Sep-2020 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
SelectionDAG.h - remove unnecessary FunctionLoweringInfo.h include. NFCI.
Use forward declarations and move the include down to dependent files that actually use it.
This also exposes a number of i
SelectionDAG.h - remove unnecessary FunctionLoweringInfo.h include. NFCI.
Use forward declarations and move the include down to dependent files that actually use it.
This also exposes a number of implicit dependencies on KnownBits.h
show more ...
|
Revision tags: llvmorg-11.0.0-rc2 |
|
#
75e6f0b3 |
| 31-Jul-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Add flag to disable promotion of uniform i16 ops
This interferes with GlobalISel's much better handling of the situation.
This should really be disable for GlobalISel. However, the fallback
AMDGPU: Add flag to disable promotion of uniform i16 ops
This interferes with GlobalISel's much better handling of the situation.
This should really be disable for GlobalISel. However, the fallback only re-runs the selection passes, and doesn't go back and rerun any codegen IR passes. I haven't come up with a good solution to this problem.
show more ...
|
Revision tags: llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2 |
|
#
f4bd01c1 |
| 22-Jun-2020 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Fix and simplify AMDGPUCodeGenPrepare::expandDivRem32
Fix the division/remainder algorithm by adding a second quotient refinement step, which is required in some cases like 0xFFFFFFFFu / 0x
[AMDGPU] Fix and simplify AMDGPUCodeGenPrepare::expandDivRem32
Fix the division/remainder algorithm by adding a second quotient refinement step, which is required in some cases like 0xFFFFFFFFu / 0x11111111u (https://bugs.llvm.org/show_bug.cgi?id=46212).
Also document, rewrite and simplify it by ensuring that we always have a lower bound on inv(y), which simplifies the UNR step and the quotient refinement steps.
Differential Revision: https://reviews.llvm.org/D83381
show more ...
|
#
52911428 |
| 29-Jun-2020 |
Guillaume Chatelet <gchatelet@google.com> |
[Alignment][NFC] Migrate AMDGPU backend to Align
This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851
[Alignment][NFC] Migrate AMDGPU backend to Align
This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Differential Revision: https://reviews.llvm.org/D82743
show more ...
|
#
9ee272f1 |
| 15-Jun-2020 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Add gfx1030 target
Differential Revision: https://reviews.llvm.org/D81886
|
Revision tags: llvmorg-10.0.1-rc1 |
|
#
3254a001 |
| 13-May-2020 |
Christopher Tetreault <ctetreau@quicinc.com> |
[SVE] Remove usages of VectorType::getNumElements() from AMDGPU
Reviewers: efriedma, arsenm, david-arm, fpetrogalli
Reviewed By: efriedma
Subscribers: dmgreen, arsenm, kzhuravl, jvesely, wdng, nha
[SVE] Remove usages of VectorType::getNumElements() from AMDGPU
Reviewers: efriedma, arsenm, david-arm, fpetrogalli
Reviewed By: efriedma
Subscribers: dmgreen, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, tschuett, hiraditya, rkruppe, psnobl, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79807
show more ...
|