#
6a87e9b0 |
| 25-Dec-2020 |
dfukalov <daniil.fukalov@amd.com> |
[NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D93813
|
#
0e219b64 |
| 03-Jan-2021 |
Kazu Hirata <kazu@google.com> |
[Target] Construct SmallVector with iterator ranges (NFC)
|
#
9b296102 |
| 29-Dec-2020 |
Juneyoung Lee <aqjune@gmail.com> |
Use unary CreateShuffleVector if possible
As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used instead of `IRBuilder::CreateShuffleV
Use unary CreateShuffleVector if possible
As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used instead of `IRBuilder::CreateShuffleVector(X, Undef, Mask)`. Let's update them.
Actually, it would have been more natural if the patches were made in this order: (1) let them use unary CreateShuffleVector first (2) update IRBuilder::CreateShuffleVector to use poison as a placeholder value (D93793)
The order is swapped, but in terms of correctness it is still fine.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D93923
show more ...
|
#
c7afb698 |
| 16-Dec-2020 |
Piotr Sobczak <Piotr.Sobczak@amd.com> |
[AMDGPU] Avoid calling copyFastMathFlags in wrong context
Calling Instruction::copyFastMathFlags() assumes the caller is FPMathOperator. Avoid calling the function for instructions that are not inst
[AMDGPU] Avoid calling copyFastMathFlags in wrong context
Calling Instruction::copyFastMathFlags() assumes the caller is FPMathOperator. Avoid calling the function for instructions that are not instances of FPMathOperator.
show more ...
|
#
958130df |
| 23-Oct-2020 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Add simplification/combines for llvm.amdgcn.fma.legacy
This follows on from D89558 which added the new intrinsic and D88955 which added similar combines for llvm.amdgcn.fmul.legacy.
Differ
[AMDGPU] Add simplification/combines for llvm.amdgcn.fma.legacy
This follows on from D89558 which added the new intrinsic and D88955 which added similar combines for llvm.amdgcn.fmul.legacy.
Differential Revision: https://reviews.llvm.org/D90028
show more ...
|
#
86a480e9 |
| 06-Oct-2020 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Add simplification/combines for llvm.amdgcn.fmul.legacy
Differential Revision: https://reviews.llvm.org/D88955
|
#
20e9c36c |
| 26-Sep-2020 |
Fangrui Song <i@maskray.me> |
Internalize functions from various tools. NFC
And internalize some classes if I noticed them:)
|
#
f0268121 |
| 17-Sep-2020 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
InstCombiner.h - remove unnecessary KnownBits.h include. NFCI.
Move the include down to cpp files with an implicit dependency.
|
#
833b3b0d |
| 23-Jul-2020 |
Sebastian Neubauer <sebastian.neubauer@amd.com> |
[AMDGPU] Add v3f16/v3i16 support to SDag
Fix lowering and instruction selection for v3x16 types and enable InstCombine to emit them.
This patch only implements it for the selection dag. GlobalISel
[AMDGPU] Add v3f16/v3i16 support to SDag
Fix lowering and instruction selection for v3x16 types and enable InstCombine to emit them.
This patch only implements it for the selection dag. GlobalISel tests in GlobalISel/llvm.amdgcn.image.load.1d.d16.ll and GlobalISel/llvm.amdgcn.image.store.2d.d16.ll still don't work.
Differential Revision: https://reviews.llvm.org/D84420
show more ...
|
#
b8d19947 |
| 04-Jun-2020 |
Sebastian Neubauer <sebastian.neubauer@amd.com> |
[AMDGPU] Add A16/G16 to InstCombine
When sampling from images with coordinates that only have 16 bit accuracy, convert the image intrinsic call to use a16 or g16. This does only happen if the target
[AMDGPU] Add A16/G16 to InstCombine
When sampling from images with coordinates that only have 16 bit accuracy, convert the image intrinsic call to use a16 or g16. This does only happen if the target hardware supports it.
An alternative would be to always apply this combination, independent of the target hardware and extend 16 bit arguments to 32 bit arguments during legalization. To me, this sounds like an unnecessary roundtrip that could prevent some further InstCombine optimizations.
Differential Revision: https://reviews.llvm.org/D85887
show more ...
|
#
3b92db4c |
| 03-Aug-2020 |
Christopher Tetreault <ctetreau@quicinc.com> |
[SVE] Remove bad call to VectorType::getNumElements() from AMDGPU
Differential Revision: https://reviews.llvm.org/D85151
|
#
c6f08b14 |
| 31-Jul-2020 |
Benjamin Kramer <benny.kra@googlemail.com> |
Hide some internal symbols. NFC.
|
#
2a6c8715 |
| 03-Jun-2020 |
Sebastian Neubauer <sebastian.neubauer@amd.com> |
[InstCombine] Move target-specific inst combining
For a long time, the InstCombine pass handled target specific intrinsics. Having target specific code in general passes was noted as an area for imp
[InstCombine] Move target-specific inst combining
For a long time, the InstCombine pass handled target specific intrinsics. Having target specific code in general passes was noted as an area for improvement for a long time.
D81728 moves most target specific code out of the InstCombine pass. Applying the target specific combinations in an extra pass would probably result in inferior optimizations compared to the current fixed-point iteration, therefore the InstCombine pass resorts to newly introduced functions in the TargetTransformInfo when it encounters unknown intrinsics. The patch should not have any effect on generated code (under the assumption that code never uses intrinsics from a foreign target).
This introduces three new functions: TargetTransformInfo::instCombineIntrinsic TargetTransformInfo::simplifyDemandedUseBitsIntrinsic TargetTransformInfo::simplifyDemandedVectorEltsIntrinsic
A few target specific parts are left in the InstCombine folder, where it makes sense to share code. The largest left-over part in InstCombineCalls.cpp is the code shared between arm and aarch64.
This allows to move about 3000 lines out from InstCombine to the targets.
Differential Revision: https://reviews.llvm.org/D81728
show more ...
|