#
33bf1cad |
| 08-Jan-2021 |
Kazu Hirata <kazu@google.com> |
[llvm] Use *Set::contains (NFC)
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2 |
|
#
01383999 |
| 16-Dec-2020 |
Jun Ma <JunMa@linux.alibaba.com> |
[InstCombine] Remove scalable vector restriction in InstCombineCasts
Differential Revision: https://reviews.llvm.org/D93389
|
#
2ac58e21 |
| 11-Dec-2020 |
Jun Ma <JunMa@linux.alibaba.com> |
[InstCombine] Remove scalable vector restriction when fold SelectInst
Differential Revision: https://reviews.llvm.org/D93083
|
#
94ead019 |
| 01-Dec-2020 |
Roman Lebedev <lebedev.ri@gmail.com> |
[InstCombine] Improve vector undef handling for sext(ashr(shl(trunc()))) fold, 2
If the shift amount was undef for some lane, the shift amount in opposite shift is irrelevant for that lane, and the
[InstCombine] Improve vector undef handling for sext(ashr(shl(trunc()))) fold, 2
If the shift amount was undef for some lane, the shift amount in opposite shift is irrelevant for that lane, and the new shift amount for that lane can be undef.
show more ...
|
#
52533b52 |
| 01-Dec-2020 |
Roman Lebedev <lebedev.ri@gmail.com> |
Revert "[InstCombine] Improve vector undef handling for sext(ashr(shl(trunc()))) fold"
It seems i have missed checklines, temporairly reverting, will reland momentairly..
This reverts commit aa1aa1
Revert "[InstCombine] Improve vector undef handling for sext(ashr(shl(trunc()))) fold"
It seems i have missed checklines, temporairly reverting, will reland momentairly..
This reverts commit aa1aa135097ecfab6d9917a435142030eff0a226.
show more ...
|
#
aa1aa135 |
| 01-Dec-2020 |
Roman Lebedev <lebedev.ri@gmail.com> |
[InstCombine] Improve vector undef handling for sext(ashr(shl(trunc()))) fold
If the shift amount was undef for some lane, the shift amount in opposite shift is irrelevant for that lane, and the new
[InstCombine] Improve vector undef handling for sext(ashr(shl(trunc()))) fold
If the shift amount was undef for some lane, the shift amount in opposite shift is irrelevant for that lane, and the new shift amount for that lane can be undef.
show more ...
|
#
8e29e20e |
| 01-Dec-2020 |
Roman Lebedev <lebedev.ri@gmail.com> |
[InstCombine] Evaluate new shift amount for sext(ashr(shl(trunc()))) fold in wide type (PR48343)
It is not correct to compute that new shift amount in it's narrow type and only then extend it into t
[InstCombine] Evaluate new shift amount for sext(ashr(shl(trunc()))) fold in wide type (PR48343)
It is not correct to compute that new shift amount in it's narrow type and only then extend it into the wide type:
---------------------------------------- Optimization: PR48343 good Precondition: (width(%X) == width(%r)) %o0 = trunc %X %o1 = shl %o0, %Y %o2 = ashr %o1, %Y %r = sext %o2 => %n0 = sext %Y %n1 = sub width(%o0), %n0 %n2 = sub width(%X), %n1 %n3 = shl %X, %n2 %r = ashr %n3, %n2
Done: 2016 Optimization is correct!
---------------------------------------- Optimization: PR48343 bad Precondition: (width(%X) == width(%r)) %o0 = trunc %X %o1 = shl %o0, %Y %o2 = ashr %o1, %Y %r = sext %o2 => %n0 = sub width(%o0), %Y %n1 = sub width(%X), %n0 %n2 = sext %n1 %n3 = shl %X, %n2 %r = ashr %n3, %n2
Done: 1 ERROR: Domain of definedness of Target is smaller than Source's for i9 %r
Example: %X i9 = 0x000 (0) %Y i4 = 0x3 (3) %o0 i4 = 0x0 (0) %o1 i4 = 0x0 (0) %o2 i4 = 0x0 (0) %n0 i4 = 0x1 (1) %n1 i4 = 0x8 (8, -8) %n2 i9 = 0x1F8 (504, -8) %n3 i9 = 0x000 (0) Source value: 0x000 (0) Target value: undef
I.e. we should be computing it in the wide type from the beginning.
Fixes https://bugs.llvm.org/show_bug.cgi?id=48343
show more ...
|
Revision tags: llvmorg-11.0.1-rc1 |
|
#
310f62b4 |
| 24-Oct-2020 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[InstCombine] narrowFunnelShift - fold trunc/zext or(shl(a,x),lshr(b,sub(bw,x))) -> fshl(a,b,x) (PR35155)
As discussed on PR35155, this extends narrowFunnelShift (recently renamed from narrowRotate)
[InstCombine] narrowFunnelShift - fold trunc/zext or(shl(a,x),lshr(b,sub(bw,x))) -> fshl(a,b,x) (PR35155)
As discussed on PR35155, this extends narrowFunnelShift (recently renamed from narrowRotate) to support basic funnel shift patterns.
Unlike matchFunnelShift we don't include the computeKnownBits limitation as extracting the pattern from the zext/trunc layers should be a indicator of reasonable funnel shift codegen, in D89139 we demonstrated how to efficiently promote funnel shifts to wider types.
Differential Revision: https://reviews.llvm.org/D89542
show more ...
|
#
1cf347e4 |
| 16-Oct-2020 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[InstCombine] narrowRotate - minor refactoring for funnel shift support. NFC.
Prep work for PR35155 - renamed narrowRotate to narrowFunnelShift, rewrote some comments and adjusted code to collect se
[InstCombine] narrowRotate - minor refactoring for funnel shift support. NFC.
Prep work for PR35155 - renamed narrowRotate to narrowFunnelShift, rewrote some comments and adjusted code to collect separate shift values, although we bail if they don't match (still only rotations are only actually folded).
I'm trying to match matchFunnelShift as much as possible in case we finally get to merge these one day.
show more ...
|
#
89657b3a |
| 14-Oct-2020 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[InstCombine] narrowRotate - canonicalize to OR(SHL,LSHR). NFCI.
Match the canonicalization code that was added to matchFunnelShift at rG02295e6d1a15
|
#
9c3138bd |
| 13-Oct-2020 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[InstCombine] visitTrunc - pass through undefs for trunc(shift(trunc/ext(x),c)) patterns
Based on the recent patches D88475 and D88429 where we are losing undef values due to extension/comparisons.
[InstCombine] visitTrunc - pass through undefs for trunc(shift(trunc/ext(x),c)) patterns
Based on the recent patches D88475 and D88429 where we are losing undef values due to extension/comparisons.
I've added a Constant::mergeUndefsWith method that merges the undef scalar/elements from another Constant into a specific Constant.
Differential Revision: https://reviews.llvm.org/D88687
show more ...
|
#
d9f064dc |
| 08-Oct-2020 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[InstCombine] visitTrunc - trunc(shl(X, C)) --> shl(trunc(X),trunc(C)) vector support
Annoyingly vectors aren't supported by shouldChangeType(), but we have precedents for always performing this on
[InstCombine] visitTrunc - trunc(shl(X, C)) --> shl(trunc(X),trunc(C)) vector support
Annoyingly vectors aren't supported by shouldChangeType(), but we have precedents for always performing this on vector types (e.g. narrowBinOp).
Differential Revision: https://reviews.llvm.org/D89067
show more ...
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5 |
|
#
0cf48a70 |
| 29-Sep-2020 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[InstCombine] visitTrunc - trunc (*shr (trunc A), C) --> trunc(*shr A, C)
Attempt to fold trunc (*shr (trunc A), C) --> trunc(*shr A, C) iff the shift amount if small enough that all zero/sign bits
[InstCombine] visitTrunc - trunc (*shr (trunc A), C) --> trunc(*shr A, C)
Attempt to fold trunc (*shr (trunc A), C) --> trunc(*shr A, C) iff the shift amount if small enough that all zero/sign bits created by the shift are removed by the last trunc.
Helps fix the regressions encountered in D88316.
I've tweaked a couple of shift values as suggested by @lebedev.ri to ensure we have coverage of shift values close (above/below) to the max limit.
Differential Revision: https://reviews.llvm.org/D88429
show more ...
|
#
b610d73b |
| 29-Sep-2020 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[InstCombine] visitTrunc - remove dead trunc(lshr (zext A), C) combine. NFCI.
I added additional test coverage at rG7a55989dc4305 - but all are handled independently of this combine and http://lab.l
[InstCombine] visitTrunc - remove dead trunc(lshr (zext A), C) combine. NFCI.
I added additional test coverage at rG7a55989dc4305 - but all are handled independently of this combine and http://lab.llvm.org:8080/coverage/coverage-reports/ indicates the code is never used.
Differential revision: https://reviews.llvm.org/D88492
show more ...
|
#
89a8a0c9 |
| 29-Sep-2020 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[InstCombine] Inherit exact flags on extended shifts in trunc (lshr (sext A), C) --> (ashr A, C)
This was missed in D88475
|
#
14ff38e2 |
| 29-Sep-2020 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[InstCombine] visitTrunc - trunc (lshr (sext A), C) --> (ashr A, C) non-uniform support
This came from @lebedev.ri's suggestion to use m_SpecificInt_ICMP for D88429 - since I was going to change the
[InstCombine] visitTrunc - trunc (lshr (sext A), C) --> (ashr A, C) non-uniform support
This came from @lebedev.ri's suggestion to use m_SpecificInt_ICMP for D88429 - since I was going to change the m_APInt to m_Constant for that patch I thought I would do it for the only other user of the APInt first.
I've added a ConstantExpr::getUMin helper - its trivial to add UMAX/SMIN/SMAX but thought I'd wait until we have use cases.
Differential Revision: https://reviews.llvm.org/D88475
show more ...
|
Revision tags: llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3 |
|
#
59c4d5aa |
| 09-Sep-2020 |
David Sherwood <david.sherwood@arm.com> |
[SVE] Fix InstCombinerImpl::PromoteCastOfAllocation for scalable vectors
In this patch I've fixed some warnings that arose from the implicit cast of TypeSize -> uint64_t. I tried writing a variety o
[SVE] Fix InstCombinerImpl::PromoteCastOfAllocation for scalable vectors
In this patch I've fixed some warnings that arose from the implicit cast of TypeSize -> uint64_t. I tried writing a variety of different cases to show how this optimisation might work for scalable vectors and found:
1. The optimisation does not work for cases where the cast type is scalable and the allocated type is not. This because we need to know how many times the cast type fits into the allocated type. 2. If we pass all the various checks for the case when the allocated type is scalable and the cast type is not, then when creating the new alloca we have to take vscale into account. This leads to sub-optimal IR that is worse than the original IR. 3. For the remaining case when both the alloca and cast types are scalable it is hard to find examples where the optimisation would kick in, except for simple bitcasts, because we typically fail the ABI alignment checks.
For now I've changed the code to bail out if only one of the alloca and cast types is scalable. This means we continue to support the existing cases where both types are fixed, and also the specific case when both types are scalable with the same size and alignment, for example a simple bitcast of an alloca to another type.
I've added tests that show we don't attempt to promote the alloca, except for simple bitcasts:
Transforms/InstCombine/AArch64/sve-cast-of-alloc.ll
Differential revision: https://reviews.llvm.org/D87378
show more ...
|
#
96ef6998 |
| 01-Sep-2020 |
Eli Friedman <efriedma@quicinc.com> |
[InstCombine] Fix a couple crashes with extractelement on a scalable vector.
Differential Revision: https://reviews.llvm.org/D86989
|
#
640f20b0 |
| 31-Aug-2020 |
Christopher Tetreault <ctetreau@quicinc.com> |
[SVE] Remove calls to VectorType::getNumElements from InstCombine
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D82237
|
Revision tags: llvmorg-11.0.0-rc2 |
|
#
912c09e8 |
| 12-Aug-2020 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] eliminate a pointer cast around insertelement
I'm not sure if this solves PR46839 completely, but reducing the casting should help: https://bugs.llvm.org/show_bug.cgi?id=46839
Differe
[InstCombine] eliminate a pointer cast around insertelement
I'm not sure if this solves PR46839 completely, but reducing the casting should help: https://bugs.llvm.org/show_bug.cgi?id=46839
Differential Revision: https://reviews.llvm.org/D85647
show more ...
|
#
bebca662 |
| 10-Aug-2020 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] rearrange code for readability; NFC
The code comment refers to the path where we change the size of the integer type, so handle that first, otherwise deal with the general case.
|
Revision tags: llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2 |
|
#
2a6c8715 |
| 03-Jun-2020 |
Sebastian Neubauer <sebastian.neubauer@amd.com> |
[InstCombine] Move target-specific inst combining
For a long time, the InstCombine pass handled target specific intrinsics. Having target specific code in general passes was noted as an area for imp
[InstCombine] Move target-specific inst combining
For a long time, the InstCombine pass handled target specific intrinsics. Having target specific code in general passes was noted as an area for improvement for a long time.
D81728 moves most target specific code out of the InstCombine pass. Applying the target specific combinations in an extra pass would probably result in inferior optimizations compared to the current fixed-point iteration, therefore the InstCombine pass resorts to newly introduced functions in the TargetTransformInfo when it encounters unknown intrinsics. The patch should not have any effect on generated code (under the assumption that code never uses intrinsics from a foreign target).
This introduces three new functions: TargetTransformInfo::instCombineIntrinsic TargetTransformInfo::simplifyDemandedUseBitsIntrinsic TargetTransformInfo::simplifyDemandedVectorEltsIntrinsic
A few target specific parts are left in the InstCombine folder, where it makes sense to share code. The largest left-over part in InstCombineCalls.cpp is the code shared between arm and aarch64.
This allows to move about 3000 lines out from InstCombine to the targets.
Differential Revision: https://reviews.llvm.org/D81728
show more ...
|
#
31971ca1 |
| 03-Jul-2020 |
Florian Hahn <flo@fhahn.com> |
[InstCombine] Try to narrow expr if trunc cannot be removed.
Narrowing an input expression of a truncate to a type larger than the result of the truncate won't allow removing the truncate, but it ma
[InstCombine] Try to narrow expr if trunc cannot be removed.
Narrowing an input expression of a truncate to a type larger than the result of the truncate won't allow removing the truncate, but it may enable further optimizations, e.g. allowing for larger vectorization factors.
For now this is intentionally limited to integer types only, to avoid producing new vector ops that might not be suitable for the target.
If we know that the only user is a trunc, we can also be allow more cases, e.g. also shortening expressions with some additional shifts.
I would appreciate feedback on the best place to do such a narrowing.
This fixes PR43580.
Reviewers: spatel, RKSimon, lebedev.ri, xbolva00
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D82973
show more ...
|
#
eb0e7acb |
| 03-Jul-2020 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[InstCombine] canEvaluateTruncated - use KnownBits to check for inrange shift amounts
Currently canEvaluateTruncated can only attempt to truncate shifts if they are scalar/uniform constant amounts t
[InstCombine] canEvaluateTruncated - use KnownBits to check for inrange shift amounts
Currently canEvaluateTruncated can only attempt to truncate shifts if they are scalar/uniform constant amounts that are in range.
This patch replaces the constant extraction code with KnownBits handling, using the KnownBits::getMaxValue to check that the amounts are inrange.
This enables support for nonuniform constant cases, and also variable shift amounts that have been masked somehow. Annoyingly, this still won't work for vectors with (demanded) undefs as KnownBits returns nothing in those cases, but its a definite improvement on what we currently have.
Differential Revision: https://reviews.llvm.org/D83127
show more ...
|
#
3da42f48 |
| 02-Jul-2020 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[InstCombine] Add sext(ashr(shl(trunc(x),c),c)) folding support for vectors
Replacing m_ConstantInt with m_Constant permits folding of vectors as well as scalars.
Differential Revision: https://rev
[InstCombine] Add sext(ashr(shl(trunc(x),c),c)) folding support for vectors
Replacing m_ConstantInt with m_Constant permits folding of vectors as well as scalars.
Differential Revision: https://reviews.llvm.org/D83058
show more ...
|