Revision tags: llvmorg-5.0.2-rc1 |
|
#
8452face |
| 05-Mar-2018 |
Craig Topper <craig.topper@intel.com> |
[InstCombine] Add constant vector support to getMinimumFPType for visitFPTrunc.
This patch teaches getMinimumFPType to support shrinking a vector of ConstantFPs. This should improve our ability to c
[InstCombine] Add constant vector support to getMinimumFPType for visitFPTrunc.
This patch teaches getMinimumFPType to support shrinking a vector of ConstantFPs. This should improve our ability to combine vector fptrunc with fp binops.
Differential Revision: https://reviews.llvm.org/D43774
llvm-svn: 326729
show more ...
|
#
c7461e1a |
| 02-Mar-2018 |
Craig Topper <craig.topper@intel.com> |
[InstCombine] Rewrite the binary op shrinking in visitFPTrunc to avoid creating overly small ConstantFPs that we'll just need to extend again.
Instead of returning the smaller FP constant we now ret
[InstCombine] Rewrite the binary op shrinking in visitFPTrunc to avoid creating overly small ConstantFPs that we'll just need to extend again.
Instead of returning the smaller FP constant we now return the minimal Type the constant can fit into. We also return the Type of the input to any fp extends. The legality checks are then done on just the size of these Types. If we find something profitable we then emit FPTruncs in front of the smaller binop and assume those FPTruncs will be constant folded or combined with any ConstantFPs or fpextends.
Differential Revision: https://reviews.llvm.org/D44038
llvm-svn: 326617
show more ...
|
Revision tags: llvmorg-6.0.0 |
|
#
b95298b0 |
| 28-Feb-2018 |
Craig Topper <craig.topper@intel.com> |
[InstCombine] Split the FP constant code out of lookThroughFPExtensions and use nullptr as a sentinel
Currently this code's control flow very much assumes that there are no meaningful checks after d
[InstCombine] Split the FP constant code out of lookThroughFPExtensions and use nullptr as a sentinel
Currently this code's control flow very much assumes that there are no meaningful checks after determining that it's a ConstantFP. So whenever it wants to stop it just does "return V". But V is also the variable name it uses when it wants to return a new value. So 'return V' appears multiple times with different meanings.
This patch just moves all the code into a helper function and returns nullptr when it wants to stop.
I've split this from D43774 while I try to figure out how to best handle the vector case there. But this change by itself at least seemed like a readability improvement.
Differential Revision: https://reviews.llvm.org/D43833
llvm-svn: 326361
show more ...
|
Revision tags: llvmorg-6.0.0-rc3 |
|
#
945b7e5a |
| 14-Feb-2018 |
Elena Demikhovsky <elena.demikhovsky@intel.com> |
Adding a width of the GEP index to the Data Layout.
Making a width of GEP Index, which is used for address calculation, to be one of the pointer properties in the Data Layout. p[address space]:size:
Adding a width of the GEP index to the Data Layout.
Making a width of GEP Index, which is used for address calculation, to be one of the pointer properties in the Data Layout. p[address space]:size:memory_size:alignment:pref_alignment:index_size_in_bits. The index size parameter is optional, if not specified, it is equal to the pointer size.
Till now, the InstCombiner normalized GEPs and extended the Index operand to the pointer width. It works fine if you can convert pointer to integer for address calculation and all registered targets do this. But some ISAs have very restricted instruction set for the pointer calculation. During discussions were desided to retrieve information for GEP index from the Data Layout. http://lists.llvm.org/pipermail/llvm-dev/2018-January/120416.html
I added an interface to the Data Layout and I changed the InstCombiner and some other passes to take the Index width into account. This change does not affect any in-tree target. I added tests to cover data layouts with explicitly specified index size.
Differential Revision: https://reviews.llvm.org/D42123
llvm-svn: 325102
show more ...
|
Revision tags: llvmorg-6.0.0-rc2 |
|
#
49aafec2 |
| 05-Feb-2018 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] don't try to evaluate instructions with >1 use (revert r324014)
This example causes a compile-time explosion:
define i16 @foo(i16 %in) { %x = zext i16 %in to i32 %a1 = mul i32 %x,
[InstCombine] don't try to evaluate instructions with >1 use (revert r324014)
This example causes a compile-time explosion:
define i16 @foo(i16 %in) { %x = zext i16 %in to i32 %a1 = mul i32 %x, %x %a2 = mul i32 %a1, %a1 %a3 = mul i32 %a2, %a2 %a4 = mul i32 %a3, %a3 %a5 = mul i32 %a4, %a4 %a6 = mul i32 %a5, %a5 %a7 = mul i32 %a6, %a6 %a8 = mul i32 %a7, %a7 %a9 = mul i32 %a8, %a8 %a10 = mul i32 %a9, %a9 %a11 = mul i32 %a10, %a10 %a12 = mul i32 %a11, %a11 %a13 = mul i32 %a12, %a12 %a14 = mul i32 %a13, %a13 %a15 = mul i32 %a14, %a14 %a16 = mul i32 %a15, %a15 %a17 = mul i32 %a16, %a16 %a18 = mul i32 %a17, %a17 %a19 = mul i32 %a18, %a18 %a20 = mul i32 %a19, %a19 %a21 = mul i32 %a20, %a20 %a22 = mul i32 %a21, %a21 %a23 = mul i32 %a22, %a22 %a24 = mul i32 %a23, %a23 %T = trunc i32 %a24 to i16 ret i16 %T }
llvm-svn: 324276
show more ...
|
#
2329fcd2 |
| 05-Feb-2018 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] only allow narrow/wide evaluation of values with >1 use if that user is a binop
There was a logic hole in D42739 / rL324014 because we're not accounting for select and phi instructions
[InstCombine] only allow narrow/wide evaluation of values with >1 use if that user is a binop
There was a logic hole in D42739 / rL324014 because we're not accounting for select and phi instructions that might have repeated operands. This is likely a source of an infinite loop. I haven't manufactured a test case to prove that, but it should be safe to speculatively limit this transform to binops while we try to create that test.
llvm-svn: 324252
show more ...
|
#
3343fcef |
| 01-Feb-2018 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] allow multi-use values in canEvaluate* if all uses are in 1 inst
This is the enhancement suggested in D42536 to fix a shortcoming in regular InstCombine's canEvaluate* functionality.
[InstCombine] allow multi-use values in canEvaluate* if all uses are in 1 inst
This is the enhancement suggested in D42536 to fix a shortcoming in regular InstCombine's canEvaluate* functionality. When we have multiple uses of a value, but they're all in one instruction, we can allow that expression to be narrowed or widened for the same cost as a single-use value.
AFAICT, this can only matter for multiply: sub/and/or/xor/select would be simplified away if the operands are the same value; add becomes shl; shifts with a variable shift amount aren't handled.
Differential Revision: https://reviews.llvm.org/D42739
llvm-svn: 324014
show more ...
|
#
1b66deea |
| 31-Jan-2018 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] reduce code duplication for canEvaluate* functions; NFCI
We'd have to make the change suggested in D42536 3x otherwise.
llvm-svn: 323877
|
#
e48597a5 |
| 26-Jan-2018 |
Vedant Kumar <vsk@apple.com> |
[InstCombine] Preserve debug values for eliminable casts
A cast from A to B is eliminable if its result is casted to C, and if the pair of casts could just be expressed as a single cast. E.g here, %
[InstCombine] Preserve debug values for eliminable casts
A cast from A to B is eliminable if its result is casted to C, and if the pair of casts could just be expressed as a single cast. E.g here, %c1 is eliminable:
%c1 = zext i16 %A to i32 %c2 = sext i32 %c1 to i64
InstCombine optimizes away eliminable casts. This patch teaches it to insert a dbg.value intrinsic pointing to the final result, so that local variables pointing to the eliminable result are preserved.
Differential Revision: https://reviews.llvm.org/D42566
llvm-svn: 323570
show more ...
|
Revision tags: llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2 |
|
#
b3fa9458 |
| 16-Nov-2017 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] include 'sub' in the list of narrow-able binops
// trunc (binop X, C) --> binop (trunc X, C') // trunc (binop (ext X), Y) --> binop X, (trunc Y)
I'm grouping sub with the
[InstCombine] include 'sub' in the list of narrow-able binops
// trunc (binop X, C) --> binop (trunc X, C') // trunc (binop (ext X), Y) --> binop X, (trunc Y)
I'm grouping sub with the other binops because that makes the code simpler and the transforms are valid: https://rise4fun.com/Alive/UeF ...so even though we don't expect a sub with constant Op1 or any of the other opcodes with constant Op0 due to canonicalization rules, we might as well handle those situations if non-canonical code somehow reaches this point (it should just make instcombine more efficient in reaching its end goal).
This should solve the problem that later manifests in the vectorizers in PR35295: https://bugs.llvm.org/show_bug.cgi?id=35295
llvm-svn: 318404
show more ...
|
#
03d0cd6a |
| 15-Nov-2017 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] trunc (binop X, C) --> binop (trunc X, C')
Note that one-use and shouldChangeType() are checked ahead of the switch.
Without the narrowing folds, we can produce inferior vector code a
[InstCombine] trunc (binop X, C) --> binop (trunc X, C')
Note that one-use and shouldChangeType() are checked ahead of the switch.
Without the narrowing folds, we can produce inferior vector code as shown in PR35299: https://bugs.llvm.org/show_bug.cgi?id=35299
llvm-svn: 318323
show more ...
|
Revision tags: llvmorg-5.0.1-rc1 |
|
#
17b0c784 |
| 05-Oct-2017 |
Craig Topper <craig.topper@intel.com> |
[InstCombine] Fix a vector splat handling bug in transformZExtICmp.
We were using an i1 type and then zero extending to a vector. Instead just create the 0/1 directly as a ConstantInt with the corre
[InstCombine] Fix a vector splat handling bug in transformZExtICmp.
We were using an i1 type and then zero extending to a vector. Instead just create the 0/1 directly as a ConstantInt with the correct type. No need to ask ConstantExpr to zero extend for us.
This bug is a bit tricky to hit because it requires us to visit a zext of an icmp that would normally be simplified to true/false, but that icmp hasnt' been visited yet. In the test case this zext and icmp were created by visiting a udiv and due to worklist ordering we got to the zext first.
Fixes PR34841.
llvm-svn: 314971
show more ...
|
Revision tags: llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4 |
|
#
4431bfe8 |
| 29-Aug-2017 |
Craig Topper <craig.topper@intel.com> |
[InstCombine] Support vector splats in transformZExtICmp
This patch adds splat support to transformZExtICmp. The test cases are vector versions of tests that failed when commenting out parts of the
[InstCombine] Support vector splats in transformZExtICmp
This patch adds splat support to transformZExtICmp. The test cases are vector versions of tests that failed when commenting out parts of the existing scalar code.
One test didn't vectorize optimize properly due to another bug so a TODO has been added.
Differential Revision: https://reviews.llvm.org/D37253
llvm-svn: 312023
show more ...
|
Revision tags: llvmorg-5.0.0-rc3 |
|
#
cc255bcd |
| 21-Aug-2017 |
Craig Topper <craig.topper@intel.com> |
[InstCombine] Fix a weakness in canEvaluateZExtd around 'and' instructions
Summary: If the bitsToClear from the LHS of an 'and' comes back non-zero, but all of those bits are known zero on the RHS,
[InstCombine] Fix a weakness in canEvaluateZExtd around 'and' instructions
Summary: If the bitsToClear from the LHS of an 'and' comes back non-zero, but all of those bits are known zero on the RHS, we can reset bitsToClear.
Without this, the 'or' in the modified test case blocks the transform because it has non-zero bits in its RHS in those bits.
Reviewers: spatel, majnemer, davide
Reviewed By: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36944
llvm-svn: 311343
show more ...
|
#
86111c66 |
| 16-Aug-2017 |
Amjad Aboud <amjad.aboud@intel.com> |
[InstCombine] Teach canEvaluateTruncated to handle arithmetic shift (including those with vector splat shift amount)
Differential Revision: https://reviews.llvm.org/D36784
llvm-svn: 311050
|
#
0a1a276d |
| 15-Aug-2017 |
Craig Topper <craig.topper@intel.com> |
[InstCombine] Teach canEvaluateZExtd and canEvaluateTruncated to handle vector shifts with splat shift amount
We were only allowing ConstantInt before. This patch allows splat of ConstantInt too.
D
[InstCombine] Teach canEvaluateZExtd and canEvaluateTruncated to handle vector shifts with splat shift amount
We were only allowing ConstantInt before. This patch allows splat of ConstantInt too.
Differential Revision: https://reviews.llvm.org/D36763
llvm-svn: 310970
show more ...
|
Revision tags: llvmorg-5.0.0-rc2 |
|
#
c50e55d0 |
| 09-Aug-2017 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] narrow rotate left/right patterns to eliminate zext/trunc (PR34046)
I couldn't find any smaller folds to help the cases in: https://bugs.llvm.org/show_bug.cgi?id=34046 after: rL310141
[InstCombine] narrow rotate left/right patterns to eliminate zext/trunc (PR34046)
I couldn't find any smaller folds to help the cases in: https://bugs.llvm.org/show_bug.cgi?id=34046 after: rL310141
The truncated rotate-by-variable patterns elude all of the existing transforms because of multiple uses and knowledge about demanded bits and knownbits that doesn't exist without the whole pattern. So we need an unfortunately large pattern match. But by simplifying this pattern in IR, the backend is already able to generate rolb/rolw/rorb/rorw for x86 using its existing rotate matching logic (although there is a likely extraneous 'and' of the rotate amount).
Note that rotate-by-constant doesn't have this problem - smaller folds should already produce the narrow IR ops.
Differential Revision: https://reviews.llvm.org/D36395
llvm-svn: 310509
show more ...
|
#
94da1de1 |
| 05-Aug-2017 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] refactor trunc(binop) transforms; NFCI
In addition to moving the shift transforms over, we may want to detect too-wide rotate patterns here (PR34046).
llvm-svn: 310181
|
#
e12d734b |
| 04-Aug-2017 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] narrow truncated add/sub/mul with constant
Name: narrow_sub %sub = sub i32 C1, %x %r = trunc i32 %sub to i8 => %xn = trunc i32 %x to i8 %narrowC = trunc i32 C1 to i8 %r =
[InstCombine] narrow truncated add/sub/mul with constant
Name: narrow_sub %sub = sub i32 C1, %x %r = trunc i32 %sub to i8 => %xn = trunc i32 %x to i8 %narrowC = trunc i32 C1 to i8 %r = sub i8 %narrowC, %xn Name: narrow_add %add = add i32 %x, C1 %r = trunc i32 %add to i8 => %xn = trunc i32 %x to i8 %narrowC = trunc i32 C1 to i8 %r = add i8 %xn, %narrowC Name: narrow_mul %mul = mul i32 %x, C1 %r = trunc i32 %mul to i8 => %xn = trunc i32 %x to i8 %narrowC = trunc i32 C1 to i8 %r = mul i8 %xn, %narrowC
http://rise4fun.com/Alive/QpS
This doesn't solve PR34046 (failure to recognize rotate): https://bugs.llvm.org/show_bug.cgi?id=34046 ...but it reduces an extra complication in the description examples to a form that we can more easily match.
llvm-svn: 310141
show more ...
|
#
a86ca08d |
| 04-Aug-2017 |
Craig Topper <craig.topper@intel.com> |
[InstCombine] Remove unnecessary casts. NFC
We're calling an overload of getOpcode that already returns Instruction::CastOps.
llvm-svn: 310024
|
Revision tags: llvmorg-5.0.0-rc1 |
|
#
95d2347a |
| 09-Jul-2017 |
Craig Topper <craig.topper@intel.com> |
[IR] Make use of Type::isPtrOrPtrVectorTy/isIntOrIntVectorTy/isFPOrFPVectorTy to shorten code. NFC
llvm-svn: 307491
|
#
bb4069e4 |
| 07-Jul-2017 |
Craig Topper <craig.topper@intel.com> |
[InstCombine] Make InstCombine's IRBuilder be passed by reference everywhere
Previously the InstCombiner class contained a pointer to an IR builder that had been passed to the constructor. Sometimes
[InstCombine] Make InstCombine's IRBuilder be passed by reference everywhere
Previously the InstCombiner class contained a pointer to an IR builder that had been passed to the constructor. Sometimes this would be passed to helper functions as either a pointer or the pointer would be dereferenced to be passed by reference.
This patch makes it a reference everywhere including the InstCombiner class itself so there is more inconsistency. This a large, but mechanical patch. I've done very minimal formatting changes on it despite what clang-format wanted to do.
llvm-svn: 307451
show more ...
|
#
cb22039b |
| 06-Jul-2017 |
Craig Topper <craig.topper@intel.com> |
[InstCombine] No need to pass DataLayout to helper functions if we're passing the InstCombiner object. We can just ask it for the DataLayout. NFC
llvm-svn: 307333
|
#
d1e81197 |
| 22-Jun-2017 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] reverse bitcast + bitwise-logic canonicalization (PR33138)
There are 2 parts to this patch made simultaneously to avoid a regression.
We're reversing the canonicalization that moves b
[InstCombine] reverse bitcast + bitwise-logic canonicalization (PR33138)
There are 2 parts to this patch made simultaneously to avoid a regression.
We're reversing the canonicalization that moves bitwise vector ops before bitcasts. We're moving bitwise vector ops *after* bitcasts instead. That's the 1st and 3rd hunks of the patch. The motivation is that there's only one fold that currently depends on the existing canonicalization (see next), but there are many folds that would automatically benefit from the new canonicalization. PR33138 ( https://bugs.llvm.org/show_bug.cgi?id=33138 ) shows why/how we have these patterns in IR.
There's an or(and,andn) pattern that requires an adjustment in order to continue matching to 'select' because the bitcast changes position. This match is unfortunately complicated because it requires 4 logic ops with optional bitcast and sext ops.
Test diffs:
1. The bitcast.ll and bitcast-bigendian.ll changes show the most basic difference - bitcast comes before logic. 2. There are also tests with no diffs in bitcast.ll that verify that we're still doing folds that were enabled by the previous canonicalization. 3. icmp-xor-signbit.ll shows the payoff. We don't need to adjust existing icmp patterns to look through bitcasts. 4. logical-select.ll contains several tests for the or(and,andn) --> select fold to verify that we are still handling those cases. The lone diff shows the movement of the bitcast from the new canonicalization rule.
Differential Revision: https://reviews.llvm.org/D33517
llvm-svn: 306011
show more ...
|
Revision tags: llvmorg-4.0.1, llvmorg-4.0.1-rc3 |
|
#
73ba1c84 |
| 07-Jun-2017 |
Craig Topper <craig.topper@gmail.com> |
[InstCombine][InstSimplify] Use APInt::isNullValue/isOneValue to reduce compiled code for comparing APInts with 0 and 1. NFC
These methods are specifically optimized to only counting leading zeros w
[InstCombine][InstSimplify] Use APInt::isNullValue/isOneValue to reduce compiled code for comparing APInts with 0 and 1. NFC
These methods are specifically optimized to only counting leading zeros without an additional uint64_t compare.
llvm-svn: 304876
show more ...
|