Revision tags: llvmorg-11.0.0-rc3 |
|
#
7903ae47 |
| 20-Sep-2020 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] factorize left shifts of add/sub
We do similar factorization folds in SimplifyUsingDistributiveLaws, but that drops no-wrap properties. Propagating those optimally may help solve: http
[InstCombine] factorize left shifts of add/sub
We do similar factorization folds in SimplifyUsingDistributiveLaws, but that drops no-wrap properties. Propagating those optimally may help solve: https://llvm.org/PR47430
The propagation is all-or-nothing for these patterns: when all 3 incoming ops have nsw or nuw, the 2 new ops should have the same no-wrap property: https://alive2.llvm.org/ce/z/Dv8wsU
This also solves: https://llvm.org/PR47584
show more ...
|
#
6aa3fc4a |
| 11-Sep-2020 |
Sanjay Patel <spatel@rotateright.com> |
Revert "[InstCombine] propagate 'nsw' on pointer difference of 'inbounds' geps (PR47430)"
This reverts commit 324a53205a3af979e3de109fdd52f91781816cba.
On closer examination of at least one of the
Revert "[InstCombine] propagate 'nsw' on pointer difference of 'inbounds' geps (PR47430)"
This reverts commit 324a53205a3af979e3de109fdd52f91781816cba.
On closer examination of at least one of the test diffs, this does not appear to be correct in all cases. Even the existing 'nsw' creation may be wrong based on this example: https://alive2.llvm.org/ce/z/uL4Hw9 https://alive2.llvm.org/ce/z/fJMKQS
show more ...
|
#
324a5320 |
| 10-Sep-2020 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] propagate 'nsw' on pointer difference of 'inbounds' geps (PR47430)
There's no signed wrap if both geps have 'inbounds': https://alive2.llvm.org/ce/z/nZkQTg https://alive2.llvm.org/ce/z
[InstCombine] propagate 'nsw' on pointer difference of 'inbounds' geps (PR47430)
There's no signed wrap if both geps have 'inbounds': https://alive2.llvm.org/ce/z/nZkQTg https://alive2.llvm.org/ce/z/7qFauh
show more ...
|
#
8b300679 |
| 07-Sep-2020 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] improve fold of pointer differences
This was supposed to be an NFC cleanup, but there's a real logic difference (did not drop 'nsw') visible in some tests in addition to an efficiency
[InstCombine] improve fold of pointer differences
This was supposed to be an NFC cleanup, but there's a real logic difference (did not drop 'nsw') visible in some tests in addition to an efficiency improvement.
This is because in the case where we have 2 GEPs, the code was *always* swapping the operands and negating the result. But if we have 2 GEPs, we should *never* need swapping/negation AFAICT.
This is part of improving flags propagation noticed with PR47430.
show more ...
|
#
3ca8b9a5 |
| 07-Sep-2020 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] give a name to an intermediate value for easier tracking; NFC
As noted in PR47430, we probably want to conditionally include 'nsw' here anyway, so we are going to need to fill out the
[InstCombine] give a name to an intermediate value for easier tracking; NFC
As noted in PR47430, we probably want to conditionally include 'nsw' here anyway, so we are going to need to fill out the optional args.
show more ...
|
#
57a26bb7 |
| 29-Aug-2020 |
Nikita Popov <nikita.ppv@gmail.com> |
[InstCombine] Fix typo in comment (NFC)
As pointed out in post-commit review of D63060.
|
#
ffe05dd1 |
| 26-Aug-2020 |
Nikita Popov <nikita.ppv@gmail.com> |
[InstCombine] usub.sat(a, b) + b => umax(a, b) (PR42178)
Fixes https://bugs.llvm.org/show_bug.cgi?id=42178 by folding usub.sat(a, b) + b to umax(a, b). The backend will expand umax back to usubsat i
[InstCombine] usub.sat(a, b) + b => umax(a, b) (PR42178)
Fixes https://bugs.llvm.org/show_bug.cgi?id=42178 by folding usub.sat(a, b) + b to umax(a, b). The backend will expand umax back to usubsat if that is profitable.
We may also want to handle uadd.sat(a, b) - b in the future.
Differential Revision: https://reviews.llvm.org/D63060
show more ...
|
Revision tags: llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2 |
|
#
2a6c8715 |
| 03-Jun-2020 |
Sebastian Neubauer <sebastian.neubauer@amd.com> |
[InstCombine] Move target-specific inst combining
For a long time, the InstCombine pass handled target specific intrinsics. Having target specific code in general passes was noted as an area for imp
[InstCombine] Move target-specific inst combining
For a long time, the InstCombine pass handled target specific intrinsics. Having target specific code in general passes was noted as an area for improvement for a long time.
D81728 moves most target specific code out of the InstCombine pass. Applying the target specific combinations in an extra pass would probably result in inferior optimizations compared to the current fixed-point iteration, therefore the InstCombine pass resorts to newly introduced functions in the TargetTransformInfo when it encounters unknown intrinsics. The patch should not have any effect on generated code (under the assumption that code never uses intrinsics from a foreign target).
This introduces three new functions: TargetTransformInfo::instCombineIntrinsic TargetTransformInfo::simplifyDemandedUseBitsIntrinsic TargetTransformInfo::simplifyDemandedVectorEltsIntrinsic
A few target specific parts are left in the InstCombine folder, where it makes sense to share code. The largest left-over part in InstCombineCalls.cpp is the code shared between arm and aarch64.
This allows to move about 3000 lines out from InstCombine to the targets.
Differential Revision: https://reviews.llvm.org/D81728
show more ...
|
#
8953ecf2 |
| 23-Jun-2020 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] reassociate diff of sums into sum of diffs
This is the integer sibling to D81491.
(a[0] + a[1] + a[2] + a[3]) - (b[0] + b[1] + b[2] +b[3]) --> (a[0] - b[0]) + (a[1] - b[1]) + (a[2] -
[InstCombine] reassociate diff of sums into sum of diffs
This is the integer sibling to D81491.
(a[0] + a[1] + a[2] + a[3]) - (b[0] + b[1] + b[2] +b[3]) --> (a[0] - b[0]) + (a[1] - b[1]) + (a[2] - b[2]) + (a[3] - b[3])
Removing the "experimental" from these intrinsics is likely not too far away.
show more ...
|
#
b5fb2695 |
| 14-Jun-2020 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] reassociate FP diff of sums into sum of diffs
(a[0] + a[1] + a[2] + a[3]) - (b[0] + b[1] + b[2] +b[3]) --> (a[0] - b[0]) + (a[1] - b[1]) + (a[2] - b[2]) + (a[3] - b[3])
This should be
[InstCombine] reassociate FP diff of sums into sum of diffs
(a[0] + a[1] + a[2] + a[3]) - (b[0] + b[1] + b[2] +b[3]) --> (a[0] - b[0]) + (a[1] - b[1]) + (a[2] - b[2]) + (a[3] - b[3])
This should be the last step in solving PR43953: https://bugs.llvm.org/show_bug.cgi?id=43953
We started emitting reduction intrinsics with: D80867/ rGe50059f6b6b3 So it's a relatively easy pattern match now to re-order those ops. Also, I have not seen any complaints for the switch to intrinsics yet, so I'll propose to remove the "experimental" tag from the intrinsics soon.
Differential Revision: https://reviews.llvm.org/D81491
show more ...
|
#
012909dc |
| 12-Jun-2020 |
EgorBo <egorbo@gmail.com> |
[InstCombine] "X - (X / C) * C == 0" to "X & C-1 == 0"
Summary: "X % C == 0" is optimized to "X & C-1 == 0" (where C is a power-of-two) However, "X % Y" can also be represented as "X - (X / Y) * Y"
[InstCombine] "X - (X / C) * C == 0" to "X & C-1 == 0"
Summary: "X % C == 0" is optimized to "X & C-1 == 0" (where C is a power-of-two) However, "X % Y" can also be represented as "X - (X / Y) * Y" so if I rewrite the initial expression: "X - (X / C) * C == 0" it's not currently optimized to "X & C-1 == 0", see godbolt: https://godbolt.org/z/KzuXUj
This is my first contribution to LLVM so I hope I didn't mess things up
Reviewers: lebedev.ri, spatel
Reviewed By: lebedev.ri
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79369
show more ...
|
#
1a2bffaf |
| 26-May-2020 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] reassociate sub+add to increase adds and throughput
The -reassociate pass tends to transform this kind of pattern into something that is worse for vectorization and codegen. See PR4395
[InstCombine] reassociate sub+add to increase adds and throughput
The -reassociate pass tends to transform this kind of pattern into something that is worse for vectorization and codegen. See PR43953: https://bugs.llvm.org/show_bug.cgi?id=43953
Follows-up the FP version of the same transform: rGa0ce2338a083
show more ...
|
#
a0ce2338 |
| 26-May-2020 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] reassociate fsub+fadd with FMF to increase adds and throughput
The -reassociate pass tends to transform this kind of pattern into something that is worse for vectorization and codegen.
[InstCombine] reassociate fsub+fadd with FMF to increase adds and throughput
The -reassociate pass tends to transform this kind of pattern into something that is worse for vectorization and codegen. See PR43953: https://bugs.llvm.org/show_bug.cgi?id=43953
show more ...
|
#
2f7c24fe |
| 22-May-2020 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] (A + B) + B --> A + (B << 1)
This eliminates a use of 'B', so it can enable follow-on transforms as well as improve analysis/codegen.
The PhaseOrdering test was added for D61726, and
[InstCombine] (A + B) + B --> A + (B << 1)
This eliminates a use of 'B', so it can enable follow-on transforms as well as improve analysis/codegen.
The PhaseOrdering test was added for D61726, and that shows the limits of instcombine vs. real reassociation. We would need to run some form of CSE to collapse that further.
The intermediate variable naming here is intentional because there's a test at llvm/test/Bitcode/value-with-long-name.ll that would break with the usual nameless value. I'm not sure how to improve that test to be more robust.
The naming may also be helpful to debug regressions if this change exposes weaknesses in the reassociation pass for example.
show more ...
|
Revision tags: llvmorg-10.0.1-rc1 |
|
#
352fef3f |
| 21-Apr-2020 |
Roman Lebedev <lebedev.ri@gmail.com> |
[InstCombine] Negator - sink sinkable negations
Summary: As we have discussed previously (e.g. in D63992 / D64090 / [[ https://bugs.llvm.org/show_bug.cgi?id=42457 | PR42457 ]]), `sub` instruction ca
[InstCombine] Negator - sink sinkable negations
Summary: As we have discussed previously (e.g. in D63992 / D64090 / [[ https://bugs.llvm.org/show_bug.cgi?id=42457 | PR42457 ]]), `sub` instruction can almost be considered non-canonical. While we do convert `sub %x, C` -> `add %x, -C`, we sparsely do that for non-constants. But we should.
Here, i propose to interpret `sub %x, %y` as `add (sub 0, %y), %x` IFF the negation can be sinked into the `%y`
This has some potential to cause endless combine loops (either around PHI's, or if there are some opposite transforms). For former there's `-instcombine-negator-max-depth` option to mitigate it, should this expose any such issues For latter, if there are still any such opposing folds, we'd need to remove the colliding fold. In any case, reproducers welcomed!
Reviewers: spatel, nikic, efriedma, xbolva00
Reviewed By: spatel
Subscribers: xbolva00, mgorny, hiraditya, reames, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68408
show more ...
|
#
01bcc3e9 |
| 15-Apr-2020 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] prevent infinite loop with sub/abs of constant expression
PR45539: https://bugs.llvm.org/show_bug.cgi?id=45539
|
#
c7ff5b38 |
| 26-Mar-2020 |
Serge Pavlov <sepavloff@gmail.com> |
[FPEnv] Use single enum to represent rounding mode
Now compiler defines 5 sets of constants to represent rounding mode. These are:
1. `llvm::APFloatBase::roundingMode`. It specifies all 5 rounding
[FPEnv] Use single enum to represent rounding mode
Now compiler defines 5 sets of constants to represent rounding mode. These are:
1. `llvm::APFloatBase::roundingMode`. It specifies all 5 rounding modes defined by IEEE-754 and is used in `APFloat` implementation.
2. `clang::LangOptions::FPRoundingModeKind`. It specifies 4 of 5 IEEE-754 rounding modes and a special value for dynamic rounding mode. It is used in clang frontend.
3. `llvm::fp::RoundingMode`. Defines the same values as `clang::LangOptions::FPRoundingModeKind` but in different order. It is used to specify rounding mode in in IR and functions that operate IR.
4. Rounding mode representation used by `FLT_ROUNDS` (C11, 5.2.4.2.2p7). Besides constants for rounding mode it also uses a special value to indicate error. It is convenient to use in intrinsic functions, as it represents platform-independent representation for rounding mode. In this role it is used in some pending patches.
5. Values like `FE_DOWNWARD` and other, which specify rounding mode in library calls `fesetround` and `fegetround`. Often they represent bits of some control register, so they are target-dependent. The same names (not values) and a special name `FE_DYNAMIC` are used in `#pragma STDC FENV_ROUND`.
The first 4 sets of constants are target independent and could have the same numerical representation. It would simplify conversion between the representations. Also now `clang::LangOptions::FPRoundingModeKind` and `llvm::fp::RoundingMode` do not contain the value for IEEE-754 rounding direction `roundTiesToAway`, although it is supported natively on some targets.
This change defines all the rounding mode type via one `llvm::RoundingMode`, which also contains rounding mode for IEEE rounding direction `roundTiesToAway`.
Differential Revision: https://reviews.llvm.org/D77379
show more ...
|
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4 |
|
#
d871ef4e |
| 10-Mar-2020 |
Simon Moll <simon.moll@emea.nec.com> |
[instcombine] remove fsub to fneg hacks; only emit fneg
Summary: Rewrite the fsub-0.0 idiom to fneg and always emit fneg for fp negation. This also extends the scalarization cost in instcombine for
[instcombine] remove fsub to fneg hacks; only emit fneg
Summary: Rewrite the fsub-0.0 idiom to fneg and always emit fneg for fp negation. This also extends the scalarization cost in instcombine for unary operators to result in the same IR rewrites for fneg as for the idiom.
Reviewed By: cameron.mcinally
Differential Revision: https://reviews.llvm.org/D75467
show more ...
|
#
c8c14d97 |
| 10-Mar-2020 |
Florian Hahn <flo@fhahn.com> |
[InstCombine] Support vectors in SimplifyAddWithRemainder.
SimplifyAddWithRemainder currently also matches for vector types, but tries to create an integer constant, which causes a crash.
By using
[InstCombine] Support vectors in SimplifyAddWithRemainder.
SimplifyAddWithRemainder currently also matches for vector types, but tries to create an integer constant, which causes a crash.
By using Constant::getIntegerValue() we can support both the scalar and vector cases.
The 2 added test cases crash without the fix.
Reviewers: spatel, lebedev.ri
Reviewed By: spatel, lebedev.ri
Differential Revision: https://reviews.llvm.org/D75906
show more ...
|
#
1badf7c3 |
| 06-Mar-2020 |
Roman Lebedev <lebedev.ri@gmail.com> |
[InstComine] Forego of one-use check in `(X - (X & Y)) --> (X & ~Y)` if Y is a constant
Summary: This is potentially more friendly for further optimizations, analysies, e.g.: https://godbolt.org
[InstComine] Forego of one-use check in `(X - (X & Y)) --> (X & ~Y)` if Y is a constant
Summary: This is potentially more friendly for further optimizations, analysies, e.g.: https://godbolt.org/z/G24anE
This resolves phase-ordering bug that was introduced in D75145 for https://godbolt.org/z/2gBwF2 https://godbolt.org/z/XvgSua
Reviewers: spatel, nikic, dmgreen, xbolva00
Reviewed By: nikic, xbolva00
Subscribers: hiraditya, zzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75757
show more ...
|
Revision tags: llvmorg-10.0.0-rc3 |
|
#
ddd11273 |
| 27-Feb-2020 |
Simon Moll <simon.moll@emea.nec.com> |
Remove BinaryOperator::CreateFNeg
Use UnaryOperator::CreateFNeg instead.
Summary: With the introduction of the native fneg instruction, the fsub -0.0, %x idiom is obsolete. This patch makes LLVM em
Remove BinaryOperator::CreateFNeg
Use UnaryOperator::CreateFNeg instead.
Summary: With the introduction of the native fneg instruction, the fsub -0.0, %x idiom is obsolete. This patch makes LLVM emit fneg instead of the idiom in all places.
Reviewed By: cameron.mcinally
Differential Revision: https://reviews.llvm.org/D75130
show more ...
|
Revision tags: llvmorg-10.0.0-rc2 |
|
#
5a8819b2 |
| 03-Feb-2020 |
Nikita Popov <nikita.ppv@gmail.com> |
[InstCombine] Use replaceOperand() in more places
This is a followup to D73803, which uses the replaceOperand() helper in more places.
This should be NFC apart from changes to worklist order.
Diff
[InstCombine] Use replaceOperand() in more places
This is a followup to D73803, which uses the replaceOperand() helper in more places.
This should be NFC apart from changes to worklist order.
Differential Revision: https://reviews.llvm.org/D73919
show more ...
|
Revision tags: llvmorg-10.0.0-rc1 |
|
#
242fed9d |
| 27-Jan-2020 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] convert fsub nsz with fneg operand to -(X + Y)
This was noted in D72521 - we need to match fneg specifically to consistently handle that pattern along with (-0.0 - X).
|
#
bcfa0f59 |
| 23-Jan-2020 |
Nikita Popov <nikita.ppv@gmail.com> |
[InstCombine] Move negation handling into freelyNegateValue()
Followup to D72978. This moves existing negation handling in InstCombine into freelyNegateValue(), which make it composable. In particul
[InstCombine] Move negation handling into freelyNegateValue()
Followup to D72978. This moves existing negation handling in InstCombine into freelyNegateValue(), which make it composable. In particular, root negations of div/zext/sext/ashr/lshr/sub can now always be performed through a shl/trunc as well.
Differential Revision: https://reviews.llvm.org/D73288
show more ...
|
#
0b83c5a7 |
| 18-Jan-2020 |
Nikita Popov <nikita.ppv@gmail.com> |
[InstCombine] Combine neg of shl of sub (PR44529)
Fixes https://bugs.llvm.org/show_bug.cgi?id=44529. We already have a combine to sink a negation through a left-shift, but it currently only works if
[InstCombine] Combine neg of shl of sub (PR44529)
Fixes https://bugs.llvm.org/show_bug.cgi?id=44529. We already have a combine to sink a negation through a left-shift, but it currently only works if the shift operand is negatable without creating any instructions. This patch introduces freelyNegateValue() as a more powerful extension of dyn_castNegVal(), which allows negating a value as long as this doesn't end up increasing instruction count. Specifically, this patch adds support for negating A-B to B-A.
This mechanism could in the future be extended to handle general negation chains that a) start at a proper 0-X negation and b) only require one operand to be freely negatable. This would end up as a weaker form of D68408 aimed at the most obviously profitable subset that eliminates a negation entirely.
Differential Revision: https://reviews.llvm.org/D72978
show more ...
|