|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2 |
|
| #
4ab40eca |
| 03-Oct-2022 |
Bjorn Pettersson <bjorn.a.pettersson@ericsson.com> |
[test][InstCombine] Update some test cases to use opaque pointers
These tests cases were converted using the script at https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34
Differential Re
[test][InstCombine] Update some test cases to use opaque pointers
These tests cases were converted using the script at https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34
Differential Revision: https://reviews.llvm.org/D135094
show more ...
|
|
Revision tags: llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1 |
|
| #
acdc419c |
| 04-Feb-2022 |
Bjorn Pettersson <bjorn.a.pettersson@ericsson.com> |
[test] Use -passes=instcombine instead of -instcombine in lots of tests. NFC
Another step moving away from the deprecated syntax of specifying pass pipeline in opt.
Differential Revision: https://r
[test] Use -passes=instcombine instead of -instcombine in lots of tests. NFC
Another step moving away from the deprecated syntax of specifying pass pipeline in opt.
Differential Revision: https://reviews.llvm.org/D119081
show more ...
|
|
Revision tags: llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3 |
|
| #
c23aefd7 |
| 31-Aug-2020 |
Roman Lebedev <lebedev.ri@gmail.com> |
[NFC][InstCombine] visitPHINode(): cleanup PHI CSE instruction replacement
As @nikic is pointing out in https://reviews.llvm.org/rGbf21ce7b908e#inline-4647 this must be sufficient otherwise `Elimina
[NFC][InstCombine] visitPHINode(): cleanup PHI CSE instruction replacement
As @nikic is pointing out in https://reviews.llvm.org/rGbf21ce7b908e#inline-4647 this must be sufficient otherwise `EliminateDuplicatePHINodes()` would have hit issues with it already.
show more ...
|
| #
bf21ce7b |
| 29-Aug-2020 |
Roman Lebedev <lebedev.ri@gmail.com> |
[InstCombine] Take 3: Perform trivial PHI CSE
The original take 1 was 6102310d814ad73eab60a88b21dd70874f7a056f, which taught InstSimplify to do that, which seemed better at time, since we got EarlyC
[InstCombine] Take 3: Perform trivial PHI CSE
The original take 1 was 6102310d814ad73eab60a88b21dd70874f7a056f, which taught InstSimplify to do that, which seemed better at time, since we got EarlyCSE support for free.
However, it was proven that we can not do that there, the simplified-to PHI would not be reachable from the original PHI, and that is not something InstSimplify is allowed to do, as noted in the commit ed90f15efb40d26b5d3ead3bb8e9e284218e0186 that reverted it: > It appears to cause compilation non-determinism and caused stage3 mismatches.
Then there was take 2 3e69871ab5a66fb55913a2a2f5e7f5b42899a4c9, which was InstCombine-specific, but it again showed stage2-stage3 differences, and reverted in bdaa3f86a040b138c58de41d73d35b76fdec1380. This is quite alarming.
Here, let's try to change how we find existing PHI candidate: due to the worklist order, and the way PHI nodes are inserted (it may be inserted as the first one, or maybe not), let's look at *all* PHI nodes in the block.
Effects on vanilla llvm test-suite + RawSpeed: ``` | statistic name | baseline | proposed | Δ | % | \|%\| | |----------------------------------------------------|-----------|-----------|-------:|---------:|---------:| | asm-printer.EmittedInsts | 7942329 | 7942457 | 128 | 0.00% | 0.00% | | assembler.ObjectBytes | 254295632 | 254312480 | 16848 | 0.01% | 0.01% | | correlated-value-propagation.NumPhis | 18412 | 18347 | -65 | -0.35% | 0.35% | | early-cse.NumCSE | 2183283 | 2183267 | -16 | 0.00% | 0.00% | | early-cse.NumSimplify | 550105 | 541842 | -8263 | -1.50% | 1.50% | | instcombine.NumAggregateReconstructionsSimplified | 73 | 4506 | 4433 | 6072.60% | 6072.60% | | instcombine.NumCombined | 3640311 | 3644419 | 4108 | 0.11% | 0.11% | | instcombine.NumDeadInst | 1778204 | 1783205 | 5001 | 0.28% | 0.28% | | instcombine.NumPHICSEs | 0 | 22490 | 22490 | 0.00% | 0.00% | | instcombine.NumWorklistIterations | 2023272 | 2024400 | 1128 | 0.06% | 0.06% | | instcount.NumCallInst | 1758395 | 1758802 | 407 | 0.02% | 0.02% | | instcount.NumInvokeInst | 59478 | 59502 | 24 | 0.04% | 0.04% | | instcount.NumPHIInst | 330557 | 330545 | -12 | 0.00% | 0.00% | | instcount.TotalBlocks | 1077138 | 1077220 | 82 | 0.01% | 0.01% | | instcount.TotalFuncs | 101442 | 101441 | -1 | 0.00% | 0.00% | | instcount.TotalInsts | 8831946 | 8832606 | 660 | 0.01% | 0.01% | | simplifycfg.NumHoistCommonCode | 24186 | 24187 | 1 | 0.00% | 0.00% | | simplifycfg.NumInvokes | 4300 | 4410 | 110 | 2.56% | 2.56% | | simplifycfg.NumSimpl | 1019813 | 999767 | -20046 | -1.97% | 1.97% | ``` So it fires 22490 times, which is less than ~24k the take 1 did, but more than what take 2 did (22228 times) . It allows foldAggregateConstructionIntoAggregateReuse() to actually work after PHI-of-extractvalue folds did their thing. Previously SimplifyCFG would have done this PHI CSE, of all places. Additionally, allows some more `invoke`->`call` folds to happen (+110, +2.56%).
All in all, expectedly, this catches less things overall, but all the motivational cases are still caught, so all good.
show more ...
|
| #
bdaa3f86 |
| 29-Aug-2020 |
Roman Lebedev <lebedev.ri@gmail.com> |
Revert "[InstCombine] Take 2: Perform trivial PHI CSE"
While the original variant with doing this in InstSimplify (rightfully) caused questions and ultimately was detected to be a culprit of stage2-
Revert "[InstCombine] Take 2: Perform trivial PHI CSE"
While the original variant with doing this in InstSimplify (rightfully) caused questions and ultimately was detected to be a culprit of stage2-stage3 mismatch, it was expected that InstCombine-based implementation would be fine.
But apparently it's not, as http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24095/steps/compare-compilers/logs/stdio suggests.
Which suggests that somewhere in InstCombine there is a loop over nondeterministically sorted container, which causes different worklist ordering.
This reverts commit 3e69871ab5a66fb55913a2a2f5e7f5b42899a4c9.
show more ...
|
| #
3e69871a |
| 29-Aug-2020 |
Roman Lebedev <lebedev.ri@gmail.com> |
[InstCombine] Take 2: Perform trivial PHI CSE
The original take was 6102310d814ad73eab60a88b21dd70874f7a056f, which taught InstSimplify to do that, which seemed better at time, since we got EarlyCSE
[InstCombine] Take 2: Perform trivial PHI CSE
The original take was 6102310d814ad73eab60a88b21dd70874f7a056f, which taught InstSimplify to do that, which seemed better at time, since we got EarlyCSE support for free.
However, it was proven that we can not do that there, the simplified-to PHI would not be reachable from the original PHI, and that is not something InstSimplify is allowed to do, as noted in the commit ed90f15efb40d26b5d3ead3bb8e9e284218e0186 that reverted it : > It appears to cause compilation non-determinism and caused stage3 mismatches.
However InstCombine already does many different optimizations, so it should be a safe place to do it here.
Note that we still can't just compare incoming values ranges, because there is no guarantee that these PHI's we'd simplify to were already re-visited and sorted. However coming up with a test is problematic.
Effects on vanilla llvm test-suite + RawSpeed: ``` | statistic name | baseline | proposed | Δ | % | |%| | |----------------------------------------------------|-----------|-----------|-------:|---------:|---------:| | instcombine.NumPHICSEs | 0 | 22228 | 22228 | 0.00% | 0.00% | | asm-printer.EmittedInsts | 7942329 | 7942456 | 127 | 0.00% | 0.00% | | assembler.ObjectBytes | 254295632 | 254313792 | 18160 | 0.01% | 0.01% | | early-cse.NumCSE | 2183283 | 2183272 | -11 | 0.00% | 0.00% | | early-cse.NumSimplify | 550105 | 541842 | -8263 | -1.50% | 1.50% | | instcombine.NumAggregateReconstructionsSimplified | 73 | 4506 | 4433 | 6072.60% | 6072.60% | | instcombine.NumCombined | 3640311 | 3666911 | 26600 | 0.73% | 0.73% | | instcombine.NumDeadInst | 1778204 | 1783318 | 5114 | 0.29% | 0.29% | | instcount.NumCallInst | 1758395 | 1758804 | 409 | 0.02% | 0.02% | | instcount.NumInvokeInst | 59478 | 59502 | 24 | 0.04% | 0.04% | | instcount.NumPHIInst | 330557 | 330549 | -8 | 0.00% | 0.00% | | instcount.TotalBlocks | 1077138 | 1077221 | 83 | 0.01% | 0.01% | | instcount.TotalFuncs | 101442 | 101441 | -1 | 0.00% | 0.00% | | instcount.TotalInsts | 8831946 | 8832611 | 665 | 0.01% | 0.01% | | simplifycfg.NumInvokes | 4300 | 4410 | 110 | 2.56% | 2.56% | | simplifycfg.NumSimpl | 1019813 | 999740 | -20073 | -1.97% | 1.97% | ``` So it fires ~22k times, which is less than ~24k the take 1 did. It allows foldAggregateConstructionIntoAggregateReuse() to actually work after PHI-of-extractvalue folds did their thing. Previously SimplifyCFG would have done this PHI CSE, of all places. Additionally, allows some more `invoke`->`call` folds to happen (+110, +2.56%).
All in all, expectedly, this catches less things overall, but all the motivational cases are still caught, so all good.
show more ...
|
| #
3ba83f2d |
| 29-Aug-2020 |
Roman Lebedev <lebedev.ri@gmail.com> |
[NFC][InstCombine] Add tests for PHI CSE
As discussed in https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/824235.html even though it seems worthwhile doing so in InstSimplify, we r
[NFC][InstCombine] Add tests for PHI CSE
As discussed in https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/824235.html even though it seems worthwhile doing so in InstSimplify, we really can't do that there, because the other PHI wouldn't be def-reachable from the original PHI.
So ignoring whether or not EarlyCSE should also know to do this, InstCombine is the place.
show more ...
|