#
0807e949 |
| 21-Sep-2018 |
Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com> |
revert changes from r342722
"[AMDGPU] lower-switch in preISel as a workaround for legacy DA"
This broke regression tests. The first breakage was noticed here: http://lab.llvm.org:8011/builders/lld-
revert changes from r342722
"[AMDGPU] lower-switch in preISel as a workaround for legacy DA"
This broke regression tests. The first breakage was noticed here: http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/23549
llvm-svn: 342743
show more ...
|
#
2de7653f |
| 21-Sep-2018 |
Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com> |
[AMDGPU] lower-switch in preISel as a workaround for legacy DA
Summary: The default target of the switch instruction may sometimes be an "unreachable" block, when it is guaranteed that one of the ca
[AMDGPU] lower-switch in preISel as a workaround for legacy DA
Summary: The default target of the switch instruction may sometimes be an "unreachable" block, when it is guaranteed that one of the cases is always taken. The dominator tree concludes that such a switch instruction does not have an immediate post dominator. This confuses divergence analysis, which is unable to propagate sync dependence to the targets of the switch instruction.
As a workaround, the AMDGPU target now invokes lower-switch as a preISel pass. LowerSwitch is designed to handle the unreachable default target correctly, allowing the divergence analysis to locate the correct immediate dominator of the now-lowered switch.
Reviewers: arsenm, nhaehnle
Reviewed By: nhaehnle
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits, simoll
Differential Revision: https://reviews.llvm.org/D52221
llvm-svn: 342722
show more ...
|
Revision tags: llvmorg-7.0.0, llvmorg-7.0.0-rc3 |
|
#
35617ed4 |
| 30-Aug-2018 |
Nicolai Haehnle <nhaehnle@gmail.com> |
[NFC] Rename the DivergenceAnalysis to LegacyDivergenceAnalysis
Summary: This is patch 1 of the new DivergenceAnalysis (https://reviews.llvm.org/D50433).
The purpose of this patch is to free up the
[NFC] Rename the DivergenceAnalysis to LegacyDivergenceAnalysis
Summary: This is patch 1 of the new DivergenceAnalysis (https://reviews.llvm.org/D50433).
The purpose of this patch is to free up the name DivergenceAnalysis for the new generic implementation. The generic implementation class will be shared by specialized divergence analysis classes.
Patch by: Simon Moll
Reviewed By: nhaehnle
Subscribers: jvesely, jholewinski, arsenm, nhaehnle, mgorny, jfb, llvm-commits
Differential Revision: https://reviews.llvm.org/D50434
Change-Id: Ie8146b11be2c50d5312f30e11c7a3036a15b48cb llvm-svn: 341071
show more ...
|
Revision tags: llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3 |
|
#
f209649d |
| 14-Jun-2018 |
Hiroshi Inoue <inouehrs@jp.ibm.com> |
[NFC] fix trivial typos in comments
llvm-svn: 334687
|
Revision tags: llvmorg-6.0.1-rc2 |
|
#
5f915461 |
| 23-May-2018 |
Changpeng Fang <changpeng.fang@gmail.com> |
StructurizeCFG: Adjust the loop depth for a subregion to order the nodes correctly
Summary: StructurizeCFG::orderNodes basically uses a reverse post-order (RPO) traversal of the region list to get
StructurizeCFG: Adjust the loop depth for a subregion to order the nodes correctly
Summary: StructurizeCFG::orderNodes basically uses a reverse post-order (RPO) traversal of the region list to get the order. The only problem with it is that sometimes backedges for outer loops will be visited before backedges for inner loops. To solve this problem, a loop depth based approach has been used to make sure all blocks in this loop has been visited before moving on to outer loop.
However, we found a problem for a SubRegion which is a loop itself:
--> BB1 --> BB2 --> BB3 -->
In this case, BB2 is a SubRegion (loop), and thus its loopdepth is different than that of BB1 and BB3. This fact will lead BB2 to be placed in the wrong order.
In this work, we treat the SubRegion as a special case and use its exit block to determine the loop and its depth to guard the sorting.
Reviewers: arsenm, jlebar
Differential Revision: https://reviews.llvm.org/D46912
llvm-svn: 333111
show more ...
|
#
3c5fd145 |
| 15-May-2018 |
Marek Olsak <marek.olsak@amd.com> |
StructurizeCFG: fix inverting conditions
Author: Samuel Pitoiset
Without this patch, it appears to me that we are selecting the wrong operand when inverting conditions. In the attached test, it wil
StructurizeCFG: fix inverting conditions
Author: Samuel Pitoiset
Without this patch, it appears to me that we are selecting the wrong operand when inverting conditions. In the attached test, it will select %tmp3 instead of %tmp4. To fix it, just use 'A' as everywhere.
This fixes a regression introduced by "[PatternMatch] define m_Not using m_Xor and cst_pred_ty"
https://reviews.llvm.org/D46351
llvm-svn: 332403
show more ...
|
#
d34e60ca |
| 14-May-2018 |
Nicola Zaghen <nicola.zaghen@imgtec.com> |
Rename DEBUG macro to LLVM_DEBUG. The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/
Rename DEBUG macro to LLVM_DEBUG. The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it.
In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one.
Differential Revision: https://reviews.llvm.org/D43624
llvm-svn: 332240
show more ...
|
#
4dfcc4a7 |
| 01-May-2018 |
Adrian Prantl <aprantl@apple.com> |
Remove @brief commands from doxygen comments, too.
This is a follow-up to r331272.
We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into
Remove @brief commands from doxygen comments, too.
This is a follow-up to r331272.
We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all.
Patch produced by for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done
https://reviews.llvm.org/D46290
llvm-svn: 331275
show more ...
|
#
5f8f34e4 |
| 01-May-2018 |
Adrian Prantl <aprantl@apple.com> |
Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they ar
Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all.
Patch produced by
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46290
llvm-svn: 331272
show more ...
|
Revision tags: llvmorg-6.0.1-rc1 |
|
#
eb7311ff |
| 04-Apr-2018 |
Nicolai Haehnle <nhaehnle@gmail.com> |
StructurizeCFG: Test for branch divergence correctly
Fixes cases like the new test @nonuniform. In that test, %cc itself is a uniform value; however, when reading it after the end of the loop in bas
StructurizeCFG: Test for branch divergence correctly
Fixes cases like the new test @nonuniform. In that test, %cc itself is a uniform value; however, when reading it after the end of the loop in basic block %if, its value is effectively non-uniform, so the branch is non-uniform.
This problem was encountered in https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change in itself is not sufficient to fix that bug, as there is another issue in the AMDGPU backend.
As discovered after committing an earlier version of this change, this exposes a subtle interaction between this pass and DivergenceAnalysis: since we remove and re-create branch instructions, we can no longer rely on DivergenceAnalysis for branches in subregions that were already processed by the pass.
Explicitly remove branch instructions from DivergenceAnalysis to avoid dangling pointers as a matter of defensive programming, and change how we detect non-uniform subregions.
Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4
Differential Revision: https://reviews.llvm.org/D43743
llvm-svn: 329165
show more ...
|
Revision tags: llvmorg-5.0.2, llvmorg-5.0.2-rc2 |
|
#
a373d18e |
| 28-Mar-2018 |
David Blaikie <dblaikie@gmail.com> |
Transforms: Introduce Transforms/Utils.h rather than spreading the declarations amongst Scalar.h and IPO.h
Fixes layering - Transforms/Utils shouldn't depend on including a Scalar or IPO header, bec
Transforms: Introduce Transforms/Utils.h rather than spreading the declarations amongst Scalar.h and IPO.h
Fixes layering - Transforms/Utils shouldn't depend on including a Scalar or IPO header, because Scalar and IPO depend on Utils.
llvm-svn: 328717
show more ...
|
Revision tags: llvmorg-5.0.2-rc1, llvmorg-6.0.0 |
|
#
e4e1de60 |
| 24-Feb-2018 |
Adam Nemet <anemet@apple.com> |
Revert "StructurizeCFG: Test for branch divergence correctly"
This reverts commit r325881.
Breaks many bots
llvm-svn: 326037
|
Revision tags: llvmorg-6.0.0-rc3 |
|
#
43c1115c |
| 23-Feb-2018 |
Nicolai Haehnle <nhaehnle@gmail.com> |
StructurizeCFG: Test for branch divergence correctly
Summary: This fixes cases like the new test @nonuniform. In that test, %cc itself is a uniform value; however, when reading it after the end of t
StructurizeCFG: Test for branch divergence correctly
Summary: This fixes cases like the new test @nonuniform. In that test, %cc itself is a uniform value; however, when reading it after the end of the loop in basic block %if, its value is effectively non-uniform.
This problem was encountered in https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change in itself is not sufficient to fix that bug, as there is another issue in the AMDGPU backend.
Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4
Reviewers: arsenm, rampitec, jlebar
Subscribers: wdng, tpr, llvm-commits
Differential Revision: https://reviews.llvm.org/D40546
llvm-svn: 325881
show more ...
|
Revision tags: llvmorg-6.0.0-rc2 |
|
#
4afb64e4 |
| 24-Jan-2018 |
Nicolai Haehnle <nhaehnle@gmail.com> |
Revert r321751, "StructurizeCFG: Fix broken backedge detection"
It causes regressions in various OpenGL test suites.
Keep the test cases introduced by r321751 as XFAIL, and add a test case for the
Revert r321751, "StructurizeCFG: Fix broken backedge detection"
It causes regressions in various OpenGL test suites.
Keep the test cases introduced by r321751 as XFAIL, and add a test case for the regression.
Change-Id: I90b4cc354f68cebe5fcef1f2422dc8fe1c6d3514 Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=36015 llvm-svn: 323355
show more ...
|
Revision tags: llvmorg-6.0.0-rc1 |
|
#
8070882b |
| 03-Jan-2018 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
StructurizeCFG: Fix broken backedge detection
The work order was changed in r228186 from SCC order to RPO with an arbitrary sorting function. The sorting function attempted to move inner loop nodes
StructurizeCFG: Fix broken backedge detection
The work order was changed in r228186 from SCC order to RPO with an arbitrary sorting function. The sorting function attempted to move inner loop nodes earlier. This was was apparently relying on an assumption that every block in a given loop / the same loop depth would be seen before visiting another loop. In the broken testcase, a block outside of the loop was encountered before moving onto another block in the same loop. The testcase would then structurize such that one blocks unconditional successor could never be reached.
Revert to plain RPO for the analysis phase. This fixes detecting edges as backedges that aren't really.
The processing phase does use another visited set, and I'm unclear on whether the order there is as important. An arbitrary order doesn't work, and triggers some infinite loops. The reversed RPO list seems to work and is closer to the order that was used before, minus the arbitary custom sorting.
A few of the changed tests now produce smaller code, and a few are slightly worse looking.
llvm-svn: 321751
show more ...
|
#
8dcfa137 |
| 29-Dec-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
StructurizeCFG: Use phi iterator range
llvm-svn: 321568
|
Revision tags: llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1 |
|
#
99241d75 |
| 20-Oct-2017 |
Eugene Zelenko <eugene.zelenko@gmail.com> |
[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316241
|
Revision tags: llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1 |
|
#
713b5ba2 |
| 09-Jul-2017 |
Hiroshi Inoue <inouehrs@jp.ibm.com> |
fix trivial typos; NFC
sucessor -> successor
llvm-svn: 307488
|
Revision tags: llvmorg-4.0.1, llvmorg-4.0.1-rc3 |
|
#
6bda14b3 |
| 06-Jun-2017 |
Chandler Carruth <chandlerc@gmail.com> |
Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line
Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days.
I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch.
This patch is *entirely* mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files.
Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again).
llvm-svn: 304787
show more ...
|
Revision tags: llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1 |
|
#
4474652c |
| 24-Apr-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Revert "StructurizeCFG: Directly invert cmp instructions"
This reverts commit r300732. This breaks a few tests. I think the problem is related to adding more uses of the condition that don't yet exi
Revert "StructurizeCFG: Directly invert cmp instructions"
This reverts commit r300732. This breaks a few tests. I think the problem is related to adding more uses of the condition that don't yet exist at this point.
llvm-svn: 301242
show more ...
|
#
d3406bc4 |
| 19-Apr-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
StructurizeCFG: Directly invert cmp instructions
The most common case for a branch condition is a single use compare. Directly invert the branch predicate rather than adding a lot of xor i1 true whi
StructurizeCFG: Directly invert cmp instructions
The most common case for a branch condition is a single use compare. Directly invert the branch predicate rather than adding a lot of xor i1 true which the DAG will have to fold later.
This produces nicer to read structurizer output.
This produces some random changes in codegen due to the DAG swapping branch conditions itself, and then does a poor job of dealing with those inverts.
llvm-svn: 300732
show more ...
|
Revision tags: llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3, llvmorg-4.0.0-rc2, llvmorg-4.0.0-rc1 |
|
#
0668cd2c |
| 10-Jan-2017 |
Serge Pavlov <sepavloff@gmail.com> |
[StructurizeCfg] Update dominator info.
In some cases StructurizeCfg updates root node, but dominator info remains unchanges, it causes crash when expensive checks are enabled. To cope with this pro
[StructurizeCfg] Update dominator info.
In some cases StructurizeCfg updates root node, but dominator info remains unchanges, it causes crash when expensive checks are enabled. To cope with this problem a new method was added to DominatorTreeBase that allows adding new root nodes, it is called in StructurizeCfg to put dominator tree in sync.
This change fixes PR27488.
Differential Revision: https://reviews.llvm.org/D28114
llvm-svn: 291530
show more ...
|
Revision tags: llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2 |
|
#
96e29155 |
| 29-Nov-2016 |
Justin Lebar <jlebar@google.com> |
[StructurizeCFG] Fix infinite loop in rebuildSSA.
Michel Dänzer reported that r288051, "[StructurizeCFG] Use range-based for loops", introduced a bug into rebuildSSA, wherein we were iterating over
[StructurizeCFG] Fix infinite loop in rebuildSSA.
Michel Dänzer reported that r288051, "[StructurizeCFG] Use range-based for loops", introduced a bug into rebuildSSA, wherein we were iterating over an instruction's use list while modifying it, without taking care to do this correctly.
llvm-svn: 288200
show more ...
|
Revision tags: llvmorg-3.9.1-rc1 |
|
#
3aec10ca |
| 28-Nov-2016 |
Justin Lebar <jlebar@google.com> |
[StructurizeCFG] Use range-based for loops.
Reviewers: arsenm
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D27000
llvm-svn: 288051
|
#
62c20d8b |
| 28-Nov-2016 |
Justin Lebar <jlebar@google.com> |
[StructurizeCFG] Refactor NearestCommonDominator.
Summary: As far as I can tell, doing our own computations in NearestCommonDominator is a false optimization -- DomTree will build up what appears to
[StructurizeCFG] Refactor NearestCommonDominator.
Summary: As far as I can tell, doing our own computations in NearestCommonDominator is a false optimization -- DomTree will build up what appears to be exactly this data when it decides it's worthwhile. Moreover, by building the cache ourselves, we cannot take advantage of the cache that the domtree might have available.
In addition, I am not convinced of the correctness of the original code. In particular, setting ResultIndex = 1 on the first addBlock instead of setting it to 0 is quite fishy. Similarly, it's not clear to me that setting IndexMap[Node] = 0 for every node as we walk up the tree finding a common parent is correct. But rather than ponder over these questions, I'd rather just make the code do the obviously-correct thing.
This patch also changes the NearestCommonDominator API a bit, improving the names and getting rid of the boolean parameter in addBlock -- see http://jlebar.com/2011/12/16/Boolean_parameters_to_API_functions_considered_harmful..html
Reviewers: arsenm
Subscribers: aemerson, wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D26998
llvm-svn: 288050
show more ...
|