MachineBlockPlacement.cpp - OpenGrok history log for /llvm-project/llvm/lib/CodeGen/MachineBlockPlacement.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
# 1ccca9e6	01-Dec-2015	Cong Hou <congh@google.com>	Fix a bug in MachineBlockPlacement that may cause assertion failure during BranchProbability construction. The root cause is the rounding behavior in BranchProbability construction. We may consider Fix a bug in MachineBlockPlacement that may cause assertion failure during BranchProbability construction. The root cause is the rounding behavior in BranchProbability construction. We may consider to use truncation instead in the future. llvm-svn: 254356 show more ...
Revision tags: llvmorg-3.7.1
# fa1917c6	01-Dec-2015	Cong Hou <congh@google.com>	Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces. The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New in Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces. The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes (http://reviews.llvm.org/D13908). 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights (http://reviews.llvm.org/D14361). 3. Use new interfaces in all other passes. 4. Remove old interfaces. This patch is 3+4 above. In this patch, MBB won't provide weight-based interfaces any more, which are totally replaced by probability-based ones. The interface addSuccessor() is redesigned so that the default probability is unknown. We allow unknown probabilities but don't allow using it together with known probabilities in successor list. That is to say, we either have a list of successors with all known probabilities, or all unknown probabilities. In the latter case, we assume each successor has 1/N probability where N is the number of successors. An assertion checks if the user is attempting to add a successor with the disallowed mixed use as stated above. This can help us catch many misuses. All uses of weight-based interfaces are now updated to use probability-based ones. Differential revision: http://reviews.llvm.org/D14973 llvm-svn: 254348 show more ...
Revision tags: llvmorg-3.7.1-rc2
# 41cf1a5d	18-Nov-2015	Cong Hou <congh@google.com>	Improving edge probabilities computation when choosing the best successor in machine block placement. When looking for the best successor from the outer loop for a block belonging to an inner loop, Improving edge probabilities computation when choosing the best successor in machine block placement. When looking for the best successor from the outer loop for a block belonging to an inner loop, the edge probability computation can be improved so that edges in the inner loop are ignored. For example, suppose we are building chains for the non-loop part of the following code, and looking for B1's best successor. Assume the true body is very hot, then B3 should be the best candidate. However, because of the existence of the back edge from B1 to B0, the probability from B1 to B3 can be very small, preventing B3 to be its successor. In this patch, when computing the probability of the edge from B1 to B3, the weight on the back edge B1->B0 is ignored, so that B1->B3 will have 100% probability. if (...) do { B0; ... // some branches B1; } while(...); else B2; B3; Differential revision: http://reviews.llvm.org/D10825 llvm-svn: 253414 show more ...
Revision tags: llvmorg-3.7.1-rc1
# b90b9e05	02-Nov-2015	Cong Hou <congh@google.com>	In MachineBlockPlacement, filter cold blocks off the loop chain when profile data is available. In the current BB placement algorithm, a loop chain always contains all loop blocks. This has a drawba In MachineBlockPlacement, filter cold blocks off the loop chain when profile data is available. In the current BB placement algorithm, a loop chain always contains all loop blocks. This has a drawback that cold blocks in the loop may be inserted on a hot function path, hence increasing branch cost and also reducing icache locality. Consider a simple example shown below: A \| B⇆C \| D When B->C is quite cold, the best BB-layout should be A,B,D,C. But the current implementation produces A,C,B,D. This patch filters those cold blocks off from the loop chain by comparing the ratio: LoopBBFreq / LoopFreq to 20%: if it is less than 20%, we don't include this BB to the loop chain. Here LoopFreq is the frequency of the loop when we reduce the loop into a single node. In general we have more cold blocks when the loop has few iterations. And vice versa. Differential revision: http://reviews.llvm.org/D11662 llvm-svn: 251833 show more ...
# 7745dbc5	19-Oct-2015	Cong Hou <congh@google.com>	Enhance loop rotation with existence of profile data in MachineBlockPlacement pass. Currently, in MachineBlockPlacement pass the loop is rotated to let the best exit to be the last BB in the loop ch Enhance loop rotation with existence of profile data in MachineBlockPlacement pass. Currently, in MachineBlockPlacement pass the loop is rotated to let the best exit to be the last BB in the loop chain, to maximize the fall-through from the loop to outside. With profile data, we can determine the cost in terms of missed fall through opportunities when rotating a loop chain and select the best rotation. Basically, there are three kinds of cost to consider for each rotation: 1. The possibly missed fall through edge (if it exists) from BB out of the loop to the loop header. 2. The possibly missed fall through edges (if they exist) from the loop exits to BB out of the loop. 3. The missed fall through edge (if it exists) from the last BB to the first BB in the loop chain. Therefore, the cost for a given rotation is the sum of costs listed above. We select the best rotation with the smallest cost. This is only for PGO mode when we have more precise edge frequencies. Differential revision: http://reviews.llvm.org/D10717 llvm-svn: 250754 show more ...
# 6ac07fd2	09-Oct-2015	Duncan P. N. Exon Smith <dexonsmith@apple.com>	CodeGen: Remove implicit iterator conversions from MBB.cpp Remove implicit ilist iterator conversions from MachineBasicBlock.cpp. I've also added an overload of `splice()` that takes a pointer, sin CodeGen: Remove implicit iterator conversions from MBB.cpp Remove implicit ilist iterator conversions from MachineBasicBlock.cpp. I've also added an overload of `splice()` that takes a pointer, since it's a natural API. This is similar to the overloads I added for `remove()` and `erase()` in r249867. llvm-svn: 249883 show more ...
# 77ec0770	16-Sep-2015	Craig Topper <craig.topper@gmail.com>	Fix a spelling error in the description of a statistic. NFC llvm-svn: 247771
# 0e288234	27-Aug-2015	Reid Kleckner <rnk@google.com>	[WinEH] Add some support for code generating catchpad We can now run 32-bit programs with empty catch bodies. The next step is to change PEI so that we get funclet prologues and epilogues. llvm-sv [WinEH] Add some support for code generating catchpad We can now run 32-bit programs with empty catch bodies. The next step is to change PEI so that we get funclet prologues and epilogues. llvm-svn: 246235 show more ...
Revision tags: llvmorg-3.7.0, llvmorg-3.7.0-rc4, llvmorg-3.7.0-rc3, studio-1.4
# ec105872	06-Aug-2015	Cong Hou <congh@google.com>	Revert r244154 which causes some build failure. See https://llvm.org/bugs/show_bug.cgi?id=24377. llvm-svn: 244239
# 36e7e52a	05-Aug-2015	Cong Hou <congh@google.com>	Record whether the weights on out-edges from a MBB are normalized. 1. Create a utility function normalizeEdgeWeights() in MachineBranchProbabilityInfo that normalizes a list of edge weights so that Record whether the weights on out-edges from a MBB are normalized. 1. Create a utility function normalizeEdgeWeights() in MachineBranchProbabilityInfo that normalizes a list of edge weights so that the sum of then can fit in uint32_t. 2. Provide an interface in MachineBasicBlock to normalize its successors' weights. 3. Add a flag in MachineBasicBlock that tracks whether its successors' weights are normalized. 4. Provide an overload of getSumForBlock that accepts a non-const pointer to a MBB so that it can force normalizing this MBB's successors' weights. 5. Update several uses of getSumForBlock() by eliminating the once needed weight scale. Differential Revision: http://reviews.llvm.org/D11442 llvm-svn: 244154 show more ...
# 924879ad	04-Aug-2015	Sanjay Patel <spatel@rotateright.com>	wrap OptSize and MinSize attributes for easier and consistent access (NFCI) Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of wrap OptSize and MinSize attributes for easier and consistent access (NFCI) Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of "or'ing" them together when optimizing just for size (-Os). Currently, we are not consistent about this and rely on a front-end to always set OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here that should be added as follow-on patches with regression tests. This patch is NFC-intended: it just replaces existing direct accesses of the attributes by the equivalent wrapper call. Differential Revision: http://reviews.llvm.org/D11734 llvm-svn: 243994 show more ...
Revision tags: llvmorg-3.7.0-rc2, llvmorg-3.7.0-rc1
# 0881fc11	15-Jul-2015	Cong Hou <congh@google.com>	Test commit. This is a test commit (one blank line deleted). llvm-svn: 242308
Revision tags: llvmorg-3.6.2, llvmorg-3.6.2-rc1
# f00654e3	23-Jun-2015	Alexander Kornienko <alexfh@google.com>	Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC) Apparently, the style needs to be agreed upon first. llvm-svn: 240390
# 70bc5f13	19-Jun-2015	Alexander Kornienko <alexfh@google.com>	Fixed/added namespace ending comments using clang-tidy. NFC The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-,llvm-namespace-c Fixed/added namespace ending comments using clang-tidy. NFC The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-,llvm-namespace-comment -header-filter='llvm/.\|clang/.' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137 show more ...
Revision tags: llvmorg-3.6.1, llvmorg-3.6.1-rc1
# 26d3017b	15-Apr-2015	Chandler Carruth <chandlerc@gmail.com>	[MBP] Spell the conditions the same way through out this if statement. NFC. llvm-svn: 235009
# cfb2b9d7	15-Apr-2015	Chandler Carruth <chandlerc@gmail.com>	[MBP] Sink a comment into the if block to which it pertains. This makes the content of the comment make much more sense. llvm-svn: 235007
# 9a512a48	15-Apr-2015	Chandler Carruth <chandlerc@gmail.com>	[MBP] Fix a really misleading typo in a comment. llvm-svn: 235006
# 799003bf	23-Mar-2015	Benjamin Kramer <benny.kra@googlemail.com>	Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used. llvm-svn: 232998
# 214997c6	20-Mar-2015	Daniel Jasper <djasper@google.com>	[MBP] Don't outline short optional branches With the option -outline-optional-branches, LLVM will place optional branches out of line (more details on r231230). With this patch, this is not done fo [MBP] Don't outline short optional branches With the option -outline-optional-branches, LLVM will place optional branches out of line (more details on r231230). With this patch, this is not done for short optional branches. A short optional branch is a branch containing a single block with an instruction count below a certain threshold (defaulting to 3). Still everything is guarded under -outline-optional-branches). Outlining a short branch can't significantly improve code locality. It can however decrease performance because of the additional jmp and in cases where the optional branch is hot. This fixes a compile time regression I have observed in a benchmark. Review: http://reviews.llvm.org/D8108 llvm-svn: 232802 show more ...
Revision tags: llvmorg-3.5.2, llvmorg-3.5.2-rc1
# 7a715dae	05-Mar-2015	Chandler Carruth <chandlerc@gmail.com>	[MBP] Use range based for-loops throughout this code. Several had already been added and the inconsistency made choosing names and changing code more annoying. Plus, wow are they better for this code [MBP] Use range based for-loops throughout this code. Several had already been added and the inconsistency made choosing names and changing code more annoying. Plus, wow are they better for this code! llvm-svn: 231347 show more ...
# 2fc3fe12	05-Mar-2015	Chandler Carruth <chandlerc@gmail.com>	[MBP] NFC, run clang-format over this code and tweak things to make the result reasonable. This code predated clang-format and so there was a reasonable amount of crufty formatting that had accumula [MBP] NFC, run clang-format over this code and tweak things to make the result reasonable. This code predated clang-format and so there was a reasonable amount of crufty formatting that had accumulated. This should ensure that neither myself nor others end up with formatting-only changes sneaking into other fixes. llvm-svn: 231341 show more ...
# d0dced58	05-Mar-2015	Chandler Carruth <chandlerc@gmail.com>	[MBP] This is no longer 'block-placement2'. ;] The old variants are long gone, update this code to reflect that. llvm-svn: 231340
# af7e99f2	05-Mar-2015	Chandler Carruth <chandlerc@gmail.com>	[MBP] Revert r231238 which attempted to fix a nasty bug where MBP is just arbitrarily interleaving unrelated control flows once they get moved "out-of-line" (both outside of natural CFG ordering and [MBP] Revert r231238 which attempted to fix a nasty bug where MBP is just arbitrarily interleaving unrelated control flows once they get moved "out-of-line" (both outside of natural CFG ordering and with diamonds that cannot be fully laid out by chaining fallthrough edges). This easy solution doesn't work in practice, and it isn't just a small bug. It looks like a very different strategy will be required. I'm working on that now, and it'll again go behind some flag so that everyone can experiment and make sure it is working well for them. llvm-svn: 231332 show more ...
# 9a53fbe2	04-Mar-2015	Chandler Carruth <chandlerc@gmail.com>	[MBP] Fix a really horrible bug in MachineBlockPlacement, but behind a flag for now. First off, thanks to Daniel Jasper for really pointing out the issue here. It's been here forever (at least, I th [MBP] Fix a really horrible bug in MachineBlockPlacement, but behind a flag for now. First off, thanks to Daniel Jasper for really pointing out the issue here. It's been here forever (at least, I think it was there when I first wrote this code) without getting really noticed or fixed. The key problem is what happens when two reasonably common patterns happen at the same time: we outline multiple cold regions of code, and those regions in turn have diamonds or other CFGs for which we can't just topologically lay them out. Consider some C code that looks like: if (a1()) { if (b1()) c1(); else d1(); f1(); } if (a2()) { if (b2()) c2(); else d2(); f2(); } done(); Now consider the case where a1() and a2() are unlikely to be true. In that case, we might lay out the first part of the function like: a1, a2, done; And then we will be out of successors in which to build the chain. We go to find the best block to continue the chain with, which is perfectly reasonable here, and find "b1" let's say. Laying out successors gets us to: a1, a2, done; b1, c1; At this point, we will refuse to lay out the successor to c1 (f1) because there are still un-placed predecessors of f1 and we want to try to preserve the CFG structure. So we go get the next best block, d1. ... wait for it ... Except that the next best block isn't d1. It is b2! d1 is waaay down inside these conditionals. It is much less important than b2. Except that this is exactly what we didn't want. If we keep going we get the entire set of the rest of the CFG interleaved!!! a1, a2, done; b1, c1; b2, c2; d1, f1; d2, f2; So we clearly need a better strategy here. =] My current favorite strategy is to actually try to place the block whose predecessor is closest. This very simply ensures that we unwind these kinds of CFGs the way that is natural and fitting, and should minimize the number of cache lines instructions are spread across. It also happens to be dead simple. It's like the datastructure was specifically set up for this use case or something. We only push blocks onto the work list when the last predecessor for them is placed into the chain. So the back of the worklist is the nearest next block. Unfortunately, a change like this is going to cause soooo many benchmarks to swing wildly. So for now I'm adding this under a flag so that we and others can validate that this is fixing the problems described, that it seems possible to enable, and hopefully that it fixes more of our problems long term. llvm-svn: 231238 show more ...
# 471e856f	04-Mar-2015	Daniel Jasper <djasper@google.com>	Add a flag to experiment with outlining optional branches. In a CFG with the edges A->B->C and A->C, B is an optional branch. LLVM's default behavior is to lay the blocks out naturally, i.e. A, B, Add a flag to experiment with outlining optional branches. In a CFG with the edges A->B->C and A->C, B is an optional branch. LLVM's default behavior is to lay the blocks out naturally, i.e. A, B, C, in order to improve code locality and fallthroughs. However, if a function contains many of those optional branches only a few of which are taken, this leads to a lot of unnecessary icache misses. Moving B out of line can work around this. Review: http://reviews.llvm.org/D7719 llvm-svn: 231230 show more ...
1 2 3 4 5 6 7 8 910>>...14