History log of /llvm-project/llvm/lib/CodeGen/MachineBlockPlacement.cpp (Results 226 – 250 of 331)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1ccca9e6 01-Dec-2015 Cong Hou <congh@google.com>

Fix a bug in MachineBlockPlacement that may cause assertion failure during BranchProbability construction.

The root cause is the rounding behavior in BranchProbability construction. We may consider

Fix a bug in MachineBlockPlacement that may cause assertion failure during BranchProbability construction.

The root cause is the rounding behavior in BranchProbability construction. We may consider to use truncation instead in the future.

llvm-svn: 254356

show more ...


Revision tags: llvmorg-3.7.1
# fa1917c6 01-Dec-2015 Cong Hou <congh@google.com>

Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces.

The patch in http://reviews.llvm.org/D13745 is broken into four parts:

1. New in

Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces.

The patch in http://reviews.llvm.org/D13745 is broken into four parts:

1. New interfaces without functional changes (http://reviews.llvm.org/D13908).
2. Use new interfaces in SelectionDAG, while in other passes treat probabilities
as weights (http://reviews.llvm.org/D14361).
3. Use new interfaces in all other passes.
4. Remove old interfaces.

This patch is 3+4 above. In this patch, MBB won't provide weight-based
interfaces any more, which are totally replaced by probability-based ones.
The interface addSuccessor() is redesigned so that the default probability is
unknown. We allow unknown probabilities but don't allow using it together
with known probabilities in successor list. That is to say, we either have a
list of successors with all known probabilities, or all unknown
probabilities. In the latter case, we assume each successor has 1/N
probability where N is the number of successors. An assertion checks if the
user is attempting to add a successor with the disallowed mixed use as stated
above. This can help us catch many misuses.

All uses of weight-based interfaces are now updated to use probability-based
ones.


Differential revision: http://reviews.llvm.org/D14973

llvm-svn: 254348

show more ...


Revision tags: llvmorg-3.7.1-rc2
# 41cf1a5d 18-Nov-2015 Cong Hou <congh@google.com>

Improving edge probabilities computation when choosing the best successor in machine block placement.

When looking for the best successor from the outer loop for a block
belonging to an inner loop,

Improving edge probabilities computation when choosing the best successor in machine block placement.

When looking for the best successor from the outer loop for a block
belonging to an inner loop, the edge probability computation can be
improved so that edges in the inner loop are ignored. For example,
suppose we are building chains for the non-loop part of the following
code, and looking for B1's best successor. Assume the true body is very
hot, then B3 should be the best candidate. However, because of the
existence of the back edge from B1 to B0, the probability from B1 to B3
can be very small, preventing B3 to be its successor. In this patch, when
computing the probability of the edge from B1 to B3, the weight on the
back edge B1->B0 is ignored, so that B1->B3 will have 100% probability.

if (...)
do {
B0;
... // some branches
B1;
} while(...);
else
B2;
B3;


Differential revision: http://reviews.llvm.org/D10825

llvm-svn: 253414

show more ...


Revision tags: llvmorg-3.7.1-rc1
# b90b9e05 02-Nov-2015 Cong Hou <congh@google.com>

In MachineBlockPlacement, filter cold blocks off the loop chain when profile data is available.

In the current BB placement algorithm, a loop chain always contains all loop blocks. This has a drawba

In MachineBlockPlacement, filter cold blocks off the loop chain when profile data is available.

In the current BB placement algorithm, a loop chain always contains all loop blocks. This has a drawback that cold blocks in the loop may be inserted on a hot function path, hence increasing branch cost and also reducing icache locality.

Consider a simple example shown below:

A
|
B⇆C
|
D

When B->C is quite cold, the best BB-layout should be A,B,D,C. But the current implementation produces A,C,B,D.

This patch filters those cold blocks off from the loop chain by comparing the ratio:

LoopBBFreq / LoopFreq

to 20%: if it is less than 20%, we don't include this BB to the loop chain. Here LoopFreq is the frequency of the loop when we reduce the loop into a single node. In general we have more cold blocks when the loop has few iterations. And vice versa.


Differential revision: http://reviews.llvm.org/D11662

llvm-svn: 251833

show more ...


# 7745dbc5 19-Oct-2015 Cong Hou <congh@google.com>

Enhance loop rotation with existence of profile data in MachineBlockPlacement pass.

Currently, in MachineBlockPlacement pass the loop is rotated to let the best exit to be the last BB in the loop ch

Enhance loop rotation with existence of profile data in MachineBlockPlacement pass.

Currently, in MachineBlockPlacement pass the loop is rotated to let the best exit to be the last BB in the loop chain, to maximize the fall-through from the loop to outside. With profile data, we can determine the cost in terms of missed fall through opportunities when rotating a loop chain and select the best rotation. Basically, there are three kinds of cost to consider for each rotation:

1. The possibly missed fall through edge (if it exists) from BB out of the loop to the loop header.
2. The possibly missed fall through edges (if they exist) from the loop exits to BB out of the loop.
3. The missed fall through edge (if it exists) from the last BB to the first BB in the loop chain.

Therefore, the cost for a given rotation is the sum of costs listed above. We select the best rotation with the smallest cost. This is only for PGO mode when we have more precise edge frequencies.

Differential revision: http://reviews.llvm.org/D10717

llvm-svn: 250754

show more ...


# 6ac07fd2 09-Oct-2015 Duncan P. N. Exon Smith <dexonsmith@apple.com>

CodeGen: Remove implicit iterator conversions from MBB.cpp

Remove implicit ilist iterator conversions from MachineBasicBlock.cpp.

I've also added an overload of `splice()` that takes a pointer, sin

CodeGen: Remove implicit iterator conversions from MBB.cpp

Remove implicit ilist iterator conversions from MachineBasicBlock.cpp.

I've also added an overload of `splice()` that takes a pointer, since
it's a natural API. This is similar to the overloads I added for
`remove()` and `erase()` in r249867.

llvm-svn: 249883

show more ...


# 77ec0770 16-Sep-2015 Craig Topper <craig.topper@gmail.com>

Fix a spelling error in the description of a statistic. NFC

llvm-svn: 247771


# 0e288234 27-Aug-2015 Reid Kleckner <rnk@google.com>

[WinEH] Add some support for code generating catchpad

We can now run 32-bit programs with empty catch bodies. The next step
is to change PEI so that we get funclet prologues and epilogues.

llvm-sv

[WinEH] Add some support for code generating catchpad

We can now run 32-bit programs with empty catch bodies. The next step
is to change PEI so that we get funclet prologues and epilogues.

llvm-svn: 246235

show more ...


Revision tags: llvmorg-3.7.0, llvmorg-3.7.0-rc4, llvmorg-3.7.0-rc3, studio-1.4
# ec105872 06-Aug-2015 Cong Hou <congh@google.com>

Revert r244154 which causes some build failure. See https://llvm.org/bugs/show_bug.cgi?id=24377.

llvm-svn: 244239


# 36e7e52a 05-Aug-2015 Cong Hou <congh@google.com>

Record whether the weights on out-edges from a MBB are normalized.

1. Create a utility function normalizeEdgeWeights() in MachineBranchProbabilityInfo that normalizes a list of edge weights so that

Record whether the weights on out-edges from a MBB are normalized.

1. Create a utility function normalizeEdgeWeights() in MachineBranchProbabilityInfo that normalizes a list of edge weights so that the sum of then can fit in uint32_t.
2. Provide an interface in MachineBasicBlock to normalize its successors' weights.
3. Add a flag in MachineBasicBlock that tracks whether its successors' weights are normalized.
4. Provide an overload of getSumForBlock that accepts a non-const pointer to a MBB so that it can force normalizing this MBB's successors' weights.
5. Update several uses of getSumForBlock() by eliminating the once needed weight scale.

Differential Revision: http://reviews.llvm.org/D11442

llvm-svn: 244154

show more ...


# 924879ad 04-Aug-2015 Sanjay Patel <spatel@rotateright.com>

wrap OptSize and MinSize attributes for easier and consistent access (NFCI)

Create wrapper methods in the Function class for the OptimizeForSize and MinSize
attributes. We want to hide the logic of

wrap OptSize and MinSize attributes for easier and consistent access (NFCI)

Create wrapper methods in the Function class for the OptimizeForSize and MinSize
attributes. We want to hide the logic of "or'ing" them together when optimizing
just for size (-Os).

Currently, we are not consistent about this and rely on a front-end to always set
OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here
that should be added as follow-on patches with regression tests.

This patch is NFC-intended: it just replaces existing direct accesses of the attributes
by the equivalent wrapper call.

Differential Revision: http://reviews.llvm.org/D11734

llvm-svn: 243994

show more ...


Revision tags: llvmorg-3.7.0-rc2, llvmorg-3.7.0-rc1
# 0881fc11 15-Jul-2015 Cong Hou <congh@google.com>

Test commit.

This is a test commit (one blank line deleted).

llvm-svn: 242308


Revision tags: llvmorg-3.6.2, llvmorg-3.6.2-rc1
# f00654e3 23-Jun-2015 Alexander Kornienko <alexfh@google.com>

Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)

Apparently, the style needs to be agreed upon first.

llvm-svn: 240390


# 70bc5f13 19-Jun-2015 Alexander Kornienko <alexfh@google.com>

Fixed/added namespace ending comments using clang-tidy. NFC

The patch is generated using this command:

tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
-checks=-*,llvm-namespace-c

Fixed/added namespace ending comments using clang-tidy. NFC

The patch is generated using this command:

tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
-checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
llvm/lib/


Thanks to Eugene Kosov for the original patch!

llvm-svn: 240137

show more ...


Revision tags: llvmorg-3.6.1, llvmorg-3.6.1-rc1
# 26d3017b 15-Apr-2015 Chandler Carruth <chandlerc@gmail.com>

[MBP] Spell the conditions the same way through out this if statement.
NFC.

llvm-svn: 235009


# cfb2b9d7 15-Apr-2015 Chandler Carruth <chandlerc@gmail.com>

[MBP] Sink a comment into the if block to which it pertains. This makes
the content of the comment make much more sense.

llvm-svn: 235007


# 9a512a48 15-Apr-2015 Chandler Carruth <chandlerc@gmail.com>

[MBP] Fix a really misleading typo in a comment.

llvm-svn: 235006


# 799003bf 23-Mar-2015 Benjamin Kramer <benny.kra@googlemail.com>

Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used.

llvm-svn: 232998


# 214997c6 20-Mar-2015 Daniel Jasper <djasper@google.com>

[MBP] Don't outline short optional branches

With the option -outline-optional-branches, LLVM will place optional
branches out of line (more details on r231230).

With this patch, this is not done fo

[MBP] Don't outline short optional branches

With the option -outline-optional-branches, LLVM will place optional
branches out of line (more details on r231230).

With this patch, this is not done for short optional branches. A short
optional branch is a branch containing a single block with an
instruction count below a certain threshold (defaulting to 3). Still
everything is guarded under -outline-optional-branches).

Outlining a short branch can't significantly improve code locality. It
can however decrease performance because of the additional jmp and in
cases where the optional branch is hot. This fixes a compile time
regression I have observed in a benchmark.

Review: http://reviews.llvm.org/D8108
llvm-svn: 232802

show more ...


Revision tags: llvmorg-3.5.2, llvmorg-3.5.2-rc1
# 7a715dae 05-Mar-2015 Chandler Carruth <chandlerc@gmail.com>

[MBP] Use range based for-loops throughout this code. Several had
already been added and the inconsistency made choosing names and
changing code more annoying. Plus, wow are they better for this code

[MBP] Use range based for-loops throughout this code. Several had
already been added and the inconsistency made choosing names and
changing code more annoying. Plus, wow are they better for this code!

llvm-svn: 231347

show more ...


# 2fc3fe12 05-Mar-2015 Chandler Carruth <chandlerc@gmail.com>

[MBP] NFC, run clang-format over this code and tweak things to make the
result reasonable.

This code predated clang-format and so there was a reasonable amount of
crufty formatting that had accumula

[MBP] NFC, run clang-format over this code and tweak things to make the
result reasonable.

This code predated clang-format and so there was a reasonable amount of
crufty formatting that had accumulated. This should ensure that neither
myself nor others end up with formatting-only changes sneaking into
other fixes.

llvm-svn: 231341

show more ...


# d0dced58 05-Mar-2015 Chandler Carruth <chandlerc@gmail.com>

[MBP] This is no longer 'block-placement2'. ;] The old variants are long
gone, update this code to reflect that.

llvm-svn: 231340


# af7e99f2 05-Mar-2015 Chandler Carruth <chandlerc@gmail.com>

[MBP] Revert r231238 which attempted to fix a nasty bug where MBP is
just arbitrarily interleaving unrelated control flows once they get
moved "out-of-line" (both outside of natural CFG ordering and

[MBP] Revert r231238 which attempted to fix a nasty bug where MBP is
just arbitrarily interleaving unrelated control flows once they get
moved "out-of-line" (both outside of natural CFG ordering and with
diamonds that cannot be fully laid out by chaining fallthrough edges).

This easy solution doesn't work in practice, and it isn't just a small
bug. It looks like a very different strategy will be required. I'm
working on that now, and it'll again go behind some flag so that
everyone can experiment and make sure it is working well for them.

llvm-svn: 231332

show more ...


# 9a53fbe2 04-Mar-2015 Chandler Carruth <chandlerc@gmail.com>

[MBP] Fix a really horrible bug in MachineBlockPlacement, but behind
a flag for now.

First off, thanks to Daniel Jasper for really pointing out the issue
here. It's been here forever (at least, I th

[MBP] Fix a really horrible bug in MachineBlockPlacement, but behind
a flag for now.

First off, thanks to Daniel Jasper for really pointing out the issue
here. It's been here forever (at least, I think it was there when
I first wrote this code) without getting really noticed or fixed.

The key problem is what happens when two reasonably common patterns
happen at the same time: we outline multiple cold regions of code, and
those regions in turn have diamonds or other CFGs for which we can't
just topologically lay them out. Consider some C code that looks like:

if (a1()) { if (b1()) c1(); else d1(); f1(); }
if (a2()) { if (b2()) c2(); else d2(); f2(); }
done();

Now consider the case where a1() and a2() are unlikely to be true. In
that case, we might lay out the first part of the function like:

a1, a2, done;

And then we will be out of successors in which to build the chain. We go
to find the best block to continue the chain with, which is perfectly
reasonable here, and find "b1" let's say. Laying out successors gets us
to:

a1, a2, done; b1, c1;

At this point, we will refuse to lay out the successor to c1 (f1)
because there are still un-placed predecessors of f1 and we want to try
to preserve the CFG structure. So we go get the next best block, d1.

... wait for it ...

Except that the next best block *isn't* d1. It is b2! d1 is waaay down
inside these conditionals. It is much less important than b2. Except
that this is exactly what we didn't want. If we keep going we get the
entire set of the rest of the CFG *interleaved*!!!

a1, a2, done; b1, c1; b2, c2; d1, f1; d2, f2;

So we clearly need a better strategy here. =] My current favorite
strategy is to actually try to place the block whose predecessor is
closest. This very simply ensures that we unwind these kinds of CFGs the
way that is natural and fitting, and should minimize the number of cache
lines instructions are spread across.

It also happens to be *dead simple*. It's like the datastructure was
specifically set up for this use case or something. We only push blocks
onto the work list when the last predecessor for them is placed into the
chain. So the back of the worklist *is* the nearest next block.

Unfortunately, a change like this is going to cause *soooo* many
benchmarks to swing wildly. So for now I'm adding this under a flag so
that we and others can validate that this is fixing the problems
described, that it seems possible to enable, and hopefully that it fixes
more of our problems long term.

llvm-svn: 231238

show more ...


# 471e856f 04-Mar-2015 Daniel Jasper <djasper@google.com>

Add a flag to experiment with outlining optional branches.

In a CFG with the edges A->B->C and A->C, B is an optional branch.

LLVM's default behavior is to lay the blocks out naturally, i.e. A, B,

Add a flag to experiment with outlining optional branches.

In a CFG with the edges A->B->C and A->C, B is an optional branch.

LLVM's default behavior is to lay the blocks out naturally, i.e. A, B,
C, in order to improve code locality and fallthroughs. However, if a
function contains many of those optional branches only a few of which
are taken, this leads to a lot of unnecessary icache misses. Moving B
out of line can work around this.

Review: http://reviews.llvm.org/D7719
llvm-svn: 231230

show more ...


12345678910>>...14