History log of /llvm-project/llvm/lib/CodeGen/MachineBlockPlacement.cpp (Results 51 – 75 of 331)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1
# 50b62731 29-Jul-2021 Guozhi Wei <carrot@google.com>

[MBP] findBestLoopTopHelper should exit if OldTop is not a chain header

Function findBestLoopTopHelper tries to find a new loop top block which can also
fall through to OldTop, but it's impossible i

[MBP] findBestLoopTopHelper should exit if OldTop is not a chain header

Function findBestLoopTopHelper tries to find a new loop top block which can also
fall through to OldTop, but it's impossible if OldTop is not a chain header, so
it should exit immediately.

Differential Revision: https://reviews.llvm.org/D106329

show more ...


Revision tags: llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1
# d8aba75a 07-May-2021 Fangrui Song <i@maskray.me>

Internalize some cl::opt global variables or move them under namespace llvm


Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3
# cd880442 28-Jan-2021 Nicholas Guy <nicholas.guy@arm.com>

[CodeGen][AArch64] Add TargetInstrInfo hook to modify the TailDuplicateSize default threshold

Different targets might handle branch performance differently, so this patch allows for
targets to speci

[CodeGen][AArch64] Add TargetInstrInfo hook to modify the TailDuplicateSize default threshold

Different targets might handle branch performance differently, so this patch allows for
targets to specify the TailDuplicateSize threshold. Said threshold defines how small a branch
can be and still be duplicated to generate straight-line code instead.
This patch also specifies said override values for the AArch64 subtarget.

Differential Revision: https://reviews.llvm.org/D95631

show more ...


Revision tags: llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1
# 7bc76fd0 31-Dec-2020 Kazu Hirata <kazu@google.com>

[CodeGen] Construct SmallVector with iterator ranges (NFC)


Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2
# 687e80be 16-Dec-2020 Guozhi Wei <carrot@google.com>

[MBP] Add whole chain to BlockFilterSet instead of individual BB

Currently we add individual BB to BlockFilterSet if its frequency satisfies

LoopFreq / Freq <= LoopToColdBlockRatio

LoopFreq is e

[MBP] Add whole chain to BlockFilterSet instead of individual BB

Currently we add individual BB to BlockFilterSet if its frequency satisfies

LoopFreq / Freq <= LoopToColdBlockRatio

LoopFreq is edge frequency from outside to loop header.
LoopToColdBlockRatio is a command line parameter.

It doesn't make sense since we always layout whole chain, not individual BBs.

It may also cause a tricky problem. Sometimes it is possible that the LoopFreq
of an inner loop is smaller than LoopFreq of outer loop. So a BB can be in
BlockFilterSet of inner loop, but not in BlockFilterSet of outer loop,
like .cold in the test case. So it is added to the chain of inner loop. When
work on the outer loop, .cold is not added to BlockFilterSet, so the edge to
successor .problem is not counted in UnscheduledPredecessors of .problem chain.
But other blocks in the inner loop are added BlockFilterSet, so the whole inner
loop chain can be layout, and markChainSuccessors is called to decrease
UnscheduledPredecessors of following chains. markChainSuccessors calls
markBlockSuccessors for every BB, even it is not in BlockFilterSet, like .cold,
so .problem chain's UnscheduledPredecessors is decreased, but this edge was not
counted on in fillWorkLists, so .problem chain's UnscheduledPredecessors
becomes 0 when it still has an unscheduled predecessor .pred! And it causes
problems in following various successor BB selection algorithms.

Differential Revision: https://reviews.llvm.org/D89088

show more ...


# d50d7c37 14-Dec-2020 Guozhi Wei <carrot@google.com>

[MBP] Prevent rotating a chain contains entry block

The entry block should always be the first BB in a function.
So we should not rotate a chain contains the entry block.

Differential Revision: htt

[MBP] Prevent rotating a chain contains entry block

The entry block should always be the first BB in a function.
So we should not rotate a chain contains the entry block.

Differential Revision: https://reviews.llvm.org/D92882

show more ...


# ee5b5b7a 14-Dec-2020 Kazu Hirata <kazu@google.com>

[CodeGen] Use llvm::erase_value (NFC)


# a553ac97 05-Dec-2020 Kazu Hirata <kazu@google.com>

[CodeGen] llvm::erase_if (NFC)


Revision tags: llvmorg-11.0.1-rc1
# 68403af0 22-Nov-2020 Kazu Hirata <kazu@google.com>

[MBP] Remove unused declaration shouldPredBlockBeOutlined (NFC)

The function was introduced on Jun 12, 2016 in commit
071d0f180794f7819c44026815614ce8fa00a3bd. Its definition was removed
on Mar 2,

[MBP] Remove unused declaration shouldPredBlockBeOutlined (NFC)

The function was introduced on Jun 12, 2016 in commit
071d0f180794f7819c44026815614ce8fa00a3bd. Its definition was removed
on Mar 2, 2017 in commit 1393761e0ca3fe8271245762f78daf4d5208cd77.

show more ...


# e42f6c0a 23-Oct-2020 Han Shen <shenhan@google.com>

Revert "[MBP] Add whole chain to BlockFilterSet instead of individual BB"

This reverts commit adfb5415010fbbc009a4a6298cfda7a6ed4fa6d4.

This is reverted because it caused an chrome error: https://c

Revert "[MBP] Add whole chain to BlockFilterSet instead of individual BB"

This reverts commit adfb5415010fbbc009a4a6298cfda7a6ed4fa6d4.

This is reverted because it caused an chrome error: https://crbug.com/1140168

show more ...


# adfb5415 14-Oct-2020 Guozhi Wei <carrot@google.com>

[MBP] Add whole chain to BlockFilterSet instead of individual BB

Currently we add individual BB to BlockFilterSet if its frequency satisfies

LoopFreq / Freq <= LoopToColdBlockRatio

LoopFreq is edg

[MBP] Add whole chain to BlockFilterSet instead of individual BB

Currently we add individual BB to BlockFilterSet if its frequency satisfies

LoopFreq / Freq <= LoopToColdBlockRatio

LoopFreq is edge frequency from outside to loop header.
LoopToColdBlockRatio is a command line parameter.

It doesn't make sense since we always layout whole chain, not individual BBs.

It may also cause a tricky problem. Sometimes it is possible that the LoopFreq
of an inner loop is smaller than LoopFreq of outer loop. So a BB can be in
BlockFilterSet of inner loop, but not in BlockFilterSet of outer loop,
like .cold in the test case. So it is added to the chain of inner loop. When
work on the outer loop, .cold is not added to BlockFilterSet, so the edge to
successor .problem is not counted in UnscheduledPredecessors of .problem chain.
But other blocks in the inner loop are added BlockFilterSet, so the whole inner
loop chain can be layout, and markChainSuccessors is called to decrease
UnscheduledPredecessors of following chains. markChainSuccessors calls
markBlockSuccessors for every BB, even it is not in BlockFilterSet, like .cold,
so .problem chain's UnscheduledPredecessors is decreased, but this edge was not
counted on in fillWorkLists, so .problem chain's UnscheduledPredecessors
becomes 0 when it still has an unscheduled predecessor .pred! And it causes
problems in following various successor BB selection algorithms.

Differential Revision: https://reviews.llvm.org/D89088

show more ...


Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4
# fd75ad86 23-Sep-2020 Guozhi Wei <carrot@google.com>

[MBFIWrapper] Add a new function getBlockProfileCount

MBFIWrapper keeps track of block frequencies of newly created blocks and
modified blocks, modified block frequencies should also impact block pr

[MBFIWrapper] Add a new function getBlockProfileCount

MBFIWrapper keeps track of block frequencies of newly created blocks and
modified blocks, modified block frequencies should also impact block profile
count. This class doesn't provide interface getBlockProfileCount, users can only
use the underlying MBFI to query profile count, the underlying MBFI doesn't know
the modifications made in MBFIWrapper, so it either provides stale profile count
for modified block or simply crashes on new blocks.

So this patch add function getBlockProfileCount to class MBFIWrapper to handle
new blocks or modified blocks.

Differential Revision: https://reviews.llvm.org/D87802

show more ...


Revision tags: llvmorg-11.0.0-rc3
# 6913812a 20-Sep-2020 Fangrui Song <i@maskray.me>

Fix some clang-tidy bugprone-argument-comment issues


Revision tags: llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1
# 28759e9f 21-Jul-2020 Guozhi Wei <carrot@google.com>

[MBP] Use profile count to compute tail dup cost if it is available

Current tail duplication in machine block placement pass uses block frequency
information in cost model. But frequency number has

[MBP] Use profile count to compute tail dup cost if it is available

Current tail duplication in machine block placement pass uses block frequency
information in cost model. But frequency number has only relative meaning
compared to other basic blocks in the same function. A large frequency number
doesn't mean it is hot and a small frequency number doesn't mean it is cold.

To overcome this problem, this patch uses profile count in cost model if it's
available. So we can tail duplicate real hot basic blocks.

Differential Revision: https://reviews.llvm.org/D83265

show more ...


Revision tags: llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3
# 78c69a00 01-Jul-2020 Yuanfang Chen <yuanfang.chen@sony.com>

[NFC] Clean up uses of MachineModuleInfoWrapperPass


Revision tags: llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3
# 1978309d 19-Feb-2020 James Y Knight <jyknight@google.com>

MachineBasicBlock::updateTerminator now requires an explicit layout successor.

Previously, it tried to infer the correct destination block from the
successor list, but this is a rather tricky propsp

MachineBasicBlock::updateTerminator now requires an explicit layout successor.

Previously, it tried to infer the correct destination block from the
successor list, but this is a rather tricky propspect, given the
existence of successors that occur mid-block, such as invoke, and
potentially in the future, callbr/INLINEASM_BR. (INLINEASM_BR, in
particular would be problematic, because its successor blocks are not
distinct from "normal" successors, as EHPads are.)

Instead, require the caller to pass in the expected fallthrough
successor explicitly. In most callers, the correct block is
immediately clear. But, in MachineBlockPlacement, we do need to record
the original ordering, before starting to reorder blocks.

Unfortunately, the goal of decoupling the behavior of end-of-block
jumps from the successor list has not been fully accomplished in this
patch, as there is currently no other way to determine whether a block
is intended to fall-through, or end as unreachable. Further work is
needed there.

Differential Revision: https://reviews.llvm.org/D79605

show more ...


# 97f92261 02-May-2020 Benjamin Kramer <benny.kra@googlemail.com>

[MBP] tuple->pair. NFC.

std::pair has a trivial copy ctor, std::tuple doesn't.


# 11455a79 11-Apr-2020 Hongtao Yu <hoy@fb.com>

[CodeGen] Allow partial tail duplication in Machine Block Placement.

Summary: A count profile may affect tail duplication's heuristic causing a block to be duplicated in only a part of its predecess

[CodeGen] Allow partial tail duplication in Machine Block Placement.

Summary: A count profile may affect tail duplication's heuristic causing a block to be duplicated in only a part of its predecessors. This is not allowed in the Machine Block Placement pass where an assert will go off. I'm removing the assert and making the optimization bail out when such case happens.

Reviewers: wenlei, davidxl, Carrot

Reviewed By: Carrot

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77748

show more ...


# c3417592 24-Mar-2020 Hiroshi Yamauchi <yamauchi@google.com>

Revert "Include static prof data when collecting loop BBs"

This reverts commit 129c911efaa492790c251b3eb18e4db36b55cbc5.

Due to an internal benchmark regression.


# 129c911e 19-Feb-2020 Bill Wendling <isanbard@gmail.com>

Include static prof data when collecting loop BBs

Summary:
If the programmer adds static profile data to a branch---i.e. uses
"__builtin_expect()" or similar---then we should honor it. Otherwise,
"_

Include static prof data when collecting loop BBs

Summary:
If the programmer adds static profile data to a branch---i.e. uses
"__builtin_expect()" or similar---then we should honor it. Otherwise,
"__builtin_expect()" is ignored in crucial situations. So we trust that
the programmer knows what they're doing until proven wrong.

Subscribers: hiraditya, JDevlieghere, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74809

show more ...


Revision tags: llvmorg-10.0.0-rc2
# 369d086d 12-Feb-2020 Guozhi Wei <carrot@google.com>

[MBP] Partial tail duplication into hot predecessors

Current tail duplication embedded in MBP duplicates a BB into all or none of its predecessors without too much cost analysis. So sometimes it is

[MBP] Partial tail duplication into hot predecessors

Current tail duplication embedded in MBP duplicates a BB into all or none of its predecessors without too much cost analysis. So sometimes it is duplicated into cold predecessors, and in other cases it may miss the duplication into hot predecessors.

This patch improves tail duplication in 3 aspects:

A successor can be duplicated into part of its predecessors.
A more fine-grained benefit analysis, combined with 1, now a successor is duplicated into hot predecessors only.
If a successor can't be duplicated into one predecessor, it doesn't impact the duplication into other predecessors.

Differential Revision: https://reviews.llvm.org/D73387

show more ...


Revision tags: llvmorg-10.0.0-rc1
# ac8da31a 29-Jan-2020 Hiroshi Yamauchi <yamauchi@google.com>

[PGO][PGSO] Handle MBFIWrapper

Some code gen passes use MBFIWrapper to keep track of the frequency of new
blocks. This was not taken into account and could lead to incorrect frequencies
as MBFI sile

[PGO][PGSO] Handle MBFIWrapper

Some code gen passes use MBFIWrapper to keep track of the frequency of new
blocks. This was not taken into account and could lead to incorrect frequencies
as MBFI silently returns zero frequency for unknown/new blocks.

Add a variant for MBFIWrapper in the PGSO query interface.

Depends on D73494.

show more ...


# 2c03c899 27-Jan-2020 Hiroshi Yamauchi <yamauchi@google.com>

[MBFI] Move BranchFolding::MBFIWrapper to its own files. NFC.

Summary:
To avoid header file circular dependency issues in passing updated MBFI (in
MBFIWrapper) to the interface of profile guided siz

[MBFI] Move BranchFolding::MBFIWrapper to its own files. NFC.

Summary:
To avoid header file circular dependency issues in passing updated MBFI (in
MBFIWrapper) to the interface of profile guided size optimizations.

A prep step for (and split off of) D73381.

Reviewers: davidxl

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73494

show more ...


# 020041d9 21-Jan-2020 Krzysztof Parzyszek <kparzysz@quicinc.com>

Update spelling of {analyze,insert,remove}Branch in strings and comments

These names have been changed from CamelCase to camelCase, but there were
many places (comments mostly) that still used the o

Update spelling of {analyze,insert,remove}Branch in strings and comments

These names have been changed from CamelCase to camelCase, but there were
many places (comments mostly) that still used the old names.

This change is NFC.

show more ...


Revision tags: llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3
# d9ae4939 05-Dec-2019 Hiroshi Yamauchi <yamauchi@google.com>

[PGO][PGSO] Instrument the code gen / target passes.

Summary:
Split off of D67120.

Add the profile guided size optimization instrumentation / queries in the code
gen or target passes. This doesn't

[PGO][PGSO] Instrument the code gen / target passes.

Summary:
Split off of D67120.

Add the profile guided size optimization instrumentation / queries in the code
gen or target passes. This doesn't enable the size optimizations in those passes
yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass
queries).

A second try after reverted D71072.

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71149

show more ...


12345678910>>...14