History log of /llvm-project/llvm/lib/CodeGen/BasicBlockSections.cpp (Results 51 – 54 of 54)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3
# d2696dec 17-Sep-2020 Snehasish Kumar <snehasishk@google.com>

[llvm] Add -bbsections-cold-text-prefix to emit cold clusters to a different section.

This change adds an option to basic block sections to allow cold
clusters to be assigned a custom text prefix. W

[llvm] Add -bbsections-cold-text-prefix to emit cold clusters to a different section.

This change adds an option to basic block sections to allow cold
clusters to be assigned a custom text prefix. With a custom prefix such
as ".text.split." (D87840), lld can place them in a separate output section.
The benefits are -

* Empirically shown to improve icache and itlb metrics by 3-5%
(absolute) compared to placing split parts in .text.unlikely.
* Mitigates against poor profiles, eg samplePGO profiles used with the
machine function splitter. Optimizations such as hugepage remapping can
make different decisions at the section granularity.
* Enables section granularity hotness monitoring (checking on the
decisions made during compilation vs sample data from production).

Differential Revision: https://reviews.llvm.org/D87813

show more ...


# 7841e21c 14-Sep-2020 Rahman Lavaee <rahmanl@google.com>

Let -basic-block-sections=labels emit basicblock metadata in a new .bb_addr_map section, instead of emitting special unary-encoded symbols.

This patch introduces the new .bb_addr_map section feature

Let -basic-block-sections=labels emit basicblock metadata in a new .bb_addr_map section, instead of emitting special unary-encoded symbols.

This patch introduces the new .bb_addr_map section feature which allows us to emit the bits needed for mapping binary profiles to basic blocks into a separate section.
The format of the emitted data is represented as follows. It includes a header for every function:

| Address of the function | -> 8 bytes (pointer size)
| Number of basic blocks in this function (>0) | -> ULEB128

The header is followed by a BB record for every basic block. These records are ordered in the same order as MachineBasicBlocks are placed in the function. Each BB Info is structured as follows:

| Offset of the basic block relative to function begin | -> ULEB128
| Binary size of the basic block | -> ULEB128
| BB metadata | -> ULEB128 [ MBB.isReturn() OR MBB.hasTailCall() << 1 OR MBB.isEHPad() << 2 ]

The new feature will replace the existing "BB labels" functionality with -basic-block-sections=labels.
The .bb_addr_map section scrubs the specially-encoded BB symbols from the binary and makes it friendly to profilers and debuggers.
Furthermore, the new feature reduces the binary size overhead from 70% bloat to only 12%.

For more information and results please refer to the RFC: https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html

Reviewed By: MaskRay, snehasish

Differential Revision: https://reviews.llvm.org/D85408

show more ...


Revision tags: llvmorg-11.0.0-rc2
# 94faadac 05-Aug-2020 Snehasish Kumar <snehasishk@google.com>

[llvm][CodeGen] Machine Function Splitter

We introduce a codegen optimization pass which splits functions into hot and cold
parts. This pass leverages the basic block sections feature recently
intro

[llvm][CodeGen] Machine Function Splitter

We introduce a codegen optimization pass which splits functions into hot and cold
parts. This pass leverages the basic block sections feature recently
introduced in LLVM from the Propeller project. The pass targets
functions with profile coverage, identifies cold blocks and moves them
to a separate section. The linker groups all cold blocks across
functions together, decreasing fragmentation and improving icache and
itlb utilization.

We evaluated the Machine Function Splitter pass on clang bootstrap and
SPECInt 2017.

For clang bootstrap we observe a mean 2.33% runtime improvement with a
~32% reduction in itlb and stlb misses. Additionally, L1 icache misses
reduced by 9.5% while L2 instruction misses reduced by 20%.

For SPECInt we report the change in IntRate the C/C++
benchmarks. All benchmarks apart from mcf and x264 improve, on average
by 0.6% with the max for deepsjeng at 1.6%.

Benchmark % Change
500.perlbench_r 0.78
502.gcc_r 0.82
505.mcf_r -0.30
520.omnetpp_r 0.18
523.xalancbmk_r 0.37
525.x264_r -0.46
531.deepsjeng_r 1.61
541.leela_r 0.83
557.xz_r 0.15

Differential Revision: https://reviews.llvm.org/D85368

show more ...


# 8d943a92 06-Aug-2020 Snehasish Kumar <snehasishk@google.com>

[NFC] Rename BBSectionsPrepare -> BasicBlockSections.

Rename the BBSectionsPrepare pass as suggested by the review comment in
https://reviews.llvm.org/D85368.

Differential Revision: https://reviews

[NFC] Rename BBSectionsPrepare -> BasicBlockSections.

Rename the BBSectionsPrepare pass as suggested by the review comment in
https://reviews.llvm.org/D85368.

Differential Revision: https://reviews.llvm.org/D85380

show more ...


123