History log of /llvm-project/llvm/lib/CodeGen/MachineFunctionSplitter.cpp (Results 26 – 32 of 32)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 7f230fee 07-Mar-2022 serge-sans-paille <sguelton@redhat.com>

Cleanup codegen includes

after: 1061034926
before: 1063332844

Differential Revision: https://reviews.llvm.org/D121169


Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1
# 3da0aeea 23-Apr-2021 Snehasish Kumar <snehasishk@google.com>

[NFC] Use hasSection instead of getSection().empty()

Use the optimized check hasSection() instead of calling
getSection().empty(). Originally suggested in D101004, but was dropped
in the commit.


# 8077d0ff 21-Apr-2021 Snehasish Kumar <snehasishk@google.com>

[CodeGen] Do not split functions with attr "implicit-section-name".

The #pragma clang section can be used at a coarse granularity to specify
the section used for bss/data/text/rodata for global obje

[CodeGen] Do not split functions with attr "implicit-section-name".

The #pragma clang section can be used at a coarse granularity to specify
the section used for bss/data/text/rodata for global objects. When split
functions is enabled, the function may be split into two parts violating
user expectations.

Reference:
https://clang.llvm.org/docs/LanguageExtensions.html#specifying-section-names-for-global-objects-pragma-clang-section

Differential Revision: https://reviews.llvm.org/D101004

show more ...


Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2
# 2c7077e6 09-Feb-2021 Snehasish Kumar <snehasishk@google.com>

[CodeGen] Split out cold exception handling pads.

Support for splitting exception handling pads was added in D73739. This
change updates the code to split out exception handling pads if profile
info

[CodeGen] Split out cold exception handling pads.

Support for splitting exception handling pads was added in D73739. This
change updates the code to split out exception handling pads if profile
information indicates that they are cold. For a given function with
multiple landind pads, if one of them is hot they are all retained as
part of the hot code section.

Differential Revision: https://reviews.llvm.org/D96372

show more ...


Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2
# 7af80299 08-Dec-2020 Pan, Tao <tao.pan@intel.com>

[CodeGen] Add text section prefix for COFF object file

Text section prefix is created in CodeGenPrepare, it's file format independent implementation, text section name is written into object file i

[CodeGen] Add text section prefix for COFF object file

Text section prefix is created in CodeGenPrepare, it's file format independent implementation, text section name is written into object file in TargetLoweringObjectFile, it's file format dependent implementation, port code of adding text section prefix to text section name from ELF to COFF.
Different with ELF that use '.' as concatenation character, COFF use '$' as concatenation character. That is, concatenation character is variable, so split concatenation character from text section prefix.
Text section prefix is existing feature of ELF, it can help to reduce icache and itlb misses, it's also make possible aggregate other compilers e.g. v8 created same prefix sections. Furthermore, the recent feature Machine Function Splitter (basic block level text prefix section) is based on text section prefix.

Reviewed By: pengfei, rnk

Differential Revision: https://reviews.llvm.org/D92073

show more ...


Revision tags: llvmorg-11.0.1-rc1
# 24bf6ff4 09-Oct-2020 Snehasish Kumar <snehasishk@google.com>

[llvm] Update default cutoff threshold for machine function splitter.

Based on internal testing at Google we found that setting the profile
summary cutoff threshold to 999950 yields the best results

[llvm] Update default cutoff threshold for machine function splitter.

Based on internal testing at Google we found that setting the profile
summary cutoff threshold to 999950 yields the best results in terms of
itlb and icache metrics (as observed on Intel CPUs).

*default* = Split out code if no profile count available for block
*size-%* = The fraction of bytes split out of .text and .text.hot
*itlb* = Misses per kilo instructions (MPKI) for itlb
*icache* = Misses per kilo instructions (MPKI) for L1 icache

Search1

| cutoff | size-% | itlb | icache |
|---------|---------|-----------|---------|
| default | 42.5861 | 0.0822151 | 2.46363 |
| 999999 | 44.9350 | 0.0767194 | 2.44416 |
| 999950 | 50.0660 | 0.075744 | 2.4091 |
| 999500 | 56.9158 | 0.082564 | 2.4188 |
| 995000 | 63.8625 | 0.0814927 | 2.42832 |
| 990000 | 71.7314 | 0.106906 | 2.57785 |

Search2

| cutoff | size-% | itlb | icache |
|---------|--------|----------|---------|
| default | 2.8845 | 0.626712 | 4.73245 |
| 999999 | 3.3291 | 0.602309 | 4.70045 |
| 999950 | 3.8577 | 0.587842 | 4.71632 |
| 999500 | 4.4170 | 0.63577 | 4.68351 |
| 995000 | 5.1020 | 0.657969 | 4.82272 |
| 990000 | 5.7153 | 0.719122 | 5.39496 |

Differential Revision: https://reviews.llvm.org/D89085

show more ...


Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2
# 94faadac 05-Aug-2020 Snehasish Kumar <snehasishk@google.com>

[llvm][CodeGen] Machine Function Splitter

We introduce a codegen optimization pass which splits functions into hot and cold
parts. This pass leverages the basic block sections feature recently
intro

[llvm][CodeGen] Machine Function Splitter

We introduce a codegen optimization pass which splits functions into hot and cold
parts. This pass leverages the basic block sections feature recently
introduced in LLVM from the Propeller project. The pass targets
functions with profile coverage, identifies cold blocks and moves them
to a separate section. The linker groups all cold blocks across
functions together, decreasing fragmentation and improving icache and
itlb utilization.

We evaluated the Machine Function Splitter pass on clang bootstrap and
SPECInt 2017.

For clang bootstrap we observe a mean 2.33% runtime improvement with a
~32% reduction in itlb and stlb misses. Additionally, L1 icache misses
reduced by 9.5% while L2 instruction misses reduced by 20%.

For SPECInt we report the change in IntRate the C/C++
benchmarks. All benchmarks apart from mcf and x264 improve, on average
by 0.6% with the max for deepsjeng at 1.6%.

Benchmark % Change
500.perlbench_r 0.78
502.gcc_r 0.82
505.mcf_r -0.30
520.omnetpp_r 0.18
523.xalancbmk_r 0.37
525.x264_r -0.46
531.deepsjeng_r 1.61
541.leela_r 0.83
557.xz_r 0.15

Differential Revision: https://reviews.llvm.org/D85368

show more ...


12