History log of /llvm-project/llvm/lib/CodeGen/MachineFunctionSplitter.cpp (Results 1 – 25 of 32)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5
# 68f7b075 23-Nov-2024 Rahman Lavaee <rahmanl@google.com>

[BasicBlockSections] Allow mixing of -basic-block-sections with MFS. (#117076)

This PR allows mixing `-basic-block-sections` with
`-enable-machine-function-splitter`. The strategy is to let
`-basi

[BasicBlockSections] Allow mixing of -basic-block-sections with MFS. (#117076)

This PR allows mixing `-basic-block-sections` with
`-enable-machine-function-splitter`. The strategy is to let
`-basic-block-sections` take precedence over functions with profiles.

show more ...


Revision tags: llvmorg-19.1.4
# 735ab61a 13-Nov-2024 Kazu Hirata <kazu@google.com>

[CodeGen] Remove unused includes (NFC) (#115996)

Identified with misc-include-cleaner.


Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 09989996 12-Jul-2024 paperchalice <liujunchang97@outlook.com>

[CodeGen][NewPM] Port `machine-block-freq` to new pass manager (#98317)

- Add `MachineBlockFrequencyAnalysis`.
- Add `MachineBlockFrequencyPrinterPass`.
- Use `MachineBlockFrequencyInfoWrapperPass

[CodeGen][NewPM] Port `machine-block-freq` to new pass manager (#98317)

- Add `MachineBlockFrequencyAnalysis`.
- Add `MachineBlockFrequencyPrinterPass`.
- Use `MachineBlockFrequencyInfoWrapperPass` in legacy pass manager.
- `LazyMachineBlockFrequencyInfo::print` is empty, drop it due to new
pass manager migration.

show more ...


Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4
# 163eaf3b 22-Feb-2024 Daniel Hoekwater <hoekwater@google.com>

[CodeGen] Clean up MachineFunctionSplitter MBB safety checking (NFC)

Move the "is MBB safe to split" check out of `isColdBlock` and update
the comment since we're no longer using a temporary hack.


Revision tags: llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2
# ef1c25eb 04-Aug-2023 Daniel Hoekwater <hoekwater@google.com>

[CodeGen][AArch64] Don't split jump table basic blocks

Jump tables on AArch64 are label-relative rather than table-relative, so
having jump table destinations that are in different sections causes
p

[CodeGen][AArch64] Don't split jump table basic blocks

Jump tables on AArch64 are label-relative rather than table-relative, so
having jump table destinations that are in different sections causes
problems with relocation. Jump table lookups have a max range of 1MB, so
all destinations must be in the same section as the lookup code. Both of
these restrictions can be mitigated with some careful and complex logic,
but doing so doesn't gain a huge performance benefit.

Efficiently ensuring jump tables are correct and can be compressed on
AArch64 is a TODO item. In the meantime, don't split blocks that can
cause problems.

Differential Revision: https://reviews.llvm.org/D157124

show more ...


# 3dbabead 25-Aug-2023 Snehasish Kumar <snehasishk@google.com>

[CodeGen] Remove unused option in MachineFunctionSplitter.

The option was added in github.com/llvm/llvm-project/commit/90ab85a but it doesn't seem to be used. The triple check has been removed so th

[CodeGen] Remove unused option in MachineFunctionSplitter.

The option was added in github.com/llvm/llvm-project/commit/90ab85a but it doesn't seem to be used. The triple check has been removed so this shouldn't be required going forward.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D158885

show more ...


# 8c249c44 04-Aug-2023 Daniel Hoekwater <hoekwater@google.com>

[CodeGen][AArch64] Don't split functions with a red zone on AArch64

Because unconditional branch relaxation on AArch64 grows the stack to
spill a register, splitting a function would cause the red z

[CodeGen][AArch64] Don't split functions with a red zone on AArch64

Because unconditional branch relaxation on AArch64 grows the stack to
spill a register, splitting a function would cause the red zone to be
overwritten. Explicitly disable MFS for such functions.

Differential Revision: https://reviews.llvm.org/D157127

show more ...


# 90ab85a1 09-Aug-2023 Daniel Hoekwater <hoekwater@google.com>

Reland "[CodeGen][AArch64] Make MFS testable on AArch64"

Reverted by 3d22dac6c3b97d7bb92f243886dfb0d32a5c42e9 because it depended
on b9d079d6188b50730e0a67267b7fee36008435ce, which broke some tests.


# 77596e6b 21-Aug-2023 Fangrui Song <i@maskray.me>

Revert D157750 "[Driver][CodeGen] Properly handle -fsplit-machine-functions for fatbinary compilation."

This reverts commit 317a0fe5bd7113c0ac9d30b2de58ca409e5ff754.
This reverts commit 30c4b97aec60

Revert D157750 "[Driver][CodeGen] Properly handle -fsplit-machine-functions for fatbinary compilation."

This reverts commit 317a0fe5bd7113c0ac9d30b2de58ca409e5ff754.
This reverts commit 30c4b97aec60895a6905816670f493cdd1d7c546.

See post-commit discussions on https://reviews.llvm.org/D157750 that
we should use a different mechanism to handle the error with --cuda-gpu-arch=

The IR/DiagnosticInfo.cpp, warn_drv_for_elf_only, codegne tests in
clang/test/Driver, and the following driver behavior (downgrading error
to warning) changes are undesired.
```
% clang --target=riscv64 -fsplit-machine-functions -c a.c
warning: -fsplit-machine-functions is not valid for riscv64 [-Wbackend-plugin]
```

show more ...


# 317a0fe5 17-Aug-2023 Han Shen <shenhan@google.com>

[Driver][CodeGen] Properly handle -fsplit-machine-functions for fatbinary compilation.

When building a fatbinary, the driver invokes the compiler multiple
times with different "--target". (For examp

[Driver][CodeGen] Properly handle -fsplit-machine-functions for fatbinary compilation.

When building a fatbinary, the driver invokes the compiler multiple
times with different "--target". (For example, with "-x cuda
--cuda-gpu-arch=sm_70" flags, clang will be invoded twice, once with
--target=x86_64_...., once with --target=sm_70) If we use
-fsplit-machine-functions or -fno-split-machine-functions for such
invocation, the driver reports an error.

This CL changes the behavior so:

- "-fsplit-machine-functions" is now passed to all targets, for non-X86
targets, the flag is a NOOP and causes a warning.

- "-fno-split-machine-functions" now negates -fsplit-machine-functions (if
-fno-split-machine-functions appears after any -fsplit-machine-functions)
for any target triple, previously, it causes an error.

- "-fsplit-machine-functions -Xarch_device -fno-split-machine-functions"
enables MFS on host but disables MFS for GPUS without warnings/errors.

- "-Xarch_host -fsplit-machine-functions" enables MFS on host but disables
MFS for GPUS without warnings/errors.

Reviewed by: xur, dhoekwater

Differential Revision: https://reviews.llvm.org/D157750

show more ...


Revision tags: llvmorg-17.0.0-rc1, llvmorg-18-init
# f7f744a5 18-Jul-2023 Han Shen <shenhan@google.com>

[CodeGen] Separate MachineFunctionSplitter logic for different profile types.

In D152577 @xur has a post-submit comment regarding to an awkward usage
of MFS for Autofdo - instead of just using -fspl

[CodeGen] Separate MachineFunctionSplitter logic for different profile types.

In D152577 @xur has a post-submit comment regarding to an awkward usage
of MFS for Autofdo - instead of just using -fsplit-machine-function, the
user needs to add "-mllvm -mfs-psi-cutoff=0" to choose the right logic
for AutoFDO. The compiler should choose the right default values for
such case.

This CL separate MFS logic for different profile types.

Reviewed By: xur, wenlei

Differential Revision: https://reviews.llvm.org/D155253

show more ...


# 8df75969 10-Jul-2023 Han Shen <shenhan@google.com>

[CodeGen] Fine tune MachineFunctionSplitPass (MFS) for FSAFDO.

The original MFS work D85368 shows good performance improvement with
Instrumented FDO. However, AutoFDO or Flow-Sensitive AutoFDO (FSAF

[CodeGen] Fine tune MachineFunctionSplitPass (MFS) for FSAFDO.

The original MFS work D85368 shows good performance improvement with
Instrumented FDO. However, AutoFDO or Flow-Sensitive AutoFDO (FSAFDO)
does not show performance gain. This is mainly caused by a less
accurate profile compared to the iFDO profile.

For the past few months, we have been working to improve FSAFDO
quality, like in D145171. Taking advantage of this improvement, MFS
now shows performance improvements over FSAFDO profiles.

That being said, 2 minor changes need to be made, 1) An FS-AutoFDO
profile generation pass needs to be added right before MFS pass and an
FSAFDO profile load pass is needed when FS-AutoFDO is enabled and the
MFS flag is present. 2) MFS only applies to hot functions, because we
believe (and experiment also shows) FS-AutoFDO is more accurate about
functions that have plenty of samples than those with no or very few
samples.

With this improvement, we see a 1.2% performance improvement in clang
benchmark, 0.9% QPS improvement in our internal search benchmark, and
3%-5% improvement in internal storage benchmark.

This is #1 of the two patches that enables the improvement.

Reviewed By: wenlei, snehasish, xur

Differential Revision: https://reviews.llvm.org/D152399

show more ...


Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1
# 950487bd 26-Jan-2023 Hongtao Yu <hoy@fb.com>

[Pseudo Probe] Do not instrument EH blocks.

This change avoids inserting probes to EH blocks. Pseudo probe can prevent block merging when probes in the blocks look different. This has a chained effe

[Pseudo Probe] Do not instrument EH blocks.

This change avoids inserting probes to EH blocks. Pseudo probe can prevent block merging when probes in the blocks look different. This has a chained effect to passes incurring exponential IR growth (such as jump threading) and as a consequence the compilation may time out. Not inserting probes to EH blocks could mitigate the issue. Another benefit is that both IR size and binary size are smaller. Since EH blocks are usually cold, the change should have minimal impact to profile quality.

Testing:

Out of two internal large benchmarks, no perf impact seen. 1% size savings to both the `text` and the `pseudo_probe` section.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D142747

show more ...


Revision tags: llvmorg-17-init, llvmorg-15.0.7
# 51b68573 16-Dec-2022 Fangrui Song <i@maskray.me>

[Transforms,CodeGen] std::optional::value => operator*/operator->

value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable

[Transforms,CodeGen] std::optional::value => operator*/operator->

value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).

show more ...


# c589730a 05-Dec-2022 Krzysztof Parzyszek <kparzysz@quicinc.com>

[YAML] Convert Optional to std::optional


# 89fae41e 05-Dec-2022 Fangrui Song <i@maskray.me>

[IR] llvm::Optional => std::optional

Many llvm/IR/* files have been migrated by other contributors.
This migrates most remaining files.


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3
# e170d955 17-Aug-2022 Archit Saxena <archsaxe@fb.com>

Split EH code by default

The current machine function splitter is reliant on profile data to do profile summary analysis to split blocks into cold section. This may sometimes limit the usage of mach

Split EH code by default

The current machine function splitter is reliant on profile data to do profile summary analysis to split blocks into cold section. This may sometimes limit the usage of machine function splitter especially in cases where we could do some form of static analysis to split out cold blocks if profile data is absent or profile data which may be faulty (Consider Sample PGO).

Of all code that could statically be marked cold Exception handling blocks are one of them (In fact BFI framework also tends to mark them as cold), and the most in size contribution. In my experiments I found out Exception handling pads and all code reachable from there account for up to 6-8% of the .text section on modern production binaries. This patch introduces a flag to split out all Exception handling blocks and blocks only reachable from Exceptional Handling pad to cold section. This flag has shown to give a performance win of up to 0.1% in terms of average cycles and instructions executed on internal facebook search service.

Reviewed By: snehasish

Differential Revision: https://reviews.llvm.org/D131824

show more ...


Revision tags: llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 3bb1ce23 22-Jul-2022 ARCHIT SAXENA <archsaxe@fb.com>

Add a nop instruction if a section starts with landing pad for function splitter

This change adds a nop instruction if section starts with landing pad. This change is like [D73739](https://reviews.l

Add a nop instruction if a section starts with landing pad for function splitter

This change adds a nop instruction if section starts with landing pad. This change is like [D73739](https://reviews.llvm.org/D73739) which avoids zero offset landing pad in basic block sections.

Detailed description:
The current machine functions splitter can create ˜sections which start with a landing pad themselves. This places landing pad at offset zero from LPStart.
```
.section .text.split.foo10,"ax",@progbits
foo10.cold: # %lpad
.cfi_startproc
.cfi_personality 3, __gxx_personality_v0
.cfi_lsda 3, .Lexception5
.cfi_def_cfa %rsp, 16
.Ltmp11: <--- This is a Landing pad and also LP Start as it is start of this section
movq %rax, %rdi <--- first instruction is at offest 0 from LPStart
callq _Unwind_Resume@PLT

```
This will cause landing pad entries to become zero (.Ltmp11-foo10.cold)
```
.Lcst_begin4:
.uleb128 .Ltmp9-.Lfunc_begin2 # >> Call Site 1 <<
.uleb128 .Ltmp10-.Ltmp9 # Call between .Ltmp9 and .Ltmp10
.uleb128 .Ltmp11-foo10.cold <---This is zero # jumps to .Ltmp11
.byte 3 # On action: 2
.uleb128 .Ltmp10-.Lfunc_begin2 # >> Call Site 2 <<
.uleb128 .Lfunc_end9-.Ltmp10 # Call between .Ltmp10 and .Lfunc_end9
.byte 0 # has no landing pad
.byte 0 # On action: cleanup
.p2align 2
```
The C++ ABI somehow assumes that no landing pads point directly to LPStart (which works in the normal case since the function begin is never a landing pad), and uses LP.offset = 0 to specify no landing pad. This change adds a nop instruction at start of such sections so that such a case could be avoided. Output:
```
.section .text.split.foo10,"ax",@progbits
foo10.cold: # %lpad
.cfi_startproc
.cfi_personality 3, __gxx_personality_v0
.cfi_lsda 3, .Lexception5
.cfi_def_cfa %rsp, 16
nop <--- new instruction that is added
.Ltmp11:
movq %rax, %rdi
callq _Unwind_Resume@PLT
```

Reviewed By: modimo, snehasish, rahmanl

Differential Revision: https://reviews.llvm.org/D130133

show more ...


# 611ffcf4 14-Jul-2022 Kazu Hirata <kazu@google.com>

[llvm] Use value instead of getValue (NFC)


# a7938c74 26-Jun-2022 Kazu Hirata <kazu@google.com>

[llvm] Don't use Optional::hasValue (NFC)

This patch replaces Optional::hasValue with the implicit cast to bool
in conditionals only.


# 3b7c3a65 25-Jun-2022 Kazu Hirata <kazu@google.com>

Revert "Don't use Optional::hasValue (NFC)"

This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.


# aa8feeef 25-Jun-2022 Kazu Hirata <kazu@google.com>

Don't use Optional::hasValue (NFC)


Revision tags: llvmorg-14.0.6
# e0e687a6 20-Jun-2022 Kazu Hirata <kazu@google.com>

[llvm] Don't use Optional::hasValue (NFC)


Revision tags: llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# 989f1c72 15-Mar-2022 serge-sans-paille <sguelton@redhat.com>

Cleanup codegen includes

This is a (fixed) recommit of https://reviews.llvm.org/D121169

after: 1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-in

Cleanup codegen includes

This is a (fixed) recommit of https://reviews.llvm.org/D121169

after: 1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681

show more ...


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# a278250b 10-Mar-2022 Nico Weber <thakis@chromium.org>

Revert "Cleanup codegen includes"

This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https:/

Revert "Cleanup codegen includes"

This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169

show more ...


12