MachineFunctionSplitter.cpp - OpenGrok history log for /llvm-project/llvm/lib/CodeGen/MachineFunctionSplitter.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5
# 68f7b075	23-Nov-2024	Rahman Lavaee <rahmanl@google.com>	[BasicBlockSections] Allow mixing of -basic-block-sections with MFS. (#117076) This PR allows mixing `-basic-block-sections` with `-enable-machine-function-splitter`. The strategy is to let `-basi [BasicBlockSections] Allow mixing of -basic-block-sections with MFS. (#117076) This PR allows mixing `-basic-block-sections` with `-enable-machine-function-splitter`. The strategy is to let `-basic-block-sections` take precedence over functions with profiles. show more ...
Revision tags: llvmorg-19.1.4
# 735ab61a	13-Nov-2024	Kazu Hirata <kazu@google.com>	[CodeGen] Remove unused includes (NFC) (#115996) Identified with misc-include-cleaner.
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 09989996	12-Jul-2024	paperchalice <liujunchang97@outlook.com>	[CodeGen][NewPM] Port `machine-block-freq` to new pass manager (#98317) - Add `MachineBlockFrequencyAnalysis`. - Add `MachineBlockFrequencyPrinterPass`. - Use `MachineBlockFrequencyInfoWrapperPass [CodeGen][NewPM] Port `machine-block-freq` to new pass manager (#98317) - Add `MachineBlockFrequencyAnalysis`. - Add `MachineBlockFrequencyPrinterPass`. - Use `MachineBlockFrequencyInfoWrapperPass` in legacy pass manager. - `LazyMachineBlockFrequencyInfo::print` is empty, drop it due to new pass manager migration. show more ...
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4
# 163eaf3b	22-Feb-2024	Daniel Hoekwater <hoekwater@google.com>	[CodeGen] Clean up MachineFunctionSplitter MBB safety checking (NFC) Move the "is MBB safe to split" check out of `isColdBlock` and update the comment since we're no longer using a temporary hack.
Revision tags: llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2
# ef1c25eb	04-Aug-2023	Daniel Hoekwater <hoekwater@google.com>	[CodeGen][AArch64] Don't split jump table basic blocks Jump tables on AArch64 are label-relative rather than table-relative, so having jump table destinations that are in different sections causes p [CodeGen][AArch64] Don't split jump table basic blocks Jump tables on AArch64 are label-relative rather than table-relative, so having jump table destinations that are in different sections causes problems with relocation. Jump table lookups have a max range of 1MB, so all destinations must be in the same section as the lookup code. Both of these restrictions can be mitigated with some careful and complex logic, but doing so doesn't gain a huge performance benefit. Efficiently ensuring jump tables are correct and can be compressed on AArch64 is a TODO item. In the meantime, don't split blocks that can cause problems. Differential Revision: https://reviews.llvm.org/D157124 show more ...
# 3dbabead	25-Aug-2023	Snehasish Kumar <snehasishk@google.com>	[CodeGen] Remove unused option in MachineFunctionSplitter. The option was added in github.com/llvm/llvm-project/commit/90ab85a but it doesn't seem to be used. The triple check has been removed so th [CodeGen] Remove unused option in MachineFunctionSplitter. The option was added in github.com/llvm/llvm-project/commit/90ab85a but it doesn't seem to be used. The triple check has been removed so this shouldn't be required going forward. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D158885 show more ...
# 8c249c44	04-Aug-2023	Daniel Hoekwater <hoekwater@google.com>	[CodeGen][AArch64] Don't split functions with a red zone on AArch64 Because unconditional branch relaxation on AArch64 grows the stack to spill a register, splitting a function would cause the red z [CodeGen][AArch64] Don't split functions with a red zone on AArch64 Because unconditional branch relaxation on AArch64 grows the stack to spill a register, splitting a function would cause the red zone to be overwritten. Explicitly disable MFS for such functions. Differential Revision: https://reviews.llvm.org/D157127 show more ...
# 90ab85a1	09-Aug-2023	Daniel Hoekwater <hoekwater@google.com>	Reland "[CodeGen][AArch64] Make MFS testable on AArch64" Reverted by 3d22dac6c3b97d7bb92f243886dfb0d32a5c42e9 because it depended on b9d079d6188b50730e0a67267b7fee36008435ce, which broke some tests.
# 77596e6b	21-Aug-2023	Fangrui Song <i@maskray.me>	Revert D157750 "[Driver][CodeGen] Properly handle -fsplit-machine-functions for fatbinary compilation." This reverts commit 317a0fe5bd7113c0ac9d30b2de58ca409e5ff754. This reverts commit 30c4b97aec60 Revert D157750 "[Driver][CodeGen] Properly handle -fsplit-machine-functions for fatbinary compilation." This reverts commit 317a0fe5bd7113c0ac9d30b2de58ca409e5ff754. This reverts commit 30c4b97aec60895a6905816670f493cdd1d7c546. See post-commit discussions on https://reviews.llvm.org/D157750 that we should use a different mechanism to handle the error with --cuda-gpu-arch= The IR/DiagnosticInfo.cpp, warn_drv_for_elf_only, codegne tests in clang/test/Driver, and the following driver behavior (downgrading error to warning) changes are undesired. ``` % clang --target=riscv64 -fsplit-machine-functions -c a.c warning: -fsplit-machine-functions is not valid for riscv64 [-Wbackend-plugin] ``` show more ...
# 317a0fe5	17-Aug-2023	Han Shen <shenhan@google.com>	[Driver][CodeGen] Properly handle -fsplit-machine-functions for fatbinary compilation. When building a fatbinary, the driver invokes the compiler multiple times with different "--target". (For examp [Driver][CodeGen] Properly handle -fsplit-machine-functions for fatbinary compilation. When building a fatbinary, the driver invokes the compiler multiple times with different "--target". (For example, with "-x cuda --cuda-gpu-arch=sm_70" flags, clang will be invoded twice, once with --target=x86_64_...., once with --target=sm_70) If we use -fsplit-machine-functions or -fno-split-machine-functions for such invocation, the driver reports an error. This CL changes the behavior so: - "-fsplit-machine-functions" is now passed to all targets, for non-X86 targets, the flag is a NOOP and causes a warning. - "-fno-split-machine-functions" now negates -fsplit-machine-functions (if -fno-split-machine-functions appears after any -fsplit-machine-functions) for any target triple, previously, it causes an error. - "-fsplit-machine-functions -Xarch_device -fno-split-machine-functions" enables MFS on host but disables MFS for GPUS without warnings/errors. - "-Xarch_host -fsplit-machine-functions" enables MFS on host but disables MFS for GPUS without warnings/errors. Reviewed by: xur, dhoekwater Differential Revision: https://reviews.llvm.org/D157750 show more ...
Revision tags: llvmorg-17.0.0-rc1, llvmorg-18-init
# f7f744a5	18-Jul-2023	Han Shen <shenhan@google.com>	[CodeGen] Separate MachineFunctionSplitter logic for different profile types. In D152577 @xur has a post-submit comment regarding to an awkward usage of MFS for Autofdo - instead of just using -fspl [CodeGen] Separate MachineFunctionSplitter logic for different profile types. In D152577 @xur has a post-submit comment regarding to an awkward usage of MFS for Autofdo - instead of just using -fsplit-machine-function, the user needs to add "-mllvm -mfs-psi-cutoff=0" to choose the right logic for AutoFDO. The compiler should choose the right default values for such case. This CL separate MFS logic for different profile types. Reviewed By: xur, wenlei Differential Revision: https://reviews.llvm.org/D155253 show more ...
# 8df75969	10-Jul-2023	Han Shen <shenhan@google.com>	[CodeGen] Fine tune MachineFunctionSplitPass (MFS) for FSAFDO. The original MFS work D85368 shows good performance improvement with Instrumented FDO. However, AutoFDO or Flow-Sensitive AutoFDO (FSAF [CodeGen] Fine tune MachineFunctionSplitPass (MFS) for FSAFDO. The original MFS work D85368 shows good performance improvement with Instrumented FDO. However, AutoFDO or Flow-Sensitive AutoFDO (FSAFDO) does not show performance gain. This is mainly caused by a less accurate profile compared to the iFDO profile. For the past few months, we have been working to improve FSAFDO quality, like in D145171. Taking advantage of this improvement, MFS now shows performance improvements over FSAFDO profiles. That being said, 2 minor changes need to be made, 1) An FS-AutoFDO profile generation pass needs to be added right before MFS pass and an FSAFDO profile load pass is needed when FS-AutoFDO is enabled and the MFS flag is present. 2) MFS only applies to hot functions, because we believe (and experiment also shows) FS-AutoFDO is more accurate about functions that have plenty of samples than those with no or very few samples. With this improvement, we see a 1.2% performance improvement in clang benchmark, 0.9% QPS improvement in our internal search benchmark, and 3%-5% improvement in internal storage benchmark. This is #1 of the two patches that enables the improvement. Reviewed By: wenlei, snehasish, xur Differential Revision: https://reviews.llvm.org/D152399 show more ...
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1
# 950487bd	26-Jan-2023	Hongtao Yu <hoy@fb.com>	[Pseudo Probe] Do not instrument EH blocks. This change avoids inserting probes to EH blocks. Pseudo probe can prevent block merging when probes in the blocks look different. This has a chained effe [Pseudo Probe] Do not instrument EH blocks. This change avoids inserting probes to EH blocks. Pseudo probe can prevent block merging when probes in the blocks look different. This has a chained effect to passes incurring exponential IR growth (such as jump threading) and as a consequence the compilation may time out. Not inserting probes to EH blocks could mitigate the issue. Another benefit is that both IR size and binary size are smaller. Since EH blocks are usually cold, the change should have minimal impact to profile quality. Testing: Out of two internal large benchmarks, no perf impact seen. 1% size savings to both the `text` and the `pseudo_probe` section. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D142747 show more ...
Revision tags: llvmorg-17-init, llvmorg-15.0.7
# 51b68573	16-Dec-2022	Fangrui Song <i@maskray.me>	[Transforms,CodeGen] std::optional::value => operator/operator-> value() has undesired exception checking semantics and calls __throw_bad_optional_access in libc++. Moreover, the API is unavailable [Transforms,CodeGen] std::optional::value => operator/operator-> value() has undesired exception checking semantics and calls __throw_bad_optional_access in libc++. Moreover, the API is unavailable without _LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS). show more ...
# c589730a	05-Dec-2022	Krzysztof Parzyszek <kparzysz@quicinc.com>	[YAML] Convert Optional to std::optional
# 89fae41e	05-Dec-2022	Fangrui Song <i@maskray.me>	[IR] llvm::Optional => std::optional Many llvm/IR/* files have been migrated by other contributors. This migrates most remaining files.
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3
# e170d955	17-Aug-2022	Archit Saxena <archsaxe@fb.com>	Split EH code by default The current machine function splitter is reliant on profile data to do profile summary analysis to split blocks into cold section. This may sometimes limit the usage of mach Split EH code by default The current machine function splitter is reliant on profile data to do profile summary analysis to split blocks into cold section. This may sometimes limit the usage of machine function splitter especially in cases where we could do some form of static analysis to split out cold blocks if profile data is absent or profile data which may be faulty (Consider Sample PGO). Of all code that could statically be marked cold Exception handling blocks are one of them (In fact BFI framework also tends to mark them as cold), and the most in size contribution. In my experiments I found out Exception handling pads and all code reachable from there account for up to 6-8% of the .text section on modern production binaries. This patch introduces a flag to split out all Exception handling blocks and blocks only reachable from Exceptional Handling pad to cold section. This flag has shown to give a performance win of up to 0.1% in terms of average cycles and instructions executed on internal facebook search service. Reviewed By: snehasish Differential Revision: https://reviews.llvm.org/D131824 show more ...
Revision tags: llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 3bb1ce23	22-Jul-2022	ARCHIT SAXENA <archsaxe@fb.com>	Add a nop instruction if a section starts with landing pad for function splitter This change adds a nop instruction if section starts with landing pad. This change is like [D73739](https://reviews.l Add a nop instruction if a section starts with landing pad for function splitter This change adds a nop instruction if section starts with landing pad. This change is like [D73739](https://reviews.llvm.org/D73739) which avoids zero offset landing pad in basic block sections. Detailed description: The current machine functions splitter can create ˜sections which start with a landing pad themselves. This places landing pad at offset zero from LPStart. ``` .section .text.split.foo10,"ax",@progbits foo10.cold: # %lpad .cfi_startproc .cfi_personality 3, __gxx_personality_v0 .cfi_lsda 3, .Lexception5 .cfi_def_cfa %rsp, 16 .Ltmp11: <--- This is a Landing pad and also LP Start as it is start of this section movq %rax, %rdi <--- first instruction is at offest 0 from LPStart callq _Unwind_Resume@PLT ``` This will cause landing pad entries to become zero (.Ltmp11-foo10.cold) ``` .Lcst_begin4: .uleb128 .Ltmp9-.Lfunc_begin2 # >> Call Site 1 << .uleb128 .Ltmp10-.Ltmp9 # Call between .Ltmp9 and .Ltmp10 .uleb128 .Ltmp11-foo10.cold <---This is zero # jumps to .Ltmp11 .byte 3 # On action: 2 .uleb128 .Ltmp10-.Lfunc_begin2 # >> Call Site 2 << .uleb128 .Lfunc_end9-.Ltmp10 # Call between .Ltmp10 and .Lfunc_end9 .byte 0 # has no landing pad .byte 0 # On action: cleanup .p2align 2 ``` The C++ ABI somehow assumes that no landing pads point directly to LPStart (which works in the normal case since the function begin is never a landing pad), and uses LP.offset = 0 to specify no landing pad. This change adds a nop instruction at start of such sections so that such a case could be avoided. Output: ``` .section .text.split.foo10,"ax",@progbits foo10.cold: # %lpad .cfi_startproc .cfi_personality 3, __gxx_personality_v0 .cfi_lsda 3, .Lexception5 .cfi_def_cfa %rsp, 16 nop <--- new instruction that is added .Ltmp11: movq %rax, %rdi callq _Unwind_Resume@PLT ``` Reviewed By: modimo, snehasish, rahmanl Differential Revision: https://reviews.llvm.org/D130133 show more ...
# 611ffcf4	14-Jul-2022	Kazu Hirata <kazu@google.com>	[llvm] Use value instead of getValue (NFC)
# a7938c74	26-Jun-2022	Kazu Hirata <kazu@google.com>	[llvm] Don't use Optional::hasValue (NFC) This patch replaces Optional::hasValue with the implicit cast to bool in conditionals only.
# 3b7c3a65	25-Jun-2022	Kazu Hirata <kazu@google.com>	Revert "Don't use Optional::hasValue (NFC)" This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.
# aa8feeef	25-Jun-2022	Kazu Hirata <kazu@google.com>	Don't use Optional::hasValue (NFC)
Revision tags: llvmorg-14.0.6
# e0e687a6	20-Jun-2022	Kazu Hirata <kazu@google.com>	[llvm] Don't use Optional::hasValue (NFC)
Revision tags: llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# 989f1c72	15-Mar-2022	serge-sans-paille <sguelton@redhat.com>	Cleanup codegen includes This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-in Cleanup codegen includes This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681 show more ...
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# a278250b	10-Mar-2022	Nico Weber <thakis@chromium.org>	Revert "Cleanup codegen includes" This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https:/ Revert "Cleanup codegen includes" This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169 show more ...
12