History log of /llvm-project/llvm/lib/CodeGen/BranchFolding.cpp (Results 26 – 50 of 442)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 3dcbbddf 01-Feb-2023 Mikhail Goncharov <goncharov.mikhail@gmail.com>

Revert "Improve and enable folding of conditional branches with tail calls."

This reverts commit c05ddc9cbc12b1f2038380f57a16c4ca98c614b7.

Fails under asan:

https://lab.llvm.org/buildbot/#/builder

Revert "Improve and enable folding of conditional branches with tail calls."

This reverts commit c05ddc9cbc12b1f2038380f57a16c4ca98c614b7.

Fails under asan:

https://lab.llvm.org/buildbot/#/builders/168/builds/11637

Failed Tests (3):
LLVM :: CodeGen/X86/jump_sign.ll
LLVM :: CodeGen/X86/or-branch.ll
LLVM :: CodeGen/X86/tailcall-extract.ll

show more ...


# c05ddc9c 01-Feb-2023 Noah Goldstein <goldstein.w.n@gmail.com>

Improve and enable folding of conditional branches with tail calls.

Improve and enable folding of conditional branches with tail calls.

1. Make it so that conditional tail calls can be emitted even

Improve and enable folding of conditional branches with tail calls.

Improve and enable folding of conditional branches with tail calls.

1. Make it so that conditional tail calls can be emitted even when
there are multiple predecessors.

2. Don't guard the transformation behind -Os. The rationale for
guarding it was static-prediction can be affected by whether the
branch is forward of backward. This is no longer true for almost any
X86 cpus (anything newer than `SnB`) so is no longer a meaningful
concern.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D140931

show more ...


Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init
# e72ca520 13-Jan-2023 Craig Topper <craig.topper@sifive.com>

[CodeGen] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFC

Use isPhysical/isVirtual methods.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D141715


Revision tags: llvmorg-15.0.7
# db6a979a 02-Dec-2022 tentzen <tentzen@microsoft.com>

Revert "[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 2"

This reverts commit 1a949c871ab4a6b6d792849d3e8c0fa6958d27f5.


# 1a949c87 02-Dec-2022 tentzen <tentzen@microsoft.com>

[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 2

This patch is the Part-2 (BE LLVM) implementation of HW Exception handling.
Part-1 (FE Clang) was committed in 797ad701522988e21249528

[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 2

This patch is the Part-2 (BE LLVM) implementation of HW Exception handling.
Part-1 (FE Clang) was committed in 797ad701522988e212495285dade8efac41a24d4.

This new feature adds the support of Hardware Exception for Microsoft Windows
SEH (Structured Exception Handling).

Compiler options:
For clang-cl.exe, the option is -EHa, the same as MSVC.
For clang.exe, the extra option is -fasync-exceptions,
plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual.

NOTE:: Without the -EHa or -fasync-exceptions, this patch is a NO-DIFF change.

The rules for C code:
For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow three rules:
First, no exception can move in or out of _try region., i.e., no "potential faulty
instruction can be moved across _try boundary.
Second, the order of exceptions for instructions 'directly' under a _try must be preserved
(not applied to those in callees).
Finally, global states (local/global/heap variables) that can be read outside of _try region
must be updated in memory (not just in register) before the subsequent exception occurs.

The impact to C++ code:
Although SEH is a feature for C code, -EHa does have a profound effect on C++
side. When a C++ function (in the same compilation unit with option -EHa ) is
called by a SEH C function, a hardware exception occurs in C++ code can also
be handled properly by an upstream SEH _try-handler or a C++ catch(...).
As such, when that happens in the middle of an object's life scope, the dtor
must be invoked the same way as C++ Synchronous Exception during unwinding process.

Design:
A natural way to achieve the rules above in LLVM today is to allow an EH edge
added on memory/computation instruction (previous iload/istore idea) so that
exception path is modeled in Flow graph preciously. However, tracking every
single memory instruction and potential faulty instruction can create many
Invokes, complicate flow graph and possibly result in negative performance
impact for downstream optimization and code generation. Making all
optimizations be aware of the new semantic is also substantial.

This design does not intend to model exception path at instruction level.
Instead, the proposed design tracks and reports EH state at BLOCK-level to
reduce the complexity of flow graph and minimize the performance-impact on CPP
code under -EHa option.
One key element of this design is the ability to compute State number at
block-level. Our algorithm is based on the following rationales:

A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping
into a _try is not allowed. The single entry must start with a seh_try_begin()
invoke with a correct State number that is the initial state of the SEME.
Through control-flow, state number is propagated into all blocks. Side exits
marked by seh_try_end() will unwind to parent state based on existing SEHUnwindMap[].
Note side exits can ONLY jump into parent scopes (lower state number).
Thus, when a block succeeds various states from its predecessors, the lowest
State triumphs others. If some exits flow to unreachable, propagation on those
paths terminate, not affecting remaining blocks.
For CPP code, object lifetime region is usually a SEME as SEH _try.
However there is one rare exception: jumping into a lifetime that has Dtor but
has no Ctor is warned, but allowed:

Warning: jump bypasses variable with a non-trivial destructor

In that case, the region is actually a MEME (multiple entry multiple exits).
Our solution is to inject a eha_scope_begin() invoke in the side entry block to
ensure a correct State.
Implementation:
Part-1: Clang implementation (already in):
Please see commit 797ad701522988e212495285dade8efac41a24d4).

Part-2 : LLVM implementation described below.

For both C++ & C-code, the state of each block is computed at the same place in
BE (WinEHPreparing pass) where all other EH tables/maps are calculated.
In addition to _scope_begin & _scope_end, the computation of block state also
rely on the existing State tracking code (UnwindMap and InvokeStateMap).

For both C++ & C-code, the state of each block with potential trap instruction
is marked and reported in DAG Instruction Selection pass, the same place where
the state for -EHsc (synchronous exceptions) is done.
If the first instruction in a reported block scope can trap, a Nop is injected
before this instruction. This nop is needed to accommodate LLVM Windows EH
implementation, in which the address in IPToState table is offset by +1.
(note the purpose of that is to ensure the return address of a call is in the
same scope as the call address.

The handler for catch(...) for -EHa must handle HW exception. So it is
'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches
C++ exceptions).
Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW
exceptions can be passed through.

Original llvm-dev [RFC] discussions can be found in these two threads below:
https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html
https://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html

Differential Revision: https://reviews.llvm.org/D102817/new/

show more ...


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2
# 4cb722ac 22-Apr-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

BranchFolder: Require NoPHIs

The pass doesn't handle SSA and breaks any phis.


# 7762a3ce 27-Apr-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

Revert "BranchFolder: Assert on SSA functions"

This reverts commit 6ff91d17d66da46572e97f9a0b042182762cbe9e.


# 6ff91d17 25-Apr-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

BranchFolder: Assert on SSA functions

We probably should have the opposite of getRequiredProperties for this


Revision tags: llvmorg-14.0.1
# 989f1c72 15-Mar-2022 serge-sans-paille <sguelton@redhat.com>

Cleanup codegen includes

This is a (fixed) recommit of https://reviews.llvm.org/D121169

after: 1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-in

Cleanup codegen includes

This is a (fixed) recommit of https://reviews.llvm.org/D121169

after: 1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681

show more ...


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# a278250b 10-Mar-2022 Nico Weber <thakis@chromium.org>

Revert "Cleanup codegen includes"

This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https:/

Revert "Cleanup codegen includes"

This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169

show more ...


# 7f230fee 07-Mar-2022 serge-sans-paille <sguelton@redhat.com>

Cleanup codegen includes

after: 1061034926
before: 1063332844

Differential Revision: https://reviews.llvm.org/D121169


Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 3aed2822 04-Dec-2021 Kazu Hirata <kazu@google.com>

[CodeGen] Use range-based for loops (NFC)


# fd7d4064 29-Nov-2021 Kazu Hirata <kazu@google.com>

[llvm] Use range-based for loops (NFC)


Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4
# 48719e3b 18-Sep-2021 Kazu Hirata <kazu@google.com>

[CodeGen] Use make_early_inc_range (NFC)


# cfc74024 16-Sep-2021 Kazu Hirata <kazu@google.com>

[llvm] Use drop_begin (NFC)


Revision tags: llvmorg-13.0.0-rc3
# c9fca53a 10-Sep-2021 Kazu Hirata <kazu@google.com>

[CodeGen, Target] Use pred_empty and succ_empty (NFC)


Revision tags: llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3
# bd524955 17-Jun-2021 Hongtao Yu <hoy@fb.com>

[CSSPGO] Undoing the concept of dangling pseudo probe

As a follow-up to https://reviews.llvm.org/D104129, I'm cleaning up the danling probe related code in both the compiler and llvm-profgen.

I'm s

[CSSPGO] Undoing the concept of dangling pseudo probe

As a follow-up to https://reviews.llvm.org/D104129, I'm cleaning up the danling probe related code in both the compiler and llvm-profgen.

I'm seeing a 5% size win for the pseudo_probe section for SPEC2017 and 10% for Ciner. Certain benchmark such as 602.gcc has a 20% size win. No obvious difference seen on build time for SPEC2017 and Cinder.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D104477

show more ...


Revision tags: llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1
# 18123192 19-Apr-2021 Hongtao Yu <hoy@fb.com>

[CSSPGO] Flip SkipPseudoOp to true for MIR APIs.

Flipping the default value of SkipPseudoOp to true for those MIR APIs to favor maximum performance. Note that certain spots like branch folding and M

[CSSPGO] Flip SkipPseudoOp to true for MIR APIs.

Flipping the default value of SkipPseudoOp to true for those MIR APIs to favor maximum performance. Note that certain spots like branch folding and MIR if-conversion is are disabled for better counts quality. For these two optimizations, this is a no-diff change.

The counts quality with SPEC2017 before/after this change is unchanged.

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D100332

show more ...


Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3
# 89855158 25-Feb-2021 Hongtao Yu <hoy@fb.com>

[CSSPGO] Unblocking optimizations by dangling pseudo probes.

This change fixes a couple places where the pseudo probe intrinsic blocks optimizations because they are not naturally removable. To unbl

[CSSPGO] Unblocking optimizations by dangling pseudo probes.

This change fixes a couple places where the pseudo probe intrinsic blocks optimizations because they are not naturally removable. To unblock those optimizations, the blocking pseudo probes are moved out of the original blocks and tagged dangling, instead of allowing pseudo probes to be literally removed. The reason is that when the original block is removed, we won't be able to sample it. Instead of assigning it a zero weight, moving all its pseudo probes into another block and marking them dangling should allow the counts inference a chance to assign them a more reasonable weight. We have not seen counts quality degradation from our experiments.

The optimizations being unblocked are:

1. Removing conditional probes for if-converted branches. Conditional probes are tagged dangling when their homing branch arms are folded so that they will not be over-counted.
2. Unblocking jump threading from removing empty blocks. Pseudo probe prevents jump threading from removing logically empty blocks that only has one unconditional jump instructions.
3. Unblocking SimplifyCFG and MIR tail duplicate to thread empty blocks and blocks with redundant branch checks.

Since dangling probes are logically deleted, they should not consume any samples in LTO postLink. This can be achieved by setting their distribution factors to zero when dangled.

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D97481

show more ...


Revision tags: llvmorg-12.0.0-rc2
# d61b4cb9 12-Feb-2021 Kazu Hirata <kazu@google.com>

[CodeGen] Use range-based for loops (NFC)


Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2
# 2082b10d 16-Jan-2021 Kazu Hirata <kazu@google.com>

[llvm] Use *::empty (NFC)


Revision tags: llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3
# 6913812a 20-Sep-2020 Fangrui Song <i@maskray.me>

Fix some clang-tidy bugprone-argument-comment issues


Revision tags: llvmorg-11.0.0-rc2
# 8a41a1f5 13-Aug-2020 Simon Pilgrim <llvm-dev@redking.me.uk>

BranchFolding.cpp - removes includes already included by BranchFolding.h. NFC.


Revision tags: llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1
# 4b0aa572 16-May-2020 James Y Knight <jyknight@google.com>

Change the INLINEASM_BR MachineInstr to be a non-terminating instruction.

Before this instruction supported output values, it fit fairly
naturally as a terminator. However, being a terminator while

Change the INLINEASM_BR MachineInstr to be a non-terminating instruction.

Before this instruction supported output values, it fit fairly
naturally as a terminator. However, being a terminator while also
supporting outputs causes some trouble, as the physreg->vreg COPY
operations cannot be in the same block.

Modeling it as a non-terminator allows it to be handled the same way
as invoke is handled already.

Most of the changes here were created by auditing all the existing
users of MachineBasicBlock::isEHPad() and
MachineBasicBlock::hasEHPadSuccessor(), and adding calls to
isInlineAsmBrIndirectTarget or mayHaveInlineAsmBr, as appropriate.

Reviewed By: nickdesaulniers, void

Differential Revision: https://reviews.llvm.org/D79794

show more ...


# 78c69a00 01-Jul-2020 Yuanfang Chen <yuanfang.chen@sony.com>

[NFC] Clean up uses of MachineModuleInfoWrapperPass


12345678910>>...18