History log of /llvm-project/bolt/lib/Core/BinaryEmitter.cpp (Results 1 – 25 of 65)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7
# aa9cc721 07-Jan-2025 Peter Waller <peter.waller@arm.com>

Reapply "[BOLT] Add --pad-funcs-before=func:n (#117924)" (#121918)

- **Reapply "[BOLT] Add --pad-funcs-before=func:n (#117924)"**
- **[BOLT] Fix --pad-funcs{,-before} state misinteraction**

When --

Reapply "[BOLT] Add --pad-funcs-before=func:n (#117924)" (#121918)

- **Reapply "[BOLT] Add --pad-funcs-before=func:n (#117924)"**
- **[BOLT] Fix --pad-funcs{,-before} state misinteraction**

When --pad-funcs-before was introduced, it introduced a bug whereby the
first one to get parsed could influence the other.

Ensure that each has its own state and test that they don't interact in
this manner by testing how the `_subsequent` symbol moves when both
arguments are supplied with different padding values.

Fixed by having a function (and static state) for each of before/after.

show more ...


# be21bd9b 06-Jan-2025 Amir Ayupov <aaupov@fb.com>

Revert "[BOLT] Add --pad-funcs-before=func:n (#117924)"

14dcf8214f9c66172d17c1cfaec6aec0030748e0 introduced a subtle bug with
the static `FunctionPadding` map.

If either `opts::FunctionPadSpec` or

Revert "[BOLT] Add --pad-funcs-before=func:n (#117924)"

14dcf8214f9c66172d17c1cfaec6aec0030748e0 introduced a subtle bug with
the static `FunctionPadding` map.

If either `opts::FunctionPadSpec` or `opts::FunctionPadBeforeSpec` are set,
the map is going to be populated with the respective spec in the first
invocation of `BinaryEmitter::emitFunction`. The subsequent invocations
will pick up the padding from the map irrespective of whether
`opts::FunctionPadSpec` or `opts::FunctionPadBeforeSpec` is passed as a
parameter.

This breaks an internal test, hence reverting the patch.

show more ...


Revision tags: llvmorg-19.1.6
# b560b87b 13-Dec-2024 Maksim Panchenko <maks@fb.com>

[BOLT] Clean up jump table handling in non-reloc mode. NFCI (#119614)

This change affects non-relocation mode only. Prior to having
CheckLargeFunctions pass, we could have emitted code for function

[BOLT] Clean up jump table handling in non-reloc mode. NFCI (#119614)

This change affects non-relocation mode only. Prior to having
CheckLargeFunctions pass, we could have emitted code for functions that
was discarded at the end due to size limitations. Since we didn't know
at the time of emission if the code would be discarded or not, we had to
emit jump tables in separate sections and handle them separately.
However, now we always run CheckLargeFunctions and make sure all emitted
code is used. Thus, we can get rid of the special jump table handling.

show more ...


# 14dcf821 11-Dec-2024 Peter Waller <peter.waller@arm.com>

[BOLT] Add --pad-funcs-before=func:n (#117924)

This complements --pad-funcs, and by using both simultaneously, enables
moving a specific function through the address space without modifying
any co

[BOLT] Add --pad-funcs-before=func:n (#117924)

This complements --pad-funcs, and by using both simultaneously, enables
moving a specific function through the address space without modifying
any code
other than the targeted function (and references to it) by doing
(before+after=constant).

See also: proposed functionality to enable inserting random padding in

https://discourse.llvm.org/t/rfc-lld-feature-for-controlling-for-code-size-dependent-measurement-bias
and https://github.com/llvm/llvm-project/pull/117653

show more ...


Revision tags: llvmorg-19.1.5
# 92301180 22-Nov-2024 Maksim Panchenko <maks@fb.com>

[BOLT] Use compact EH format for fixed-address executables (#117274)

Use ULEB128 format for emitting LSDAs for fixed-address executables,
similar to what we use for PIEs/DSOs. Main difference is th

[BOLT] Use compact EH format for fixed-address executables (#117274)

Use ULEB128 format for emitting LSDAs for fixed-address executables,
similar to what we use for PIEs/DSOs. Main difference is that we don't
use landing pad trampolines when landing pads are not contained in a
single fragment. Instead, we fallback to emitting larger fixed-address
LSDAs, which is still better than adding trampoline instructions.

show more ...


# 105ecd8b 22-Nov-2024 Maksim Panchenko <maks@fb.com>

[BOLT] Avoid EH trampolines for PIEs/DSOs (#117106)

We used to emit EH trampolines for PIE/DSO whenever a function fragment
contained a landing pad outside of it. However, it is common to have all

[BOLT] Avoid EH trampolines for PIEs/DSOs (#117106)

We used to emit EH trampolines for PIE/DSO whenever a function fragment
contained a landing pad outside of it. However, it is common to have all
landing pads in a cold fragment even when their throwers are in a hot
one.

To reduce the number of trampolines, analyze landing pads for any given
function fragment, and if they all belong to the same (possibly
different) fragment, designate that fragment as a landing pad fragment
for the "thrower" fragment. Later, emit landing pad fragment symbol as
an LPStart for the thrower LSDA.

show more ...


# 3282be1f 20-Nov-2024 Maksim Panchenko <maks@fb.com>

[BOLT] Use ULEB128 encoding for PIE/DSO exception tables (#116911)

Use ULEB128 encoding for call sites in PIE/DSO binaries. The encoding
reduces the size of the tables compared to sdata4 and is the

[BOLT] Use ULEB128 encoding for PIE/DSO exception tables (#116911)

Use ULEB128 encoding for call sites in PIE/DSO binaries. The encoding
reduces the size of the tables compared to sdata4 and is the default
format used by Clang.

Note that for fixed-address executables we still use absolute addressing
to cover cases where landing pads can reside in different function
fragments.

For testing, we rely on runtime EH tests.

show more ...


# 066dd91a 20-Nov-2024 Maksim Panchenko <maks@fb.com>

[BOLT] Offset LPStart to avoid unnecessary instructions (#116713)

For C++ exception handling, when we write a call site table, we must
avoid emitting 0-value offsets for landing pads unless the cal

[BOLT] Offset LPStart to avoid unnecessary instructions (#116713)

For C++ exception handling, when we write a call site table, we must
avoid emitting 0-value offsets for landing pads unless the call site has
no landing pad. However, 0 can be a real offset from the start of the
FDE if the FDE corresponds to a function fragment that starts with a
landing pad. In such cases, we used to emit a trap instruction at the
start of the fragment to guarantee non-zero LP offset.

To avoid emitting unnecessary trap instructions, we can instead set
LPStart to an offset from the FDE. If we emit it as [FDEStart - 1], then
all real offsets from LPStart in FDE become non-negative.

show more ...


Revision tags: llvmorg-19.1.4
# 93a42445 17-Nov-2024 Maksim Panchenko <maks@fb.com>

[BOLT] Use new assembler directives for EH table emission (#116294)

When emitting C++ exception tables (LSDAs), BOLT used to estimate the
size of the tables beforehand. This implementation was nece

[BOLT] Use new assembler directives for EH table emission (#116294)

When emitting C++ exception tables (LSDAs), BOLT used to estimate the
size of the tables beforehand. This implementation was necessary as the
assembler/streamer lacked the emitULEB128IntValue() functionality.

As I plan to introduce [u|s]uleb128-encoded exception tables in BOLT,
now is a perfect time to switch to the new API and eliminate the need
to pre-compute the size of the tables.

show more ...


Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 344228eb 02-Jul-2024 Amir Ayupov <aaupov@fb.com>

[BOLT] Drop macro-fusion alignment (#97358)

9d0754ada5dbbc0c009bcc2f7824488419cc5530 dropped MC support required for
optimal macro-fusion alignment in BOLT. Remove the support in BOLT as
performan

[BOLT] Drop macro-fusion alignment (#97358)

9d0754ada5dbbc0c009bcc2f7824488419cc5530 dropped MC support required for
optimal macro-fusion alignment in BOLT. Remove the support in BOLT as
performance measurements with large binaries didn't show a significant
improvement.

Test Plan:
macro-fusion alignment was never upstreamed, so no upstream tests are
affected.

show more ...


Revision tags: llvmorg-18.1.8
# 7520d0c9 07-Jun-2024 Amir Ayupov <aaupov@fb.com>

[BOLT][NFC] Unset UseAssemblerInfoForParsing for emission (#94778)

Summary:
Use workaround for quadratic behavior inside
AttemptToFoldSymbolOffsetDifference called from BinaryEmitter::emitLSDA.

[BOLT][NFC] Unset UseAssemblerInfoForParsing for emission (#94778)

Summary:
Use workaround for quadratic behavior inside
AttemptToFoldSymbolOffsetDifference called from BinaryEmitter::emitLSDA.


https://github.com/llvm/llvm-project/commit/b06e736982a3568fe2bcea8688550f9e393b7450#commitcomment-142836456

show more ...


Revision tags: llvmorg-18.1.7
# c8fc234e 22-May-2024 shaw young <58664393+shawbyoung@users.noreply.github.com>

[BOLT][NFC] Eliminate uses of throwing std::map::at (#92950)

Remove calls to std::unordered_map::at, std::map::at, and
std::vector::at.


Revision tags: llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4
# 603fa4c6 15-Apr-2024 Nathan Sidwell <nathan@acm.org>

[BOLT][NFC] Be more obvious about selecting X86 (#88527)

Use `isX86()` rather than `!isAArch64() && !isRISCV()`, and similar.


Revision tags: llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1
# 7c206c78 28-Feb-2024 Maksim Panchenko <maks@fb.com>

[BOLT] Refactor interface for instruction labels. NFCI (#83209)

To avoid accidentally setting the label twice for the same instruction,
which can lead to a "lost" label, introduce getOrSetInstLabel

[BOLT] Refactor interface for instruction labels. NFCI (#83209)

To avoid accidentally setting the label twice for the same instruction,
which can lead to a "lost" label, introduce getOrSetInstLabel()
function. Rename existing functions to getInstLabel()/setInstLabel() to
make it explicit that they operate on instruction labels. Add an
assertion in setInstLabel() that the instruction did not have a prior
label set.

show more ...


Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3
# 52cf0711 12-Feb-2024 Amir Ayupov <aaupov@fb.com>

[BOLT][NFC] Log through JournalingStreams (#81524)

Make core BOLT functionality more friendly to being used as a
library instead of in our standalone driver llvm-bolt. To
accomplish this, we augme

[BOLT][NFC] Log through JournalingStreams (#81524)

Make core BOLT functionality more friendly to being used as a
library instead of in our standalone driver llvm-bolt. To
accomplish this, we augment BinaryContext with journaling streams
that are to be used by most BOLT code whenever something needs to
be logged to the screen. Users of the library can decide if logs
should be printed to a file, no file or to the screen, as
before. To illustrate this, this patch adds a new option
`--log-file` that allows the user to redirect BOLT logging to a
file on disk or completely hide it by using
`--log-file=/dev/null`. Future BOLT code should now use
`BinaryContext::outs()` for printing important messages instead of
`llvm::outs()`. A new test log.test enforces this by verifying that
no strings are print to screen once the `--log-file` option is
used.

In previous patches we also added a new BOLTError class to report
common and fatal errors, so code shouldn't call exit(1) now. To
easily handle problems as before (by quitting with exit(1)),
callers can now use
`BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code
needs to deal with BOLT errors. To test this, we have fatal.s
that checks we are correctly quitting and printing a fatal error
to the screen.

Because this is a significant change by itself, not all code was
yet ported. Code from Profiler libs (DataAggregator and friends)
still print errors directly to screen.

Co-authored-by: Rafael Auler <rafaelauler@fb.com>

Test Plan: NFC

show more ...


Revision tags: llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# 9fec33aa 19-Jan-2024 Amir Ayupov <aaupov@fb.com>

Revert "[BOLT] Fix unconditional output of boltedcollection in merge-fdata (#78653)"

This reverts commit 82bc33ea3f1a539be50ed46919dc53fc6b685da9.

Accidentally pushed unrelated changes.


# 82bc33ea 19-Jan-2024 Amir Ayupov <aaupov@fb.com>

[BOLT] Fix unconditional output of boltedcollection in merge-fdata (#78653)

Fix the bug where merge-fdata unconditionally outputs boltedcollection
line, regardless of whether input files have it s

[BOLT] Fix unconditional output of boltedcollection in merge-fdata (#78653)

Fix the bug where merge-fdata unconditionally outputs boltedcollection
line, regardless of whether input files have it set.

Test Plan:
Added bolt/test/X86/merge-fdata-nobat-mode.test which fails without this
fix.

show more ...


# c43d0432 30-Nov-2023 ShatianWang <38512325+ShatianWang@users.noreply.github.com>

[BOLT] Create .text.warm for 3-way splitting (#73863)

This commit explicitly adds a warm code section, .text.warm, when
-split-functions -split-strategy=cdsplit is used. This replaces the
previous

[BOLT] Create .text.warm for 3-way splitting (#73863)

This commit explicitly adds a warm code section, .text.warm, when
-split-functions -split-strategy=cdsplit is used. This replaces the
previous approach of using .text.cold.0 as warm and .text.cold.1 as cold
in 3-way function splitting. NFC.

show more ...


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5
# f633f325 14-Nov-2023 Maksim Panchenko <maks@fb.com>

[BOLT] Fix NOP instruction emission on x86 (#72186)

Use MCAsmBackend::writeNopData() interface to emit NOP instructions on
x86. There are multiple forms of NOP instruction on x86 with different
si

[BOLT] Fix NOP instruction emission on x86 (#72186)

Use MCAsmBackend::writeNopData() interface to emit NOP instructions on
x86. There are multiple forms of NOP instruction on x86 with different
sizes. Currently, LLVM's assembly/disassembly does not support all forms
correctly which can lead to a breakage of input code semantics, e.g. if
the program relies on NOP instructions for reserving a patch space.

Add "--keep-nops" option to preserve NOP instructions.

show more ...


# 0df15467 06-Nov-2023 Maksim Panchenko <maks@fb.com>

[BOLT] Use Label annotation instead of EHLabel pseudo. NFCI. (#70179)

When we need to attach EH label to an instruction, we can now use Label
annotation instead of EHLabel pseudo instruction.


# e28c393b 06-Nov-2023 maksfb <maks@fb.com>

[BOLT] Reduce the number of emitted symbols. NFCI. (#70175)

We emit a symbol before an instruction for a number of reasons, e.g. for
tracking LocSyms, debug line, or if the instruction has a label

[BOLT] Reduce the number of emitted symbols. NFCI. (#70175)

We emit a symbol before an instruction for a number of reasons, e.g. for
tracking LocSyms, debug line, or if the instruction has a label
annotation. Currently, we may emit multiple symbols per instruction.

Reuse the same label instead of creating and emitting new ones when
possible. I'm planning to refactor EH labels as well in a separate diff.

Change getLabel() to return a pointer instead of std::optional<> since
an empty label should be treated identically to no label.

show more ...


Revision tags: llvmorg-17.0.4, llvmorg-17.0.3
# b7944f7c 12-Oct-2023 Vladislav Khmelevsky <och95@yandex.ru>

[BOLT] Return proper minimal alignment from BF (#67707)

Currently minimal alignment of function is hardcoded to 2 bytes.
Add 2 more cases:
1. In case BF is data in code return the alignment of CI

[BOLT] Return proper minimal alignment from BF (#67707)

Currently minimal alignment of function is hardcoded to 2 bytes.
Add 2 more cases:
1. In case BF is data in code return the alignment of CI as minimal
alignment
2. For aarch64 and riscv platforms return the minimal value of 4 (added
test for aarch64)
Otherwise fallback to returning the 2 as it previously was.

show more ...


# ff5e2bab 06-Oct-2023 Job Noorman <jnoorman@igalia.com>

[BOLT] Improve handling of relocations targeting specific instructions (#66395)

On RISC-V, there are certain relocations that target a specific
instruction instead of a more abstract location like

[BOLT] Improve handling of relocations targeting specific instructions (#66395)

On RISC-V, there are certain relocations that target a specific
instruction instead of a more abstract location like a function or basic
block. Take the following example that loads a value from symbol `foo`:

```
nop
1: auipc t0, %pcrel_hi(foo)
ld t0, %pcrel_lo(1b)(t0)
```

This results in two relocation:
- auipc: `R_RISCV_PCREL_HI20` referencing `foo`;
- ld: `R_RISCV_PCREL_LO12_I` referencing to local label `1` which points
to the auipc instruction.

It is of utmost importance that the `R_RISCV_PCREL_LO12_I` keeps
referring to the auipc instruction; if not, the program will fail to
assemble. However, BOLT currently does not guarantee this.

BOLT currently assumes that all local symbols are jump targets and
always starts a new basic block at symbol locations. The example above
results in a CFG the looks like this:

```
.BB0:
nop
.BB1:
auipc t0, %pcrel_hi(foo)
ld t0, %pcrel_lo(.BB1)(t0)
```

While this currently works (i.e., the `R_RISCV_PCREL_LO12_I` relocation
points to the correct instruction), it has two downsides:
- Too many basic blocks are created (the example above is logically only
one yet two are created);
- If instructions are inserted in `.BB1` (e.g., by instrumentation),
things will break since the label will not point to the auipc anymore.

This patch proposes to fix this issue by teaching BOLT to track labels
that should always point to a specific instruction. This is implemented
as follows:
- Add a new annotation type (`kLabel`) that allows us to annotate
instructions with an `MCSymbol *`;
- Whenever we encounter a relocation type that is used to refer to a
specific instruction (`Relocation::isInstructionReference`), we
register it without a symbol;
- During disassembly, whenever we encounter an instruction with such a
relocation, create a symbol for its target and store it in an offset
to symbol map (to ensure multiple relocations referencing the same
instruction use the same label);
- After disassembly, iterate this map to attach labels to instructions
via the new annotation type;
- During emission, emit these labels right before the instruction.

I believe the use of annotations works quite well for this use case as
it allows us to reliably track instruction labels. If we were to store
them as offsets in basic blocks, it would be error prone to keep them
updated whenever instructions are inserted or removed.

I have chosen to add labels as first-class annotations (as opposed to a
generic one) because the documentation of `MCAnnotation` suggests that
generic annotations are to be used for optional metadata that can be
discarded without affecting correctness. As this is not the case for
labels, a first-class annotation seemed more appropriate.

show more ...


Revision tags: llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1
# 28fd2ca1 27-Jul-2023 Denis Revunov <revunov.denis@huawei-partners.com>

[BOLT] Fix trap value for non-X86

The trap value used by BOLT was assumed to be single-byte instruction.
It made some functions unaligned on AArch64(e.g exceptions-instrumentation test)
and caused e

[BOLT] Fix trap value for non-X86

The trap value used by BOLT was assumed to be single-byte instruction.
It made some functions unaligned on AArch64(e.g exceptions-instrumentation test)
and caused emission failures. Fix that by changing fill value to StringRef.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D158191

show more ...


# 23c8d382 21-Aug-2023 Job Noorman <jnoorman@igalia.com>

[BOLT] Calculate input to output address map using BOLTLinker

BOLT uses MCAsmLayout to calculate the output values of basic blocks.
This means output values are calculated based on a pre-linking sta

[BOLT] Calculate input to output address map using BOLTLinker

BOLT uses MCAsmLayout to calculate the output values of basic blocks.
This means output values are calculated based on a pre-linking state and
any changes to symbol values during linking will cause incorrect values
to be used.

This issue was first addressed in D154604 by adding all basic block
symbols to the symbol table for the linker to resolve them. However, the
runtime overhead of handling this huge symbol table turned out to be
prohibitively large.

This patch solves the issue in a different way. First, a temporary
section containing [input address, output symbol] pairs is emitted to the
intermediary object file. The linker will resolve all these references
so we end up with a section of [input address, output address] pairs.
This section is then parsed and used to:
- Replace BinaryBasicBlock::OffsetTranslationTable
- Replace BinaryFunction::InputOffsetToAddressMap
- Update BinaryBasicBlock::OutputAddressRange

Note that the reason this is more performant than the previous attempt
is that these symbol references do not cause entries to be added to the
symbol table. Instead, section-relative references are used for the
relocations.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D155604

show more ...


123