#
79d695f0 |
| 12-Oct-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFCI] Speedup BAT::writeMaps
For a large binary with BAT section of size 38 MB with ~170k maps, reduces writeMaps time from 70s down to 1s.
The inefficiency was in the use of std::distance w
[BOLT][NFCI] Speedup BAT::writeMaps
For a large binary with BAT section of size 38 MB with ~170k maps, reduces writeMaps time from 70s down to 1s.
The inefficiency was in the use of std::distance with std::map::iterator which doesn't provide random access. Use sorted vector for lookups.
Test Plan: NFC
Reviewers: maksfb, rafaelauler, dcci, ayermolo
Reviewed By: maksfb
Pull Request: https://github.com/llvm/llvm-project/pull/112061
show more ...
|
#
dc1da939 |
| 05-Jul-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][BAT] Add support for three-way split functions (#93760)
In three-way split functions, if only .warm fragment is present, BAT
incorrectly overwrites the map for .warm fragment by empty .cold
[BOLT][BAT] Add support for three-way split functions (#93760)
In three-way split functions, if only .warm fragment is present, BAT
incorrectly overwrites the map for .warm fragment by empty .cold
fragment.
Test Plan: updated register-fragments-bolt-symbols.s
show more ...
|
#
8901f718 |
| 09-Jun-2024 |
Kazu Hirata <kazu@google.com> |
Use StringRef::starts_with (NFC) (#94886)
|
#
d1d9545e |
| 24-May-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][BAT] Add entries for deleted basic blocks
Deleted basic blocks are required for correct mapping of branches modified by SCTC.
Increases BAT size, bytes: - large binary: 8622496 -> 8703244. -
[BOLT][BAT] Add entries for deleted basic blocks
Deleted basic blocks are required for correct mapping of branches modified by SCTC.
Increases BAT size, bytes: - large binary: 8622496 -> 8703244. - small binary (X86/bolt-address-translation.test): 928 -> 940.
Test Plan: updated bb-with-two-tail-calls.s
Reviewers: ayermolo, dcci, maksfb, rafaelauler
Reviewed By: rafaelauler
Pull Request: https://github.com/llvm/llvm-project/pull/91906
show more ...
|
#
50574638 |
| 22-May-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Make BAT methods const (#91823)
|
#
7c5c8b2f |
| 22-May-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Move BAT::fetchParentAddress to header (#93061)
Unbreak shared build after
https://github.com/llvm/llvm-project/pull/91683
|
#
8927ac86 |
| 15-Apr-2024 |
Kazu Hirata <kazu@google.com> |
[BOLT] Fix a warning
This patch fixes:
bolt/lib/Profile/BoltAddressTranslation.cpp:380:37: error: operator '<<' has lower precedence than '+'; '+' will be evaluated first [-Werror,-Wshift-op-
[BOLT] Fix a warning
This patch fixes:
bolt/lib/Profile/BoltAddressTranslation.cpp:380:37: error: operator '<<' has lower precedence than '+'; '+' will be evaluated first [-Werror,-Wshift-op-parentheses]
show more ...
|
#
b79b6f9c |
| 15-Apr-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT] Use offset deduplication for cold fragments
Apply deduplication for uniformity and BAT section size reduction.
Changes BAT section size to: - large binary: 39541552 bytes (1.02x original), -
[BOLT] Use offset deduplication for cold fragments
Apply deduplication for uniformity and BAT section size reduction.
Changes BAT section size to: - large binary: 39541552 bytes (1.02x original), - medium binary: 3828996 bytes (0.64x), - small binary: 928 bytes (0.65x).
Test Plan: Updated bolt-address-translation.test
Reviewers: rafaelauler, dcci, ayermolo, JDevlieghere, maksfb
Reviewed By: maksfb
Pull Request: https://github.com/llvm/llvm-project/pull/87853
show more ...
|
#
3997f0eb |
| 11-Apr-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT] Cover all call sites in writeBATYAML
Call site information setting was conditioned on branch information presence for a given block. However, it's possible to have sampled profile lacking one
[BOLT] Cover all call sites in writeBATYAML
Call site information setting was conditioned on branch information presence for a given block. However, it's possible to have sampled profile lacking one or the other for a given basic block.
Iterate over branch profiles and call profiles independently to cover all recorded profile data.
Depends on https://github.com/llvm/llvm-project/pull/87569
Test Plan: Updated bolt/test/X86/yaml-secondary-entry-discriminator.s
Reviewers: ayermolo, dcci, maksfb, rafaelauler
Reviewed By: maksfb
Pull Request: https://github.com/llvm/llvm-project/pull/87743
show more ...
|
#
e64eede0 |
| 06-Apr-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][BAT] Fix encoded NumBasicBlocks
Emit the recorded number of blocks, not the number of basic block hashes. There might be differences in corner cases (openssl BN_BLINDING_convert_ex function).
[BOLT][BAT] Fix encoded NumBasicBlocks
Emit the recorded number of blocks, not the number of basic block hashes. There might be differences in corner cases (openssl BN_BLINDING_convert_ex function).
Test Plan: Updated openssl.test in https://github.com/rafaelauler/bolt-tests/pull/31
Reviewers: rafaelauler, ayermolo, maksfb, dcci
Reviewed By: ayermolo
Pull Request: https://github.com/llvm/llvm-project/pull/87830
show more ...
|
#
02276239 |
| 06-Apr-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][BAT] Support multi-way split functions
BAT writeMaps encoded the assumption that functions are only split into two fragments (hot and cold). However, BOLT supports splitting into arbitrary nu
[BOLT][BAT] Support multi-way split functions
BAT writeMaps encoded the assumption that functions are only split into two fragments (hot and cold). However, BOLT supports splitting into arbitrary number of fragments. Relax that assumption and look up primary (hot) fragment explicitly.
Depends on: https://github.com/llvm/llvm-project/pull/86219
Test Plan: Updated bolt/test/X86/yaml-secondary-entry-discriminator.s
Reviewers: ayermolo, rafaelauler, maksfb, dcci
Reviewed By: maksfb, dcci
Pull Request: https://github.com/llvm/llvm-project/pull/87123
show more ...
|
#
2d3c827c |
| 05-Apr-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT] Use BAT for YAML profile call target information
Provide a mechanism to resolve call target information for calls from non-BAT functions to BAT functions (`YAMLProfileWriter::convert`). Make
[BOLT] Use BAT for YAML profile call target information
Provide a mechanism to resolve call target information for calls from non-BAT functions to BAT functions (`YAMLProfileWriter::convert`). Make it generic for future use in BAT-to-BAT calls.
Test Plan: Updated bolt/test/X86/bolt-address-translation-yaml.test
Reviewers: ayermolo, maksfb, rafaelauler, dcci
Reviewed By: maksfb
Pull Request: https://github.com/llvm/llvm-project/pull/86219
show more ...
|
#
213eda15 |
| 25-Mar-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT] Add CallSiteInfo entries in YAMLBAT (#76896)
Attach call counters to YAML profile, covering inter-function control
flow.
Depends on: https://github.com/llvm/llvm-project/pull/86218
Tes
[BOLT] Add CallSiteInfo entries in YAMLBAT (#76896)
Attach call counters to YAML profile, covering inter-function control
flow.
Depends on: https://github.com/llvm/llvm-project/pull/86218
Test Plan:
Updated bolt/test/X86/bolt-address-translation-yaml.test
show more ...
|
#
1b763f23 |
| 25-Mar-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT] Add secondary entry points to BAT
Provide secondary entry points for `EntryDiscriminator` call info field in YAML profile.
Increases BAT section size to: - large binary: 39655300 bytes (1.03
[BOLT] Add secondary entry points to BAT
Provide secondary entry points for `EntryDiscriminator` call info field in YAML profile.
Increases BAT section size to: - large binary: 39655300 bytes (1.03x the original), - medium binary: 3834328 bytes (0.65x), - small binary: 924 bytes (0.64x).
Depends on: https://github.com/llvm/llvm-project/pull/76911
Test Plan: - Updated bolt-address-translation{,-yaml}.test - Added openssl test: https://github.com/rafaelauler/bolt-tests/pull/30
Reviewers: dcci, rafaelauler, maksfb, ayermolo
Reviewed By: rafaelauler
Pull Request: https://github.com/llvm/llvm-project/pull/86218
show more ...
|
#
a91cd53d |
| 23-Mar-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Refactor BAT metadata data structures
Hide the implementations of `FuncHashes` and `BBHashMap` classes, getting rid of `at` accessors that could throw an exception.
Test Plan: NFC
Revi
[BOLT][NFC] Refactor BAT metadata data structures
Hide the implementations of `FuncHashes` and `BBHashMap` classes, getting rid of `at` accessors that could throw an exception.
Test Plan: NFC
Reviewers: ayermolo, maksfb, dcci, rafaelauler
Reviewed By: rafaelauler
Pull Request: https://github.com/llvm/llvm-project/pull/86353
show more ...
|
#
ceba3a38 |
| 22-Mar-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT] Add number of basic blocks to BAT
YAML profile reader checks the number of basic blocks in regular, no-stale-matching mode. Add it to BAT.
This increases the size of BAT section to: - large
[BOLT] Add number of basic blocks to BAT
YAML profile reader checks the number of basic blocks in regular, no-stale-matching mode. Add it to BAT.
This increases the size of BAT section to: - large binary: 39583080 bytes (1.02x of the original), - medium binary: 3816492 bytes (0.64x), - small binary: 920 bytes (0.64x, no change due to alignment).
Test Plan: Updated bolt-address-translation-yaml.test
Reviewers: rafaelauler, ayermolo, maksfb, dcci
Reviewed By: rafaelauler
Pull Request: https://github.com/llvm/llvm-project/pull/86045
show more ...
|
#
b0e23639 |
| 21-Feb-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT] Add BB index to BAT
Add input basic block index to BAT metadata. This addresses the case where some basic blocks are eliminated, and output index is not equal to the input block index. These
[BOLT] Add BB index to BAT
Add input basic block index to BAT metadata. This addresses the case where some basic blocks are eliminated, and output index is not equal to the input block index. These indices are used in non-stale-matching mode.
Increases BAT section size to: - large binary: 39521512 bytes (1.02x original), - medium binary: 3799988 bytes (0.64x), - small binary: 920 bytes (0.64x).
Test Plan: Updated bolt-address-translation{,-yaml}.test
Pull Request: https://github.com/llvm/llvm-project/pull/86044
show more ...
|
#
f66d631b |
| 22-Mar-2024 |
Amir Ayupov <aaupov@fb.com> |
Revert "[BOLT] Add BB index to BAT (#86044)"
This reverts commit 3b3de48fd84b8269d5f45ee0a9dc6b7448368424.
|
#
3b3de48f |
| 22-Mar-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT] Add BB index to BAT (#86044)
|
#
aa7e4ba3 |
| 21-Mar-2024 |
Kazu Hirata <kazu@google.com> |
[BOLT] Fix an unused variable warning
This patch fixes:
bolt/lib/Profile/BoltAddressTranslation.cpp:26:12: error: unused variable 'HotFuncAddress' [-Werror,-Wunused-variable]
|
#
ad00e7e5 |
| 20-Mar-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT] Write and parse BF/BB hashes in BAT
This increases BAT section size to: - large binary: 34832976 bytes (0.90x original), - medium binary: 3586800 bytes (0.60x original), - small binary: 816 b
[BOLT] Write and parse BF/BB hashes in BAT
This increases BAT section size to: - large binary: 34832976 bytes (0.90x original), - medium binary: 3586800 bytes (0.60x original), - small binary: 816 bytes (0.57x original).
Test Plan: Updated bolt/test/X86/bolt-address-translation.test
Reviewers: rafaelauler, dcci, ayermolo, maksfb
Reviewed By: rafaelauler
Pull Request: https://github.com/llvm/llvm-project/pull/76907
show more ...
|
#
d2c9a19d |
| 15-Feb-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Pass BF/BB hashes to BAT
Test Plan: NFC
Reviewers: dcci, rafaelauler, maksfb, ayermolo
Reviewed By: rafaelauler
Pull Request: https://github.com/llvm/llvm-project/pull/76906
|
#
52cf0711 |
| 12-Feb-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT][NFC] Log through JournalingStreams (#81524)
Make core BOLT functionality more friendly to being used as a
library instead of in our standalone driver llvm-bolt. To
accomplish this, we augme
[BOLT][NFC] Log through JournalingStreams (#81524)
Make core BOLT functionality more friendly to being used as a
library instead of in our standalone driver llvm-bolt. To
accomplish this, we augment BinaryContext with journaling streams
that are to be used by most BOLT code whenever something needs to
be logged to the screen. Users of the library can decide if logs
should be printed to a file, no file or to the screen, as
before. To illustrate this, this patch adds a new option
`--log-file` that allows the user to redirect BOLT logging to a
file on disk or completely hide it by using
`--log-file=/dev/null`. Future BOLT code should now use
`BinaryContext::outs()` for printing important messages instead of
`llvm::outs()`. A new test log.test enforces this by verifying that
no strings are print to screen once the `--log-file` option is
used.
In previous patches we also added a new BOLTError class to report
common and fatal errors, so code shouldn't call exit(1) now. To
easily handle problems as before (by quitting with exit(1)),
callers can now use
`BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code
needs to deal with BOLT errors. To test this, we have fatal.s
that checks we are correctly quitting and printing a fatal error
to the screen.
Because this is a significant change by itself, not all code was
yet ported. Code from Profiler libs (DataAggregator and friends)
still print errors directly to screen.
Co-authored-by: Rafael Auler <rafaelauler@fb.com>
Test Plan: NFC
show more ...
|
#
df7d2b2f |
| 25-Jan-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT] Deduplicate equal offsets in BAT (#76905)
Encode BRANCHENTRY bits as bitmask for deduplicated entries.
Reduces BAT section size:
- large binary: to 11834216 bytes (0.31x original),
- med
[BOLT] Deduplicate equal offsets in BAT (#76905)
Encode BRANCHENTRY bits as bitmask for deduplicated entries.
Reduces BAT section size:
- large binary: to 11834216 bytes (0.31x original),
- medium binary: to 1565584 bytes (0.26x original),
- small binary: to 336 bytes (0.23x original).
Test Plan: Updated bolt/test/X86/bolt-address-translation.test
show more ...
|
#
8f1d94aa |
| 18-Jan-2024 |
Amir Ayupov <aaupov@fb.com> |
[BOLT] Use continuous output addresses in delta encoding in BAT
Make output function addresses be delta-encoded wrt last offset in the previous function. This reduces the deltas in function start ad
[BOLT] Use continuous output addresses in delta encoding in BAT
Make output function addresses be delta-encoded wrt last offset in the previous function. This reduces the deltas in function start addresses.
Test Plan: Reduces BAT section size to: - large binary: 12218860 bytes (0.32x original), - medium binary: 1606580 bytes (0.27x original), - small binary: 404 bytes (0.28x original),
Reviewers: rafaelauler
Reviewed By: rafaelauler
Pull Request: https://github.com/llvm/llvm-project/pull/76904
show more ...
|