Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5 |
|
#
cf83a7fd |
| 22-Nov-2024 |
Lei Wang <wlei@fb.com> |
[SHT_LLVM_BB_ADDR_MAP] Add an option to skip emitting bb entries (#114447)
Sometimes we want to use a `PgoAnalysisMap` feature that doesn't require
the BB entries info, e.g. only the `FuncEntryCoun
[SHT_LLVM_BB_ADDR_MAP] Add an option to skip emitting bb entries (#114447)
Sometimes we want to use a `PgoAnalysisMap` feature that doesn't require
the BB entries info, e.g. only the `FuncEntryCount`, but the BB entries
is emitted by default, so I'm adding an option to skip the info for this
case to save the binary size(can save ~90% size of the section). For
implementation, it extends a new field(`OmitBBEntries`) in
`BBAddrMap::Features` for this and it's controlled by a switch
`--basic-block-address-map-skip-bb-entries`.
Note that this naturally supports backwards compatibility as the field
is zero for the old version, matches the decoding in the new version
llvm.
show more ...
|
Revision tags: llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5 |
|
#
df6d2faa |
| 29-Apr-2024 |
Amir Ayupov <aaupov@fb.com> |
[Object] Provide operator< for ELFSymbolRef (#89861)
Normally, operator< accepting DataRefImpl is used when comparing
SymbolRef/ELFSymbolRef. However, it uses std::memcmp which interprets
DataRefI
[Object] Provide operator< for ELFSymbolRef (#89861)
Normally, operator< accepting DataRefImpl is used when comparing
SymbolRef/ELFSymbolRef. However, it uses std::memcmp which interprets
DataRefImpl union as char string so that the result depends on host
endianness.
For ELFSymbolRef a specialized operator< can be used instead to produce
consistent ordering regardless of endianness by comparing the symbol
table index and symbol index fields separately.
show more ...
|
Revision tags: llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2 |
|
#
acec6419 |
| 02-Feb-2024 |
Rahman Lavaee <rahmanl@google.com> |
[SHT_LLVM_BB_ADDR_MAP] Allow basic-block-sections and labels be used together by decoupling the handling of the two features. (#74128)
Today `-split-machine-functions` and `-fbasic-block-sections={a
[SHT_LLVM_BB_ADDR_MAP] Allow basic-block-sections and labels be used together by decoupling the handling of the two features. (#74128)
Today `-split-machine-functions` and `-fbasic-block-sections={all,list}`
cannot be combined with `-basic-block-sections=labels` (the labels
option will be ignored).
The inconsistency comes from the way basic block address map -- the
underlying mechanism for basic block labels -- encodes basic block
addresses
(https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html).
Specifically, basic block offsets are computed relative to the function
begin symbol. This relies on functions being contiguous which is not the
case for MFS and basic block section binaries. This means Propeller
cannot use binary profiles collected from these binaries, which limits
the applicability of Propeller for iterative optimization.
To make the `SHT_LLVM_BB_ADDR_MAP` feature work with basic block section
binaries, we propose modifying the encoding of this section as follows.
First let us review the current encoding which emits the address of each
function and its number of basic blocks, followed by basic block entries
for each basic block.
| | |
|--|--|
| Address of the function | Function Address |
| Number of basic blocks in this function | NumBlocks |
| BB entry 1
| BB entry 2
| ...
| BB entry #NumBlocks
To make this work for basic block sections, we treat each basic block
section similar to a function, except that basic block sections of the
same function must be encapsulated in the same structure so we can map
all of them to their single function.
We modify the encoding to first emit the number of basic block sections
(BB ranges) in the function. Then we emit the address map of each basic
block section section as before: the base address of the section, its
number of blocks, and BB entries for its basic block. The first section
in the BB address map is always the function entry section.
| | |
|--|--|
| Number of sections for this function | NumBBRanges |
| Section 1 begin address | BaseAddress[1] |
| Number of basic blocks in section 1 | NumBlocks[1] |
| BB entries for Section 1
|..................|
| Section #NumBBRanges begin address | BaseAddress[NumBBRanges] |
| Number of basic blocks in section #NumBBRanges |
NumBlocks[NumBBRanges] |
| BB entries for Section #NumBBRanges
The encoding of basic block entries remains as before with the minor
change that each basic block offset is now computed relative to the
begin symbol of its containing BB section.
This patch adds a new boolean codegen option `-basic-block-address-map`.
Correspondingly, the front-end flag `-fbasic-block-address-map` and LLD
flag `--lto-basic-block-address-map` are introduced.
Analogously, we add a new TargetOption field `BBAddrMap`. This means BB
address maps are either generated for all functions in the compiling
unit, or for none (depending on `TargetOptions::BBAddrMap`).
This patch keeps the functionality of the old
`-fbasic-block-sections=labels` option but does not remove it. A
subsequent patch will remove the obsolete option.
We refactor the `BasicBlockSections` pass by separating the BB address
map and BB sections handing to their own functions (named
`handleBBAddrMap` and `handleBBSections`). `handleBBSections` renumbers
basic blocks and places them in their assigned sections.
`handleBBAddrMap` is invoked after `handleBBSections` (if requested) and
only renumbers the blocks.
- New tests added:
- Two tests basic-block-address-map-with-basic-block-sections.ll and
basic-block-address-map-with-mfs.ll to exercise the combination of
`-basic-block-address-map` with `-basic-block-sections=list` and
'-split-machine-functions`.
- A driver sanity test for the `-fbasic-block-address-map` option
(basic-block-address-map.c).
- An LLD test for testing the `--lto-basic-block-address-map` option.
This reuses the LLVM IR from `lld/test/ELF/lto/basic-block-sections.ll`.
- Renamed and modified the two existing codegen tests for basic block
address map (`basic-block-sections-labels-functions-sections.ll` and
`basic-block-sections-labels.ll`)
- Removed `SHT_LLVM_BB_ADDR_MAP_V0` tests. Full deprecation of
`SHT_LLVM_BB_ADDR_MAP_V0` and `SHT_LLVM_BB_ADDR_MAP` version less than 2
will happen in a separate PR in a few months.
show more ...
|
Revision tags: llvmorg-18.1.0-rc1 |
|
#
23faa81d |
| 24-Jan-2024 |
Micah Weston <micahsweston@gmail.com> |
[SHT_LLVM_BB_ADDR_MAP] Avoids side-effects in addition since order is unspecified. (#79168)
Turns out the problem with
https://github.com/llvm/llvm-project/issues/60013 is due to the fact
that ord
[SHT_LLVM_BB_ADDR_MAP] Avoids side-effects in addition since order is unspecified. (#79168)
Turns out the problem with
https://github.com/llvm/llvm-project/issues/60013 is due to the fact
that order of operation is unspecified in C++:
https://en.cppreference.com/w/cpp/language/eval_order. A small example
of where this manifests with MSVC can be seen here
https://ooo.godbolt.org/z/bxqKeqzqn.
This patch does the following:
* Removes the addition operations where we sequence more than one
side-effect based expression.
* Removes test guards to now run on Windows
show more ...
|
Revision tags: llvmorg-19-init |
|
#
6ba2c2bb |
| 16-Jan-2024 |
Rahman Lavaee <rahmanl@google.com> |
[SHT_LLVM_BB_ADDR_MAP,NFC] Add SCOPED_TRACE for convenient mapping of failures to test cases. (#78335)
Although parameterized gtests are preferred for this, our tests are not
very straightforward.
[SHT_LLVM_BB_ADDR_MAP,NFC] Add SCOPED_TRACE for convenient mapping of failures to test cases. (#78335)
Although parameterized gtests are preferred for this, our tests are not
very straightforward. So I decided to add SCOPED_TRACE for different
test cases and the lambda checks as well.
Typical test failure message now looks like:
```
...llvm-project/llvm/unittests/Object/ELFObjectFileTest.cpp:737
Expected equality of these values:
*BBAddrMaps
Which is: { 32-byte object <11-11 01-00 00-00 00-00 C0-A9 FB-3E E4-55 00-00 D0-A9 FB-3E E4-55 00-00 D0-A9 FB-3E E4-55 00-00>, 32-byte object <22-22 02-00 00-00 00-00 F0-8E FB-3E E4-55 00-00 00-8F FB-3E E4-55 00-00 00-8F FB-3E E4-55 00-00> }
ExpectedResult
Which is: { 32-byte object <33-33 03-00 00-00 00-00 50-A7 FB-3E E4-55 00-00 60-A7 FB-3E E4-55 00-00 60-A7 FB-3E E4-55 00-00> }
Google Test trace:
...llvm-project/llvm/unittests/Object/ELFObjectFileTest.cpp:726: for TextSectionIndex: 1 and object yaml:
--- !ELF
FileHeader:
Class: ELFCLASS64
Data: ELFDATA2LSB
Type: ET_EXEC
Sections:
- Name: .llvm_bb_addr_map_1
Type: SHT_LLVM_BB_ADDR_MAP
Link: 1
Entries:
- Version: 2
Address: 0x11111
BBEntries:
- ID: 1
AddressOffset: 0x0
Size: 0x1
Metadata: 0x2
- Name: .llvm_bb_addr_map_2
Type: SHT_LLVM_BB_ADDR_MAP
Link: 1
Entries:
- Version: 2
Address: 0x22222
BBEntries:
- ID: 2
AddressOffset: 0x0
Size: 0x2
Metadata: 0x4
- Name: .llvm_bb_addr_map_3
Type: SHT_LLVM_BB_ADDR_MAP
Link: 2
Entries:
- Version: 1
Address: 0x33333
BBEntries:
- ID: 0
AddressOffset: 0x0
Size: 0x3
Metadata: 0x6
- Name: .llvm_bb_addr_map_4
Type: SHT_LLVM_BB_ADDR_MAP_V0
# Link: 0 (by default, can be overriden)
Entries:
- Version: 0
Address: 0x44444
BBEntries:
- ID: 0
AddressOffset: 0x0
Size: 0x4
Metadata: 0x18
...llvm-project/llvm/unittests/Object/ELFObjectFileTest.cpp:757: normal sections
```
show more ...
|
#
2873060f |
| 06-Jan-2024 |
Micah Weston <micahsweston@gmail.com> |
[SHT_LLVM_BB_ADDR_MAP] Fixes two bugs in decoding of PGOAnalyses in BBAddrMap. (#77139)
We had specified that `readBBAddrMap` will always keep PGOAnalyses and
BBAddrMaps the same length on success.
[SHT_LLVM_BB_ADDR_MAP] Fixes two bugs in decoding of PGOAnalyses in BBAddrMap. (#77139)
We had specified that `readBBAddrMap` will always keep PGOAnalyses and
BBAddrMaps the same length on success.
https://github.com/llvm/llvm-project/blob/365fbbfbcfefb8766f7716109b9c3767b58e6058/llvm/include/llvm/Object/ELFObjectFile.h#L116-L117
It turns out that this is not currently the case when no analyses exist
in a function. No test had caught it.
We also should not append PGOBBEntries when there is no BBFreq or
BrProb.
This patch adds:
* tests that PGOAnalyses and BBAddrMaps are same length even when no
analyses are enabled
* fixes decode so that PGOAnalyses and BBAddrMaps are same length
* updates test to not emit unnecessary PGOBBEntries
* fixes decode to not emit PGOBBEntries when unnecessary
show more ...
|
#
0cb98167 |
| 05-Jan-2024 |
Joseph Huber <huberjn@outlook.com> |
[ELF][Obvious] Last time fixing this test hopefully
|
#
c2f5e435 |
| 05-Jan-2024 |
Joseph Huber <huberjn@outlook.com> |
[ELF] Again attempt to fix test on BE architectures
Summary: This formats something according to the style, and again attempts to fix this failing on the BE PPC test. Sorry for the spam, these commi
[ELF] Again attempt to fix test on BE architectures
Summary: This formats something according to the style, and again attempts to fix this failing on the BE PPC test. Sorry for the spam, these commits are the only way I can check it because the failure isn't local.
show more ...
|
#
dfe9bb4d |
| 05-Jan-2024 |
Joseph Huber <huberjn@outlook.com> |
[ELF] Attempt to fix test on big endian architectures
Summary: This test fails because AMDGPU has a check for little-endianness before returning the architecture. This test attempts to force the typ
[ELF] Attempt to fix test on big endian architectures
Summary: This test fails because AMDGPU has a check for little-endianness before returning the architecture. This test attempts to force the type to be considered little-endian for the purpose of this test.
show more ...
|
#
3b337bbc |
| 05-Jan-2024 |
Joseph Huber <huberjn@outlook.com> |
[ELF] Attempt to set the OS when using 'makeTriple()' (#76992)
Summary: This patch fixes up the `makeTriple()` interface to emit append the operating system information when it is readily avaialble
[ELF] Attempt to set the OS when using 'makeTriple()' (#76992)
Summary: This patch fixes up the `makeTriple()` interface to emit append the operating system information when it is readily avaialble from the ELF. The main motivation for this is so the GPU architectures can be easily identified correctly when given and ELF. E.g. we want `amdgpu-amd-amdhsa` as the output and not `amdgpu--`.
This required adding support for the CUDA OS/ABI, which is easily found to be `0x33` when using `readelf`.
show more ...
|
#
105adf2c |
| 12-Dec-2023 |
Micah Weston <micahsweston@gmail.com> |
[SHT_LLVM_BB_ADDR_MAP] Implements PGOAnalysisMap in Object and ObjectYAML with tests.
Reviewed in PR (#71750). A part of [RFC - PGO Accuracy Metrics: Emitting and Evaluating Branch
and Block
Analy
[SHT_LLVM_BB_ADDR_MAP] Implements PGOAnalysisMap in Object and ObjectYAML with tests.
Reviewed in PR (#71750). A part of [RFC - PGO Accuracy Metrics: Emitting and Evaluating Branch
and Block
Analysis](https://discourse.llvm.org/t/rfc-pgo-accuracy-metrics-emitting-and-evaluating-branch-and-block-analysis/73902).
This PR adds the PGOAnalysisMap data structure and implements encoding and
decoding through Object and ObjectYAML along with associated tests. When
emitted into the bb-addr-map section, each function is followed by the associated
pgo-analysis-map for that function. The emitting of each analysis in the map is
controlled by a bit in the bb-addr-map feature byte. All existing bb-addr-map
code can ignore the pgo-analysis-map if the caller does not request the data.
show more ...
|
Revision tags: llvmorg-17.0.6 |
|
#
fab690d6 |
| 17-Nov-2023 |
Rahman Lavaee <rahmanl@google.com> |
[NFC][SHT_LLVM_BB_ADDR_MAP] Define and use constructor and accessors for BBAddrMap fields. (#72689)
The fields are still kept as public for now since our tooling accesses
them. Will change them to
[NFC][SHT_LLVM_BB_ADDR_MAP] Define and use constructor and accessors for BBAddrMap fields. (#72689)
The fields are still kept as public for now since our tooling accesses
them. Will change them to private visibility in a later patch.
show more ...
|
Revision tags: llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5 |
|
#
9c3c6f6a |
| 24-May-2023 |
Rahman Lavaee <rahmanl@google.com> |
[Propeller] Add HasIndirectBranch to BBEntry::Metadata.
This information helps to avoid considering cloning for blocks with indirect branches.
Reviewed By: jhenderson
Differential Revision: https:
[Propeller] Add HasIndirectBranch to BBEntry::Metadata.
This information helps to avoid considering cloning for blocks with indirect branches.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D150611
show more ...
|
Revision tags: llvmorg-16.0.4 |
|
#
5ac48ef5 |
| 11-May-2023 |
Rahman Lavaee <rahmanl@google.com> |
[Propeller] Use a bit-field struct for the metdata fields of BBEntry.
This patch encapsulates the encoding and decoding logic of basic block metadata into the Metadata struct, and also reduces the d
[Propeller] Use a bit-field struct for the metdata fields of BBEntry.
This patch encapsulates the encoding and decoding logic of basic block metadata into the Metadata struct, and also reduces the decoded size of `SHT_LLVM_BB_ADDR_MAP` section.
The patch would've looked more readable if we could use designated initializer, but that is a c++20 feature.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D148360
show more ...
|
Revision tags: llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4 |
|
#
9c645a99 |
| 25-Feb-2023 |
Aiden Grossman <agrossman154@yahoo.com> |
[ELF] Move getSectionAndRelocations to ELF.cpp from ELFDumper.cpp
This refactoring will allow for this utility function to be used in other places in the codebase outside of the llvm-readobj tool.
[ELF] Move getSectionAndRelocations to ELF.cpp from ELFDumper.cpp
This refactoring will allow for this utility function to be used in other places in the codebase outside of the llvm-readobj tool.
Reviewed By: jhenderson, rahmanl
Differential Revision: https://reviews.llvm.org/D144783
show more ...
|
Revision tags: llvmorg-16.0.0-rc3 |
|
#
d768bf99 |
| 10-Feb-2023 |
Archibald Elliott <archibald.elliott@arm.com> |
[NFC][TargetParser] Replace uses of llvm/Support/Host.h
The forwarding header is left in place because of its use in `polly/lib/External/isl/interface/extract_interface.cc`, but I have added a GCC w
[NFC][TargetParser] Replace uses of llvm/Support/Host.h
The forwarding header is left in place because of its use in `polly/lib/External/isl/interface/extract_interface.cc`, but I have added a GCC warning about the fact it is deprecated, because it is used in `isl` from where it is included by Polly.
show more ...
|
Revision tags: llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1 |
|
#
7fc87159 |
| 25-Jan-2023 |
Paul Robinson <paul.robinson@sony.com> |
[unittests] Use GTEST_SKIP() instead of return when appropriate
Basically NFC: A TEST/TEST_F/etc that bails out early (usually because setup failed or some other runtime condition wasn't met) genera
[unittests] Use GTEST_SKIP() instead of return when appropriate
Basically NFC: A TEST/TEST_F/etc that bails out early (usually because setup failed or some other runtime condition wasn't met) generally should use GTEST_SKIP() to report its status correctly, unless it takes steps to report another status (e.g., FAIL()).
I did see a handful of tests show up as SKIPPED after this change, which is not unexpected. The status seemed appropriate in all the new cases.
show more ...
|
Revision tags: llvmorg-17-init, llvmorg-15.0.7 |
|
#
3d6841b2 |
| 07-Dec-2022 |
Rahman Lavaee <rahmanl@google.com> |
[Propeller] Use Fixed MBB ID instead of volatile MachineBasicBlock::Number.
Let Propeller use specialized IDs for basic blocks, instead of MBB number.
This allows optimizations not just prior to as
[Propeller] Use Fixed MBB ID instead of volatile MachineBasicBlock::Number.
Let Propeller use specialized IDs for basic blocks, instead of MBB number.
This allows optimizations not just prior to asm-printer, but throughout the entire codegen. This patch only implements the functionality under the new `LLVM_BB_ADDR_MAP` version, but the old version is still being used. A later patch will change the used version.
####Background Today Propeller uses machine basic block (MBB) numbers, which already exist, to map native assembly to machine IR. This is done as follows. - Basic block addresses are captured and dumped into the `LLVM_BB_ADDR_MAP` section just before the AsmPrinter pass which writes out object files. This ensures that we have a mapping that is close to assembly. - Profiling mapping works by taking a virtual address of an instruction and looking up the `LLVM_BB_ADDR_MAP` section to find the MBB number it corresponds to. - While this works well today, we need to do better when we scale Propeller to target other Machine IR optimizations like spill code optimization. Register allocation happens earlier in the Machine IR pipeline and we need an annotation mechanism that is valid at that point. - The current scheme will not work in this scenario because the MBB number of a particular basic block is not fixed and changes over the course of codegen (via renumbering, adding, and removing the basic blocks). - In other words, the volatile MBB numbers do not provide a one-to-one correspondence throughout the lifetime of Machine IR. Profile annotation using MBB numbers is restricted to a fixed point; only valid at the exact point where it was dumped. - Further, the object file can only be dumped before AsmPrinter and cannot be dumped at an arbitrary point in the Machine IR pass pipeline. Hence, MBB numbers are not suitable and we need something else. ####Solution We propose using fixed unique incremental MBB IDs for basic blocks instead of volatile MBB numbers. These IDs are assigned upon the creation of machine basic blocks. We modify `MachineFunction::CreateMachineBasicBlock` to assign the fixed ID to every newly created basic block. It assigns `MachineFunction::NextMBBID` to the MBB ID and then increments it, which ensures having unique IDs.
To ensure correct profile attribution, multiple equivalent compilations must generate the same Propeller IDs. This is guaranteed as long as the MachineFunction passes run in the same order. Since the `NextBBID` variable is scoped to `MachineFunction`, interleaving of codegen for different functions won't cause any inconsistencies.
The new encoding is generated under the new version number 2 and we keep backward-compatibility with older versions.
####Impact on Size of the `LLVM_BB_ADDR_MAP` Section Emitting the Propeller ID results in a 23% increase in the size of the `LLVM_BB_ADDR_MAP` section for the clang binary.
Reviewed By: tmsriram
Differential Revision: https://reviews.llvm.org/D100808
show more ...
|
#
8a0e0c22 |
| 15-Jan-2023 |
Sergei Barannikov <barannikov88@gmail.com> |
[NFC] Use `llvm::enumerate` in llvm/unittests/Object
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D141788
|
#
310f7652 |
| 26-Dec-2022 |
Andrei Safronov <andrei.safronov@espressif.com> |
[Xtensa 2/10] Add Xtensa ELF definitions
Add file with Xtensa ELF relocations. Add Xtensa support to ELF.h, ELFObject.h and ELFYAML.cpp. Add simple test of Xtensa ELF representation in YAML.
Differ
[Xtensa 2/10] Add Xtensa ELF definitions
Add file with Xtensa ELF relocations. Add Xtensa support to ELF.h, ELFObject.h and ELFYAML.cpp. Add simple test of Xtensa ELF representation in YAML.
Differential Revision: https://reviews.llvm.org/D64827
show more ...
|
#
96b6ee1b |
| 13-Dec-2022 |
Rahman Lavaee <rahmanl@google.com> |
Revert "[Propeller] Use Fixed MBB ID instead of volatile MachineBasicBlock::Number."
This reverts commit 6015a045d768feab3bae9ad9c0c81e118df8b04a.
Differential Revision: https://reviews.llvm.org/D1
Revert "[Propeller] Use Fixed MBB ID instead of volatile MachineBasicBlock::Number."
This reverts commit 6015a045d768feab3bae9ad9c0c81e118df8b04a.
Differential Revision: https://reviews.llvm.org/D139952
show more ...
|
#
6015a045 |
| 07-Dec-2022 |
Rahman Lavaee <rahmanl@google.com> |
[Propeller] Use Fixed MBB ID instead of volatile MachineBasicBlock::Number.
Let Propeller use specialized IDs for basic blocks, instead of MBB number.
This allows optimizations not just prior to as
[Propeller] Use Fixed MBB ID instead of volatile MachineBasicBlock::Number.
Let Propeller use specialized IDs for basic blocks, instead of MBB number.
This allows optimizations not just prior to asm-printer, but throughout the entire codegen. This patch only implements the functionality under the new `LLVM_BB_ADDR_MAP` version, but the old version is still being used. A later patch will change the used version.
####Background Today Propeller uses machine basic block (MBB) numbers, which already exist, to map native assembly to machine IR. This is done as follows. - Basic block addresses are captured and dumped into the `LLVM_BB_ADDR_MAP` section just before the AsmPrinter pass which writes out object files. This ensures that we have a mapping that is close to assembly. - Profiling mapping works by taking a virtual address of an instruction and looking up the `LLVM_BB_ADDR_MAP` section to find the MBB number it corresponds to. - While this works well today, we need to do better when we scale Propeller to target other Machine IR optimizations like spill code optimization. Register allocation happens earlier in the Machine IR pipeline and we need an annotation mechanism that is valid at that point. - The current scheme will not work in this scenario because the MBB number of a particular basic block is not fixed and changes over the course of codegen (via renumbering, adding, and removing the basic blocks). - In other words, the volatile MBB numbers do not provide a one-to-one correspondence throughout the lifetime of Machine IR. Profile annotation using MBB numbers is restricted to a fixed point; only valid at the exact point where it was dumped. - Further, the object file can only be dumped before AsmPrinter and cannot be dumped at an arbitrary point in the Machine IR pass pipeline. Hence, MBB numbers are not suitable and we need something else. ####Solution We propose using fixed unique incremental MBB IDs for basic blocks instead of volatile MBB numbers. These IDs are assigned upon the creation of machine basic blocks. We modify `MachineFunction::CreateMachineBasicBlock` to assign the fixed ID to every newly created basic block. It assigns `MachineFunction::NextMBBID` to the MBB ID and then increments it, which ensures having unique IDs.
To ensure correct profile attribution, multiple equivalent compilations must generate the same Propeller IDs. This is guaranteed as long as the MachineFunction passes run in the same order. Since the `NextBBID` variable is scoped to `MachineFunction`, interleaving of codegen for different functions won't cause any inconsistencies.
The new encoding is generated under the new version number 2 and we keep backward-compatibility with older versions.
####Impact on Size of the `LLVM_BB_ADDR_MAP` Section Emitting the Propeller ID results in a 23% increase in the size of the `LLVM_BB_ADDR_MAP` section for the clang binary.
Reviewed By: tmsriram
Differential Revision: https://reviews.llvm.org/D100808
show more ...
|
#
c302fb5c |
| 04-Dec-2022 |
Fangrui Song <i@maskray.me> |
[Object] llvm::Optional => std::optional
|
#
b6a01caa |
| 03-Dec-2022 |
Kazu Hirata <kazu@google.com> |
[llvm/unittests] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the am
[llvm/unittests] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional.
This is part of an effort to migrate from llvm::Optional to std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
show more ...
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
#
0aa6df65 |
| 28-Jun-2022 |
Rahman Lavaee <rahmanl@google.com> |
[Propeller] Encode address offsets of basic blocks relative to the end of the previous basic blocks.
This is a resurrection of D106421 with the change that it keeps backward-compatibility. This mean
[Propeller] Encode address offsets of basic blocks relative to the end of the previous basic blocks.
This is a resurrection of D106421 with the change that it keeps backward-compatibility. This means decoding the previous version of `LLVM_BB_ADDR_MAP` will work. This is required as the profile mapping tool is not released with LLVM (AutoFDO). As suggested by @jhenderson we rename the original section type value to `SHT_LLVM_BB_ADDR_MAP_V0` and assign a new value to the `SHT_LLVM_BB_ADDR_MAP` section type. The new encoding adds a version byte to each function entry to specify the encoding version for that function. This patch also adds a feature byte to be used with more flexibility in the future. An use-case example for the feature field is encoding multi-section functions more concisely using a different format.
Conceptually, the new encoding emits basic block offsets and sizes as label differences between each two consecutive basic block begin and end label. When decoding, offsets must be aggregated along with basic block sizes to calculate the final offsets of basic blocks relative to the function address.
This encoding uses smaller values compared to the existing one (offsets relative to function symbol). Smaller values tend to occupy fewer bytes in ULEB128 encoding. As a result, we get about 17% total reduction in the size of the bb-address-map section (from about 11MB to 9MB for the clang PGO binary). The extra two bytes (version and feature fields) incur a small 3% size overhead to the `LLVM_BB_ADDR_MAP` section size.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D121346
show more ...
|