History log of /llvm-project/llvm/lib/Transforms/IPO/FunctionImport.cpp (Results 1 – 25 of 320)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7
# 4312075e 06-Jan-2025 Mircea Trofin <mtrofin@google.com>

[nfc][thinlto] remove unnecessary return from `renameModuleForThinLTO` (#121851)

Same goes for `FunctionImportGlobalProcessing::run`.

The return value was used, but it was always `false`.


Revision tags: llvmorg-19.1.6, llvmorg-19.1.5
# 6faf17b7 03-Dec-2024 Mingming Liu <mingmingl@google.com>

[ThinLTO]Supports declaration import for global variables in distributed ThinLTO (#117616)

When `-import-declaration` option is enabled, declaration import is
supported for functions. https://githu

[ThinLTO]Supports declaration import for global variables in distributed ThinLTO (#117616)

When `-import-declaration` option is enabled, declaration import is
supported for functions. https://github.com/llvm/llvm-project/pull/88024
has the context for this option.

This patch supports declaration import for global variables in
distributed ThinLTO. The motivating use case is to propagate `dso_local`
attribute of global variables across modules, to optimize global
variable access when a binary is built with
`-fno-direct-access-external-data`.
* With `-fdirect-access-external-data`, non thread-local global
variables will [have `dso_local`
attributes](https://github.com/llvm/llvm-project/blob/fe3c23b439b9a2d00442d9bc6a4ca86f73066a3d/clang/lib/CodeGen/CodeGenModule.cpp#L1730-L1746).
This optimizes the global variable access as shown by
https://gcc.godbolt.org/z/vMzWcKdh3

show more ...


# 991154d0 27-Nov-2024 Krzysztof Pszeniczny <kpszeniczny@google.com>

[LTO] Use .at instead of .lookup to avoid copies. (NFC) (#117888)

`DenseMap::lookup` returns by value (because it default-creates the
returned value if the key isn't present in the map), which mean

[LTO] Use .at instead of .lookup to avoid copies. (NFC) (#117888)

`DenseMap::lookup` returns by value (because it default-creates the
returned value if the key isn't present in the map), which means that we
do a lot of copying here. Since we assert that something is present in
the returned value two lines below this call, it's safe to use `.at`
here instead.

Copying and then destroying dense maps here is responsible for 60% of
the time spent in LTO indexing in a large internal build.

show more ...


Revision tags: llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2
# 2edd897a 07-Oct-2024 Nuri Amari <nuri.amari99@gmail.com>

Make WriteIndexesThinBackend multi threaded (#109847)

We've noticed that for large builds executing thin-link can take on the
order of 10s of minutes. We are only using a single thread to write the

Make WriteIndexesThinBackend multi threaded (#109847)

We've noticed that for large builds executing thin-link can take on the
order of 10s of minutes. We are only using a single thread to write the
sharded indices and import files for each input bitcode file. While we
need to ensure the index file produced lists modules in a deterministic
order, that doesn't prevent us from executing the rest of the work in
parallel.

In this change we use a thread pool to execute as much of the backend's
work as possible in parallel. In local testing on a machine with 80
cores, this change makes a thin-link for ~100,000 input files run in ~2
minutes. Without this change it takes upwards of 10 minutes.

---------

Co-authored-by: Nuri Amari <nuriamari@fb.com>

show more ...


Revision tags: llvmorg-19.1.1, llvmorg-19.1.0
# 885ac299 12-Sep-2024 Mircea Trofin <mtrofin@google.com>

[nfc][ctx_prof] Change some internal "set" types

- the set used for targets under a callsite is simpler to use if iterators
are stable (it gets manipulated during updates)
- the set used to fetch

[nfc][ctx_prof] Change some internal "set" types

- the set used for targets under a callsite is simpler to use if iterators
are stable (it gets manipulated during updates)
- the set used to fetch the transitive closure of GUIDs under a node can
be left as a choice to the user.

show more ...


# 3dad29b6 11-Sep-2024 Kazu Hirata <kazu@google.com>

[LTO] Remove unused includes (NFC) (#108110)

clangd reports these as unused headers. My manual inspection agrees
with the findings.


Revision tags: llvmorg-19.1.0-rc4
# 5c0d61e3 01-Sep-2024 Kazu Hirata <kazu@google.com>

[LTO] Reduce memory usage for import lists (#106772)

This patch reduces the memory usage for import lists by employing
memory-efficient data structures.

With this patch, an import list for a giv

[LTO] Reduce memory usage for import lists (#106772)

This patch reduces the memory usage for import lists by employing
memory-efficient data structures.

With this patch, an import list for a given destination module is
basically DenseSet<uint32_t> with each element indexing into the
deduplication table containing tuples of:

{SourceModule, GUID, Definition/Declaration}

In one of our large applications, the peak memory usage goes down by
9.2% from 6.120GB to 5.555GB during the LTO indexing step.

This patch addresses several sources of space inefficiency associated
with std::unordered_map:

- std::unordered_map<GUID, ImportKind> takes up 16 bytes because of
padding even though ImportKind only carries one bit of information.

- std::unordered_map uses pointers to elements, both in the hash table
proper and for collision chains.

- We allocate an instance of std::unordered_map for each
{Destination Module, Source Module} pair for which we have at least
one import. Most import lists have less than 10 imports, so the
metadata like the size of std::unordered_map and the pointer to the
hash table costs a lot relative to the actual contents.

show more ...


# eb9c49c9 28-Aug-2024 Kazu Hirata <kazu@google.com>

[LTO] Make getImportType a proper function (NFC) (#106450)

I'm planning to reduce the memory footprint of ThinLTO indexing by
changing ImportMapTy. A look-up of the import type will involve data

[LTO] Make getImportType a proper function (NFC) (#106450)

I'm planning to reduce the memory footprint of ThinLTO indexing by
changing ImportMapTy. A look-up of the import type will involve data
private to ImportMapTy, so it must be done by a member function of
ImportMapTy. This patch turns getImportType into a member function so
that a subsequent "real" change will just have to update the
implementation of the function in place.

show more ...


# 4f15039c 28-Aug-2024 Kazu Hirata <kazu@google.com>

[LTO] Introduce new type alias ImportListsTy (NFC) (#106420)

The background is as follows. I'm planning to reduce the memory
footprint of ThinLTO indexing by changing ImportMapTy, the data
struct

[LTO] Introduce new type alias ImportListsTy (NFC) (#106420)

The background is as follows. I'm planning to reduce the memory
footprint of ThinLTO indexing by changing ImportMapTy, the data
structure used for an import list. Once this patch lands, I'm
planning to change the type slightly. The new type alias allows us to
update the type without touching many places.

show more ...


# 29bb523b 27-Aug-2024 Kazu Hirata <kazu@google.com>

[LTO] Introduce a helper lambda in gatherImportedSummariesForModule (NFC) (#106251)

This patch forward ports the heterogeneous std::map::operator[]() from
C++26 so that we can look up the map witho

[LTO] Introduce a helper lambda in gatherImportedSummariesForModule (NFC) (#106251)

This patch forward ports the heterogeneous std::map::operator[]() from
C++26 so that we can look up the map without allocating an instance of
std::string when the key-value pair exists in the map.

The background is as follows. I'm planning to reduce the memory
footprint of ThinLTO indexing by changing ImportMapTy, the data
structure used for an import list. The new list will be a hash set of
tuples (SourceModule, GUID, ImportType) represented in a space
efficient manner. That means that as we iterate over the hash set, we
encounter SourceModule as many times as GUID. We don't want to create
a temporary instance of std::string every time we look up
ModuleToSummariesForIndex like:

auto &SummariesForIndex =
ModuleToSummariesForIndex[std::string(ILI.first)];

This patch removes the need to create the temporaries by enabling the
hetegeneous lookup with std::set<K, V, std::less<>> and forward
porting std::map::operator[]() from C++26.

show more ...


# 0359b9a2 27-Aug-2024 Kazu Hirata <kazu@google.com>

[LTO] Introduce a helper function collectImportStatistics (NFC) (#106179)

This patch introduces a helper function collectImportStatistics. The
new function computes statistics of imports for
Comp

[LTO] Introduce a helper function collectImportStatistics (NFC) (#106179)

This patch introduces a helper function collectImportStatistics. The
new function computes statistics of imports for
ComputeCrossModuleImport and dumpImportListForModule with no
functional change.

The background is as follows. I'm planning to reduce the memory
footprint of ThinLTO indexing by changing ImportMapTy, the data
structure used for an import list. The new list will be a hash set of
tuples (SourceModule, GUID, ImportType) represented in a space
efficient manner. That means that obtaining statistics like the
number of definitions per source module requires us to go through the
entire import list (for a given destination module).

Introducing a helper function now makes the callers more independent
of the underlying data structures used in ImportMapT.

show more ...


# 4e30cf7b 26-Aug-2024 Kazu Hirata <kazu@google.com>

[LTO] Introduce getSourceModules (NFC) (#105955)

This patch introduces getSourceModules to compute the list of source
modules in the ascending alphabetical order. The new function is
intended to

[LTO] Introduce getSourceModules (NFC) (#105955)

This patch introduces getSourceModules to compute the list of source
modules in the ascending alphabetical order. The new function is
intended to hide implementation details of ImportMapTy while
simplifying FunctionImporter::importFunctions a little bit.

show more ...


# dbd7ce0c 24-Aug-2024 Kazu Hirata <kazu@google.com>

[IR] Inroduce ModuleToSummariesForIndexTy (NFC) (#105906)

This patch introduces type alias ModuleToSummariesForIndexTy.

I'm planning to change the type slightly to allow heterogeneous lookup
(th

[IR] Inroduce ModuleToSummariesForIndexTy (NFC) (#105906)

This patch introduces type alias ModuleToSummariesForIndexTy.

I'm planning to change the type slightly to allow heterogeneous lookup
(that is, std::map<K, V, std::less<>>) in a subsequent patch. The
problem is that changing the type affects many places. Using a type
alias reduces the impact.

show more ...


# 35639079 23-Aug-2024 Kazu Hirata <kazu@google.com>

[LTO] Turn ImportMapTy into a proper class (NFC) (#105748)

This patch turns type alias ImportMapTy into a proper class to provide
a more intuitive interface like:

ImportList.addDefinition(...)

[LTO] Turn ImportMapTy into a proper class (NFC) (#105748)

This patch turns type alias ImportMapTy into a proper class to provide
a more intuitive interface like:

ImportList.addDefinition(...)

as opposed to:

FunctionImporter::addDefinition(ImportList, ...)

Also, this patch requires all non-const accesses to go through
addDefinition, maybeAddDeclaration, and addGUID while providing const
accesses via:

const ImportMapTyImpl &getImportMap() const { return ImportMap; }

I realize ImportMapTy may not be the best name as a class (maybe OK as
a type alias). I am not renaming ImportMapTy in this patch at least
because there are 47 mentions of ImportMapTy under llvm/.

show more ...


# ca48b015 22-Aug-2024 Kazu Hirata <kazu@google.com>

[LTO] Use a helper function to add a definition (NFC) (#105721)

I missed this one when I introduced helper functions in:

commit 3082a381f57ef2885c270f41f2955e08c79634c5
Author: Kazu Hirata <

[LTO] Use a helper function to add a definition (NFC) (#105721)

I missed this one when I introduced helper functions in:

commit 3082a381f57ef2885c270f41f2955e08c79634c5
Author: Kazu Hirata <kazu@google.com>
Date: Thu Aug 22 12:06:47 2024 -0700

show more ...


# 3082a381 22-Aug-2024 Kazu Hirata <kazu@google.com>

[LTO] Introduce helper functions to add GUIDs to ImportList (NFC) (#105555)

The new helper functions make the intent clearer while hiding
implementation details, including how we handle previously

[LTO] Introduce helper functions to add GUIDs to ImportList (NFC) (#105555)

The new helper functions make the intent clearer while hiding
implementation details, including how we handle previously added
entries. Note that:

- If we are adding a GUID as a GlobalValueSummary::Definition, then we
override a previously added GlobalValueSummary::Declaration entry
for the same GUID.

- If we are adding a GUID as a GlobalValueSummary::Declaration, then a
previously added GlobalValueSummary::Definition entry for the same
GUID takes precedence, and no change is made.

show more ...


# fdbc4089 21-Aug-2024 Kazu Hirata <kazu@google.com>

[LTO] Compare std::optional<ImportKind> directly with ImportKind (NFC) (#105561)

Note that:

Opt == Val if and only (Opt && *Opt == Val)

where:

std::optional<T> Opt;
T Val;


Revision tags: llvmorg-19.1.0-rc3
# 6807ca8e 13-Aug-2024 Mircea Trofin <mtrofin@google.com>

[nfc][ctx_prof] Use one flag for the "use" scenario (#103377)

No need to have two flags, one for the thinlink and one for compilation.


# 51a3bc12 09-Aug-2024 Mingming Liu <mingmingl@google.com>

[ThinLTO]Clean up 'import-assume-unique-local' flag. (#102424)

While manual compiles can specify full file paths and build automation
tools use full, unique paths in practice, it's not clear whethe

[ThinLTO]Clean up 'import-assume-unique-local' flag. (#102424)

While manual compiles can specify full file paths and build automation
tools use full, unique paths in practice, it's not clear whether it's a
general good practice to enforce full paths (fail a build if relative
paths are used).

`NumDefs == 1` condition [1] should hold true for many internal-linkage
vtables as long as full paths are indeed used to salvage the marginal
performance when local-linkage vtables are imported due to indirect
reference.
https://github.com/llvm/llvm-project/pull/100448#discussion_r1692068402
has more details.

[1]
https://github.com/llvm/llvm-project/pull/100448/files#diff-e7cb370fee46f0f773f2b5429dfab36b75126d3909ae98ee87ff3d0e3f75c6e9R215

show more ...


Revision tags: llvmorg-19.1.0-rc2
# c99bd3ce 29-Jul-2024 Mircea Trofin <mtrofin@google.com>

[ctx_prof] Extend `WorkloadImportsManager` to use the contextual profile (#98682)

Keeping the json-based input as it's useful for diagnostics or for driving the import by other means than contextual

[ctx_prof] Extend `WorkloadImportsManager` to use the contextual profile (#98682)

Keeping the json-based input as it's useful for diagnostics or for driving the import by other means than contextual composition.

The support for the contextual profile is just another modality for constructing the import list (`WorkloadImportsManager::Workloads`).

Everything else - i.e. the actual importing logic - is already independent from how that list was obtained.

show more ...


Revision tags: llvmorg-19.1.0-rc1
# ba8883c4 25-Jul-2024 Mingming Liu <mingmingl@google.com>

Fix buildbot failure by fixing the base pointer type (#100508)

This should fix buildbot failures like
https://lab.llvm.org/buildbot/#/builders/169/builds/1448


# ac1a1e57 25-Jul-2024 Mingming Liu <mingmingl@google.com>

[ThinLTO][TypeProf] Import local-linkage global var for mod1:func_foo-> mod2:local-var edge (#100448)

VTable value profiling can create reference edges from `mod1:func_foo`
to `mod2:local-vtable`.

[ThinLTO][TypeProf] Import local-linkage global var for mod1:func_foo-> mod2:local-var edge (#100448)

VTable value profiling can create reference edges from `mod1:func_foo`
to `mod2:local-vtable`. Indirect call profiling can create reference
edges from `mod1:func_foo` to `mod2:local_func_bar`.

Given a ref chain `mod1:func_foo -> mod2:local-var`,`local-var` doesn't
get imported by default.

Compiler checks / requires the module of 'local-var' is the same as the
function that referenced it(`mod1:func_foo`). This is to prevent
mis-compilation when both `mod1` and `mod2` has `local-var` of the same
name, and cpp files are compiled without full path.

This patch allows the import when one of the following conditions
happen:
1) Introduce an option `import-assume-local-unique`. When the compiler
user can guarantee that all files are compiled with full paths, they can
set this option.
2) When there is one instance of value summary.

Test:
* A/B testing this option alone gives -0.16% statistically consistent
cpu cycle reduction on one search workload (no throughput increase)
* Testing it together with existing more-efficient ICP bumps the
throughput increase by a margin (0.05%~0.1%)
* No regressions observed.

show more ...


Revision tags: llvmorg-20-init
# 50fea994 09-Jul-2024 Mingming Liu <mingmingl@google.com>

Reland "[ThinLTO][Bitcode] Generate import type in bitcode" (#97253)

https://github.com/llvm/llvm-project/pull/87600 was reverted in order to
revert
https://github.com/llvm/llvm-project/commit/626

Reland "[ThinLTO][Bitcode] Generate import type in bitcode" (#97253)

https://github.com/llvm/llvm-project/pull/87600 was reverted in order to
revert
https://github.com/llvm/llvm-project/commit/6262763341fcd71a2b0708cf7485f9abd1d26ba8.
Now https://github.com/llvm/llvm-project/pull/95482 is fix forward for
https://github.com/llvm/llvm-project/commit/6262763341fcd71a2b0708cf7485f9abd1d26ba8.
This patch is a reland for
https://github.com/llvm/llvm-project/pull/87600

**Changes on top of original patch**
In `llvm/include/llvm/IR/ModuleSummaryIndex.h`, make the type of
`GVSummaryPtrSet` an `unordered_set` which is more memory efficient when
the number of elements is smaller than 128 [1]

**Original commit message**

For distributed ThinLTO, the LTO indexing step generates combined
summary for each module, and postlink pipeline reads the combined
summary which stores the information for link-time optimization.

This patch populates the 'import type' of a summary in bitcode, and
updates bitcode reader to parse the bit correctly.

[1]
https://github.com/llvm/llvm-project/blob/393eff4e02e7ab3d234d246a8d6912c8e745e6f9/llvm/lib/Support/SmallPtrSet.cpp#L43

show more ...


# af784a5c 03-Jul-2024 Mingming Liu <mingmingl@google.com>

[ThinLTO] Use a set rather than a map to track exported ValueInfos. (#97360)

https://github.com/llvm/llvm-project/pull/95482 is a reland of
https://github.com/llvm/llvm-project/pull/88024.
https:/

[ThinLTO] Use a set rather than a map to track exported ValueInfos. (#97360)

https://github.com/llvm/llvm-project/pull/95482 is a reland of
https://github.com/llvm/llvm-project/pull/88024.
https://github.com/llvm/llvm-project/pull/95482 keeps indexing memory
usage reasonable by using unordered_map and doesn't make other changes
to originally reviewed code.

While discussing possible ways to minimize indexing memory usage, Teresa
asked whether I need `ExportSetTy` as a map or a set is sufficient. This
PR implements the idea. It uses a set rather than a map to track exposed
ValueInfos.

Currently, `ExportLists` has two use cases, and neither needs to track a
ValueInfo's import/export status. So using a set is sufficient and
correct.
1) In both in-process and distributed ThinLTO, it's used to decide if a
function or global variable is visible [1] from another module after importing
creates additional cross-module references.
* If a cross-module call edge is seen today, the callee must be visible
to another module without keeping track of its export status already.
For instance, this [2] is how callees of direct calls get exported.
2) For in-process ThinLTO [3], it's used to compute lto cache key.
* The cache key computation already hashes [4] 'ImportList' , and 'ExportList' is
determined by 'ImportList'. So it's fine to not track 'import type' for export list.

[1] https://github.com/llvm/llvm-project/blob/66cd8ec4c08252ebc73c82e4883a8da247ed146b/llvm/lib/LTO/LTO.cpp#L1815-L1819
[2] https://github.com/llvm/llvm-project/blob/66cd8ec4c08252ebc73c82e4883a8da247ed146b/llvm/lib/LTO/LTO.cpp#L1783-L1794
[3] https://github.com/llvm/llvm-project/blob/66cd8ec4c08252ebc73c82e4883a8da247ed146b/llvm/lib/LTO/LTO.cpp#L1494-L1496
[4] https://github.com/llvm/llvm-project/blob/b76100e220591fab2bf0a4917b216439f7aa4b09/llvm/lib/LTO/LTO.cpp#L194-L222

show more ...


# 8d9db947 20-Jun-2024 Mingming Liu <mingmingl@google.com>

Reland "[ThinLTO] Populate declaration import status except for distributed ThinLTO under a default-off new option" (#95482)

Make `FunctionsToImportTy` an `unordered_map` rather than `DenseMap`.
Cr

Reland "[ThinLTO] Populate declaration import status except for distributed ThinLTO under a default-off new option" (#95482)

Make `FunctionsToImportTy` an `unordered_map` rather than `DenseMap`.
Credit goes to jvoung@ for the 'DenseMap -> unordered_map' change. This
is a reland of https://github.com/llvm/llvm-project/pull/92718

* `DenseMap` allocates space for a large number of key/value pairs and
wastes space when the number of elements are small.
* While init bucket size is zero [1], it quickly allocates buckets for 64 elements [2]
when the number of elements is small (for example, 3 or 4 elements). The programmer
manual [3] also mentions it could waste space.
* Experiments show `FunctionsToImportTy.size()` is smaller than 4 for
multiple binaries with high indexing ram usage. `unordered_map` grows
factor is at most 2 in llvm libc [4] for insert operations.

With this change, `ComputeCrossModuleImport` ram increase is smaller
than 0.5G on a couple of binaries with high indexing ram usage. A wider
range of (pre-release) tests pass.

[1] https://github.com/llvm/llvm-project/blob/ad79a14c9e5ec4a369eed4adf567c22cc029863f/llvm/include/llvm/ADT/DenseMap.h#L431-L432
[2] https://github.com/llvm/llvm-project/blob/ad79a14c9e5ec4a369eed4adf567c22cc029863f/llvm/include/llvm/ADT/DenseMap.h#L849
[3] https://llvm.org/docs/ProgrammersManual.html#llvm-adt-densemap-h
[4] https://github.com/llvm/llvm-project/blob/ad79a14c9e5ec4a369eed4adf567c22cc029863f/libcxx/include/__hash_table#L1525-L1526

**Original commit message**
The goal is to populate `declaration` import status if a new flag
`-import-declaration` is on.

* For in-process ThinLTO, the `declaration` status is visible to backend
`function-import` pass, so `FunctionImporter::importFunctions` should
read the import status and be no-op for declaration summaries.
Basically, the postlink pipeline is updated to keep its current behavior
(import definitions), but not updated to handle `declaration` summaries.
Two use cases ([better call-graph
sort](https://discourse.llvm.org/t/rfc-for-better-call-graph-sort-build-a-more-complete-call-graph-by-adding-more-indirect-call-edges/74029#support-cross-module-function-declaration-import-5)
or [cross-module
auto-init](https://github.com/llvm/llvm-project/pull/87597#discussion_r1556067195))
would use this bit differently.

* For distributed ThinLTO, the `declaration` status is not serialized to
bitcode. As discussed, https://github.com/llvm/llvm-project/pull/87600
will do this.

show more ...


12345678910>>...13