ModuleSummaryAnalysis.cpp - OpenGrok history log for /llvm-project/llvm/lib/Analysis/ModuleSummaryAnalysis.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 9513f2fd	15-Nov-2024	Teresa Johnson <tejohnson@google.com>	[MemProf] Print full context hash when reporting hinted bytes (#114465) Improve the information printed when -memprof-report-hinted-sizes is enabled. Now print the full context hash computed from t [MemProf] Print full context hash when reporting hinted bytes (#114465) Improve the information printed when -memprof-report-hinted-sizes is enabled. Now print the full context hash computed from the original profile, similar to what we do when reporting matching statistics. This will make it easier to correlate with the profile. Note that the full context hash must be computed at profile match time and saved in the metadata and summary, because we may trim the context during matching when it isn't needed for distinguishing hotness. Similarly, due to the context trimming, we may have more than one full context id and total size pair per MIB in the metadata and summary, which now get a list of these pairs. Remove the old aggregate size from the metadata and summary support. One other change from the prior support is that we no longer write the size information into the combined index for the LTO backends, which don't use this information, which reduces unnecessary bloat in distributed index files. show more ...
# 236fda55	06-Nov-2024	Kazu Hirata <kazu@google.com>	[Analysis] Remove unused includes (NFC) (#114936) Identified with misc-include-cleaner.
Revision tags: llvmorg-19.1.3
# 5995e4b9	18-Oct-2024	Teresa Johnson <tejohnson@google.com>	[MemProf] Disable memprof ICP support by default (#112940) A failure showed up after this was committed, rather than revert simply disable this new support to simplify investigation and further tes [MemProf] Disable memprof ICP support by default (#112940) A failure showed up after this was committed, rather than revert simply disable this new support to simplify investigation and further testing. show more ...
# 6264288d	18-Oct-2024	Teresa Johnson <tejohnson@google.com>	[MemProf] Fix the option to disable memprof ICP (#112917) The -enable-memprof-indirect-call-support meant to guard the recently added memprof ICP support was not used in enough places. Specifically [MemProf] Fix the option to disable memprof ICP (#112917) The -enable-memprof-indirect-call-support meant to guard the recently added memprof ICP support was not used in enough places. Specifically, it was not checked in mayHaveMemprofSummary, which is called from the ThinLTO backend applyImports. This led to failures when checking the callsite records, as we incorrectly expected records for indirect calls. Fix the option to be checked in all necessary locations, and add testing. show more ...
Revision tags: llvmorg-19.1.2
# 1de71652	11-Oct-2024	Teresa Johnson <tejohnson@google.com>	[MemProf] Support cloning for indirect calls with ThinLTO (#110625) This patch enables support for cloning in indirect callsites. This is done by synthesizing callsite records for each virtual ca [MemProf] Support cloning for indirect calls with ThinLTO (#110625) This patch enables support for cloning in indirect callsites. This is done by synthesizing callsite records for each virtual call target from the profile metadata. In the thin link all the synthesized records for a particular indirect callsite initially share the same context node, but support is added to partition the callsites and outgoing edges based on the callee function, creating a separate node for each target. In the LTO backend, when cloning is needed we first perform indirect call promotion, then change the target of the new direct call to the desired clone. Note this is ThinLTO-specific, since for regular LTO indirect call promotion should have already occurred. show more ...
Revision tags: llvmorg-19.1.1, llvmorg-19.1.0
# 51d3829d	07-Sep-2024	Kazu Hirata <kazu@google.com>	[ThinLTO] Shrink FunctionSummary by 8 bytes (#107706) During the ThinLTO indexing step for one of our large applications, we create 4 million instances of FunctionSummary. Changing: std::ve [ThinLTO] Shrink FunctionSummary by 8 bytes (#107706) During the ThinLTO indexing step for one of our large applications, we create 4 million instances of FunctionSummary. Changing: std::vector<EdgeTy> CallGraphEdgeList; to: SmallVector<EdgeTy, 0> CallGraphEdgeList; in FunctionSummary reduces the size of each instance by 8 bytes. The rest of the patch makes the same change to other places so that the types stay compatible across function boundaries. show more ...
# d4ddf06b	06-Sep-2024	Mingming Liu <mingmingl@google.com>	[NFCI]Remove EntryCount from FunctionSummary and clean up surrounding synthetic count passes. (#107471) The primary motivation is to remove `EntryCount` from `FunctionSummary`. This frees 8 bytes o [NFCI]Remove EntryCount from FunctionSummary and clean up surrounding synthetic count passes. (#107471) The primary motivation is to remove `EntryCount` from `FunctionSummary`. This frees 8 bytes out of `sizeof(FunctionSummary)` (136 bytes as of https://github.com/llvm/llvm-project/commit/64498c54831bed9cf069e0923b9b73678c6451d8). While I'm at it, this PR clean up {SummaryBasedOptimizations, SyntheticCountsPropagation} since they were not used and there are no plans to further invest on them. With this patch, bitcode writer writes a placeholder 0 at the byte offset of `EntryCount` and bitcode reader can parse the function entry count at the correct byte offset. Added a TODO to stop writing `EntryCount` and bump bitcode version show more ...
# 0ffa377c	06-Sep-2024	Kazu Hirata <kazu@google.com>	[ThinLTO] Shrink GlobalValueSummary by 8 bytes (#107342) During the ThinLTO indexing step for one of our large applications, we create 7.5 million instances of GlobalValueSummary. Changing: [ThinLTO] Shrink GlobalValueSummary by 8 bytes (#107342) During the ThinLTO indexing step for one of our large applications, we create 7.5 million instances of GlobalValueSummary. Changing: std::vector<ValueInfo> RefEdgeList; to: SmallVector<ValueInfo, 0> RefEdgeList; in GlobalValueSummary reduces the size of each instance by 8 bytes. The rest of the patch makes the same change to other places so that the types stay compatible across function boundaries. show more ...
Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 9f8205d9	11-Jul-2024	Teresa Johnson <tejohnson@google.com>	[MemProf] Track and report profiled sizes through cloning (#98382) If requested, via the -memprof-report-hinted-sizes option, track the total profiled size of each MIB through the thin link, then r [MemProf] Track and report profiled sizes through cloning (#98382) If requested, via the -memprof-report-hinted-sizes option, track the total profiled size of each MIB through the thin link, then report on the corresponding allocation coldness after all cloning is complete. To save size, a different bitcode record type is used for the allocation info when the option is specified, and the sizes are kept separate from the MIBs in the index. show more ...
# b0ae923a	22-Jun-2024	Kazu Hirata <kazu@google.com>	[ProfileData] Add a variant of getValueProfDataFromInst (#95993) This patch adds a variant of getValueProfDataFromInst that returns std::vector<InstrProfValueData> instead of std::unique<InstrProf [ProfileData] Add a variant of getValueProfDataFromInst (#95993) This patch adds a variant of getValueProfDataFromInst that returns std::vector<InstrProfValueData> instead of std::unique<InstrProfValueData[]>. The new return type carries the length with it, so we can drop out parameter ActualNumValueData. Also, the caller can directly feed the return value into a range-based for loop as shown in the patch. I'm planning to migrate other callers of getValueProfDataFromInst to the new variant in follow-up patches. show more ...
# 2c2f4905	18-Jun-2024	Kazu Hirata <kazu@google.com>	[Analysis] Clean up getPromotionCandidatesForInstruction (NFC) (#95624) Callers of getPromotionCandidatesForInstruction pass NumVals as an out parameter for the number of value-count pairs of the v [Analysis] Clean up getPromotionCandidatesForInstruction (NFC) (#95624) Callers of getPromotionCandidatesForInstruction pass NumVals as an out parameter for the number of value-count pairs of the value profiling data, but nobody uses the out parameter. This patch removes the parameter and updates the callers. Note that the number of value-count pairs is still available via getPromotionCandidatesForInstruction(...).size(). show more ...
Revision tags: llvmorg-18.1.8
# d3342e5b	12-Jun-2024	Abhina Sree <69635948+abhina-sree@users.noreply.github.com>	[SystemZ][z/OS] Continue marking text files with OF_Text (#95111) Text files should be opened with OF_Text to have the correct encoding.
Revision tags: llvmorg-18.1.7, llvmorg-18.1.6
# fa9b1be4	16-May-2024	Mingming Liu <mingmingl@google.com>	[ThinLTO]Mark referencers of local ifunc not eligible for import (#92431) If an ifunc has local linkage, do not add it into ref edges and mark its referencer (a function or global variable) not eli [ThinLTO]Mark referencers of local ifunc not eligible for import (#92431) If an ifunc has local linkage, do not add it into ref edges and mark its referencer (a function or global variable) not eligible for import. An ifunc doesn't have summary and ThinLTO cannot promote it. Importing the referencer may cause linkage errors. To reference a similar fix, https://reviews.llvm.org/D158961 marks callers of local ifunc not eligible for import to fix https://github.com/llvm/llvm-project/issues/58740 show more ...
Revision tags: llvmorg-18.1.5, llvmorg-18.1.4
# 58c5f50f	11-Apr-2024	Leonard Chan <leonardchan@google.com>	Reapply "[llvm] Teach GlobalDCE about dso_local_equivalent" Also reapply "[llvm] Teach whole program devirtualization about relative vtables" This reverts commit 1c604a9780fcfe92a99d539913553f0835b Reapply "[llvm] Teach GlobalDCE about dso_local_equivalent" Also reapply "[llvm] Teach whole program devirtualization about relative vtables" This reverts commit 1c604a9780fcfe92a99d539913553f0835b81de3 and 474f5efebed24547e76d022f0c5ffcc9db97ce6f. show more ...
# c0b77e0a	15-Apr-2024	Leonard Chan <leonardchan@google.com>	Revert "Reapply "[llvm] Teach whole program devirtualization about relative vtables"" This reverts commit 09c3bfe9b3eb47a2af0c10531b25f90cfb5fa9f4.
# 09c3bfe9	11-Apr-2024	Leonard Chan <leonardchan@google.com>	Reapply "[llvm] Teach whole program devirtualization about relative vtables" This reverts commit 474f5efebed24547e76d022f0c5ffcc9db97ce6f.
# dda73336	11-Apr-2024	Mingming Liu <mingmingl@google.com>	[ThinLTO]Record import type in GlobalValueSummary::GVFlags (#87597) The motivating use case is to support import the function declaration across modules to construct call graph edges for indirect c [ThinLTO]Record import type in GlobalValueSummary::GVFlags (#87597) The motivating use case is to support import the function declaration across modules to construct call graph edges for indirect calls [1] when importing the function definition costs too much compile time (e.g., the function is too large has no `noinline` attribute). 1. Currently, when the compiled IR module doesn't have a function definition but its postlink combined summary contains the function summary or a global alias summary with this function as aliasee, the function definition will be imported from source module by IRMover. The implementation is in FunctionImporter::importFunctions [2] 2. In order for FunctionImporter to import a declaration of a function, both function summary and alias summary need to carry the def / decl state. Specifically, all existing summary fields doesn't differ across import modules, but the def / decl state of is decided by `<ImportModule, Function>`. This change encodes the def/decl state in `GlobalValueSummary::GVFlags`. In the subsequent changes 1. The indexing step `computeImportForModule` [3] will compute the set of definitions and the set of declarations for each module, and passing on the information to bitcode writer. 2. Bitcode writer will look up the def/decl state and sets the state when it writes out the flag value. This is demonstrated in https://github.com/llvm/llvm-project/pull/87600 3. Function importer will read the def/decl state when reading the combined summary to figure out two sets of global values, and IRMover will be updated to import the declaration (aka linkGlobalValuePrototype [4]) into the destination module. - The next change is https://github.com/llvm/llvm-project/pull/87600 [1] mentioned in rfc https://discourse.llvm.org/t/rfc-for-better-call-graph-sort-build-a-more-complete-call-graph-by-adding-more-indirect-call-edges/74029#support-cross-module-function-declaration-import-5 [2] https://github.com/llvm/llvm-project/blob/3b337242ee165554f0017b00671381ec5b1ba855/llvm/lib/Transforms/IPO/FunctionImport.cpp#L1608-L1764 [3] https://github.com/llvm/llvm-project/blob/3b337242ee165554f0017b00671381ec5b1ba855/llvm/lib/Transforms/IPO/FunctionImport.cpp#L856 [4] https://github.com/llvm/llvm-project/blob/3b337242ee165554f0017b00671381ec5b1ba855/llvm/lib/Linker/IRMover.cpp#L605 show more ...
Revision tags: llvmorg-18.1.3
# 1e15371d	01-Apr-2024	Mingming Liu <mingmingl@google.com>	[ThinLTO][TypeProf] Implement vtable def import (#79381) Add annotated vtable GUID as referenced variables in per function summary, and update bitcode writer to create value-ids for these referenc [ThinLTO][TypeProf] Implement vtable def import (#79381) Add annotated vtable GUID as referenced variables in per function summary, and update bitcode writer to create value-ids for these referenced vtables. - This is the part3 of type profiling work, and described in the "Virtual Table Definition Import" [1] section of the RFC. [1] https://github.com/llvm/llvm-project/pull/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 show more ...
Revision tags: llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# 88fbc4d3	06-Dec-2023	Teresa Johnson <tejohnson@google.com>	[ThinLTO] Add tail call flag to call edges in summary (#74043) This adds support for a HasTailCall flag on function call edges in the ThinLTO summary. It is intended for use in aiding discovery of [ThinLTO] Add tail call flag to call edges in summary (#74043) This adds support for a HasTailCall flag on function call edges in the ThinLTO summary. It is intended for use in aiding discovery of missing frames from tail calls in profiled call stacks for MemProf of profiled binaries that did not disable tail call elimination. A follow on change will add the use of this new flag during MemProf context disambiguation. The new flag is encoded in the bitcode along with either the hotness flag from the profile, or the relative block frequency under the -write-relbf-to-summary flag when there is no profile data. Because we now will always have some additional call edge information, I have removed the non-profile function summary record format, and we simply encode the tail call flag along with a hotness type of none when there is no profile information or relative block frequency. The change of record format and name caused most of the test case changes. I have added explicit testing of generation of the new tail call flag into the bitcode and IR assembly format as part of the changes to llvm/test/Bitcode/thinlto-function-summary-refgraph.ll. I have also added round trip testing through assembly and bitcode to llvm/test/Assembler/thinlto-summary.ll. show more ...
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3
# 5181156b	05-Oct-2023	Matthias Braun <matze@braunis.de>	Use BlockFrequency type in more places (NFC) (#68266) The `BlockFrequency` class abstracts `uint64_t` frequency values. Use it more consistently in various APIs and disable implicit conversion to Use BlockFrequency type in more places (NFC) (#68266) The `BlockFrequency` class abstracts `uint64_t` frequency values. Use it more consistently in various APIs and disable implicit conversion to make usage more consistent and explicit. - Use `BlockFrequency Freq` parameter for `setBlockFreq`, `getProfileCountFromFreq` and `setBlockFreqAndScale` functions. - Return `BlockFrequency` in `getEntryFreq()` functions. - While on it change some `const BlockFrequency& Freq` parameters to plain `BlockFreqency Freq`. - Mark `BlockFrequency(uint64_t)` constructor as explicit. - Add missing `BlockFrequency::operator!=`. - Remove `uint64_t BlockFreqency::getMaxFrequency()`. - Add `BlockFrequency BlockFrequency::max()` function. show more ...
Revision tags: llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4
# 6276927b	29-Aug-2023	Fangrui Song <i@maskray.me>	[ThinLTO] Mark callers of local ifunc not eligible for import Fix https://github.com/llvm/llvm-project/issues/58740 The `target_clones` attribute results in ifunc on eligible targets (Linux glibc/An [ThinLTO] Mark callers of local ifunc not eligible for import Fix https://github.com/llvm/llvm-project/issues/58740 The `target_clones` attribute results in ifunc on eligible targets (Linux glibc/Android or FreeBSD). If the function has internal linkage, we will get an internal linkage ifunc. ``` __attribute__((target_clones("popcnt", "default"))) static int foo(int n) { return __builtin_popcount(n); } int use(int n) { return foo(n); } @foo.ifunc = internal ifunc i32 (i32), ptr @foo.resolver define internal nonnull ptr @foo.resolver() comdat { ; local linkage comdat is another issue that should be fixed ... select i1 %.not, ptr @foo.default.1, ptr @foo.popcnt.0 ... } define internal i32 @foo.default.1(i32 noundef %n) ``` ifuncs are not included in module summaries, so LTO doesn't know the local linkage `foo.default.1` referenced by `foo.resolver` should be promoted. If a caller of `foo` (e.g. `use`) is imported, the local linkage `foo.resolver` will be cloned as a definition (IRLinker::shouldLink), leading to linker errors. ``` ld.lld: error: undefined hidden symbol: foo.default.1.llvm.8017227050314953235 >>> referenced by bar.c >>> lto.tmp:(foo.ifunc) ``` As a simple fix, just mark `use` as not eligible for import. Non-local linkage ifuncs do not have the problem, because they are not imported, and not cloned when a caller is imported. --- https://reviews.llvm.org/D82745 contains a more involved fix, though the original bug it intended to fix (https://github.com/llvm/llvm-project/issues/45833) now works. Note: importing ifunc is tricky. If we import an ifunc, we need to make sure the resolver and the implementation are in the translation unit, as required by https://sourceware.org/glibc/wiki/GNU_IFUNC > Requirement (a): Resolver must be defined in the same translation unit as the implementations. This is infeasible if the implementation is changed to available_externally. In addition, the imported ifunc may be referenced by two translation units. This doesn't work with PowerPC32 -msecure-plt (https://maskray.me/blog/2021-01-18-gnu-indirect-function). At the very least, every referencing translation unit needs one extra IRELATIVE dynamic relocation. At least for the local linkage ifunc case, it doesn't have much use outside of `target_clones`, as a global pointer is usually a better replacement. I think ifuncs just have too many pitfalls to design more IR features around it to optimize them. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D158961 show more ...
Revision tags: llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1
# 1b162fab	25-Jul-2023	Fangrui Song <i@maskray.me>	[Support] Change SetVector's default template parameter to SmallVector<, 0> Similar to D156016 for MapVector. This brings back commit fae7b98c221b5b28797f7b56b656b6b819d99f27 with a fix to llvm/un [Support] Change SetVector's default template parameter to SmallVector<, 0> Similar to D156016 for MapVector. This brings back commit fae7b98c221b5b28797f7b56b656b6b819d99f27 with a fix to llvm/unittests/Support/ThreadPool.cpp's `_WIN32` code path. show more ...
Revision tags: llvmorg-18-init
# 3d83912c	25-Jul-2023	Simon Pilgrim <llvm-dev@redking.me.uk>	Revert rGfae7b98c221b5b28797f7b56b656b6b819d99f27 "[Support] Change SetVector's default template parameter to SmallVector<, 0>" This is failing on Windows MSVC builds: llvm\unittests\Support\Thread Revert rGfae7b98c221b5b28797f7b56b656b6b819d99f27 "[Support] Change SetVector's default template parameter to SmallVector<, 0>" This is failing on Windows MSVC builds: llvm\unittests\Support\ThreadPool.cpp(380): error C2440: 'return': cannot convert from 'Vector' to 'std::vector<llvm::BitVector,std::allocator<llvm::BitVector>>' with [ Vector=llvm::SmallVector<llvm::BitVector,0> ] show more ...
# fae7b98c	25-Jul-2023	Fangrui Song <i@maskray.me>	[Support] Change SetVector's default template parameter to SmallVector<*, 0> Similar to D156016 for MapVector.
# fb2a971c	25-Jul-2023	Fangrui Song <i@maskray.me>	[Support] Change MapVector's default template parameter to SmallVector<, 0> SmallVector<, 0> is often a better replacement for std::vector : both the object size and the code size are smaller. (Sm [Support] Change MapVector's default template parameter to SmallVector<, 0> SmallVector<, 0> is often a better replacement for std::vector : both the object size and the code size are smaller. (SmallMapVector uses SmallVector as well, but it is not common.) clang size decreases by 0.0226%. instructions:u decreases 0.037% when compiling a sqlite3 amalgram. Reviewed By: JDevlieghere Differential Revision: https://reviews.llvm.org/D156016 show more ...
12 3 4 5 6 7 8