Revision tags: llvmorg-21-init, llvmorg-19.1.7 |
|
#
6c3c90b5 |
| 07-Jan-2025 |
Lei Wang <wlei@fb.com> |
[CSSPGO]Add a flag to limit unsymbolized context depth (#121531)
Adding a new flag(`--csprof-max-unsymbolized-context-depth`) to only
limit unsymbolized context depth. Currently,`--csprof-max-conte
[CSSPGO]Add a flag to limit unsymbolized context depth (#121531)
Adding a new flag(`--csprof-max-unsymbolized-context-depth`) to only
limit unsymbolized context depth. Currently,`--csprof-max-context-depth`
applies to both symbolized and unsymbolized profile context, there are
scenarios where `--csprof-max-context-depth` may not be flexible enough,
e.g. if we want to limit the context but still keep all the inlinings
from the leaf frame, we could set the value
csprof-max-unsymbolized-context-depth >= 1.
show more ...
|
Revision tags: llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2 |
|
#
23609a38 |
| 02-Aug-2024 |
Tim Creech <timothy.m.creech@intel.com> |
[llvm-profgen] Revert #99826 and #99026 (#100147)
Revert #99826 and #99026 to allow for additional input.
|
Revision tags: llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
01d78364 |
| 22-Jul-2024 |
Tim Creech <timothy.m.creech@intel.com> |
[llvm-profgen] Add --sample-period to estimate absolute counts (#99826)
Without `--sample-period`, no assumptions are made about perf profile
sample frequencies. This is useful for comparing relati
[llvm-profgen] Add --sample-period to estimate absolute counts (#99826)
Without `--sample-period`, no assumptions are made about perf profile
sample frequencies. This is useful for comparing relative hotness of
different program locations within the same profile.
With `--sample-period`, LBR- and IP-based profile hit counts are
adjusted to estimate the absolute total event count for each program
location. This makes it reasonable to compare hit counts between
different profiles, e.g., between two LBR-based execution frequency
profiles with different sampling periods or between LBR-based execution
frequency profiles and IP-based branch mispredict profiles.
This functionality is in support of HWPGO[^1], which aims to enable
feedback from a wider range of hardware events.
[^1]:
https://llvm.org/devmtg/2024-04/slides/TechnicalTalks/Xiao-EnablingHW-BasedPGO.pdf
show more ...
|
#
0caf0c93 |
| 21-Jul-2024 |
Tim Creech <timothy.m.creech@intel.com> |
[llvm-profgen] Support creating profiles of arbitrary events (#99026)
This change introduces two options which may be used to create profiles
of arbitrary PMU events.
1. `--leading-ip-only` prov
[llvm-profgen] Support creating profiles of arbitrary events (#99026)
This change introduces two options which may be used to create profiles
of arbitrary PMU events.
1. `--leading-ip-only` provides a simple sample-IP-based profile mode.
This is not useful for building a profile of execution frequency, but it
is useful for building new types of profiles.
For example, to build a profile of unpredictable branches:
perf record -b -e branch-misses:upp -o perf.data ... llvm-profgen
--perfdata perf.data --leading-ip-only ...
2. `--perf-event=event` enables the creation of a profile concerned with
a specific event or set of events. The names given should match the
"event" field as emitted by perf-script(1).
This option has two spellings: `--perf-event` and `--perf-events`. The
plural spelling accepts a comma-separated list. The singular spelling
appends a single event name to the set of events which will be used.
This is meant to accommodate event names containing commas.
Combined, these options allow generating multiple kinds of profiles from
a single `perf record` collection. For example, to generate both
execution frequency and branch mispredict profiles:
perf record -c 1000003 -b -e
br_inst_retired.near_taken:upp,br_misp_retired.all_branches:upp ...
llvm-profgen --output execution.prof
--perf-event=br_inst_retired.near_taken:upp ...
llvm-profgen --leading-ip-only --output unpredictable.prof
--perf-event=br_misp_retired.all_branches:upp ...
These additions are in support of more general HWPGO[^1], allowing
feedback from a wider range of hardware events.
[^1]:
https://llvm.org/devmtg/2024-04/slides/TechnicalTalks/Xiao-EnablingHW-BasedPGO.pdf
---------
Co-authored-by: Tim Creech <tcreech@tcreech.com>
show more ...
|
Revision tags: llvmorg-18.1.8 |
|
#
d4a01549 |
| 13-Jun-2024 |
Jay Foad <jay.foad@amd.com> |
[llvm-project] Fix typo "seperate" (#95373)
|
#
2fa6eaf9 |
| 13-Jun-2024 |
xur-llvm <59886942+xur-llvm@users.noreply.github.com> |
[llvm-profgen] Add support for Linux kenrel profile (#92831)
Add the support to handle Linux kernel perf files. The functionality is
under option -kernel. Note that currently only main kernel (in v
[llvm-profgen] Add support for Linux kenrel profile (#92831)
Add the support to handle Linux kernel perf files. The functionality is
under option -kernel. Note that currently only main kernel (in vmlinux)
is handled: kernel modules are not handled.
---------
Co-authored-by: Han Shen <shenhan@google.com>
show more ...
|
Revision tags: llvmorg-18.1.7 |
|
#
8f5a2325 |
| 24-May-2024 |
Haohai Wen <haohai.wen@intel.com> |
[llvm-profgen] Trim tail CR+LF for LBR record line (#93210)
On Windows, perfscript generated by sep contains CR+LF at the end of
LBR records line. This '\r' will be treated as a LBR record when run
[llvm-profgen] Trim tail CR+LF for LBR record line (#93210)
On Windows, perfscript generated by sep contains CR+LF at the end of
LBR records line. This '\r' will be treated as a LBR record when running
llvm-profgen on Linux and then generate warning.
show more ...
|
Revision tags: llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4 |
|
#
3f7f446d |
| 11-Apr-2024 |
Haohai Wen <haohai.wen@intel.com> |
[llvm-profgen] Remove temporary perf script files (#86668)
The temporary perf script files converted from perf data will occupy
lots
of space for large project. This patch removes them when llvm-p
[llvm-profgen] Remove temporary perf script files (#86668)
The temporary perf script files converted from perf data will occupy
lots
of space for large project. This patch removes them when llvm-profgen
exits normally or receives signals.
show more ...
|
Revision tags: llvmorg-18.1.3, llvmorg-18.1.2 |
|
#
8c03f400 |
| 15-Mar-2024 |
Haohai Wen <haohai.wen@intel.com> |
[llvm-profgen] Support COFF binary (#83972)
Intel Vtune/SEP has supported collecting LBR on Windows and generating
perf-script file which is same format as Linux perf script. This patch
teaches ll
[llvm-profgen] Support COFF binary (#83972)
Intel Vtune/SEP has supported collecting LBR on Windows and generating
perf-script file which is same format as Linux perf script. This patch
teaches llvm-profgen to disassemble COFF binary so that we can do
Sampling based PGO on Windows.
show more ...
|
Revision tags: llvmorg-18.1.1 |
|
#
8466ab98 |
| 29-Feb-2024 |
Matthias Braun <matze@braunis.de> |
llvm-profgen: Fix race condition (#83489)
Fix race condition when multiple instances of `llvm-progen` read from
the same inputs.
|
Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init |
|
#
586ecdf2 |
| 12-Dec-2023 |
Kazu Hirata <kazu@google.com> |
[llvm] Use StringRef::{starts,ends}_with (NFC) (#74956)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::
[llvm] Use StringRef::{starts,ends}_with (NFC) (#74956)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
show more ...
|
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2 |
|
#
345fd0c1 |
| 10-Apr-2023 |
Hongtao Yu <hoy@fb.com> |
[FS-AFDO] Generate pseudo-probe-based profiles with FS-discriminators.
This change enables generating pseudo-probe-based FS-AFDO profiles. The change is straightforward based-on previous change {D14
[FS-AFDO] Generate pseudo-probe-based profiles with FS-discriminators.
This change enables generating pseudo-probe-based FS-AFDO profiles. The change is straightforward based-on previous change {D147651} by just injecting FS-discriminators into various profile generation spot.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D147957
show more ...
|
Revision tags: llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
da2f5d0a |
| 14-Dec-2022 |
Fangrui Song <i@maskray.me> |
[tools] llvm::Optional => std::optional
|
#
b4482f7c |
| 03-Dec-2022 |
Kazu Hirata <kazu@google.com> |
[tools] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of m
[tools] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional.
This is part of an effort to migrate from llvm::Optional to std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
show more ...
|
#
e748db0f |
| 01-Dec-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Support: Convert Program APIs to std::optional
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3 |
|
#
46765248 |
| 14-Oct-2022 |
wlei <wlei@fb.com> |
[llvm-profgen] Fix inconsistent loading address issues
This is to fix two issues related with loading address:
1) When multiple MMAPs occur and their loading address are different, before it only u
[llvm-profgen] Fix inconsistent loading address issues
This is to fix two issues related with loading address:
1) When multiple MMAPs occur and their loading address are different, before it only used the first MMap as base address, all perf address after it used the wrong base address.
2) For pseudo probe profile, the address is always based on preferred loading address. If the base address is not equal to the preferred loading address, the pseudo probe address query will be wrong.
Solution: Instead of converting the address to offset lazily, right now all the address after parsing are converted on the fly based on preferred loading address in the parsing time. There is no "offset" used in profile generator any more.
Reviewed By: hoy, wenlei
Differential Revision: https://reviews.llvm.org/D126827
show more ...
|
Revision tags: working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0 |
|
#
89f14332 |
| 03-Sep-2022 |
Kazu Hirata <kazu@google.com> |
Use llvm::lower_bound (NFC)
|
Revision tags: llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2 |
|
#
1b212d10 |
| 08-Aug-2022 |
wlei <wlei@fb.com> |
[llvm-profgen] Fix perf script parsing issues
Fix two perf script parsing issues:
1) Redirect the error message to a new file. (the error message mixed in the perfscript could screw up the MMAP eve
[llvm-profgen] Fix perf script parsing issues
Fix two perf script parsing issues:
1) Redirect the error message to a new file. (the error message mixed in the perfscript could screw up the MMAP event line and cause a parsing failure)
2) Changed the MMap parsing error message to warning since the perfscript can still be parsed using the preferred address as base address.
Reviewed By: hoy, wenlei
Differential Revision: https://reviews.llvm.org/D131449
show more ...
|
Revision tags: llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6 |
|
#
b62e3a73 |
| 16-Jun-2022 |
Corentin Jabot <corentinjabot@gmail.com> |
Replace to_hexString by touhexstr [NFC]
LLVM had 2 methods to convert a number to an hexa string, this remove one of them.
Differential Revision: https://reviews.llvm.org/D127958
|
Revision tags: llvmorg-14.0.5 |
|
#
d86a206f |
| 05-Jun-2022 |
Fangrui Song <i@maskray.me> |
Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options
|
#
557efc9a |
| 04-Jun-2022 |
Fangrui Song <i@maskray.me> |
[llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC
Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!` error. More were added due to cargo cult. Since the err
[llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC
Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!` error. More were added due to cargo cult. Since the error has been removed, cl::ZeroOrMore is unneeded.
Also remove cl::init(false) while touching the lines.
show more ...
|
Revision tags: llvmorg-14.0.4 |
|
#
9f732af5 |
| 12-May-2022 |
Hongtao Yu <hoy@fb.com> |
[llvm-profgen] Filter out oversized LBR ranges.
As a follow up to {D123271}, LBR ranges that are too big should also be considered as invalid.
For example, the last two pairs in the following trace
[llvm-profgen] Filter out oversized LBR ranges.
As a follow up to {D123271}, LBR ranges that are too big should also be considered as invalid.
For example, the last two pairs in the following trace form a range [0x0d7b02b0, 0x368ba706] that covers a ton of functions in the binary. Such oversized range should also be ignored.
0x0c74505f/0x368b99a0 **0x368ba706**/0x0c745040 0x0d7b1c3f/**0x0d7b02b0**
Add a defensive check to filter out those ranges based that the valid range should not cross the unconditional branch(Call, return, unconditional jmp).
Reviewed By: hoy, wenlei
Differential Revision: https://reviews.llvm.org/D125448
show more ...
|
Revision tags: llvmorg-14.0.3 |
|
#
e36786d1 |
| 28-Apr-2022 |
Hongtao Yu <hoy@fb.com> |
[CSSPGO] Rename ProfileIsCSNested and ProfileIsCSFlat
To be more clear and definitive, I'm renaming `ProfileIsCSFlat` back to `ProfileIsCS` which stands for full context-sensitive flat profiles. `P
[CSSPGO] Rename ProfileIsCSNested and ProfileIsCSFlat
To be more clear and definitive, I'm renaming `ProfileIsCSFlat` back to `ProfileIsCS` which stands for full context-sensitive flat profiles. `ProfileIsCSNested` is now renamed to `ProfileIsPreInlined` and is extended to be applicable for CS flat profiles too. More specifically, `ProfileIsPreInlined` is for any kind of profiles (flat or nested) that contain 'ShouldBeInlined' contexts. The flag is encoded in the profile summary section for extbinary profiles and is computed on-the-fly for text profiles.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D122602
show more ...
|
Revision tags: llvmorg-14.0.2 |
|
#
bfcb2c11 |
| 24-Apr-2022 |
wlei <wlei@fb.com> |
[llvm-profgen] Decouple artificial branch from LBR parser and fix external address related issues
This patch is fixing two issues for both CS and non-CS. 1) For external-call-internal, the head samp
[llvm-profgen] Decouple artificial branch from LBR parser and fix external address related issues
This patch is fixing two issues for both CS and non-CS. 1) For external-call-internal, the head samples of the the internal function should be recorded. 2) avoid ignoring LBR after meeting the interrupt branch for CS profile
LBR parser is shared between CS and non-CS, we found it's error-prone while dealing with artificial branch inside LBR parser. Since artificial branch is mainly used for CS profile unwinding, this patch tries to simplify LBR parser by decoupling artificial branch code from it, the concept of artificial branch is removed and split into two transitional branches(internal-to-external, external-to-internal). Then we leave all the processing of external branch to unwinder.
Specifically for unwinder, remembering that we introduce external frame in https://reviews.llvm.org/D115550. We can just take external address as a regular address and reuse current unwind function(unwindCall, unwindReturn). For a normal case, the external frame will match an external LBR, and it will be filtered out by `unwindLinear` without losing any context.
The data also shows that the interrupt or standalone LBR pattern(unpaired case) does exist, we choose to handle it by clearing the call stack and keeping unwinding. Here we leverage checking in `unwindLinear`, because a standalone LBR, no matter its type, since it doesn’t have other part to pair, it will eventually cause a wrong linear range, like [external, internal], [internal, external]. Then set the state to invalid there.
Reviewed By: hoy, wenlei
Differential Revision: https://reviews.llvm.org/D118177
show more ...
|
#
17f6cba3 |
| 15-Apr-2022 |
Wenlei He <aktoon@gmail.com> |
[llvm-profgen] Add process filter for perf reader
For profile generation, we need to filter raw perf samples for binary of interest. Sometimes binary name along isn't enough as we can have binary of
[llvm-profgen] Add process filter for perf reader
For profile generation, we need to filter raw perf samples for binary of interest. Sometimes binary name along isn't enough as we can have binary of the same name running in the system. This change adds a process id filter to allow users to further disambiguiate the input raw samples.
Differential Revision: https://reviews.llvm.org/D123869
show more ...
|