History log of /llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp (Results 1 – 25 of 88)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7
# 6c3c90b5 07-Jan-2025 Lei Wang <wlei@fb.com>

[CSSPGO]Add a flag to limit unsymbolized context depth (#121531)

Adding a new flag(`--csprof-max-unsymbolized-context-depth`) to only
limit unsymbolized context depth. Currently,`--csprof-max-conte

[CSSPGO]Add a flag to limit unsymbolized context depth (#121531)

Adding a new flag(`--csprof-max-unsymbolized-context-depth`) to only
limit unsymbolized context depth. Currently,`--csprof-max-context-depth`
applies to both symbolized and unsymbolized profile context, there are
scenarios where `--csprof-max-context-depth` may not be flexible enough,
e.g. if we want to limit the context but still keep all the inlinings
from the leaf frame, we could set the value
csprof-max-unsymbolized-context-depth >= 1.

show more ...


Revision tags: llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2
# 23609a38 02-Aug-2024 Tim Creech <timothy.m.creech@intel.com>

[llvm-profgen] Revert #99826 and #99026 (#100147)

Revert #99826 and #99026 to allow for additional input.


Revision tags: llvmorg-19.1.0-rc1, llvmorg-20-init
# 01d78364 22-Jul-2024 Tim Creech <timothy.m.creech@intel.com>

[llvm-profgen] Add --sample-period to estimate absolute counts (#99826)

Without `--sample-period`, no assumptions are made about perf profile
sample frequencies. This is useful for comparing relati

[llvm-profgen] Add --sample-period to estimate absolute counts (#99826)

Without `--sample-period`, no assumptions are made about perf profile
sample frequencies. This is useful for comparing relative hotness of
different program locations within the same profile.

With `--sample-period`, LBR- and IP-based profile hit counts are
adjusted to estimate the absolute total event count for each program
location. This makes it reasonable to compare hit counts between
different profiles, e.g., between two LBR-based execution frequency
profiles with different sampling periods or between LBR-based execution
frequency profiles and IP-based branch mispredict profiles.

This functionality is in support of HWPGO[^1], which aims to enable
feedback from a wider range of hardware events.

[^1]:
https://llvm.org/devmtg/2024-04/slides/TechnicalTalks/Xiao-EnablingHW-BasedPGO.pdf

show more ...


# 0caf0c93 21-Jul-2024 Tim Creech <timothy.m.creech@intel.com>

[llvm-profgen] Support creating profiles of arbitrary events (#99026)

This change introduces two options which may be used to create profiles
of arbitrary PMU events.

1. `--leading-ip-only` prov

[llvm-profgen] Support creating profiles of arbitrary events (#99026)

This change introduces two options which may be used to create profiles
of arbitrary PMU events.

1. `--leading-ip-only` provides a simple sample-IP-based profile mode.
This is not useful for building a profile of execution frequency, but it
is useful for building new types of profiles.

For example, to build a profile of unpredictable branches:

perf record -b -e branch-misses:upp -o perf.data ... llvm-profgen
--perfdata perf.data --leading-ip-only ...

2. `--perf-event=event` enables the creation of a profile concerned with
a specific event or set of events. The names given should match the
"event" field as emitted by perf-script(1).

This option has two spellings: `--perf-event` and `--perf-events`. The
plural spelling accepts a comma-separated list. The singular spelling
appends a single event name to the set of events which will be used.
This is meant to accommodate event names containing commas.

Combined, these options allow generating multiple kinds of profiles from
a single `perf record` collection. For example, to generate both
execution frequency and branch mispredict profiles:

perf record -c 1000003 -b -e
br_inst_retired.near_taken:upp,br_misp_retired.all_branches:upp ...
llvm-profgen --output execution.prof
--perf-event=br_inst_retired.near_taken:upp ...
llvm-profgen --leading-ip-only --output unpredictable.prof
--perf-event=br_misp_retired.all_branches:upp ...

These additions are in support of more general HWPGO[^1], allowing
feedback from a wider range of hardware events.

[^1]:
https://llvm.org/devmtg/2024-04/slides/TechnicalTalks/Xiao-EnablingHW-BasedPGO.pdf

---------

Co-authored-by: Tim Creech <tcreech@tcreech.com>

show more ...


Revision tags: llvmorg-18.1.8
# d4a01549 13-Jun-2024 Jay Foad <jay.foad@amd.com>

[llvm-project] Fix typo "seperate" (#95373)


# 2fa6eaf9 13-Jun-2024 xur-llvm <59886942+xur-llvm@users.noreply.github.com>

[llvm-profgen] Add support for Linux kenrel profile (#92831)

Add the support to handle Linux kernel perf files. The functionality is
under option -kernel. Note that currently only main kernel (in v

[llvm-profgen] Add support for Linux kenrel profile (#92831)

Add the support to handle Linux kernel perf files. The functionality is
under option -kernel. Note that currently only main kernel (in vmlinux)
is handled: kernel modules are not handled.

---------

Co-authored-by: Han Shen <shenhan@google.com>

show more ...


Revision tags: llvmorg-18.1.7
# 8f5a2325 24-May-2024 Haohai Wen <haohai.wen@intel.com>

[llvm-profgen] Trim tail CR+LF for LBR record line (#93210)

On Windows, perfscript generated by sep contains CR+LF at the end of
LBR records line. This '\r' will be treated as a LBR record when run

[llvm-profgen] Trim tail CR+LF for LBR record line (#93210)

On Windows, perfscript generated by sep contains CR+LF at the end of
LBR records line. This '\r' will be treated as a LBR record when running
llvm-profgen on Linux and then generate warning.

show more ...


Revision tags: llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4
# 3f7f446d 11-Apr-2024 Haohai Wen <haohai.wen@intel.com>

[llvm-profgen] Remove temporary perf script files (#86668)

The temporary perf script files converted from perf data will occupy
lots
of space for large project. This patch removes them when llvm-p

[llvm-profgen] Remove temporary perf script files (#86668)

The temporary perf script files converted from perf data will occupy
lots
of space for large project. This patch removes them when llvm-profgen
exits normally or receives signals.

show more ...


Revision tags: llvmorg-18.1.3, llvmorg-18.1.2
# 8c03f400 15-Mar-2024 Haohai Wen <haohai.wen@intel.com>

[llvm-profgen] Support COFF binary (#83972)

Intel Vtune/SEP has supported collecting LBR on Windows and generating
perf-script file which is same format as Linux perf script. This patch
teaches ll

[llvm-profgen] Support COFF binary (#83972)

Intel Vtune/SEP has supported collecting LBR on Windows and generating
perf-script file which is same format as Linux perf script. This patch
teaches llvm-profgen to disassemble COFF binary so that we can do
Sampling based PGO on Windows.

show more ...


Revision tags: llvmorg-18.1.1
# 8466ab98 29-Feb-2024 Matthias Braun <matze@braunis.de>

llvm-profgen: Fix race condition (#83489)

Fix race condition when multiple instances of `llvm-progen` read from
the same inputs.


Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# 586ecdf2 12-Dec-2023 Kazu Hirata <kazu@google.com>

[llvm] Use StringRef::{starts,ends}_with (NFC) (#74956)

This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::

[llvm] Use StringRef::{starts,ends}_with (NFC) (#74956)

This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.

show more ...


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2
# 345fd0c1 10-Apr-2023 Hongtao Yu <hoy@fb.com>

[FS-AFDO] Generate pseudo-probe-based profiles with FS-discriminators.

This change enables generating pseudo-probe-based FS-AFDO profiles. The change is straightforward based-on previous change {D14

[FS-AFDO] Generate pseudo-probe-based profiles with FS-discriminators.

This change enables generating pseudo-probe-based FS-AFDO profiles. The change is straightforward based-on previous change {D147651} by just injecting FS-discriminators into various profile generation spot.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D147957

show more ...


Revision tags: llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# da2f5d0a 14-Dec-2022 Fangrui Song <i@maskray.me>

[tools] llvm::Optional => std::optional


# b4482f7c 03-Dec-2022 Kazu Hirata <kazu@google.com>

[tools] Use std::nullopt instead of None (NFC)

This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of m

[tools] Use std::nullopt instead of None (NFC)

This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

show more ...


# e748db0f 01-Dec-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

Support: Convert Program APIs to std::optional


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3
# 46765248 14-Oct-2022 wlei <wlei@fb.com>

[llvm-profgen] Fix inconsistent loading address issues

This is to fix two issues related with loading address:

1) When multiple MMAPs occur and their loading address are different, before it only u

[llvm-profgen] Fix inconsistent loading address issues

This is to fix two issues related with loading address:

1) When multiple MMAPs occur and their loading address are different, before it only used the first MMap as base address, all perf address after it used the wrong base address.

2) For pseudo probe profile, the address is always based on preferred loading address. If the base address is not equal to the preferred loading address, the pseudo probe address query will be wrong.

Solution: Instead of converting the address to offset lazily, right now all the address after parsing are converted on the fly based on preferred loading address in the parsing time. There is no "offset" used in profile generator any more.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D126827

show more ...


Revision tags: working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0
# 89f14332 03-Sep-2022 Kazu Hirata <kazu@google.com>

Use llvm::lower_bound (NFC)


Revision tags: llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2
# 1b212d10 08-Aug-2022 wlei <wlei@fb.com>

[llvm-profgen] Fix perf script parsing issues

Fix two perf script parsing issues:

1) Redirect the error message to a new file. (the error message mixed in the perfscript could screw up the MMAP eve

[llvm-profgen] Fix perf script parsing issues

Fix two perf script parsing issues:

1) Redirect the error message to a new file. (the error message mixed in the perfscript could screw up the MMAP event line and cause a parsing failure)

2) Changed the MMap parsing error message to warning since the perfscript can still be parsed using the preferred address as base address.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D131449

show more ...


Revision tags: llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6
# b62e3a73 16-Jun-2022 Corentin Jabot <corentinjabot@gmail.com>

Replace to_hexString by touhexstr [NFC]

LLVM had 2 methods to convert a number to an hexa string,
this remove one of them.

Differential Revision: https://reviews.llvm.org/D127958


Revision tags: llvmorg-14.0.5
# d86a206f 05-Jun-2022 Fangrui Song <i@maskray.me>

Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options


# 557efc9a 04-Jun-2022 Fangrui Song <i@maskray.me>

[llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC

Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!`
error. More were added due to cargo cult. Since the err

[llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC

Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!`
error. More were added due to cargo cult. Since the error has been removed,
cl::ZeroOrMore is unneeded.

Also remove cl::init(false) while touching the lines.

show more ...


Revision tags: llvmorg-14.0.4
# 9f732af5 12-May-2022 Hongtao Yu <hoy@fb.com>

[llvm-profgen] Filter out oversized LBR ranges.

As a follow up to {D123271}, LBR ranges that are too big should also be considered as invalid.

For example, the last two pairs in the following trace

[llvm-profgen] Filter out oversized LBR ranges.

As a follow up to {D123271}, LBR ranges that are too big should also be considered as invalid.

For example, the last two pairs in the following trace form a range [0x0d7b02b0, 0x368ba706] that covers a ton of functions in the binary. Such oversized range should also be ignored.

0x0c74505f/0x368b99a0 **0x368ba706**/0x0c745040 0x0d7b1c3f/**0x0d7b02b0**

Add a defensive check to filter out those ranges based that the valid range should not cross the unconditional branch(Call, return, unconditional jmp).

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D125448

show more ...


Revision tags: llvmorg-14.0.3
# e36786d1 28-Apr-2022 Hongtao Yu <hoy@fb.com>

[CSSPGO] Rename ProfileIsCSNested and ProfileIsCSFlat

To be more clear and definitive, I'm renaming `ProfileIsCSFlat` back to `ProfileIsCS` which stands for full context-sensitive flat profiles. `P

[CSSPGO] Rename ProfileIsCSNested and ProfileIsCSFlat

To be more clear and definitive, I'm renaming `ProfileIsCSFlat` back to `ProfileIsCS` which stands for full context-sensitive flat profiles. `ProfileIsCSNested` is now renamed to `ProfileIsPreInlined` and is extended to be applicable for CS flat profiles too. More specifically, `ProfileIsPreInlined` is for any kind of profiles (flat or nested) that contain 'ShouldBeInlined' contexts. The flag is encoded in the profile summary section for extbinary profiles and is computed on-the-fly for text profiles.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D122602

show more ...


Revision tags: llvmorg-14.0.2
# bfcb2c11 24-Apr-2022 wlei <wlei@fb.com>

[llvm-profgen] Decouple artificial branch from LBR parser and fix external address related issues

This patch is fixing two issues for both CS and non-CS.
1) For external-call-internal, the head samp

[llvm-profgen] Decouple artificial branch from LBR parser and fix external address related issues

This patch is fixing two issues for both CS and non-CS.
1) For external-call-internal, the head samples of the the internal function should be recorded.
2) avoid ignoring LBR after meeting the interrupt branch for CS profile

LBR parser is shared between CS and non-CS, we found it's error-prone while dealing with artificial branch inside LBR parser. Since artificial branch is mainly used for CS profile unwinding, this patch tries to simplify LBR parser by decoupling artificial branch code from it, the concept of artificial branch is removed and split into two transitional branches(internal-to-external, external-to-internal). Then we leave all the processing of external branch to unwinder.

Specifically for unwinder, remembering that we introduce external frame in https://reviews.llvm.org/D115550. We can just take external address as a regular address and reuse current unwind function(unwindCall, unwindReturn). For a normal case, the external frame will match an external LBR, and it will be filtered out by `unwindLinear` without losing any context.

The data also shows that the interrupt or standalone LBR pattern(unpaired case) does exist, we choose to handle it by clearing the call stack and keeping unwinding. Here we leverage checking in `unwindLinear`, because a standalone LBR, no matter its type, since it doesn’t have other part to pair, it will eventually cause a wrong linear range, like [external, internal], [internal, external]. Then set the state to invalid there.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D118177

show more ...


# 17f6cba3 15-Apr-2022 Wenlei He <aktoon@gmail.com>

[llvm-profgen] Add process filter for perf reader

For profile generation, we need to filter raw perf samples for binary of interest. Sometimes binary name along isn't enough as we can have binary of

[llvm-profgen] Add process filter for perf reader

For profile generation, we need to filter raw perf samples for binary of interest. Sometimes binary name along isn't enough as we can have binary of the same name running in the system. This change adds a process id filter to allow users to further disambiguiate the input raw samples.

Differential Revision: https://reviews.llvm.org/D123869

show more ...


1234