Revision tags: llvmorg-21-init |
|
#
8e702735 |
| 24-Jan-2025 |
Jeremy Morse <jeremy.morse@sony.com> |
[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and sim
[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to moveBefore use iterators.
This patch adds a (guaranteed dereferenceable) iterator-taking
moveBefore, and changes a bunch of call-sites where it's obviously safe
to change to use it by just calling getIterator() on an instruction
pointer. A follow-up patch will contain less-obviously-safe changes.
We'll eventually deprecate and remove the instruction-pointer
insertBefore, but not before adding concise documentation of what
considerations are needed (very few).
show more ...
|
Revision tags: llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
4d12a143 |
| 06-Nov-2024 |
Kazu Hirata <kazu@google.com> |
[Instrumentation] Remove unused includes (NFC) (#115117)
Identified with misc-include-cleaner.
|
Revision tags: llvmorg-19.1.3 |
|
#
b4fcaa13 |
| 22-Oct-2024 |
Michael O'Farrell <micpof@gmail.com> |
[PGO][SampledInstr] Correct off by 1s and allow 100% sampling (#113350)
This corrects a couple off by ones related to the sampling of
**instrumented** counters, and enables setting 100% rates for b
[PGO][SampledInstr] Correct off by 1s and allow 100% sampling (#113350)
This corrects a couple off by ones related to the sampling of
**instrumented** counters, and enables setting 100% rates for burst
sampling (burst duration = period).
Off by ones:
Prior to this change it was impossible to set a period of 65535 because
this was converted to fast sampling which rollsover at USHRT_MAX + 1
(65536). Similarly the burst durations would collect burst duration + 1
counts as they used an ULE comparison.
100% sampling:
Although this is not useful for a productionized use case, it does allow
for more deterministic testing with the sampling checks in place. After
all the off by ones are fixed, allowing for 100% sampling is a matter of
letting burst duration = period.
show more ...
|
#
6924fc03 |
| 16-Oct-2024 |
Rahul Joshi <rjoshi@nvidia.com> |
[LLVM] Add `Intrinsic::getDeclarationIfExists` (#112428)
Add `Intrinsic::getDeclarationIfExists` to lookup an existing
declaration of an intrinsic in a `Module`.
|
Revision tags: llvmorg-19.1.2 |
|
#
d4efc3e0 |
| 14-Oct-2024 |
Yuta Saito <kateinoigakukun@gmail.com> |
[Coverage][WebAssembly] Add initial support for WebAssembly/WASI (#111332)
Currently, WebAssembly/WASI target does not provide direct support for
code coverage.
This patch set fixes several issues
[Coverage][WebAssembly] Add initial support for WebAssembly/WASI (#111332)
Currently, WebAssembly/WASI target does not provide direct support for
code coverage.
This patch set fixes several issues to unlock the feature. The main
changes are:
1. Port `compiler-rt/lib/profile` to WebAssembly/WASI.
2. Adjust profile metadata sections for Wasm object file format.
- [CodeGen] Emit `__llvm_covmap` and `__llvm_covfun` as custom sections
instead of data segments.
- [lld] Align the interval space of custom sections at link time.
- [llvm-cov] Copy misaligned custom section data if the start address is
not aligned.
- [llvm-cov] Read `__llvm_prf_names` from data segments
3. [clang] Link with profile runtime libraries if requested
See each commit message for more details and rationale.
This is part of the effort to add code coverage support in Wasm target
of Swift toolchain.
show more ...
|
#
6c331e50 |
| 03-Oct-2024 |
NAKAMURA Takumi <geek4civic@gmail.com> |
[MC/DC] Rework tvbitmap.update to get rid of the inlined function (#110792)
Per the discussion in #102542, it is safe to insert BBs under
`lowerIntrinsics()` since #69535 has made tolerant of modif
[MC/DC] Rework tvbitmap.update to get rid of the inlined function (#110792)
Per the discussion in #102542, it is safe to insert BBs under
`lowerIntrinsics()` since #69535 has made tolerant of modifying BBs.
So, I can get rid of using the inlined function `rmw_or`, introduced in
#96040.
show more ...
|
Revision tags: llvmorg-19.1.1 |
|
#
e03f4271 |
| 19-Sep-2024 |
Jay Foad <jay.foad@amd.com> |
[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133)
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all oc
[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133)
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.
show more ...
|
Revision tags: llvmorg-19.1.0 |
|
#
2ae968a0 |
| 16-Sep-2024 |
Antonio Frighetto <10052132+antoniofrighetto@users.noreply.github.com> |
[Instrumentation] Move out to Utils (NFC) (#108532)
Utility functions have been moved out to Utils. Minor opportunity to
drop the header where not needed.
|
Revision tags: llvmorg-19.1.0-rc4 |
|
#
fde2d23e |
| 22-Aug-2024 |
Ethan Luis McDonough <ethanluismcdonough@gmail.com> |
[PGO][OpenMP] Instrumentation for GPU devices (Revision of #76587) (#102691)
This pull request is a revised version of #76587. This pull request
fixes some build issues that were present in the pre
[PGO][OpenMP] Instrumentation for GPU devices (Revision of #76587) (#102691)
This pull request is a revised version of #76587. This pull request
fixes some build issues that were present in the previous version of
this change.
> This pull request is the first part of an ongoing effort to extends
PGO instrumentation to GPU device code. This PR makes the following
changes:
>
> - Adds blank registration functions to device RTL
> - Gives PGO globals protected visibility when targeting a supported
GPU
> - Handles any addrspace casts for PGO calls
> - Implements PGO global extraction in GPU plugins (currently only
dumps info)
>
> These changes can be tested by supplying `-fprofile-instrument=clang`
while targeting a GPU.
show more ...
|
Revision tags: llvmorg-19.1.0-rc3 |
|
#
f5b81aa6 |
| 16-Aug-2024 |
gulfemsavrun <gulfem@google.com> |
[InstrProf] Support conditional counter updates (#102542)
This patch adds support for conditional counter updates in single byte
counters mode to reduce the write contention by first checking wheth
[InstrProf] Support conditional counter updates (#102542)
This patch adds support for conditional counter updates in single byte
counters mode to reduce the write contention by first checking whether
the counter is set before overwriting it.
---------
Co-authored-by: Juan Manuel Martinez Caamaño <jmartinezcaamao@gmail.com>
show more ...
|
Revision tags: llvmorg-19.1.0-rc2 |
|
#
d2f77eb8 |
| 31-Jul-2024 |
NAKAMURA Takumi <geek4civic@gmail.com> |
[MC/DC][Coverage] Introduce "Bitmap Bias" for continuous mode (#96126)
`counter_bias` is incompatible to Bitmap. The distance between Counters
and Bitmap is different between on-memory sections and
[MC/DC][Coverage] Introduce "Bitmap Bias" for continuous mode (#96126)
`counter_bias` is incompatible to Bitmap. The distance between Counters
and Bitmap is different between on-memory sections and profraw image.
Reference to `__llvm_profile_bitmap_bias` is generated only if
`-fcoverge-mcdc` `-runtime-counter-relocation` are specified. The
current implementation rejected their options.
```
Runtime counter relocation is presently not supported for MC/DC bitmaps
```
show more ...
|
Revision tags: llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
b1ca2a95 |
| 22-Jul-2024 |
xur-llvm <59886942+xur-llvm@users.noreply.github.com> |
[PGO] Sampled instrumentation in PGO to speed up instrumentation binary (#69535)
In comparison to non-instrumented binaries, PGO instrumentation binaries
can be significantly slower. For highly thr
[PGO] Sampled instrumentation in PGO to speed up instrumentation binary (#69535)
In comparison to non-instrumented binaries, PGO instrumentation binaries
can be significantly slower. For highly threaded programs, this slowdown
can
reach 10x due to data races or false sharing within counters.
This patch incorporates sampling into the PGO instrumentation process to
enhance the speed of instrumentation binaries. The fundamental concept
is similar to the one proposed in https://reviews.llvm.org/D63949.
Three sampling modes are introduced:
1. Simple Sampling: When '-sampled-instr-bust-duration' is set to 1.
2. Fast Burst Sampling: When not using simple sampling, and
'-sampled-instr-period' is set to 65535. This is the default mode of
sampling.
3. Full Burst Sampling: When neither simple nor fast burst sampling is
used.
Utilizing this sampled instrumentation significantly improves the
binary's
execution speed. Measurements show up to 5x speedup with default
settings. Fast burst sampling now results in only around 20% to 30%
slowdown (compared to 8 to 10x slowdown without sampling).
Out tests show that profile quality remains good with sampling,
with edge counts typically showing more than 90% overlap.
For applications whose behavior changes due to binary speed,
sampling instrumentation can enhance performance.
Observations have shown some apps experiencing up to
a ~2% improvement in PGO.
A potential drawback of this patch is the increased binary size
and compilation time. The Sampling method in this patch does
not improve single threaded program instrumentation binary
speed.
show more ...
|
#
cfc22605 |
| 20-Jul-2024 |
NAKAMURA Takumi <geek4civic@gmail.com> |
InstrProf: Mark BiasLI as invariant. (#95588)
Bias doesn't change after startup.
The test is enhanced for optimized sequences and atomic ops.
|
#
2c8b912f |
| 28-Jun-2024 |
Ethan Luis McDonough <ethanluismcdonough@gmail.com> |
Revert "[PGO][OpenMP] Instrumentation for GPU devices (#76587)"
This reverts commit 5fd2af38e461445c583d7ffc2fe23858966eee76. It caused build issues and broke the buildbot.
|
#
5fd2af38 |
| 28-Jun-2024 |
Ethan Luis McDonough <ethanluismcdonough@gmail.com> |
[PGO][OpenMP] Instrumentation for GPU devices (#76587)
This pull request is the first part of an ongoing effort to extends PGO
instrumentation to GPU device code. This PR makes the following change
[PGO][OpenMP] Instrumentation for GPU devices (#76587)
This pull request is the first part of an ongoing effort to extends PGO
instrumentation to GPU device code. This PR makes the following changes:
- Adds blank registration functions to device RTL
- Gives PGO globals protected visibility when targeting a supported GPU
- Handles any addrspace casts for PGO calls
- Implements PGO global extraction in GPU plugins (currently only dumps
info)
These changes can be tested by supplying `-fprofile-instrument=clang`
while targeting a GPU.
show more ...
|
#
b347a720 |
| 26-Jun-2024 |
NAKAMURA Takumi <geek4civic@gmail.com> |
[MC/DC][Coverage] Make tvbitmapupdate capable of atomic write (#96042)
This also introduces "Test and conditional Read-Modify-Write". The flow
to `atomicrmw or` is marked as `unlikely`.
|
#
a0e1b4a2 |
| 22-Jun-2024 |
NAKAMURA Takumi <geek4civic@gmail.com> |
[MC/DC][Coverage] Split out Read-modfy-Write to rmw_or(ptr,i8) (#96040)
`rmw_or` is defined as "private alwaysinline". At the moment, it has
just only simple "Read, Or, and Write", which is just sa
[MC/DC][Coverage] Split out Read-modfy-Write to rmw_or(ptr,i8) (#96040)
`rmw_or` is defined as "private alwaysinline". At the moment, it has
just only simple "Read, Or, and Write", which is just same as the
current implementation.
show more ...
|
#
a5128542 |
| 19-Jun-2024 |
NAKAMURA Takumi <geek4civic@gmail.com> |
InstProfiling: Give the name to profc_bias. NFC. (#95587)
|
#
139f896c |
| 18-Jun-2024 |
NAKAMURA Takumi <geek4civic@gmail.com> |
InstrProfiling: Split creating Bias offset to getOrCreateBiasVar(Name). NFC. (#95692)
|
#
85a7bba7 |
| 16-Jun-2024 |
NAKAMURA Takumi <geek4civic@gmail.com> |
Cleanup MC/DC intrinsics for #82448 (#95496)
3rd arg of `tvbitmap.update` was made unused. Remove 3rd arg.
Sweep `condbitmap.update`, since it is no longer used.
|
Revision tags: llvmorg-18.1.8 |
|
#
71f8b441 |
| 13-Jun-2024 |
NAKAMURA Takumi <geek4civic@gmail.com> |
Reapply: [MC/DC][Coverage] Loosen the limit of NumConds from 6 (#82448)
By storing possible test vectors instead of combinations of conditions, the restriction is dramatically relaxed.
This introdu
Reapply: [MC/DC][Coverage] Loosen the limit of NumConds from 6 (#82448)
By storing possible test vectors instead of combinations of conditions, the restriction is dramatically relaxed.
This introduces two options to `cc1`:
* `-fmcdc-max-conditions=32767` * `-fmcdc-max-test-vectors=2147483646`
This change makes coverage mapping, profraw, and profdata incompatible with Clang-18.
- Bitmap semantics changed. It is incompatible with previous format. - `BitmapIdx` in `Decision` points to the end of the bitmap. - Bitmap is packed per function. - `llvm-cov` can understand `profdata` generated by `llvm-profdata-18`.
RFC: https://discourse.llvm.org/t/rfc-coverage-new-algorithm-and-file-format-for-mc-dc/76798
-- Change(s) since llvmorg-19-init-14288-g7ead2d8c7e91
- Update compiler-rt/test/profile/ContinuousSyncMode/image-with-mcdc.c
show more ...
|
#
b422fa6b |
| 14-Jun-2024 |
Hans Wennborg <hans@chromium.org> |
Revert "[MC/DC][Coverage] Loosen the limit of NumConds from 6 (#82448)"
This broke the lit tests on Mac: https://green.lab.llvm.org/job/llvm.org/job/clang-stage1-RA/1096/
> By storing possible test
Revert "[MC/DC][Coverage] Loosen the limit of NumConds from 6 (#82448)"
This broke the lit tests on Mac: https://green.lab.llvm.org/job/llvm.org/job/clang-stage1-RA/1096/
> By storing possible test vectors instead of combinations of conditions, > the restriction is dramatically relaxed. > > This introduces two options to `cc1`: > > * `-fmcdc-max-conditions=32767` > * `-fmcdc-max-test-vectors=2147483646` > > This change makes coverage mapping, profraw, and profdata incompatible > with Clang-18. > > - Bitmap semantics changed. It is incompatible with previous format. > - `BitmapIdx` in `Decision` points to the end of the bitmap. > - Bitmap is packed per function. > - `llvm-cov` can understand `profdata` generated by `llvm-profdata-18`. > > RFC: > https://discourse.llvm.org/t/rfc-coverage-new-algorithm-and-file-format-for-mc-dc/76798
This reverts commit 7ead2d8c7e9114b3f23666209a1654939987cb30.
show more ...
|
#
7ead2d8c |
| 13-Jun-2024 |
NAKAMURA Takumi <geek4civic@gmail.com> |
[MC/DC][Coverage] Loosen the limit of NumConds from 6 (#82448)
By storing possible test vectors instead of combinations of conditions,
the restriction is dramatically relaxed.
This introduces tw
[MC/DC][Coverage] Loosen the limit of NumConds from 6 (#82448)
By storing possible test vectors instead of combinations of conditions,
the restriction is dramatically relaxed.
This introduces two options to `cc1`:
* `-fmcdc-max-conditions=32767`
* `-fmcdc-max-test-vectors=2147483646`
This change makes coverage mapping, profraw, and profdata incompatible
with Clang-18.
- Bitmap semantics changed. It is incompatible with previous format.
- `BitmapIdx` in `Decision` points to the end of the bitmap.
- Bitmap is packed per function.
- `llvm-cov` can understand `profdata` generated by `llvm-profdata-18`.
RFC:
https://discourse.llvm.org/t/rfc-coverage-new-algorithm-and-file-format-for-mc-dc/76798
show more ...
|
Revision tags: llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3 |
|
#
1351d178 |
| 01-Apr-2024 |
Mingming Liu <mingmingl@google.com> |
[InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (#66825)
(The profile format change is split into a standalone change into https://github.com/llvm/llvm-project/pull/8169
[InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (#66825)
(The profile format change is split into a standalone change into https://github.com/llvm/llvm-project/pull/81691)
* For InstrFDO value profiling, implement instrumentation and lowering for virtual table address.
* This is controlled by `-enable-vtable-value-profiling` and off by default.
* When the option is on, raw profiles will carry serialized `VTableProfData` structs and compressed vtables as payloads.
* Implement profile reader and writer support
* Raw profile reader is used by `llvm-profdata` but not compiler. Raw profile reader will construct InstrProfSymtab with symbol names, and map profiled runtime address to vtable symbols.
* Indexed profile reader is used by `llvm-profdata` and compiler. When initialized, the reader stores a pointer to the beginning of in-memory compressed vtable names and the length of string. When used in `llvm-profdata`, reader decompress the string to show symbols of a profiled site. When used in compiler, string decompression doesn't
happen since IR is used to construct InstrProfSymtab.
* Indexed profile writer collects the list of vtable names, and stores that to index profiles.
* Text profile reader and writer support are added but mostly follow the implementation for indirect-call value type.
* `llvm-profdata show -show-vtables <args> <profile>` is implemented.
rfc in
https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600#pick-instrumentation-points-and-instrument-runtime-types-7
show more ...
|
Revision tags: llvmorg-18.1.2, llvmorg-18.1.1 |
|
#
87e7140f |
| 04-Mar-2024 |
Mingming Liu <mingmingl@google.com> |
[nfc][InstrProfiling]For comdat setting helper function, move comment closer to the code (#83757)
|