#
88fbc4d3 |
| 06-Dec-2023 |
Teresa Johnson <tejohnson@google.com> |
[ThinLTO] Add tail call flag to call edges in summary (#74043)
This adds support for a HasTailCall flag on function call edges in the
ThinLTO summary. It is intended for use in aiding discovery of
[ThinLTO] Add tail call flag to call edges in summary (#74043)
This adds support for a HasTailCall flag on function call edges in the
ThinLTO summary. It is intended for use in aiding discovery of missing
frames from tail calls in profiled call stacks for MemProf of profiled
binaries that did not disable tail call elimination. A follow on change
will add the use of this new flag during MemProf context disambiguation.
The new flag is encoded in the bitcode along with either the hotness
flag from the profile, or the relative block frequency under the
-write-relbf-to-summary flag when there is no profile data.
Because we now will always have some additional call edge information, I
have removed the non-profile function summary record format, and we
simply encode the tail call flag along with a hotness type of none when
there is no profile information or relative block frequency. The change
of record format and name caused most of the test case changes.
I have added explicit testing of generation of the new tail call flag
into the bitcode and IR assembly format as part of the changes to
llvm/test/Bitcode/thinlto-function-summary-refgraph.ll. I have also
added round trip testing through assembly and bitcode to
llvm/test/Assembler/thinlto-summary.ll.
show more ...
|
#
a8874cf5 |
| 05-Dec-2023 |
hev <wangrui@loongson.cn> |
[llvm][IR] Add per-global code model attribute (#72077)
This adds a per-global code model attribute, which can override the
target's code model to access global variables.
Suggested-by: Arthur E
[llvm][IR] Add per-global code model attribute (#72077)
This adds a per-global code model attribute, which can override the
target's code model to access global variables.
Suggested-by: Arthur Eubanks <aeubanks@google.com>
Link: https://discourse.llvm.org/t/how-to-best-implement-code-model-overriding-for-certain-values/71816
Link: https://discourse.llvm.org/t/rfc-add-per-global-code-model-attribute/74944
show more ...
|
Revision tags: llvmorg-17.0.6 |
|
#
d9962c40 |
| 24-Nov-2023 |
Craig Topper <craig.topper@sifive.com> |
[IR] Add disjoint flag for Or instructions. (#72583)
This flag indicates that every bit is known to be zero in at least one
of the inputs. This allows the Or to be treated as an Add since there is
[IR] Add disjoint flag for Or instructions. (#72583)
This flag indicates that every bit is known to be zero in at least one
of the inputs. This allows the Or to be treated as an Add since there is
no possibility of a carry from any bit.
If the flag is present and this property does not hold, the result is
poison.
This makes it easier to reverse the InstCombine transform that turns Add
into Or.
This is inspired by a comment here
https://github.com/llvm/llvm-project/pull/71955#discussion_r1391614578
Discourse thread
https://discourse.llvm.org/t/rfc-add-or-disjoint-flag/75036
show more ...
|
#
f99a0200 |
| 17-Nov-2023 |
Stephen Tozer <Stephen.Tozer@Sony.com> |
Reapply "[DebugInfo] Make DIArgList inherit from Metadata and always unique"
This reverts commit 0fd5dc94380d5fe666dc6c603b4bb782cef743e7.
The original commit removed DIArgLists from being in an MD
Reapply "[DebugInfo] Make DIArgList inherit from Metadata and always unique"
This reverts commit 0fd5dc94380d5fe666dc6c603b4bb782cef743e7.
The original commit removed DIArgLists from being in an MDNode map, but did not insert a new `delete` in the LLVMContextImpl destructor. This reapply adds that call to delete, preventing a memory leak.
show more ...
|
#
0fd5dc94 |
| 17-Nov-2023 |
Stephen Tozer <stephen.tozer@sony.com> |
Revert "[DebugInfo] Make DIArgList inherit from Metadata and always unique" (#72682)
Reverts llvm/llvm-project#72147
Reverted due to buildbot failure:
https://lab.llvm.org/buildbot/#/builders/5/
Revert "[DebugInfo] Make DIArgList inherit from Metadata and always unique" (#72682)
Reverts llvm/llvm-project#72147
Reverted due to buildbot failure:
https://lab.llvm.org/buildbot/#/builders/5/builds/38410
show more ...
|
#
e77af7e1 |
| 17-Nov-2023 |
Stephen Tozer <stephen.tozer@sony.com> |
[DebugInfo] Make DIArgList inherit from Metadata and always unique (#72147)
This patch changes the `DIArgList` class's inheritance from `MDNode` to
`Metadata, ReplaceableMetadataImpl`, and ensures
[DebugInfo] Make DIArgList inherit from Metadata and always unique (#72147)
This patch changes the `DIArgList` class's inheritance from `MDNode` to
`Metadata, ReplaceableMetadataImpl`, and ensures that it is always
unique, i.e. a distinct DIArgList should never be produced.
This should not result in any changes to IR or bitcode parsing and
printing, as the format for DIArgList is unchanged, and the order in which it
appears should also be identical. As a minor note, this patch also fixes
a gap in the verifier, where the ValueAsMetadata operands to a DIArgList
would not be visited.
show more ...
|
Revision tags: llvmorg-17.0.5 |
|
#
b7b5907b |
| 09-Nov-2023 |
Chuanqi Xu <yedeng.yd@linux.alibaba.com> |
[Coroutines] Introduce [[clang::coro_only_destroy_when_complete]] (#71014)
Close https://github.com/llvm/llvm-project/issues/56980.
This patch tries to introduce a light-weight optimization attri
[Coroutines] Introduce [[clang::coro_only_destroy_when_complete]] (#71014)
Close https://github.com/llvm/llvm-project/issues/56980.
This patch tries to introduce a light-weight optimization attribute for
coroutines which are guaranteed to only be destroyed after it reached
the final suspend.
The rationale behind the patch is simple. See the example:
```C++
A foo() {
dtor d;
co_await something();
dtor d1;
co_await something();
dtor d2;
co_return 43;
}
```
Generally the generated .destroy function may be:
```C++
void foo.destroy(foo.Frame *frame) {
switch(frame->suspend_index()) {
case 1:
frame->d.~dtor();
break;
case 2:
frame->d.~dtor();
frame->d1.~dtor();
break;
case 3:
frame->d.~dtor();
frame->d1.~dtor();
frame->d2.~dtor();
break;
default: // coroutine completed or haven't started
break;
}
frame->promise.~promise_type();
delete frame;
}
```
Since the compiler need to be ready for all the cases that the coroutine
may be destroyed in a valid state.
However, from the user's perspective, we can understand that certain
coroutine types may only be destroyed after it reached to the final
suspend point. And we need a method to teach the compiler about this.
Then this is the patch. After the compiler recognized that the
coroutines can only be destroyed after complete, it can optimize the
above example to:
```C++
void foo.destroy(foo.Frame *frame) {
frame->promise.~promise_type();
delete frame;
}
```
I spent a lot of time experimenting and experiencing this in the
downstream. The numbers are really good. In a real-world coroutine-heavy
workload, the size of the build dir (including .o files) reduces 14%.
And the size of final libraries (excluding the .o files) reduces 8% in
Debug mode and 1% in Release mode.
show more ...
|
#
7b9d73c2 |
| 07-Nov-2023 |
Paulo Matos <pmatos@igalia.com> |
[NFC] Remove Type::getInt8PtrTy (#71029)
Replace this with PointerType::getUnqual().
Followup to the opaque pointer transition. Fixes an in-code TODO item.
|
Revision tags: llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3 |
|
#
6b8ed787 |
| 16-Aug-2023 |
Nikita Popov <npopov@redhat.com> |
[IR] Add writable attribute
This adds a writable attribute, which in conjunction with dereferenceable(N) states that a spurious store of N bytes is introduced on function entry. This implies that th
[IR] Add writable attribute
This adds a writable attribute, which in conjunction with dereferenceable(N) states that a spurious store of N bytes is introduced on function entry. This implies that this many bytes are writable without trapping or introducing data races. See https://llvm.org/docs/Atomics.html#optimization-outside-atomic for why the second point is important.
This attribute can be added to sret arguments. I believe Rust will also be able to use it for by-value (moved) arguments. Rust likely won't be able to use it for &mut arguments (tree borrows does not appear to allow spurious stores).
In this patch the new attribute is only used by LICM scalar promotion. However, the actual motivation for this is to fix a correctness issue in call slot optimization, which needs this attribute to avoid optimization regressions.
Followup to the discussion on D157499.
Differential Revision: https://reviews.llvm.org/D158081
show more ...
|
#
ed3f06b9 |
| 30-Oct-2023 |
Nikita Popov <npopov@redhat.com> |
[IR] Add zext nneg flag (#67982)
Add an nneg flag to the zext instruction, which specifies that the
argument is non-negative. Otherwise, the result is a poison value.
The primary use-case for th
[IR] Add zext nneg flag (#67982)
Add an nneg flag to the zext instruction, which specifies that the
argument is non-negative. Otherwise, the result is a poison value.
The primary use-case for the flag is to preserve information when sext
gets replaced with zext due to range-based canonicalization. The nneg
flag allows us to convert the zext back into an sext later. This is
useful for some optimizations (e.g. a signed icmp can fold with sext but
not zext), as well as some targets (e.g. RISCV prefers sext over zext).
Discourse thread: https://discourse.llvm.org/t/rfc-add-zext-nneg-flag/73914
This patch is based on https://reviews.llvm.org/D156444 by
@Panagiotis156, with some implementation simplifications and additional
tests.
---------
Co-authored-by: Panagiotis K <karouzakispan@gmail.com>
show more ...
|
#
df3478e4 |
| 18-Oct-2023 |
Stephen Tozer <stephen.tozer@sony.com> |
[LLVM] Add new attribute `optdebug` to optimize for debugging (#66632)
This patch adds a new fn attribute, `optdebug`, that specifies that
optimizations should make decisions that prioritize debug
[LLVM] Add new attribute `optdebug` to optimize for debugging (#66632)
This patch adds a new fn attribute, `optdebug`, that specifies that
optimizations should make decisions that prioritize debug info quality,
potentially at the cost of runtime performance.
This patch does not add any functional changes triggered by this
attribute, only the attribute itself. A subsequent patch will use this
flag to disable the post-RA scheduler.
show more ...
|
#
4346aaf0 |
| 30-Sep-2023 |
Youngsuk Kim <joseph942010@gmail.com> |
[llvm] Remove uses of Type::getPointerTo() (NFC)
* Remove if its sole use is to support an unnecessary ptr-to-ptr bitcast (remove the bitcast as well) * Replace with use of other APIs.
NFC opaque
[llvm] Remove uses of Type::getPointerTo() (NFC)
* Remove if its sole use is to support an unnecessary ptr-to-ptr bitcast (remove the bitcast as well) * Replace with use of other APIs.
NFC opaque pointer cleanup effort.
show more ...
|
Revision tags: llvmorg-17.0.0-rc2 |
|
#
bbe8cd13 |
| 31-Jul-2023 |
Teresa Johnson <tejohnson@google.com> |
[LTO] Remove module id from summary index
The module paths string table mapped to both an id sequentially assigned during LTO linking, and the module hash. The former is leftover from before the mod
[LTO] Remove module id from summary index
The module paths string table mapped to both an id sequentially assigned during LTO linking, and the module hash. The former is leftover from before the module hash was added for caching and subsequently replaced use of the module id when renaming promoted symbols (to avoid affects due to link order changes). The sequentially assigned module id was not removed, however, as it was still a convenience when serializing to/from bitcode and assembly.
This patch removes the module id from this table, since it isn't strictly needed and can lead to confusion on when it is appropriate to use (e.g. see fix in D156525). It also takes a (likely not significant) amount of overhead. Where an integer module id is needed (e.g. bitcode writing), one is assigned on the fly.
There are a couple of test changes since the paths are now sorted alphanumerically when assigning ids on the fly during assembly writing, in order to ensure deterministic behavior.
Differential Revision: https://reviews.llvm.org/D156730
show more ...
|
Revision tags: llvmorg-17.0.0-rc1, llvmorg-18-init |
|
#
3eae1bf4 |
| 14-Jul-2023 |
Nikita Popov <npopov@redhat.com> |
[llvm] Remove uses of getNonOpaquePointerElementType() (NFC)
|
#
a1ca3af3 |
| 05-Jul-2023 |
Matthew Voss <matthew.voss@sony.com> |
[llvm] A Unified LTO Bitcode Frontend
Here's a high level summary of the changes in this patch. For more information on rational, see the RFC. (https://discourse.llvm.org/t/rfc-a-unified-lto-bitcode
[llvm] A Unified LTO Bitcode Frontend
Here's a high level summary of the changes in this patch. For more information on rational, see the RFC. (https://discourse.llvm.org/t/rfc-a-unified-lto-bitcode-frontend/61774).
- Add config parameter to LTO backend, specifying which LTO mode is desired when using unified LTO. - Add unified LTO flag to the summary index for efficiency. Unified LTO modules can be detected without parsing the module. - Make sure that the ModuleID is generated by incorporating more types of symbols.
Differential Revision: https://reviews.llvm.org/D123803
show more ...
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3 |
|
#
3adc6e03 |
| 19-Apr-2023 |
Teresa Johnson <tejohnson@google.com> |
[ThinLTO] Remove BlockCount for non partial sample profile builds
As pointed out in https://discourse.llvm.org/t/undeterministic-thin-index-file/69985, the block count added to distributed ThinLTO i
[ThinLTO] Remove BlockCount for non partial sample profile builds
As pointed out in https://discourse.llvm.org/t/undeterministic-thin-index-file/69985, the block count added to distributed ThinLTO index files breaks incremental builds on ThinLTO - if any linked file has a different number of BBs, then the accumulated sum placed in the index files will change, causing all ThinLTO backend compiles to be redone.
The block count is only used for scaling of partial sample profiles, and was added in D80403 for D79831.
This patch simply removes this field from the index files of non partial sample profile compiles, which is NFC on the output of the compiler.
We subsequently need to see if this can be removed for partial sample profiles without signficant performance loss, or redesigned in a way that does not destroy caching.
Differential Revision: https://reviews.llvm.org/D148746
show more ...
|
Revision tags: llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4 |
|
#
bbfb13a5 |
| 06-Mar-2023 |
Nikita Popov <npopov@redhat.com> |
[ConstExpr] Remove select constant expression
This removes the select constant expression, as part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179. Uses of this expression
[ConstExpr] Remove select constant expression
This removes the select constant expression, as part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179. Uses of this expressions have already been removed in advance, so this just removes related infrastructure and updates tests.
Differential Revision: https://reviews.llvm.org/D145382
show more ...
|
#
01dacc41 |
| 27-Feb-2023 |
Arthur Eubanks <aeubanks@google.com> |
[Bitcode] Remove typed pointer abbreviation
Since typed pointers are deprecated.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D144901
|
Revision tags: llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
5da67449 |
| 06-Dec-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
IR: Add nofpclass parameter attribute
This carries a bitmask indicating forbidden floating-point value kinds in the argument or return value. This will enable interprocedural -ffinite-math-only opti
IR: Add nofpclass parameter attribute
This carries a bitmask indicating forbidden floating-point value kinds in the argument or return value. This will enable interprocedural -ffinite-math-only optimizations. This is primarily to cover the no-nans and no-infinities cases, but also covers the other floating point classes for free. Textually, this provides a number of names corresponding to bits in FPClassTest, e.g.
call nofpclass(nan inf) @must_be_finite() call nofpclass(snan) @cannot_be_snan()
This is more expressive than the existing nnan and ninf fast math flags. As an added bonus, you can represent fun things like nanf:
declare nofpclass(inf zero sub norm) float @only_nans()
Compared to nnan/ninf: - Can be applied to individual call operands as well as the return value - Can distinguish signaling and quiet nans - Distinguishes the sign of infinities - Can be safely propagated since it doesn't imply anything about other operands. - Does not apply to FP instructions; it's not a flag
This is one step closer to being able to retire "no-nans-fp-math" and "no-infs-fp-math". The one remaining situation where we have no way to represent no-nans/infs is for loads (if we wanted to solve this we could introduce !nofpclass metadata, following along with noundef/!noundef).
This is to help simplify the GPU builtin math library distribution. Currently the library code has explicit finite math only checks, read from global constants the compiler driver needs to set based on the compiler flags during linking. We end up having to internalize the library into each translation unit in case different linked modules have different math flags. By propagating known-not-nan and known-not-infinity information, we can automatically prune the edge case handling in most functions if the function is only reached from fast math uses.
show more ...
|
#
62c7f035 |
| 07-Feb-2023 |
Archibald Elliott <archibald.elliott@arm.com> |
[NFC][TargetParser] Remove llvm/ADT/Triple.h
I also ran `git clang-format` to get the headers in the right order for the new location, which has changed the order of other headers in two files.
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5 |
|
#
778cf543 |
| 03-Nov-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
IR: Add atomicrmw uinc_wrap and udec_wrap
These are essentially add/sub 1 with a clamping value.
AMDGPU has instructions for these. CUDA/HIP expose these as atomicInc/atomicDec. Currently we use ta
IR: Add atomicrmw uinc_wrap and udec_wrap
These are essentially add/sub 1 with a clamping value.
AMDGPU has instructions for these. CUDA/HIP expose these as atomicInc/atomicDec. Currently we use target intrinsics for these, but those do no carry the ordering and syncscope. Add these to atomicrmw so we can carry these and benefit from the regular legalization processes.
show more ...
|
#
e6b02214 |
| 20-Dec-2022 |
Joshua Cranmer <joshua.cranmer@intel.com> |
[IR] Add a target extension type to LLVM.
Target-extension types represent types that need to be preserved through optimization, but otherwise are not introspectable by target-independent optimizati
[IR] Add a target extension type to LLVM.
Target-extension types represent types that need to be preserved through optimization, but otherwise are not introspectable by target-independent optimizations. This patch doesn't add any uses of these types by an existing backend, it only provides basic infrastructure such that these types would work correctly.
Reviewed By: nikic, barannikov88
Differential Revision: https://reviews.llvm.org/D135202
show more ...
|
#
f7dffc28 |
| 10-Dec-2022 |
Kazu Hirata <kazu@google.com> |
Don't include None.h (NFC)
I've converted all known uses of None to std::nullopt, so we no longer need to include None.h.
This is part of an effort to migrate from llvm::Optional to std::optional:
Don't include None.h (NFC)
I've converted all known uses of None to std::nullopt, so we no longer need to include None.h.
This is part of an effort to migrate from llvm::Optional to std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
show more ...
|
#
c9cb4fc7 |
| 01-Dec-2022 |
Jonas Hahnfeld <jonas.hahnfeld@cern.ch> |
[DebugInfo] Store optional DIFile::Source as pointer
getCanonicalMDString() also returns a nullptr for empty strings, which tripped over the getSource() method. Solve the ambiguity of no source vers
[DebugInfo] Store optional DIFile::Source as pointer
getCanonicalMDString() also returns a nullptr for empty strings, which tripped over the getSource() method. Solve the ambiguity of no source versus an optional containing a nullptr by simply storing a pointer.
Differential Revision: https://reviews.llvm.org/D138658
show more ...
|
#
49e75ebd |
| 07-Dec-2022 |
Krzysztof Parzyszek <kparzysz@quicinc.com> |
[Bitcode(Reader|Writer)] Convert Optional to std::optional
|