#
29482426 |
| 08-Mar-2021 |
Alina Sbirlea <asbirlea@google.com> |
Revert "[LICM] Make promotion faster"
Revert 3d8f842712d49b0767832b6e3f65df2d3f19af4e Revision triggers a miscompile sinking a store incorrectly outside a threading loop. Detected by tsan. Reverting
Revert "[LICM] Make promotion faster"
Revert 3d8f842712d49b0767832b6e3f65df2d3f19af4e Revision triggers a miscompile sinking a store incorrectly outside a threading loop. Detected by tsan. Reverting while investigating.
Differential Revision: https://reviews.llvm.org/D89264
show more ...
|
#
98994271 |
| 04-Mar-2021 |
dfukalov <daniil.fukalov@amd.com> |
[NFC][AliasSetTracker] Remove implicit conversion AliasResult to integer.
Preparation to make AliasResult scoped enumeration.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D97
[NFC][AliasSetTracker] Remove implicit conversion AliasResult to integer.
Preparation to make AliasResult scoped enumeration.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D97973
show more ...
|
Revision tags: llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1 |
|
#
3d8f8427 |
| 11-Oct-2020 |
Nikita Popov <nikita.ppv@gmail.com> |
[LICM] Make promotion faster
Even when MemorySSA-based LICM is used, an AST is still populated for scalar promotion. As the AST has quadratic complexity, a lot of time is spent in this step despite
[LICM] Make promotion faster
Even when MemorySSA-based LICM is used, an AST is still populated for scalar promotion. As the AST has quadratic complexity, a lot of time is spent in this step despite the existing access count limit. This patch optimizes the identification of promotable stores.
The idea here is pretty simple: We're only interested in must-alias mod sets of loop invariant pointers. As such, only populate the AST with loop-invariant loads and stores (anything else is definitely not promotable) and then discard any sets which alias with any of the remaining, definitely non-promotable accesses.
If we promoted something, check whether this has made some other accesses loop invariant and thus possible promotion candidates.
This is much faster in practice, because we need to perform AA queries for O(NumPromotable^2 + NumPromotable*NumNonPromotable) instead of O(NumTotal^2), and NumPromotable tends to be small. Additionally, promotable accesses have loop invariant pointers, for which AA is cheaper.
This has a signicant positive compile-time impact. We save ~1.8% geomean on CTMark at O3, with 6% on lencod in particular and 25% on individual files.
Conceptually, this change is NFC, but may not be so in practice, because the AST is only an approximation, and can produce different results depending on the order in which accesses are added. However, there is at least no impact on the number of promotions (licm.NumPromoted) in test-suite O3 configuration with this change.
Differential Revision: https://reviews.llvm.org/D89264
show more ...
|
#
896d0e1a |
| 23-Feb-2021 |
Kazu Hirata <kazu@google.com> |
[Analysis] Use range-based for loops (NFC)
|
#
28d31320 |
| 06-Feb-2021 |
Kazu Hirata <kazu@google.com> |
[Analysis] Use range-based for loops (NFC)
|
#
121cac01 |
| 19-Jan-2021 |
Jeroen Dobbelaere <jeroen.dobbelaere@synopsys.com> |
[noalias.decl] Look through llvm.experimental.noalias.scope.decl
Just like llvm.assume, there are a lot of cases where we can just ignore llvm.experimental.noalias.scope.decl.
Reviewed By: nikic
D
[noalias.decl] Look through llvm.experimental.noalias.scope.decl
Just like llvm.assume, there are a lot of cases where we can just ignore llvm.experimental.noalias.scope.decl.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D93042
show more ...
|
#
f76e83bf |
| 30-Dec-2020 |
Kazu Hirata <kazu@google.com> |
[Analysis] Use llvm::append_range (NFC)
|
#
4df8efce |
| 17-Nov-2020 |
Nikita Popov <nikita.ppv@gmail.com> |
[AA] Split up LocationSize::unknown()
Currently, we have some confusion in the codebase regarding the meaning of LocationSize::unknown(): Some parts (including most of BasicAA) assume that LocationS
[AA] Split up LocationSize::unknown()
Currently, we have some confusion in the codebase regarding the meaning of LocationSize::unknown(): Some parts (including most of BasicAA) assume that LocationSize::unknown() only allows accesses after the base pointer. Some parts (various callers of AA) assume that LocationSize::unknown() allows accesses both before and after the base pointer (but within the underlying object).
This patch splits up LocationSize::unknown() into LocationSize::afterPointer() and LocationSize::beforeOrAfterPointer() to make this completely unambiguous. I tried my best to determine which one is appropriate for all the existing uses.
The test changes in cs-cs.ll in particular illustrate a previously clearly incorrect AA result: We were effectively assuming that argmemonly functions were only allowed to access their arguments after the passed pointer, but not before it. I'm pretty sure that this was not intentional, and it's certainly not specified by LangRef that way.
Differential Revision: https://reviews.llvm.org/D91649
show more ...
|
#
f3c44569 |
| 18-Nov-2020 |
Hongtao Yu <hoy@fb.com> |
[CSSPGO] IR intrinsic for pseudo-probe block instrumentation
This change introduces a new IR intrinsic named `llvm.pseudoprobe` for pseudo-probe block instrumentation. Please refer to https://review
[CSSPGO] IR intrinsic for pseudo-probe block instrumentation
This change introduces a new IR intrinsic named `llvm.pseudoprobe` for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story.
A pseudo probe is used to collect the execution count of the block where the probe is instrumented. This requires a pseudo probe to be persisting. The LLVM PGO instrumentation also instruments in similar places by placing a counter in the form of atomic read/write operations or runtime helper calls. While these operations are very persisting or optimization-resilient, in theory we can borrow the atomic read/write implementation from PGO counters and cut it off at the end of compilation with all the atomics converted into binary data. This was our initial design and we’ve seen promising sample correlation quality with it. However, the atomics approach has a couple issues:
1. IR Optimizations are blocked unexpectedly. Those atomic instructions are not going to be physically present in the binary code, but since they are on the IR till very end of compilation, they can still prevent certain IR optimizations and result in lower code quality. 2. The counter atomics may not be fully cleaned up from the code stream eventually. 3. Extra work is needed for re-targeting.
We choose to implement pseudo probes based on a special LLVM intrinsic, which is expected to have most of the semantics that comes with an atomic operation but does not block desired optimizations as much as possible. More specifically the semantics associated with the new intrinsic enforces a pseudo probe to be virtually executed exactly the same number of times before and after an IR optimization. The intrinsic also comes with certain flags that are carefully chosen so that the places they are probing are not going to be messed up by the optimizer while most of the IR optimizations still work. The core flags given to the special intrinsic is `IntrInaccessibleMemOnly`, which means the intrinsic accesses memory and does have a side effect so that it is not removable, but is does not access memory locations that are accessible by any original instructions. This way the intrinsic does not alias with any original instruction and thus it does not block optimizations as much as an atomic operation does. We also assign a function GUID and a block index to an intrinsic so that they are uniquely identified and not merged in order to achieve good correlation quality.
Let's now look at an example. Given the following LLVM IR:
``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 br i1 %cmp, label %bb1, label %bb2 bb1: br label %bb3 bb2: br label %bb3 bb3: ret void } ```
The instrumented IR will look like below. Note that each `llvm.pseudoprobe` intrinsic call represents a pseudo probe at a block, of which the first parameter is the GUID of the probe’s owner function and the second parameter is the probe’s ID.
``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 call void @llvm.pseudoprobe(i64 837061429793323041, i64 1) br i1 %cmp, label %bb1, label %bb2 bb1: call void @llvm.pseudoprobe(i64 837061429793323041, i64 2) br label %bb3 bb2: call void @llvm.pseudoprobe(i64 837061429793323041, i64 3) br label %bb3 bb3: call void @llvm.pseudoprobe(i64 837061429793323041, i64 4) ret void }
```
Reviewed By: wmi
Differential Revision: https://reviews.llvm.org/D86490
show more ...
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3 |
|
#
f4ea0f98 |
| 15-Sep-2020 |
Arthur Eubanks <aeubanks@google.com> |
[NewPM] Port -print-alias-sets to NPM
Really it should be named print<alias-sets>, but for the sake of changing fewer tests, added a TODO to rename after NPM switch and test cleanup.
Reviewed By: y
[NewPM] Port -print-alias-sets to NPM
Really it should be named print<alias-sets>, but for the sake of changing fewer tests, added a TODO to rename after NPM switch and test cleanup.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D87713
show more ...
|
#
6d40f35c |
| 15-Sep-2020 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
AliasSetTracker.cpp - remove unnecessary includes. NFCI.
These are all directly included in AliasSetTracker.h
|
Revision tags: llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3 |
|
#
466f8843 |
| 18-Feb-2020 |
Jim Lin <tclin914@gmail.com> |
[NFC] Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h,td}
|
Revision tags: llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1 |
|
#
05da2fe5 |
| 13-Nov-2019 |
Reid Kleckner <rnk@google.com> |
Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of reco
Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
show more ...
|
Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5 |
|
#
18f5204d |
| 12-Sep-2019 |
Alina Sbirlea <asbirlea@google.com> |
[LICM/AST] Check if the AliasAny set is removed from the tracker.
Summary: Resolves PR38513. Credit to @bjope for debugging this.
Reviewers: hfinkel, uabelho, bjope
Subscribers: sanjoy.google, bjo
[LICM/AST] Check if the AliasAny set is removed from the tracker.
Summary: Resolves PR38513. Credit to @bjope for debugging this.
Reviewers: hfinkel, uabelho, bjope
Subscribers: sanjoy.google, bjope, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67417
llvm-svn: 371752
show more ...
|
Revision tags: llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2 |
|
#
6cba96ed |
| 06-Feb-2019 |
Alina Sbirlea <asbirlea@google.com> |
[LICM/MSSA] Add promotion to scalars by building an AliasSetTracker with MemorySSA.
Summary: Experimentally we found that promotion to scalars carries less benefits than sinking and hoisting in LICM
[LICM/MSSA] Add promotion to scalars by building an AliasSetTracker with MemorySSA.
Summary: Experimentally we found that promotion to scalars carries less benefits than sinking and hoisting in LICM. When using MemorySSA, we build an AliasSetTracker on demand in order to reuse the current infrastructure. We only build it if less than AccessCapForMSSAPromotion exist in the loop, a cap that is by default set to 250. This value ensures there are no runtime regressions, and there are small compile time gains for pathological cases. A much lower value (20) was found to yield a single regression in the llvm-test-suite and much higher benefits for compile times. Conservatively we set the current cap to a high value, but we will explore lowering it when MemorySSA is enabled by default.
Reviewers: sanjoy, chandlerc
Subscribers: nemanjai, jlebar, Prazek, george.burgess.iv, jfb, jsji, llvm-commits
Differential Revision: https://reviews.llvm.org/D56625
llvm-svn: 353339
show more ...
|
#
910c6bef |
| 06-Feb-2019 |
Alina Sbirlea <asbirlea@google.com> |
[AliasSetTracker] Pass MustAlias to addPointer more often.
Summary: Pass the alias info to addPointer when available. Will save an alias() call for must sets when adding a known Must or May alias. [
[AliasSetTracker] Pass MustAlias to addPointer more often.
Summary: Pass the alias info to addPointer when available. Will save an alias() call for must sets when adding a known Must or May alias. [Part of a series of cleanup patches]
Reviewers: reames, mkazantsev
Subscribers: sanjoy, jlebar, llvm-commits
Differential Revision: https://reviews.llvm.org/D56613
llvm-svn: 353335
show more ...
|
#
00ae46ba |
| 06-Feb-2019 |
Philip Reames <listmail@philipreames.com> |
[AliasSetTracker] Minor style tweak to avoid a variable w/two distinct live ranges [NFC]
llvm-svn: 353267
|
#
8e1d6577 |
| 28-Jan-2019 |
Alina Sbirlea <asbirlea@google.com> |
[AliasSetTracker] Cleanup more comments. [NFCI]
llvm-svn: 352416
|
#
d8c829bc |
| 28-Jan-2019 |
Alina Sbirlea <asbirlea@google.com> |
[AliasSetTracker] Cleanup comments. [NFCI]
llvm-svn: 352406
|
#
3d1d95ca |
| 28-Jan-2019 |
Alina Sbirlea <asbirlea@google.com> |
[AliasSetTracker] Update signature to aliasesPointer [NFCI].
llvm-svn: 352399
|
Revision tags: llvmorg-8.0.0-rc1 |
|
#
2946cd70 |
| 19-Jan-2019 |
Chandler Carruth <chandlerc@gmail.com> |
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the ne
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository.
llvm-svn: 351636
show more ...
|
#
363ac683 |
| 07-Jan-2019 |
Chandler Carruth <chandlerc@gmail.com> |
[CallSite removal] Migrate all Alias Analysis APIs to use the newly minted `CallBase` class instead of the `CallSite` wrapper.
This moves the largest interwoven collection of APIs that traffic in `C
[CallSite removal] Migrate all Alias Analysis APIs to use the newly minted `CallBase` class instead of the `CallSite` wrapper.
This moves the largest interwoven collection of APIs that traffic in `CallSite`s. While a handful of these could have been migrated with a minorly more shallow migration by converting from a `CallSite` to a `CallBase`, it hardly seemed worth it. Most of the APIs needed to migrate together because of the complex interplay of AA APIs and the fact that converting from a `CallBase` to a `CallSite` isn't free in its current implementation.
Out of tree users of these APIs can fairly reliably migrate with some combination of `.getInstruction()` on the `CallSite` instance and casting the resulting pointer. The most generic form will look like `CS` -> `cast_or_null<CallBase>(CS.getInstruction())` but in most cases there is a more elegant migration. Hopefully, this migrates enough APIs for users to fully move from `CallSite` to the base class. All of the in-tree users were easily migrated in that fashion.
Thanks for the review from Saleem!
Differential Revision: https://reviews.llvm.org/D55641
llvm-svn: 350503
show more ...
|
Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1 |
|
#
fd9722fb |
| 01-Nov-2018 |
Alina Sbirlea <asbirlea@google.com> |
[AliasSetTracker] Misc cleanup (NFCI)
Summary: Remove two redundant checks, add one in the unit test. Remove an unused method. Fix computation of TotalMayAliasSetSize. llvm-svn: 345911
|
#
4f5d3371 |
| 29-Oct-2018 |
Alina Sbirlea <asbirlea@google.com> |
[AliasSetTracker] Cleanup addPointer interface. [NFCI]
Summary: Attempting to simplify the addPointer interface. Currently there's code decomposing a MemoryLocation into (Ptr, Size, AAMDNodes) only
[AliasSetTracker] Cleanup addPointer interface. [NFCI]
Summary: Attempting to simplify the addPointer interface. Currently there's code decomposing a MemoryLocation into (Ptr, Size, AAMDNodes) only to recreate the MemoryLocation inside the call.
Reviewers: reames, mkazantsev
Subscribers: sanjoy, jlebar, llvm-commits
Differential Revision: https://reviews.llvm.org/D53836
llvm-svn: 345548
show more ...
|
#
6ef8002c |
| 10-Oct-2018 |
George Burgess IV <george.burgess.iv@gmail.com> |
Replace most users of UnknownSize with LocationSize::unknown(); NFC
Moving away from UnknownSize is part of the effort to migrate us to LocationSizes (e.g. the cleanup promised in D44748).
This doe
Replace most users of UnknownSize with LocationSize::unknown(); NFC
Moving away from UnknownSize is part of the effort to migrate us to LocationSizes (e.g. the cleanup promised in D44748).
This doesn't entirely remove all of the uses of UnknownSize; some uses require tweaks to assume that UnknownSize isn't just some kind of int. This patch is intended to just be a trivial replacement for all places where LocationSize::unknown() will Just Work.
llvm-svn: 344186
show more ...
|