#
d2fa4147 |
| 27-Apr-2016 |
Adam Nemet <anemet@apple.com> |
[LoopDist] Add llvm.loop.distribute.enable loop metadata
Summary: D19403 adds a new pragma for loop distribution. This change adds support for the corresponding metadata that the pragma is translat
[LoopDist] Add llvm.loop.distribute.enable loop metadata
Summary: D19403 adds a new pragma for loop distribution. This change adds support for the corresponding metadata that the pragma is translated to by the FE.
As part of this I had to rethink the flag -enable-loop-distribute. My goal was to be backward compatible with the existing behavior:
A1. pass is off by default from the optimization pipeline unless -enable-loop-distribute is specified
A2. pass is on when invoked directly from opt (e.g. for unit-testing)
The new pragma/metadata overrides these defaults so the new behavior is:
B1. A1 + enable distribution for individual loop with the pragma/metadata
B2. A2 + disable distribution for individual loop with the pragma/metadata
The default value whether the pass is on or off comes from the initiator of the pass. From the PassManagerBuilder the default is off, from opt it's on.
I moved -enable-loop-distribute under the pass. If the flag is specified it overrides the default from above.
Then the pragma/metadata can further modifies this per loop.
As a side-effect, we can now also use -enable-loop-distribute=0 from opt to emulate the default from the optimization pipeline. So to be precise this is the new behavior:
C1. pass is off by default from the optimization pipeline unless -enable-loop-distribute or the pragma/metadata enables it
C2. pass is on when invoked directly from opt unless -enable-loop-distribute=0 or the pragma/metadata disables it
Reviewers: hfinkel
Subscribers: joker.eph, mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D19431
llvm-svn: 267672
show more ...
|
#
61399ac4 |
| 27-Apr-2016 |
Adam Nemet <anemet@apple.com> |
[LoopDist] Split main class. NFC
This splits out the per-loop functionality from the Pass class.
With this the fact whether the loop is forced-distribute with the new metadata/pragma can be cached
[LoopDist] Split main class. NFC
This splits out the per-loop functionality from the Pass class.
With this the fact whether the loop is forced-distribute with the new metadata/pragma can be cached in the per-loop class rather than passed around.
llvm-svn: 267643
show more ...
|
#
5eccf07d |
| 17-Mar-2016 |
Adam Nemet <anemet@apple.com> |
[LoopVersioning] Annotate versioned loop with noalias metadata
Summary: If we decide to version a loop to benefit a transformation, it makes sense to record the now non-aliasing accesses in the newl
[LoopVersioning] Annotate versioned loop with noalias metadata
Summary: If we decide to version a loop to benefit a transformation, it makes sense to record the now non-aliasing accesses in the newly versioned loop. This allows non-aliasing information to be used by subsequent passes.
One example is 456.hmmer in SPECint2006 where after loop distribution, we vectorize one of the newly distributed loops. To vectorize we version this loop to fully disambiguate may-aliasing accesses. If we add the noalias markers, we can use the same information in a later DSE pass to eliminate some dead stores which amounts to ~25% of the instructions of this hot memory-pipeline-bound loop. The overall performance improves by 18% on our ARM64.
The scoped noalias annotation is added in LoopVersioning. The patch then enables this for loop distribution. A follow-on patch will enable it for the vectorizer. Eventually this should be run by default when versioning the loop but first I'd like to get some feedback whether my understanding and application of scoped noalias metadata is correct.
Essentially my approach was to have a separate alias domain for each versioning of the loop. For example, if we first version in loop distribution and then in vectorization of the distributed loops, we have a different set of memchecks for each versioning. By keeping the scopes in different domains they can conveniently be defined independently since different alias domains don't affect each other.
As written, I also have a separate domain for each loop. This is not necessary and we could save some metadata here by using the same domain across the different loops. I don't think it's a big deal either way.
Probably the best is to review the tests first to see if I mapped this problem correctly to scoped noalias markers. I have plenty of comments in the tests.
Note that the interface is prepared for the vectorizer which needs the annotateInstWithNoAlias API. The vectorizer does not use LoopVersioning so we need a way to pass in the versioned instructions. This is also why the maps have to become part of the object state.
Also currently, we only have an AA-aware DSE after the vectorizer if we also run the LTO pipeline. Depending how widely this triggers we may want to schedule a DSE toward the end of the regular pass pipeline.
Reviewers: hfinkel, nadav, ashutosh.nema
Subscribers: mssimpso, aemerson, llvm-commits, mcrosier
Differential Revision: http://reviews.llvm.org/D16712
llvm-svn: 263743
show more ...
|
Revision tags: llvmorg-3.8.0, llvmorg-3.8.0-rc3, llvmorg-3.8.0-rc2, llvmorg-3.8.0-rc1 |
|
#
843fb204 |
| 15-Dec-2015 |
Justin Bogner <mail@justinbogner.com> |
LPM: Stop threading `Pass *` through all of the loop utility APIs. NFC
A large number of loop utility functions take a `Pass *` and reach into it to find out which analyses to preserve. There are a
LPM: Stop threading `Pass *` through all of the loop utility APIs. NFC
A large number of loop utility functions take a `Pass *` and reach into it to find out which analyses to preserve. There are a number of problems with this:
- The APIs have access to pretty well any Pass state they want, so it's hard to tell what they may or may not do.
- Other APIs have copied these and pass around a `Pass *` even though they don't even use it. Some of these just hand a nullptr to the API since the callers don't even have a pass available.
- Passes in the new pass manager don't work like the current ones, so the APIs can't be used as is there.
Instead, we should explicitly thread the analysis results that we actually care about through these APIs. This is both simpler and more reusable.
llvm-svn: 255669
show more ...
|
#
9cd9a7e3 |
| 09-Dec-2015 |
Silviu Baranga <silviu.baranga@arm.com> |
Re-commit r255115, with the PredicatedScalarEvolution class moved to ScalarEvolution.h, in order to avoid cyclic dependencies between the Transform and Analysis modules:
[LV][LAA] Add a layer over S
Re-commit r255115, with the PredicatedScalarEvolution class moved to ScalarEvolution.h, in order to avoid cyclic dependencies between the Transform and Analysis modules:
[LV][LAA] Add a layer over SCEV to apply run-time checked knowledge on SCEV expressions
Summary: This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the usage of SCEV predicates. The SCEVPredicatedLayer takes the statically deduced knowledge by ScalarEvolution and applies the knowledge from the SCEV predicates. The end goal is that both LAA and LV should use this interface everywhere.
This also solves a problem involving the result of SCEV expression rewritting when the predicate changes. Suppose we have the expression (sext {a,+,b}) and two predicates P1: {a,+,b} has nsw P2: b = 1.
Applying P1 and then P2 gives us {a,+,1}, while applying P2 and the P1 gives us sext({a,+,1}) (the AddRec expression was changed by P2 so P1 no longer applies). The SCEVPredicatedLayer maintains the order of transformations by feeding back the results of previous transformations into new transformations, and therefore avoiding this issue.
The SCEVPredicatedLayer maintains a cache to remember the results of previous SCEV rewritting results. This also has the benefit of reducing the overall number of expression rewrites.
Reviewers: mzolotukhin, anemet
Subscribers: jmolloy, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D14296
llvm-svn: 255122
show more ...
|
#
ad1ccb35 |
| 09-Dec-2015 |
Silviu Baranga <silviu.baranga@arm.com> |
Revert r255115 until we figure out how to fix the bot failures.
llvm-svn: 255117
|
#
41eb6825 |
| 09-Dec-2015 |
Silviu Baranga <silviu.baranga@arm.com> |
[LV][LAA] Add a layer over SCEV to apply run-time checked knowledge on SCEV expressions
Summary: This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the usage of SCEV pr
[LV][LAA] Add a layer over SCEV to apply run-time checked knowledge on SCEV expressions
Summary: This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the usage of SCEV predicates. The SCEVPredicatedLayer takes the statically deduced knowledge by ScalarEvolution and applies the knowledge from the SCEV predicates. The end goal is that both LAA and LV should use this interface everywhere.
This also solves a problem involving the result of SCEV expression rewritting when the predicate changes. Suppose we have the expression (sext {a,+,b}) and two predicates P1: {a,+,b} has nsw P2: b = 1.
Applying P1 and then P2 gives us {a,+,1}, while applying P2 and the P1 gives us sext({a,+,1}) (the AddRec expression was changed by P2 so P1 no longer applies). The SCEVPredicatedLayer maintains the order of transformations by feeding back the results of previous transformations into new transformations, and therefore avoiding this issue.
The SCEVPredicatedLayer maintains a cache to remember the results of previous SCEV rewritting results. This also has the benefit of reducing the overall number of expression rewrites.
Reviewers: mzolotukhin, anemet
Subscribers: jmolloy, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D14296
llvm-svn: 255115
show more ...
|
Revision tags: llvmorg-3.7.1, llvmorg-3.7.1-rc2, llvmorg-3.7.1-rc1 |
|
#
2910a4f6 |
| 09-Nov-2015 |
Silviu Baranga <silviu.baranga@arm.com> |
Allow LLE/LD and the loop versioning infrastructure to use SCEV predicates
Summary: LAA currently generates a set of SCEV predicates that must be checked by users. In the case of Loop Distribute/Loo
Allow LLE/LD and the loop versioning infrastructure to use SCEV predicates
Summary: LAA currently generates a set of SCEV predicates that must be checked by users. In the case of Loop Distribute/Loop Load Elimination, no such predicates could have been emitted, since we don't allow stride versioning. However, in the future there could be SCEV predicates that will need to be checked.
This change adds support for SCEV predicate versioning in the Loop Distribute, Loop Load Eliminate and the loop versioning infrastructure.
Reviewers: anemet
Subscribers: mssimpso, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D14240
llvm-svn: 252467
show more ...
|
#
a2df750f |
| 03-Nov-2015 |
Adam Nemet <anemet@apple.com> |
[LAA] LLE 3/6: Rename InterestingDependence to Dependences, NFC
Summary: We now collect all types of dependences including lexically forward deps not just "interesting" ones.
Reviewers: hfinkel
Su
[LAA] LLE 3/6: Rename InterestingDependence to Dependences, NFC
Summary: We now collect all types of dependences including lexically forward deps not just "interesting" ones.
Reviewers: hfinkel
Subscribers: rengolin, llvm-commits
Differential Revision: http://reviews.llvm.org/D13256
llvm-svn: 251985
show more ...
|
Revision tags: llvmorg-3.7.0, llvmorg-3.7.0-rc4, llvmorg-3.7.0-rc3 |
|
#
e4813409 |
| 20-Aug-2015 |
Adam Nemet <anemet@apple.com> |
[LVer] Fix FIXME: hide addPHINodes, NFC
Since Ashutosh made findDefsUsedOutsideOfLoop public, we can clean this up.
Now clients that don't compute DefsUsedOutsideOfLoop can just call versionLoop()
[LVer] Fix FIXME: hide addPHINodes, NFC
Since Ashutosh made findDefsUsedOutsideOfLoop public, we can clean this up.
Now clients that don't compute DefsUsedOutsideOfLoop can just call versionLoop() and computing DefsUsedOutsideOfLoop will happen implicitly. With that there is no reason to expose addPHINodes anymore.
Ashutosh, you can now drop the calls to findDefsUsedOutsideOfLoop and addPHINodes in LVerLICM and things should just work.
llvm-svn: 245579
show more ...
|
#
c5b7b555 |
| 19-Aug-2015 |
Ashutosh Nema <ashu1212@gmail.com> |
Exposed findDefsUsedOutsideOfLoop as a loop utility function
Exposed findDefsUsedOutsideOfLoop as a loop utility function by moving it from LoopDistribute to LoopUtils.
Reviewed By: anemet
llvm-s
Exposed findDefsUsedOutsideOfLoop as a loop utility function
Exposed findDefsUsedOutsideOfLoop as a loop utility function by moving it from LoopDistribute to LoopUtils.
Reviewed By: anemet
llvm-svn: 245416
show more ...
|
#
06ccf014 |
| 14-Aug-2015 |
Adam Nemet <anemet@apple.com> |
[LVer] Remove unused Pass parameter from versionLoop, NFC
llvm-svn: 245032
|
Revision tags: studio-1.4 |
|
#
15840393 |
| 07-Aug-2015 |
Adam Nemet <anemet@apple.com> |
[LAA] Make the set of runtime checks part of the state of LAA, NFC
This is the full set of checks that clients can further filter. IOW, it's client-agnostic. This makes LAA complete in the sense th
[LAA] Make the set of runtime checks part of the state of LAA, NFC
This is the full set of checks that clients can further filter. IOW, it's client-agnostic. This makes LAA complete in the sense that it now provides the two main results of its analysis precomputed:
1. memory dependences via getDepChecker().getInsterestingDependences() 2. run-time checks via getRuntimePointerCheck().getChecks()
However, as a consequence we now compute this information pro-actively. Thus if the client decides to skip the loop based on the dependences we've computed the checks unnecessarily. In order to see whether this was a significant overhead I checked compile time on SPEC2k6 LTO bitcode files. The change was in the noise.
The checks are generated in canCheckPtrAtRT, at the same place where we used to call groupChecks to merge checks.
llvm-svn: 244368
show more ...
|
Revision tags: llvmorg-3.7.0-rc2 |
|
#
c75ad69c |
| 30-Jul-2015 |
Adam Nemet <anemet@apple.com> |
[LDist] Filter the checks locally rather than in LAA, NFC
Before, we were passing the pointer partitions to LAA. Now, we get all the checks from LAA and filter out the checks within partitions in L
[LDist] Filter the checks locally rather than in LAA, NFC
Before, we were passing the pointer partitions to LAA. Now, we get all the checks from LAA and filter out the checks within partitions in LoopDistribution.
This effectively concludes the steps to move filtering memchecks from LAA into its clients. There is still some cleanup left to remove the unused interfaces in LAA that still take PtrPartition.
(Moving this functionality to LoopDistribution also requires needsChecking on pointers to be made public.)
llvm-svn: 243613
show more ...
|
#
0a674401 |
| 28-Jul-2015 |
Adam Nemet <anemet@apple.com> |
[LDist][LVer] Explicitly pass the set of memchecks to LoopVersioning, NFC
Before the patch, the checks were generated internally in addRuntimeCheck. Now, we use the new overloaded version of addRun
[LDist][LVer] Explicitly pass the set of memchecks to LoopVersioning, NFC
Before the patch, the checks were generated internally in addRuntimeCheck. Now, we use the new overloaded version of addRuntimeCheck that takes the ready-made set of checks as a parameter.
The checks are now generated by the client (LoopDistribution) with the new RuntimePointerChecking::generateChecks API.
Also the new printChecks API is used to print out the checks for debugging.
This is to continue the transition over to the new model whereby clients will get the full set of checks from LAA, filter it and then pass it to LoopVersioning and in turn to addRuntimeCheck.
llvm-svn: 243382
show more ...
|
#
7679afda |
| 24-Jul-2015 |
Pete Cooper <peter_cooper@apple.com> |
Use make_range(rbegin(), rend()) to allow foreach loops. NFC.
Instead of the pattern
for (auto I = x.rbegin(), E = x.end(); I != E; ++I)
we can use make_range to construct the reverse range and i
Use make_range(rbegin(), rend()) to allow foreach loops. NFC.
Instead of the pattern
for (auto I = x.rbegin(), E = x.end(); I != E; ++I)
we can use make_range to construct the reverse range and iterate using that instead.
llvm-svn: 243163
show more ...
|
Revision tags: llvmorg-3.7.0-rc1 |
|
#
9f7dedc3 |
| 14-Jul-2015 |
Adam Nemet <anemet@apple.com> |
[LAA] Introduce RuntimePointerChecking::PointerInfo, NFC
Turn this structure-of-arrays (i.e. the various pointer attributes) into array-of-structures.
llvm-svn: 242219
|
#
7cdebac0 |
| 14-Jul-2015 |
Adam Nemet <anemet@apple.com> |
[LAA] Lift RuntimePointerCheck out of LoopAccessInfo, NFC
I am planning to add more nested classes inside RuntimePointerCheck so all these triple-nesting would be hard to follow.
Also rename it to
[LAA] Lift RuntimePointerCheck out of LoopAccessInfo, NFC
I am planning to add more nested classes inside RuntimePointerCheck so all these triple-nesting would be hard to follow.
Also rename it to RuntimePointerChecking (i.e. append 'ing').
llvm-svn: 242218
show more ...
|
#
215746b4 |
| 10-Jul-2015 |
Adam Nemet <anemet@apple.com> |
[LoopDist/LoopVer] Move LoopVersioning to a new module, NFC
Summary: The class will obviously need improvement down the road. For one, there is no reason that addPHINodes would have to be exposed l
[LoopDist/LoopVer] Move LoopVersioning to a new module, NFC
Summary: The class will obviously need improvement down the road. For one, there is no reason that addPHINodes would have to be exposed like that. I will make this and other improvements in follow-up patches.
The main goal is to be able to share this functionality. The LoopLoadElimination pass I am working on needs it too. Later we can move other clients as well (LV and Ashutosh's LICMVer).
Reviewers: hfinkel, ashutosh.nema
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10577
llvm-svn: 241932
show more ...
|
#
1a689188 |
| 10-Jul-2015 |
Adam Nemet <anemet@apple.com> |
[LoopDist] Move loop-versioning helper functions to Cloning, NFC
Summary: This makes them available to the LoopVersioning class as that is moved to its own module in the next patch.
Reviewers: ashu
[LoopDist] Move loop-versioning helper functions to Cloning, NFC
Summary: This makes them available to the LoopVersioning class as that is moved to its own module in the next patch.
Reviewers: ashutosh.nema, hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10576
llvm-svn: 241931
show more ...
|
Revision tags: llvmorg-3.6.2, llvmorg-3.6.2-rc1 |
|
#
f530b329 |
| 22-Jun-2015 |
Adam Nemet <anemet@apple.com> |
[LoopDist] Improve variable names and comments in LoopVersioning class, NFC
As with the previous patch, the goal is to turn the class into a general loop-versioning class. This patch removes any re
[LoopDist] Improve variable names and comments in LoopVersioning class, NFC
As with the previous patch, the goal is to turn the class into a general loop-versioning class. This patch removes any references to loop distribution.
llvm-svn: 240352
show more ...
|
#
7632500d |
| 19-Jun-2015 |
Adam Nemet <anemet@apple.com> |
[LoopDist] Rename RuntimeCheckEmitter to LoopVersioning, NFC
llvm-svn: 240165
|
#
772a1506 |
| 19-Jun-2015 |
Adam Nemet <anemet@apple.com> |
[LoopDist] Move pointer-to-partition computation out of RuntimeCheckEmitter, NFC
This starts preparing the class to become a (more) general LoopVersioning utility class.
llvm-svn: 240164
|
#
e6987bf3 |
| 21-May-2015 |
Benjamin Kramer <benny.kra@googlemail.com> |
[LoopDistribute] Remove a layer of pointer indirection.
Just store InstPartitions directly into the std::list. No functional change intended.
llvm-svn: 237930
|
Revision tags: llvmorg-3.6.1 |
|
#
2f85b737 |
| 14-May-2015 |
Adam Nemet <anemet@apple.com> |
Attempt to fix MSVC bots
llvm-svn: 237359
|