Revision tags: llvmorg-21-init |
|
#
52bffdf9 |
| 25-Jan-2025 |
David Green <david.green@arm.com> |
[IPSCCP][FuncSpec] Protect against metadata access from call args. (#124284)
Fixes an issue reported from #114964, where metadata arguments were
attempted to be accessed as constants.
|
Revision tags: llvmorg-19.1.7 |
|
#
67efbd0b |
| 08-Jan-2025 |
Ryan Mansfield <ryan_mansfield@apple.com> |
[LLVM] Fix various cl::desc typos and whitespace issues (NFC) (#121955)
|
Revision tags: llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
88e9b373 |
| 06-Nov-2024 |
Hari Limaye <hari.limaye@arm.com> |
[FuncSpec] Query SCCPSolver in more places (#114964)
When traversing the use-def chain of an Argument in a candidate
specialization, also query the SCCPSolver to see if a Value is constant.
This a
[FuncSpec] Query SCCPSolver in more places (#114964)
When traversing the use-def chain of an Argument in a candidate
specialization, also query the SCCPSolver to see if a Value is constant.
This allows us to better estimate the codesize savings of a candidate in
the presence of instructions that are a user of the argument we are
estimating savings for which also use arguments that have been found
constant by IPSCCP.
Similarly when estimating the dead basic blocks from branch and switch
instructions which become constant, also query the SCCPSolver to see if
a predecessor is unreachable.
show more ...
|
#
5f30b1aa |
| 04-Nov-2024 |
Hari Limaye <hari.limaye@arm.com> |
[FuncSpec] Improve handling of BinaryOperator instructions (#114534)
When visiting BinaryOperator instructions during estimation of codesize
savings for a candidate specialization, don't bail when
[FuncSpec] Improve handling of BinaryOperator instructions (#114534)
When visiting BinaryOperator instructions during estimation of codesize
savings for a candidate specialization, don't bail when the other
operand is not found to be constant. This allows us to find more
constants than we otherwise would, for example `and(false, x)`.
show more ...
|
#
5ed3f463 |
| 04-Nov-2024 |
Hari Limaye <hari.limaye@arm.com> |
[FuncSpec] Improve handling of Comparison Instructions (#114073)
When visiting comparison instructions during computation of a
specializations's bonus, make use of information from the lattice valu
[FuncSpec] Improve handling of Comparison Instructions (#114073)
When visiting comparison instructions during computation of a
specializations's bonus, make use of information from the lattice value
of the other operand in the case where we have not found this to have a
specific constant value.
show more ...
|
#
daa9af17 |
| 04-Nov-2024 |
Hari Limaye <hari.limaye@arm.com> |
[FuncSpec] Handle ssa_copy intrinsic calls in InstCostVisitor (#114247)
Look through ssa_copy intrinsic calls when computing codesize bonus for
a specialization.
Also remove redundant logic to s
[FuncSpec] Handle ssa_copy intrinsic calls in InstCostVisitor (#114247)
Look through ssa_copy intrinsic calls when computing codesize bonus for
a specialization.
Also remove redundant logic to skip computing codesize bonus for
ssa_copy intrinsics, now these are considered zero-cost by TTI (in PR
#75294).
show more ...
|
#
e19a5fc6 |
| 29-Oct-2024 |
Hari Limaye <hari.limaye@arm.com> |
[FuncSpec] Improve accounting of specialization codesize growth (#113448)
Only accumulate the codesize increase of functions that are actually
specialized, rather than for every candidate specializ
[FuncSpec] Improve accounting of specialization codesize growth (#113448)
Only accumulate the codesize increase of functions that are actually
specialized, rather than for every candidate specialization that we
analyse.
This fixes a subtle bug where prior analysis of candidate
specializations that were deemed unprofitable could prevent subsequent
profitable candidates from being recognised.
show more ...
|
#
06664fdc |
| 29-Oct-2024 |
Hari Limaye <hari.limaye@arm.com> |
[FuncSpec] Enable SpecializeLiteralConstant by default (#113442)
Enable specialization on literal constant arguments by default in
Function Specialization.
---------
Co-authored-by: Alexandro
[FuncSpec] Enable SpecializeLiteralConstant by default (#113442)
Enable specialization on literal constant arguments by default in
Function Specialization.
---------
Co-authored-by: Alexandros Lamprineas <alexandros.lamprineas@arm.com>
show more ...
|
Revision tags: llvmorg-19.1.3 |
|
#
6ab26eab |
| 28-Oct-2024 |
Ellis Hoag <ellis.sparky.hoag@gmail.com> |
Check hasOptSize() in shouldOptimizeForSize() (#112626)
|
#
c6931c25 |
| 23-Oct-2024 |
Hari Limaye <hari.limaye@arm.com> |
[FuncSpec] Only compute Latency bonus when necessary (#113159)
Only compute the Latency component of a specialisation's Bonus when
necessary, to avoid unnecessarily computing the Block Frequency
I
[FuncSpec] Only compute Latency bonus when necessary (#113159)
Only compute the Latency component of a specialisation's Bonus when
necessary, to avoid unnecessarily computing the Block Frequency
Information for a Function.
show more ...
|
#
0d1a91e8 |
| 18-Oct-2024 |
Hari Limaye <hari.limaye@arm.com> |
[FuncSpec] Update MinFunctionSize logic (#112711)
Always require functions to be larger than MinFunctionSize when
SpecializeLiteralConstant is enabled, and increase MinFunctionSize to
500, to prev
[FuncSpec] Update MinFunctionSize logic (#112711)
Always require functions to be larger than MinFunctionSize when
SpecializeLiteralConstant is enabled, and increase MinFunctionSize to
500, to prevent excessive triggering of specialisations on small
functions.
show more ...
|
Revision tags: llvmorg-19.1.2 |
|
#
6472cb1e |
| 09-Oct-2024 |
Alexandros Lamprineas <alexandros.lamprineas@arm.com> |
[FuncSpec] Improve estimation of select instruction. (#111176)
When propagating a constant to a select instruction we only consider the
condition operand as the use. I am extending the logic to con
[FuncSpec] Improve estimation of select instruction. (#111176)
When propagating a constant to a select instruction we only consider the
condition operand as the use. I am extending the logic to consider the
true and false values too, in case the condition had been found to be
constant in a previous propagation but halted.
show more ...
|
Revision tags: llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3 |
|
#
f364b2ee |
| 13-Aug-2024 |
Yingwei Zheng <dtcxzyw2333@gmail.com> |
[LLVM] Don't peek through bitcast on pointers and gep with zero indices. NFC. (#102889)
Since we are using opaque pointers now, we don't need to peek through
bitcast on pointers and gep with zero i
[LLVM] Don't peek through bitcast on pointers and gep with zero indices. NFC. (#102889)
Since we are using opaque pointers now, we don't need to peek through
bitcast on pointers and gep with zero indices.
show more ...
|
Revision tags: llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8 |
|
#
575e68e5 |
| 12-Jun-2024 |
Hans Wennborg <hans@chromium.org> |
FunctionSpecialization: Make the ordering of BestSpecs stricter
otherwise it's not guaranteed which of two candidates with the same score would get specialized first, or at all.
|
Revision tags: llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6 |
|
#
2fb51fba |
| 22-Nov-2023 |
Mats Petersson <mats.petersson@arm.com> |
[FuncSpec] Update function specialization to handle phi-chains (#72903)
When using the LLVM flang compiler with alias analysis (AA) enabled,
SPEC2017:548.exchange2_r was running significantly slowe
[FuncSpec] Update function specialization to handle phi-chains (#72903)
When using the LLVM flang compiler with alias analysis (AA) enabled,
SPEC2017:548.exchange2_r was running significantly slower than wihtout
the AA.
This was caused by the GVN pass replacing many of the loads in the
pre-AA code with phi-nodes that form a long chain of dependencies, which
the function specialization was unable to follow.
This adds a function to discover phi-nodes in a transitive set, with
some limitations to avoid spending ages analysing phi-nodes.
The minimum latency savings also had to be lowered - fewer load
instructions means less saving.
Adding some more prints to help debugging the isProfitable decision.
No significant change in compile time or generated code-size.
(A previous attempt to fix this was abandoned: https://github.com/llvm/llvm-project/pull/71442)
---------
Co-authored-by: Alexandros Lamprineas <alexandros.lamprineas@arm.com>
show more ...
|
Revision tags: llvmorg-17.0.5 |
|
#
c4c0ac10 |
| 06-Nov-2023 |
Nikita Popov <npopov@redhat.com> |
[IPO] Remove unnecessary bitcasts (NFC)
|
Revision tags: llvmorg-17.0.4, llvmorg-17.0.3 |
|
#
5181156b |
| 05-Oct-2023 |
Matthias Braun <matze@braunis.de> |
Use BlockFrequency type in more places (NFC) (#68266)
The `BlockFrequency` class abstracts `uint64_t` frequency values. Use it
more consistently in various APIs and disable implicit conversion to
Use BlockFrequency type in more places (NFC) (#68266)
The `BlockFrequency` class abstracts `uint64_t` frequency values. Use it
more consistently in various APIs and disable implicit conversion to
make usage more consistent and explicit.
- Use `BlockFrequency Freq` parameter for `setBlockFreq`,
`getProfileCountFromFreq` and `setBlockFreqAndScale` functions.
- Return `BlockFrequency` in `getEntryFreq()` functions.
- While on it change some `const BlockFrequency& Freq` parameters to
plain `BlockFreqency Freq`.
- Mark `BlockFrequency(uint64_t)` constructor as explicit.
- Add missing `BlockFrequency::operator!=`.
- Remove `uint64_t BlockFreqency::getMaxFrequency()`.
- Add `BlockFrequency BlockFrequency::max()` function.
show more ...
|
Revision tags: llvmorg-17.0.2 |
|
#
e15d72ad |
| 19-Sep-2023 |
Alexandros Lamprineas <alexandros.lamprineas@arm.com> |
[FuncSpec] Adjust the names of specializations and promoted stack values
Currently the naming scheme is a bit funky; the specializations are named
after the original function followed by an arbitra
[FuncSpec] Adjust the names of specializations and promoted stack values
Currently the naming scheme is a bit funky; the specializations are named
after the original function followed by an arbitrary decimal number. This
makes it hard to debug inlined specializations of recursive functions.
With this patch I am adding ".specialized." in between of the original
name and the suffix, which is now a single increment counter.
show more ...
|
Revision tags: llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4 |
|
#
7afea8a8 |
| 24-Aug-2023 |
Alexandros Lamprineas <alexandros.lamprineas@arm.com> |
[NFC][FuncSpec] Update the description of function specialization.
The code has changed significantly over time making the description outdated. In this patch I am re-writing the description with an
[NFC][FuncSpec] Update the description of function specialization.
The code has changed significantly over time making the description outdated. In this patch I am re-writing the description with an emphasis to the cost model, where most of the changes have happened.
Differential Revision: https://reviews.llvm.org/D158723
show more ...
|
Revision tags: llvmorg-17.0.0-rc3 |
|
#
386aa2ab |
| 21-Aug-2023 |
Alexandros Lamprineas <alexandros.lamprineas@arm.com> |
[FuncSpec] Increase the maximum number of times the specializer can run.
* Changes the default value of FuncSpecMaxIters from 1 to 10. This allows specialization of recursive functions. * Adds an
[FuncSpec] Increase the maximum number of times the specializer can run.
* Changes the default value of FuncSpecMaxIters from 1 to 10. This allows specialization of recursive functions. * Adds an option to control the maximum codesize growth per function. * Measured ~45% performance uplift for SPEC2017:548.exchange2_r on AWS Graviton3.
Differential Revision: https://reviews.llvm.org/D145819
show more ...
|
#
3c03a151 |
| 11-Aug-2023 |
Kazu Hirata <kazu@google.com> |
[llvm] Use DenseMap::lookup (NFC)
|
Revision tags: llvmorg-17.0.0-rc2 |
|
#
d1b376fd |
| 07-Aug-2023 |
Alexandros Lamprineas <alexandros.lamprineas@arm.com> |
[FuncSpec] Rework the discardment logic for unprofitable specializations.
Currently we make an arbitrary comparison between codesize and latency in order to decide whether to keep a specialization o
[FuncSpec] Rework the discardment logic for unprofitable specializations.
Currently we make an arbitrary comparison between codesize and latency in order to decide whether to keep a specialization or not. Sometimes the latency savings are biased in favor of loops because of imprecise block frequencies, therefore this metric contains a lot of noise. This patch tries to address the problem as follows:
* Reject specializations whose codesize savings are less than X% of the original function size. * Reject specializations whose latency savings are less than Y% of the original function size. * Reject specializations whose inlining bonus is less than Z% of the original function size.
I am not saying this is super precise, but at least X, Y and Z are configurable, allowing us to tweak the cost model. Moreover, it lets us prioritize codesize over latency, which is a less noisy metric.
I am also increasing the minimum size a function should have to be considered a candidate for specialization. Initially the cost of a function was calculated as
CodeMetrics::NumInsts * InlineConstants::getInstrCost()
which later in D150464 was altered into CodeMetrics::NumInsts since the metric is supposed to model TargetTransformInfo::TCK_CodeSize. However, we omitted adjusting MinFunctionSize in that commit.
Differential Revision: https://reviews.llvm.org/D157123
show more ...
|
#
c2d19002 |
| 07-Aug-2023 |
Alexandros Lamprineas <alexandros.lamprineas@arm.com> |
[FuncSpec] Estimate dead blocks more accurately.
Currently we only consider basic blocks with a unique predecessor when estimating the size of dead code. However, we could expand to this to consider
[FuncSpec] Estimate dead blocks more accurately.
Currently we only consider basic blocks with a unique predecessor when estimating the size of dead code. However, we could expand to this to consider blocks with a back-edge, or blocks preceded by dead blocks.
Differential Revision: https://reviews.llvm.org/D156903
show more ...
|
#
5bfefff1 |
| 31-Jul-2023 |
Alexandros Lamprineas <alexandros.lamprineas@arm.com> |
Reland [FuncSpec] Split the specialization bonus into CodeSize and Latency.
Currently we use a combined metric TargetTransformInfo::TCK_SizeAndLatency when estimating the specialization bonus. This
Reland [FuncSpec] Split the specialization bonus into CodeSize and Latency.
Currently we use a combined metric TargetTransformInfo::TCK_SizeAndLatency when estimating the specialization bonus. This is suboptimal, and in some cases erroneous. For example we shouldn't be weighting the codesize decrease attributed to constant propagation by the block frequency of the dead code. Instead only the latency savings should be weighted by block frequency. The total codesize savings from all the specialization arguments should be deducted from the specialization cost.
Differential Revision: https://reviews.llvm.org/D155103
show more ...
|
Revision tags: llvmorg-17.0.0-rc1 |
|
#
893d3a61 |
| 27-Jul-2023 |
Alexandros Lamprineas <alexandros.lamprineas@arm.com> |
Reland [FuncSpec] Add Phi nodes to the InstCostVisitor.
This patch allows constant folding of PHIs when estimating the user bonus. Phi nodes are a special case since some of their inputs may remain
Reland [FuncSpec] Add Phi nodes to the InstCostVisitor.
This patch allows constant folding of PHIs when estimating the user bonus. Phi nodes are a special case since some of their inputs may remain unresolved until all the specialization arguments have been processed by the InstCostVisitor. Therefore, we keep a list of dead basic blocks and then lazily visit the Phi nodes once the user bonus has been computed for all the specialization arguments.
Differential Revision: https://reviews.llvm.org/D154852
show more ...
|