#
2e865353 |
| 14-Mar-2024 |
Stephen Tozer <stephen.tozer@sony.com> |
[RemoveDIs][NFC] Move DPValue::filter -> filterDbgVars (#85208)
This patch changes DPValue::filter to be a non-member method
filterDbgVars. There are two reasons for this: firstly, the name of
DPV
[RemoveDIs][NFC] Move DPValue::filter -> filterDbgVars (#85208)
This patch changes DPValue::filter to be a non-member method
filterDbgVars. There are two reasons for this: firstly, the name of
DPValue is about to change to DbgVariableRecord, which will result in
every `for` loop that uses DPValue::filter to require a line break. This
is a small thing, but it makes the rename patch more difficult to
review, and is just generally more awkward for what is a fairly common
loop. Secondly, the intent is to later break up the DPValue class into
subclasses, at which point it would be better to have a non-member
function that allows template arguments for the cases we want to filter
with greater specificity.
show more ...
|
#
95fef1df |
| 14-Mar-2024 |
Florian Hahn <flo@fhahn.com> |
[LV] Improve AnyOf reduction codegen. (#78304)
Update AnyOf reduction code generation to only keep track of the AnyOf
property in a boolean vector in the loop, only selecting either the new
or sta
[LV] Improve AnyOf reduction codegen. (#78304)
Update AnyOf reduction code generation to only keep track of the AnyOf
property in a boolean vector in the loop, only selecting either the new
or start value in the middle block.
The patch incorporates feedback from https://reviews.llvm.org/D153697.
This fixes the #62565, as now there aren't multiple uses of the
start/new values.
Fixes https://github.com/llvm/llvm-project/issues/62565
PR: https://github.com/llvm/llvm-project/pull/78304
show more ...
|
#
15f3f446 |
| 12-Mar-2024 |
Stephen Tozer <stephen.tozer@sony.com> |
[RemoveDIs][NFC] Rename common interface functions for DPValues->DbgRecords (#84793)
As part of the effort to rename the DbgRecord classes, this patch
renames the widely-used functions that operate
[RemoveDIs][NFC] Rename common interface functions for DPValues->DbgRecords (#84793)
As part of the effort to rename the DbgRecord classes, this patch
renames the widely-used functions that operate on DbgRecords but refer
to DbgValues or DPValues in their names to refer to DbgRecords instead;
all such functions are defined in one of `BasicBlock.h`,
`Instruction.h`, and `DebugProgramInstruction.h`.
This patch explicitly does not change the names of any comments or
variables, except for where they use the exact name of one of the
renamed functions. The reason for this is reviewability; this patch can
be trivially examined to determine that the only changes are direct
string substitutions and any results from clang-format responding to the
changed line lengths. Future patches will cover renaming variables and
comments, and then renaming the classes themselves.
show more ...
|
Revision tags: llvmorg-18.1.1 |
|
#
8d6e867e |
| 05-Mar-2024 |
Patrick O'Neill <102189596+patrick-rivos@users.noreply.github.com> |
[LSR][term-fold] Ensure the simple recurrence is from the current loop (#83085)
If the phi node found by matchSimpleRecurrence is not from the current
loop, then isAlmostDeadIV panics. With this pa
[LSR][term-fold] Ensure the simple recurrence is from the current loop (#83085)
If the phi node found by matchSimpleRecurrence is not from the current
loop, then isAlmostDeadIV panics. With this patch we bail out early.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
---------
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
show more ...
|
Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3 |
|
#
ababa964 |
| 20-Feb-2024 |
Orlando Cazalet-Hyams <orlando.hyams@sony.com> |
[RemoveDIs][NFC] Introduce DbgRecord base class [1/3] (#78252)
Patch 1 of 3 to add llvm.dbg.label support to the RemoveDIs project. The
patch stack adds a new base class
-> 1. Add DbgRecord
[RemoveDIs][NFC] Introduce DbgRecord base class [1/3] (#78252)
Patch 1 of 3 to add llvm.dbg.label support to the RemoveDIs project. The
patch stack adds a new base class
-> 1. Add DbgRecord base class for DPValue and the not-yet-added
DPLabel class.
2. Add the DPLabel class.
3. Enable dbg.label conversion and add support to passes.
Patches 1 and 2 are NFC.
In the near future we also will rename DPValue to DbgVariableRecord and
DPLabel to DbgLabelRecord, at which point we'll overhaul the function
names too. The name DPLabel keeps things consistent for now.
show more ...
|
#
2ed2a3ad |
| 16-Feb-2024 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[Transforms][Utils] Add helpers to map between Reduction IntrinsicID and Arithmetic Instruction Opcode and MinMax IntrinsicID / RecurKind
Noticed on #81852
|
Revision tags: llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6 |
|
#
ce1b2433 |
| 23-Nov-2023 |
Jeremy Morse <jeremy.morse@sony.com> |
[DebugInfo][RemoveDIs] Instrument loop-deletion for DPValues (#73042)
Loop deletion identifies dbg.value intrinsics in the loop, sets them to
undef/poison, and sinks them to the exit of the loop, t
[DebugInfo][RemoveDIs] Instrument loop-deletion for DPValues (#73042)
Loop deletion identifies dbg.value intrinsics in the loop, sets them to
undef/poison, and sinks them to the exit of the loop, to ensure that any
variable assignments that happen in a deleted loop are "optimised out".
This needs to be replicated for DPValues, the non-instruction
replacement for dbg.value intrinsics.
The movement API for DPValues is (deliberately) more limited than
dbg.values, which is tricky because inserting the collection of
dbg.values at an arbitrary iterator can insert a dbg.value in the middle
of a sequence of dbg.values. A big no-no for DPValues. This patch
replicates the order by inserting DPValues in reverse at the
head-iterator of the block, to ensure the same output as dbg.value mode.
Technically the order isn't important, but we're trying to ensure
identical outputs from optimisation passes right now.
Add more CHECK lines for dbg.values in diundef.ll to ensure that we
don't create any spurious dbg.values, and to ensure that sequences of
dbg.values come out of the optimisation in the correct order.
show more ...
|
#
19e6d541 |
| 23-Nov-2023 |
Florian Hahn <flo@fhahn.com> |
[LV] Re-use existing compare if possible for diff checks.
SCEV simplifying the subtraction may result in redundant compares that are all OR'd together. Keep track of the generated operands in SeenCo
[LV] Re-use existing compare if possible for diff checks.
SCEV simplifying the subtraction may result in redundant compares that are all OR'd together. Keep track of the generated operands in SeenCompares, with the key being the pair of operands for the compare.
If we alrady generated the same compare previously, skip it.
show more ...
|
#
32d1197a |
| 22-Nov-2023 |
Florian Hahn <flo@fhahn.com> |
[LV] Use SCEV for subtraction of src/sink for diff runtime checks.
Instead of expanding the src/sink SCEV expressions and emitting an IR sub to compute the difference, the subtraction can be directl
[LV] Use SCEV for subtraction of src/sink for diff runtime checks.
Instead of expanding the src/sink SCEV expressions and emitting an IR sub to compute the difference, the subtraction can be directly be performed by ScalarEvolution. This allows the subtraction to be simplified by SCEV, which in turn can reduced the number of redundant runtime check instructions generated.
It also allows to generate checks that are invariant w.r.t. an outer loop, if he inner loop AddRecs have the same outer loop AddRec as start.
show more ...
|
#
ead35564 |
| 21-Nov-2023 |
Florian Hahn <flo@fhahn.com> |
[LoopUtils] Freeze compare results for diff checks instead of pointers.
THe freezes are introduced to avoid branch on undef/poison, if any of the pointers may be poison. The same can be achieved by
[LoopUtils] Freeze compare results for diff checks instead of pointers.
THe freezes are introduced to avoid branch on undef/poison, if any of the pointers may be poison. The same can be achieved by just freezing the compare, which reduces the number of freezes needed. See https://alive2.llvm.org/ce/z/NHa_ud
Note that the individual compares need to be frozen and it is not sufficient to only freeze the resulting OR:
Result OR frozen only (UNSOUND): https://alive2.llvm.org/ce/z/YzFHQY Individual conds frozen (SOUND): https://alive2.llvm.org/ce/z/5L6Z3f
show more ...
|
Revision tags: llvmorg-17.0.5 |
|
#
3ca4fe80 |
| 06-Nov-2023 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[Transforms] Use StringRef::starts_with/ends_with instead of startswith/endswith. NFC.
startswith/endswith wrap starts_with/ends_with and will eventually go away (to more closely match string_view)
|
#
07f0e75b |
| 02-Nov-2023 |
David Sherwood <57997763+david-arm@users.noreply.github.com> |
[LoopVectorize] Fix bug with code to hoist runtime checks (#70937)
There was a silly mistake in the expandBounds function that was using
the wrong type when calling expandCodeFor and always assumin
[LoopVectorize] Fix bug with code to hoist runtime checks (#70937)
There was a silly mistake in the expandBounds function that was using
the wrong type when calling expandCodeFor and always assuming the stride
is 64 bits. I've added the following test to defend this fix:
Transforms/LoopVectorize/ARM/mve-hoist-runtime-checks.ll
show more ...
|
Revision tags: llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3 |
|
#
26aed5b9 |
| 17-Aug-2023 |
Mel Chen <mel.chen@sifive.com> |
[VPlan][LoopUtils] Remove unused parameter TTI
This patch removes the member TTI from VPReductionRecipe, as the generation of reduction operations no longer requires TTI.
Reviewed By: fhahn
Differ
[VPlan][LoopUtils] Remove unused parameter TTI
This patch removes the member TTI from VPReductionRecipe, as the generation of reduction operations no longer requires TTI.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D158148
show more ...
|
#
111fcb0d |
| 02-Sep-2023 |
Fangrui Song <i@maskray.me> |
[llvm] Fix duplicate word typos. NFC
Those fixes were taken from https://reviews.llvm.org/D137338
|
Revision tags: llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6 |
|
#
c02184f2 |
| 06-Jun-2023 |
David Sherwood <david.sherwood@arm.com> |
[LoopVectorize] Allow inner loop runtime checks to be hoisted above an outer loop
Suppose we have a nested loop like this:
void foo(int32_t *dst, int32_t *src, int m, int n) { for (int i = 0;
[LoopVectorize] Allow inner loop runtime checks to be hoisted above an outer loop
Suppose we have a nested loop like this:
void foo(int32_t *dst, int32_t *src, int m, int n) { for (int i = 0; i < m; i++) { for (int j = 0; j < n; j++) { dst[(i * n) + j] += src[(i * n) + j]; } } }
We currently generate runtime memory checks as a precondition for entering the vectorised version of the inner loop. However, if the runtime-determined trip count for the inner loop is quite small then the cost of these checks becomes quite expensive. This patch attempts to mitigate these costs by adding a new option to expand the memory ranges being checked to include the outer loop as well. This leads to runtime checks that can then be hoisted above the outer loop. For example, rather than looking for a conflict between the memory ranges:
1. &dst[(i * n)] -> &dst[(i * n) + n] 2. &src[(i * n)] -> &src[(i * n) + n]
we can instead look at the expanded ranges:
1. &dst[0] -> &dst[((m - 1) * n) + n] 2. &src[0] -> &src[((m - 1) * n) + n]
which are outer-loop-invariant. As with many optimisations there is a trade-off here, because there is a danger that using the expanded ranges we may never enter the vectorised inner loop, whereas with the smaller ranges we might enter at least once.
I have added a HoistRuntimeChecks option that is turned off by default, but can be enabled for workloads where we know this is guaranteed to be of real benefit. In future, we can also use PGO to determine if this is worthwhile by using the inner loop trip count information.
When enabling this option for SPEC2017 on neoverse-v1 with the flags "-Ofast -mcpu=native -flto" I see an overall geomean improvement of ~0.5%:
SPEC2017 results (+ is an improvement, - is a regression): 520.omnetpp: +2% 525.x264: +2% 557.xz: +1.2% ... GEOMEAN: +0.5%
I didn't investigate all the differences to see if they are genuine or noise, but I know the x264 improvement is real because it has some hot nested loops with low trip counts where I can see this hoisting is beneficial.
Tests have been added here:
Transforms/LoopVectorize/runtime-checks-hoist.ll
Differential Revision: https://reviews.llvm.org/D152366
show more ...
|
#
51dfe3cb |
| 16-Aug-2023 |
Nikita Popov <npopov@redhat.com> |
[IR] Add PHINode::removeIncomingValueIf() (NFC)
Add an API that allows removing multiple incoming phi values based on a predicate callback, as suggested on D157621.
This makes sure that the removal
[IR] Add PHINode::removeIncomingValueIf() (NFC)
Add an API that allows removing multiple incoming phi values based on a predicate callback, as suggested on D157621.
This makes sure that the removal is linear time rather than quadratic, and avoids subtleties around iterator invalidation.
I have replaced some of the more straightforward users with the new API, though there's a couple more places that should be able to use it.
Differential Revision: https://reviews.llvm.org/D158064
show more ...
|
#
a7ee80fa |
| 11-Aug-2023 |
Bjorn Pettersson <bjorn.a.pettersson@ericsson.com> |
[llvm] Drop some more typed pointer bitcasts etc.
|
#
425e9e81 |
| 19-Jul-2023 |
Mel Chen <mel.chen@sifive.com> |
[LV] Rename the Select[I|F]Cmp reduction pattern to [I|F]AnyOf. (NFC)
Regarding this NFC change, please refer to the discussion in this thread. https://reviews.llvm.org/D150851#4467261
Reviewed By:
[LV] Rename the Select[I|F]Cmp reduction pattern to [I|F]AnyOf. (NFC)
Regarding this NFC change, please refer to the discussion in this thread. https://reviews.llvm.org/D150851#4467261
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D155786
show more ...
|
#
8763d799 |
| 30-Jun-2023 |
Carlos Alberto Enciso <carlos.alberto.enciso@gmail.com> |
[loop-deletion] Overly defensive with undef-ing dbg.values.
Explicitly inserting undef is overly defensive. Any values computed nside the loop that are referenced by dbg.values should naturally beco
[loop-deletion] Overly defensive with undef-ing dbg.values.
Explicitly inserting undef is overly defensive. Any values computed nside the loop that are referenced by dbg.values should naturally become undef when the loop is deleted, and all other values that are loop invariant must be preserved.
Reviewed By: scott.linder
Differential Revision: https://reviews.llvm.org/D153539
show more ...
|
#
ec146cb7 |
| 13-Jun-2023 |
Anna Thomas <anna@azul.com> |
[LV] Add support for minimum/maximum intrinsics
{mini|maxi}mum intrinsics are different from {min|max}num intrinsics in the propagation of NaN and signed zero. Also, the minnum/maxnum intrinsics req
[LV] Add support for minimum/maximum intrinsics
{mini|maxi}mum intrinsics are different from {min|max}num intrinsics in the propagation of NaN and signed zero. Also, the minnum/maxnum intrinsics require the presence of nsz flags to be valid reductions in vectorizer. In this regard, we introduce a new recurrence kind and also add support for identifying reduction patterns using these intrinsics.
The reduction intrinsics and lowering was introduced here: 26bfbec5d2.
There are tests added which show how this interacts across chains of min/max patterns.
Differential Revision: https://reviews.llvm.org/D151482
show more ...
|
#
143ed21b |
| 05-Jun-2023 |
Nikita Popov <npopov@redhat.com> |
Revert "[LCSSA] Remove unused ScalarEvolution argument (NFC)"
This reverts commit 5362a0d859d8e96b3f7c0437b7866e17a818a4f7.
In preparation for reverting a dependent revision.
|
Revision tags: llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3 |
|
#
5362a0d8 |
| 02-May-2023 |
Nikita Popov <npopov@redhat.com> |
[LCSSA] Remove unused ScalarEvolution argument (NFC)
After D149435, LCSSA formation no longer needs access to ScalarEvolution, so remove the argument from the utilities.
|
Revision tags: llvmorg-16.0.2 |
|
#
aa754f7e |
| 13-Apr-2023 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[IR] llvm::createMinMaxOp - create integer min/max intrinsics instead of icmp/sel
Based off D148215, when expanding a min/max reduction we should be creating min/max intrinsics directly instead of r
[IR] llvm::createMinMaxOp - create integer min/max intrinsics instead of icmp/sel
Based off D148215, when expanding a min/max reduction we should be creating min/max intrinsics directly instead of relying on instcombine to fold them back together.
This patch handles integer min/max cases. Hopefully we can add floating point support soon (at least for fastmath/nnan cases) - but we're missing some of the plumbing to pass the correct FMF to the intrinsic at the moment.
Differential Revision: https://reviews.llvm.org/D148221
show more ...
|
#
797da79a |
| 12-Apr-2023 |
Dmitry Makogon <d.makogon@g.nsu.ru> |
[LoopUtils] Add isKnownPositiveInLoop and isKnownNonPositiveInLoop functions
|
Revision tags: llvmorg-16.0.1 |
|
#
54539fa8 |
| 20-Mar-2023 |
Philip Reames <preames@rivosinc.com> |
[LSR/LFTR] Move two utilities to common code for reuse [nfc]
We're working on a transform in LSR which is essentiall an inverse of LFTR (in certain sub-cases). Move utilties so that they can be reu
[LSR/LFTR] Move two utilities to common code for reuse [nfc]
We're working on a transform in LSR which is essentiall an inverse of LFTR (in certain sub-cases). Move utilties so that they can be reused.
show more ...
|