History log of /llvm-project/llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp (Results 26 – 50 of 352)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 9deee6bf 11-Aug-2023 Nikita Popov <npopov@redhat.com>

[SDAG] Don't transfer !range metadata without !noundef to SDAG (PR64589)

D141386 changed the semantics of !range metadata to return poison
on violation. If !range is combined with !noundef, violatio

[SDAG] Don't transfer !range metadata without !noundef to SDAG (PR64589)

D141386 changed the semantics of !range metadata to return poison
on violation. If !range is combined with !noundef, violation is
immediate UB instead, matching the old semantics.

In theory, these IR semantics should also carry over into SDAG.
In practice, DAGCombine has at least one key transform that is
invalid in the presence of poison, namely the conversion of logical
and/or to bitwise and/or (https://github.com/llvm/llvm-project/blob/c7b537bf0923df05254f9fa4722b298eb8f4790d/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L11252).
Ideally, we would fix this transform, but this will require
substantial work to avoid codegen regressions.

In the meantime, avoid transferring !range metadata without
!noundef, effectively restoring the old !range metadata semantics
on the SDAG layer.

Fixes https://github.com/llvm/llvm-project/issues/64589.

Differential Revision: https://reviews.llvm.org/D157685

show more ...


Revision tags: llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# 8de9f2b5 26-Jun-2023 Job Noorman <jnoorman@igalia.com>

Move SubtargetFeature.h from MC to TargetParser

SubtargetFeature.h is currently part of MC while it doesn't depend on
anything in MC. Since some LLVM components might have the need to work
with targ

Move SubtargetFeature.h from MC to TargetParser

SubtargetFeature.h is currently part of MC while it doesn't depend on
anything in MC. Since some LLVM components might have the need to work
with target features without necessarily needing MC, it might be
worthwhile to move SubtargetFeature.h to a different location. This will
reduce the dependencies of said components.

Note that I choose TargetParser as the destination because that's where
Triple lives and SubtargetFeatures feels related to that.

This issues came up during a JITLink review (D149522). JITLink would
like to avoid a dependency on MC while still needing to store target
features.

Reviewed By: MaskRay, arsenm

Differential Revision: https://reviews.llvm.org/D150549

show more ...


Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2
# c1221251 10-Apr-2023 NAKAMURA Takumi <geek4civic@gmail.com>

Restore CodeGen/MachineValueType.h from `Support`

This is rework of;

- rG13e77db2df94 (r328395; MVT)

Since `LowLevelType.h` has been restored to `CodeGen`, `MachinveValueType.h`
can be restored

Restore CodeGen/MachineValueType.h from `Support`

This is rework of;

- rG13e77db2df94 (r328395; MVT)

Since `LowLevelType.h` has been restored to `CodeGen`, `MachinveValueType.h`
can be restored as well.

Depends on D148767

Differential Revision: https://reviews.llvm.org/D149024

show more ...


# fb8038db 13-Apr-2023 Simon Pilgrim <llvm-dev@redking.me.uk>

[TTI] getExtendedReductionCost - replace std::optional<FastMathFlags> args with FastMathFlags

Followup to D148149 where it was noticed that the std::optional wrapper wasn't helping with anything (we

[TTI] getExtendedReductionCost - replace std::optional<FastMathFlags> args with FastMathFlags

Followup to D148149 where it was noticed that the std::optional wrapper wasn't helping with anything (we can just use an empty FastMathFlags()).

show more ...


# a39d2d50 11-Apr-2023 David Green <david.green@arm.com>

[ARM] Increase the Scalarized cost of masked gather/scatter operations

If a gather/scatter is masked and will need to be scalarized then the cost
should be higher than we currently produce. An addit

[ARM] Increase the Scalarized cost of masked gather/scatter operations

If a gather/scatter is masked and will need to be scalarized then the cost
should be higher than we currently produce. An additional cost for scalarizing
the mask, extracting i1s and branching on the result needs to be added, which
this patch gives a cost of 5.

Differential Revision: https://reviews.llvm.org/D147331

show more ...


Revision tags: llvmorg-16.0.1
# b4089cfa 04-Apr-2023 David Sherwood <david.sherwood@arm.com>

[NFC][LoopVectorize] Simplify preferPredicateOverEpilogue interface

Given just how many arguments we pass to
preferPredicateOverEpilogue and considering this list may
grow over time I've decided to

[NFC][LoopVectorize] Simplify preferPredicateOverEpilogue interface

Given just how many arguments we pass to
preferPredicateOverEpilogue and considering this list may
grow over time I've decided to pass in a pointer to a new
TailFoldingInfo structure instead, similar to what we do
with IntrinsicCostAttributes, etc. In addition, many of the
arguments we pass in are actually available in the
LoopVectorizationLegality class so I've managed to
reduce the set of pointers that we need to pass in the
TailFoldingInfo struct.

Differential Revision: https://reviews.llvm.org/D146127

show more ...


Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2
# c41b41eb 30-Jan-2023 Sander de Smalen <sander.desmalen@arm.com>

[LoopVectorize] Use overflow-check analysis to improve tail-folding.

This work follows on from D142109 and addresses a possible regression
when we know the loop iteration counter cannot overflow.

W

[LoopVectorize] Use overflow-check analysis to improve tail-folding.

This work follows on from D142109 and addresses a possible regression
when we know the loop iteration counter cannot overflow.

When we know the overflow-check always evaluates to false, it's better to
use the other style of tail folding where it assumes a runtime check was
added, because that avoids having to calculate a modified trip-count.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D142894

show more ...


# a8cd35c3 16-Feb-2023 Simon Tatham <simon.tatham@arm.com>

[LowerTypeTests] Support generating Armv6-M jump tables. (reland)

[Originally committed as f6ddf7781471b71243fa3c3ae7c93073f95c7dff;
reverted in bbef38352fbade9e014ec97d5991da5dee306da7 due to test

[LowerTypeTests] Support generating Armv6-M jump tables. (reland)

[Originally committed as f6ddf7781471b71243fa3c3ae7c93073f95c7dff;
reverted in bbef38352fbade9e014ec97d5991da5dee306da7 due to test
breakage; now relanded with the Arm tests conditioned on
`arm-registered-target`]

The LowerTypeTests pass emits a jump table in the form of an
`inlineasm` IR node containing a string representation of some
assembly. It tests the target triple to see what architecture it
should be generating assembly for. But that's not good enough for
`Triple::thumb`, because the 32-bit PC-relative `b.w` branch
instruction isn't available in all supported architecture versions. In
particular, Armv6-M doesn't support that instruction (although the
similar Armv8-M Baseline does).

Most of this patch is concerned with working out whether the
compilation target is Armv6-M or not, which I'm doing by going through
all the functions in the module, retrieving a TargetTransformInfo for
each one, and querying it via a new method I've added to check its
SubtargetInfo. If any function's TTI indicates that it's targeting an
architecture supporting B.W, then we assume we're also allowed to use
B.W in the jump table.

The Armv6-M compatible jump table format requires a temporary
register, and therefore also has to use the stack in order to restore
that register.

Another consequence of this change is that jump tables on Arm/Thumb
are no longer always the same size. In particular, on an architecture
that supports Arm and Thumb-1 but not Thumb-2, the Arm and Thumb
tables are different sizes from //each other//. As a consequence,
``getJumpTableEntrySize`` can no longer base its answer on the target
triple's architecture: it has to take into account the decision that
``selectJumpTableArmEncoding`` made, which meant I had to move that
function to an earlier point in the code and store its answer in the
``LowerTypeTestsModule`` class.

Reviewed By: lenary

Differential Revision: https://reviews.llvm.org/D143576

show more ...


# 397265d8 20-Feb-2023 Kazu Hirata <kazu@google.com>

[llvm] Use APInt::isAllOnes instead of isAllOnesValue (NFC)

Note that isAllOnesValue has been soft-deprecated in favor of
isAllOnes.


# bbef3835 16-Feb-2023 Simon Tatham <simon.tatham@arm.com>

Revert "[LowerTypeTests] Support generating Armv6-M jump tables."

This reverts commit f6ddf7781471b71243fa3c3ae7c93073f95c7dff.

Eight buildbots reported that the two test files changed by that
comm

Revert "[LowerTypeTests] Support generating Armv6-M jump tables."

This reverts commit f6ddf7781471b71243fa3c3ae7c93073f95c7dff.

Eight buildbots reported that the two test files changed by that
commit had started failing. The buildbots in question all had in
common that they build with a very restricted `LLVM_TARGETS_TO_BUILD`,
such as only X86 or AArch64 or Hexagon. I didn't notice this before
commit because my own build has the full default set of targets, and
in that circumstance, the tests pass.

I assume the problem has something to do with the attempt to query
TargetTransformInfo: if you can't make a valid TTI for the target
triple then you can't ask it what kind of inline assembler you should
be emitting, and so `opt` without the Arm backend can't get the Arm
cases of these tests right.

I don't have time to fix this until next week, so I'll revert the
change for now to keep the buildbots happy.

show more ...


# f6ddf778 16-Feb-2023 Simon Tatham <simon.tatham@arm.com>

[LowerTypeTests] Support generating Armv6-M jump tables.

The LowerTypeTests pass emits a jump table in the form of an
`inlineasm` IR node containing a string representation of some
assembly. It test

[LowerTypeTests] Support generating Armv6-M jump tables.

The LowerTypeTests pass emits a jump table in the form of an
`inlineasm` IR node containing a string representation of some
assembly. It tests the target triple to see what architecture it
should be generating assembly for. But that's not good enough for
`Triple::thumb`, because the 32-bit PC-relative `b.w` branch
instruction isn't available in all supported architecture versions. In
particular, Armv6-M doesn't support that instruction (although the
similar Armv8-M Baseline does).

Most of this patch is concerned with working out whether the
compilation target is Armv6-M or not, which I'm doing by going through
all the functions in the module, retrieving a TargetTransformInfo for
each one, and querying it via a new method I've added to check its
SubtargetInfo. If any function's TTI indicates that it's targeting an
architecture supporting B.W, then we assume we're also allowed to use
B.W in the jump table.

The Armv6-M compatible jump table format requires a temporary
register, and therefore also has to use the stack in order to restore
that register.

Another consequence of this change is that jump tables on Arm/Thumb
are no longer always the same size. In particular, on an architecture
that supports Arm and Thumb-1 but not Thumb-2, the Arm and Thumb
tables are different sizes from //each other//. As a consequence,
``getJumpTableEntrySize`` can no longer base its answer on the target
triple's architecture: it has to take into account the decision that
``selectJumpTableArmEncoding`` made, which meant I had to move that
function to an earlier point in the code and store its answer in the
``LowerTypeTestsModule`` class.

Reviewed By: lenary

Differential Revision: https://reviews.llvm.org/D143576

show more ...


# 00531139 03-Feb-2023 Sander de Smalen <sander.desmalen@arm.com>

[LoopVectorize][TTI] NFCI: Clarify enum for the tail folding style.

This NFC (intended) patch has several small changes:
* It renames PredicationStyle to TailFoldingStyle.
* It renames TTI.emitActiv

[LoopVectorize][TTI] NFCI: Clarify enum for the tail folding style.

This NFC (intended) patch has several small changes:
* It renames PredicationStyle to TailFoldingStyle.
* It renames TTI.emitActiveLaneMask() to TTI.getPreferredTailFoldingStyle()
* Simplifies some of its uses in the LoopVectorizer

Rationale: To my surprise PredicationStyle::None did not mean 'no
predication', but rather 'no active lane mask intrinsic', such that the
predicate is created using a splat + compare with stepvector. The enum is
also highly specific to tail folding, so it seems better to name this
around that feature, i.e. 'tail folding style'.

This also makes it more amenable to extend it to other tail folding styles,
such as the one added in D142109.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D142887

show more ...


Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init
# 5fb3a57e 21-Jan-2023 ShihPo Hung <shihpo.hung@sifive.com>

[Cost] Add CostKind to getVectorInstrCost and its related users

LoopUnroll estimates the loop size via getInstructionCost(),
but getInstructionCost() cannot pass CostKind to getVectorInstrCost().
An

[Cost] Add CostKind to getVectorInstrCost and its related users

LoopUnroll estimates the loop size via getInstructionCost(),
but getInstructionCost() cannot pass CostKind to getVectorInstrCost().
And so does getShuffleCost() to getBroadcastShuffleOverhead(),
getPermuteShuffleOverhead(), getExtractSubvectorOverhead(),
and getInsertSubvectorOverhead().

To address this, this patch adds an argument CostKind to these
functions.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D142116

show more ...


Revision tags: llvmorg-15.0.7
# 8fd5558b 11-Jan-2023 Guillaume Chatelet <gchatelet@google.com>

[NFC] Use TypeSize::geFixedValue() instead of TypeSize::getFixedSize()

This change is one of a series to implement the discussion from
https://reviews.llvm.org/D141134.


# 9b5f6268 21-Dec-2022 Alexey Bataev <a.bataev@outlook.com>

[SLP]Fix cost of the broadcast buildvector/gather.

Need to include the cost of the initial insertelement to the cost of the
broadcasts. Also, need to adjust the cost of the gather/buildvector if
the

[SLP]Fix cost of the broadcast buildvector/gather.

Need to include the cost of the initial insertelement to the cost of the
broadcasts. Also, need to adjust the cost of the gather/buildvector if
the element is inserted into poison/undef vector.

Differential Revision: https://reviews.llvm.org/D140498

show more ...


# 86fe4dfd 02-Dec-2022 Krzysztof Parzyszek <kparzysz@quicinc.com>

TargetTransformInfo: convert Optional to std::optional

Recommit: added missing "#include <cstdint>".


# 4e12d183 02-Dec-2022 Krzysztof Parzyszek <kparzysz@quicinc.com>

Revert "TargetTransformInfo: convert Optional to std::optional"

This reverts commit b83711248cb12639e7ef7303cfbb4452b4067e85.

Some buildbots are failing.


# b8371124 02-Dec-2022 Krzysztof Parzyszek <kparzysz@quicinc.com>

TargetTransformInfo: convert Optional to std::optional


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3
# de6dfbbb 14-Oct-2022 David Green <david.green@arm.com>

[ARM] Fix for MVE i128 vector icmp costs.

We were hitting an assert as the legalied type needn't be a vector.

Fixes #58364


Revision tags: working, llvmorg-15.0.2
# f6d110e2 27-Sep-2022 Philip Reames <preames@rivosinc.com>

[LAA] Make getPtrStride return Option instead of overloading zero as error value [nfc]

This is purely NFC restructure in advance of a change which actually exposes zero strides. This is mostly beca

[LAA] Make getPtrStride return Option instead of overloading zero as error value [nfc]

This is purely NFC restructure in advance of a change which actually exposes zero strides. This is mostly because I find this interface confusing each time I look at it.

show more ...


Revision tags: llvmorg-15.0.1, llvmorg-15.0.0
# 2833760c 29-Aug-2022 Kazu Hirata <kazu@google.com>

[Target] Qualify auto in range-based for loops (NFC)


Revision tags: llvmorg-15.0.0-rc3
# df20ff9a 23-Aug-2022 Philip Reames <preames@rivosinc.com>

[TTI] Kill last couple uses of OperandValueKind in targets [nfc]

Use the accessor methods on the containing class instead so that we can change the representation.


# c9608d57 22-Aug-2022 Philip Reames <preames@rivosinc.com>

[TTI] Plumb through OperandValueInfo in getMemoryOpCost [NFC]

This has the effect of exposing the power-of-two property for use in memory op costing, but no target actually uses it yet. The main po

[TTI] Plumb through OperandValueInfo in getMemoryOpCost [NFC]

This has the effect of exposing the power-of-two property for use in memory op costing, but no target actually uses it yet. The main point of this change is simple consistency with the recently changes getArithmeticInstrCost, and to remove the last (interface) use of OperandValueKind.

show more ...


# 104fa367 20-Aug-2022 Philip Reames <preames@rivosinc.com>

[TTI] Use OperandValueInfo in getArithmeticInstrCost implementation [NFC]

This change completes the process of replacing OperandValueKind and OperandValueProperties which were previously passed inde

[TTI] Use OperandValueInfo in getArithmeticInstrCost implementation [NFC]

This change completes the process of replacing OperandValueKind and OperandValueProperties which were previously passed independently in this API with a single container class which contains both.

This is the change which motivated the whole sequence which preceeded it. In an original spike version of this change, I'd noticed a nasty bug: I'd changed the signature without changing names, and as result, we silently passed additional information through a callsite which previously dropped the power-of-two fact. This might be harmless in most cases, but at least a couple clearly dependend for correctness on not passing that property through.

I did my best to split off prior changes which reduced the scope of this one, and which made it possible to use compiler assistance. For instance, every parameter which changes type in this change also changes name. This was intentional to make sure that every call site possible effected must show up in the diff. This let me audit each one closely.

show more ...


# 5263155d 21-Aug-2022 Simon Pilgrim <llvm-dev@redking.me.uk>

[CostModel] Add CostKind argument to getShuffleCost

Defaults to TCK_RecipThroughput - as most explicit calls were assuming TCK_RecipThroughput (vectorizers) or was just doing a before-vs-after compa

[CostModel] Add CostKind argument to getShuffleCost

Defaults to TCK_RecipThroughput - as most explicit calls were assuming TCK_RecipThroughput (vectorizers) or was just doing a before-vs-after comparison (vectorcombiner). Calls via getInstructionCost were just dropping the CostKind, so again there should be no change at this time (as getShuffleCost and its expansions don't use CostKind yet) - but it will make it easier for us to better account for size/latency shuffle costs in inline/unroll passes in the future.

Differential Revision: https://reviews.llvm.org/D132287

show more ...


12345678910>>...15