#
9deee6bf |
| 11-Aug-2023 |
Nikita Popov <npopov@redhat.com> |
[SDAG] Don't transfer !range metadata without !noundef to SDAG (PR64589)
D141386 changed the semantics of !range metadata to return poison on violation. If !range is combined with !noundef, violatio
[SDAG] Don't transfer !range metadata without !noundef to SDAG (PR64589)
D141386 changed the semantics of !range metadata to return poison on violation. If !range is combined with !noundef, violation is immediate UB instead, matching the old semantics.
In theory, these IR semantics should also carry over into SDAG. In practice, DAGCombine has at least one key transform that is invalid in the presence of poison, namely the conversion of logical and/or to bitwise and/or (https://github.com/llvm/llvm-project/blob/c7b537bf0923df05254f9fa4722b298eb8f4790d/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L11252). Ideally, we would fix this transform, but this will require substantial work to avoid codegen regressions.
In the meantime, avoid transferring !range metadata without !noundef, effectively restoring the old !range metadata semantics on the SDAG layer.
Fixes https://github.com/llvm/llvm-project/issues/64589.
Differential Revision: https://reviews.llvm.org/D157685
show more ...
|
Revision tags: llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init |
|
#
8de9f2b5 |
| 26-Jun-2023 |
Job Noorman <jnoorman@igalia.com> |
Move SubtargetFeature.h from MC to TargetParser
SubtargetFeature.h is currently part of MC while it doesn't depend on anything in MC. Since some LLVM components might have the need to work with targ
Move SubtargetFeature.h from MC to TargetParser
SubtargetFeature.h is currently part of MC while it doesn't depend on anything in MC. Since some LLVM components might have the need to work with target features without necessarily needing MC, it might be worthwhile to move SubtargetFeature.h to a different location. This will reduce the dependencies of said components.
Note that I choose TargetParser as the destination because that's where Triple lives and SubtargetFeatures feels related to that.
This issues came up during a JITLink review (D149522). JITLink would like to avoid a dependency on MC while still needing to store target features.
Reviewed By: MaskRay, arsenm
Differential Revision: https://reviews.llvm.org/D150549
show more ...
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2 |
|
#
c1221251 |
| 10-Apr-2023 |
NAKAMURA Takumi <geek4civic@gmail.com> |
Restore CodeGen/MachineValueType.h from `Support`
This is rework of;
- rG13e77db2df94 (r328395; MVT)
Since `LowLevelType.h` has been restored to `CodeGen`, `MachinveValueType.h` can be restored
Restore CodeGen/MachineValueType.h from `Support`
This is rework of;
- rG13e77db2df94 (r328395; MVT)
Since `LowLevelType.h` has been restored to `CodeGen`, `MachinveValueType.h` can be restored as well.
Depends on D148767
Differential Revision: https://reviews.llvm.org/D149024
show more ...
|
#
fb8038db |
| 13-Apr-2023 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[TTI] getExtendedReductionCost - replace std::optional<FastMathFlags> args with FastMathFlags
Followup to D148149 where it was noticed that the std::optional wrapper wasn't helping with anything (we
[TTI] getExtendedReductionCost - replace std::optional<FastMathFlags> args with FastMathFlags
Followup to D148149 where it was noticed that the std::optional wrapper wasn't helping with anything (we can just use an empty FastMathFlags()).
show more ...
|
#
a39d2d50 |
| 11-Apr-2023 |
David Green <david.green@arm.com> |
[ARM] Increase the Scalarized cost of masked gather/scatter operations
If a gather/scatter is masked and will need to be scalarized then the cost should be higher than we currently produce. An addit
[ARM] Increase the Scalarized cost of masked gather/scatter operations
If a gather/scatter is masked and will need to be scalarized then the cost should be higher than we currently produce. An additional cost for scalarizing the mask, extracting i1s and branching on the result needs to be added, which this patch gives a cost of 5.
Differential Revision: https://reviews.llvm.org/D147331
show more ...
|
Revision tags: llvmorg-16.0.1 |
|
#
b4089cfa |
| 04-Apr-2023 |
David Sherwood <david.sherwood@arm.com> |
[NFC][LoopVectorize] Simplify preferPredicateOverEpilogue interface
Given just how many arguments we pass to preferPredicateOverEpilogue and considering this list may grow over time I've decided to
[NFC][LoopVectorize] Simplify preferPredicateOverEpilogue interface
Given just how many arguments we pass to preferPredicateOverEpilogue and considering this list may grow over time I've decided to pass in a pointer to a new TailFoldingInfo structure instead, similar to what we do with IntrinsicCostAttributes, etc. In addition, many of the arguments we pass in are actually available in the LoopVectorizationLegality class so I've managed to reduce the set of pointers that we need to pass in the TailFoldingInfo struct.
Differential Revision: https://reviews.llvm.org/D146127
show more ...
|
Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2 |
|
#
c41b41eb |
| 30-Jan-2023 |
Sander de Smalen <sander.desmalen@arm.com> |
[LoopVectorize] Use overflow-check analysis to improve tail-folding.
This work follows on from D142109 and addresses a possible regression when we know the loop iteration counter cannot overflow.
W
[LoopVectorize] Use overflow-check analysis to improve tail-folding.
This work follows on from D142109 and addresses a possible regression when we know the loop iteration counter cannot overflow.
When we know the overflow-check always evaluates to false, it's better to use the other style of tail folding where it assumes a runtime check was added, because that avoids having to calculate a modified trip-count.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D142894
show more ...
|
#
a8cd35c3 |
| 16-Feb-2023 |
Simon Tatham <simon.tatham@arm.com> |
[LowerTypeTests] Support generating Armv6-M jump tables. (reland)
[Originally committed as f6ddf7781471b71243fa3c3ae7c93073f95c7dff; reverted in bbef38352fbade9e014ec97d5991da5dee306da7 due to test
[LowerTypeTests] Support generating Armv6-M jump tables. (reland)
[Originally committed as f6ddf7781471b71243fa3c3ae7c93073f95c7dff; reverted in bbef38352fbade9e014ec97d5991da5dee306da7 due to test breakage; now relanded with the Arm tests conditioned on `arm-registered-target`]
The LowerTypeTests pass emits a jump table in the form of an `inlineasm` IR node containing a string representation of some assembly. It tests the target triple to see what architecture it should be generating assembly for. But that's not good enough for `Triple::thumb`, because the 32-bit PC-relative `b.w` branch instruction isn't available in all supported architecture versions. In particular, Armv6-M doesn't support that instruction (although the similar Armv8-M Baseline does).
Most of this patch is concerned with working out whether the compilation target is Armv6-M or not, which I'm doing by going through all the functions in the module, retrieving a TargetTransformInfo for each one, and querying it via a new method I've added to check its SubtargetInfo. If any function's TTI indicates that it's targeting an architecture supporting B.W, then we assume we're also allowed to use B.W in the jump table.
The Armv6-M compatible jump table format requires a temporary register, and therefore also has to use the stack in order to restore that register.
Another consequence of this change is that jump tables on Arm/Thumb are no longer always the same size. In particular, on an architecture that supports Arm and Thumb-1 but not Thumb-2, the Arm and Thumb tables are different sizes from //each other//. As a consequence, ``getJumpTableEntrySize`` can no longer base its answer on the target triple's architecture: it has to take into account the decision that ``selectJumpTableArmEncoding`` made, which meant I had to move that function to an earlier point in the code and store its answer in the ``LowerTypeTestsModule`` class.
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D143576
show more ...
|
#
397265d8 |
| 20-Feb-2023 |
Kazu Hirata <kazu@google.com> |
[llvm] Use APInt::isAllOnes instead of isAllOnesValue (NFC)
Note that isAllOnesValue has been soft-deprecated in favor of isAllOnes.
|
#
bbef3835 |
| 16-Feb-2023 |
Simon Tatham <simon.tatham@arm.com> |
Revert "[LowerTypeTests] Support generating Armv6-M jump tables."
This reverts commit f6ddf7781471b71243fa3c3ae7c93073f95c7dff.
Eight buildbots reported that the two test files changed by that comm
Revert "[LowerTypeTests] Support generating Armv6-M jump tables."
This reverts commit f6ddf7781471b71243fa3c3ae7c93073f95c7dff.
Eight buildbots reported that the two test files changed by that commit had started failing. The buildbots in question all had in common that they build with a very restricted `LLVM_TARGETS_TO_BUILD`, such as only X86 or AArch64 or Hexagon. I didn't notice this before commit because my own build has the full default set of targets, and in that circumstance, the tests pass.
I assume the problem has something to do with the attempt to query TargetTransformInfo: if you can't make a valid TTI for the target triple then you can't ask it what kind of inline assembler you should be emitting, and so `opt` without the Arm backend can't get the Arm cases of these tests right.
I don't have time to fix this until next week, so I'll revert the change for now to keep the buildbots happy.
show more ...
|
#
f6ddf778 |
| 16-Feb-2023 |
Simon Tatham <simon.tatham@arm.com> |
[LowerTypeTests] Support generating Armv6-M jump tables.
The LowerTypeTests pass emits a jump table in the form of an `inlineasm` IR node containing a string representation of some assembly. It test
[LowerTypeTests] Support generating Armv6-M jump tables.
The LowerTypeTests pass emits a jump table in the form of an `inlineasm` IR node containing a string representation of some assembly. It tests the target triple to see what architecture it should be generating assembly for. But that's not good enough for `Triple::thumb`, because the 32-bit PC-relative `b.w` branch instruction isn't available in all supported architecture versions. In particular, Armv6-M doesn't support that instruction (although the similar Armv8-M Baseline does).
Most of this patch is concerned with working out whether the compilation target is Armv6-M or not, which I'm doing by going through all the functions in the module, retrieving a TargetTransformInfo for each one, and querying it via a new method I've added to check its SubtargetInfo. If any function's TTI indicates that it's targeting an architecture supporting B.W, then we assume we're also allowed to use B.W in the jump table.
The Armv6-M compatible jump table format requires a temporary register, and therefore also has to use the stack in order to restore that register.
Another consequence of this change is that jump tables on Arm/Thumb are no longer always the same size. In particular, on an architecture that supports Arm and Thumb-1 but not Thumb-2, the Arm and Thumb tables are different sizes from //each other//. As a consequence, ``getJumpTableEntrySize`` can no longer base its answer on the target triple's architecture: it has to take into account the decision that ``selectJumpTableArmEncoding`` made, which meant I had to move that function to an earlier point in the code and store its answer in the ``LowerTypeTestsModule`` class.
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D143576
show more ...
|
#
00531139 |
| 03-Feb-2023 |
Sander de Smalen <sander.desmalen@arm.com> |
[LoopVectorize][TTI] NFCI: Clarify enum for the tail folding style.
This NFC (intended) patch has several small changes: * It renames PredicationStyle to TailFoldingStyle. * It renames TTI.emitActiv
[LoopVectorize][TTI] NFCI: Clarify enum for the tail folding style.
This NFC (intended) patch has several small changes: * It renames PredicationStyle to TailFoldingStyle. * It renames TTI.emitActiveLaneMask() to TTI.getPreferredTailFoldingStyle() * Simplifies some of its uses in the LoopVectorizer
Rationale: To my surprise PredicationStyle::None did not mean 'no predication', but rather 'no active lane mask intrinsic', such that the predicate is created using a splat + compare with stepvector. The enum is also highly specific to tail folding, so it seems better to name this around that feature, i.e. 'tail folding style'.
This also makes it more amenable to extend it to other tail folding styles, such as the one added in D142109.
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D142887
show more ...
|
Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init |
|
#
5fb3a57e |
| 21-Jan-2023 |
ShihPo Hung <shihpo.hung@sifive.com> |
[Cost] Add CostKind to getVectorInstrCost and its related users
LoopUnroll estimates the loop size via getInstructionCost(), but getInstructionCost() cannot pass CostKind to getVectorInstrCost(). An
[Cost] Add CostKind to getVectorInstrCost and its related users
LoopUnroll estimates the loop size via getInstructionCost(), but getInstructionCost() cannot pass CostKind to getVectorInstrCost(). And so does getShuffleCost() to getBroadcastShuffleOverhead(), getPermuteShuffleOverhead(), getExtractSubvectorOverhead(), and getInsertSubvectorOverhead().
To address this, this patch adds an argument CostKind to these functions.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D142116
show more ...
|
Revision tags: llvmorg-15.0.7 |
|
#
8fd5558b |
| 11-Jan-2023 |
Guillaume Chatelet <gchatelet@google.com> |
[NFC] Use TypeSize::geFixedValue() instead of TypeSize::getFixedSize()
This change is one of a series to implement the discussion from https://reviews.llvm.org/D141134.
|
#
9b5f6268 |
| 21-Dec-2022 |
Alexey Bataev <a.bataev@outlook.com> |
[SLP]Fix cost of the broadcast buildvector/gather.
Need to include the cost of the initial insertelement to the cost of the broadcasts. Also, need to adjust the cost of the gather/buildvector if the
[SLP]Fix cost of the broadcast buildvector/gather.
Need to include the cost of the initial insertelement to the cost of the broadcasts. Also, need to adjust the cost of the gather/buildvector if the element is inserted into poison/undef vector.
Differential Revision: https://reviews.llvm.org/D140498
show more ...
|
#
86fe4dfd |
| 02-Dec-2022 |
Krzysztof Parzyszek <kparzysz@quicinc.com> |
TargetTransformInfo: convert Optional to std::optional
Recommit: added missing "#include <cstdint>".
|
#
4e12d183 |
| 02-Dec-2022 |
Krzysztof Parzyszek <kparzysz@quicinc.com> |
Revert "TargetTransformInfo: convert Optional to std::optional"
This reverts commit b83711248cb12639e7ef7303cfbb4452b4067e85.
Some buildbots are failing.
|
#
b8371124 |
| 02-Dec-2022 |
Krzysztof Parzyszek <kparzysz@quicinc.com> |
TargetTransformInfo: convert Optional to std::optional
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3 |
|
#
de6dfbbb |
| 14-Oct-2022 |
David Green <david.green@arm.com> |
[ARM] Fix for MVE i128 vector icmp costs.
We were hitting an assert as the legalied type needn't be a vector.
Fixes #58364
|
Revision tags: working, llvmorg-15.0.2 |
|
#
f6d110e2 |
| 27-Sep-2022 |
Philip Reames <preames@rivosinc.com> |
[LAA] Make getPtrStride return Option instead of overloading zero as error value [nfc]
This is purely NFC restructure in advance of a change which actually exposes zero strides. This is mostly beca
[LAA] Make getPtrStride return Option instead of overloading zero as error value [nfc]
This is purely NFC restructure in advance of a change which actually exposes zero strides. This is mostly because I find this interface confusing each time I look at it.
show more ...
|
Revision tags: llvmorg-15.0.1, llvmorg-15.0.0 |
|
#
2833760c |
| 29-Aug-2022 |
Kazu Hirata <kazu@google.com> |
[Target] Qualify auto in range-based for loops (NFC)
|
Revision tags: llvmorg-15.0.0-rc3 |
|
#
df20ff9a |
| 23-Aug-2022 |
Philip Reames <preames@rivosinc.com> |
[TTI] Kill last couple uses of OperandValueKind in targets [nfc]
Use the accessor methods on the containing class instead so that we can change the representation.
|
#
c9608d57 |
| 22-Aug-2022 |
Philip Reames <preames@rivosinc.com> |
[TTI] Plumb through OperandValueInfo in getMemoryOpCost [NFC]
This has the effect of exposing the power-of-two property for use in memory op costing, but no target actually uses it yet. The main po
[TTI] Plumb through OperandValueInfo in getMemoryOpCost [NFC]
This has the effect of exposing the power-of-two property for use in memory op costing, but no target actually uses it yet. The main point of this change is simple consistency with the recently changes getArithmeticInstrCost, and to remove the last (interface) use of OperandValueKind.
show more ...
|
#
104fa367 |
| 20-Aug-2022 |
Philip Reames <preames@rivosinc.com> |
[TTI] Use OperandValueInfo in getArithmeticInstrCost implementation [NFC]
This change completes the process of replacing OperandValueKind and OperandValueProperties which were previously passed inde
[TTI] Use OperandValueInfo in getArithmeticInstrCost implementation [NFC]
This change completes the process of replacing OperandValueKind and OperandValueProperties which were previously passed independently in this API with a single container class which contains both.
This is the change which motivated the whole sequence which preceeded it. In an original spike version of this change, I'd noticed a nasty bug: I'd changed the signature without changing names, and as result, we silently passed additional information through a callsite which previously dropped the power-of-two fact. This might be harmless in most cases, but at least a couple clearly dependend for correctness on not passing that property through.
I did my best to split off prior changes which reduced the scope of this one, and which made it possible to use compiler assistance. For instance, every parameter which changes type in this change also changes name. This was intentional to make sure that every call site possible effected must show up in the diff. This let me audit each one closely.
show more ...
|
#
5263155d |
| 21-Aug-2022 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[CostModel] Add CostKind argument to getShuffleCost
Defaults to TCK_RecipThroughput - as most explicit calls were assuming TCK_RecipThroughput (vectorizers) or was just doing a before-vs-after compa
[CostModel] Add CostKind argument to getShuffleCost
Defaults to TCK_RecipThroughput - as most explicit calls were assuming TCK_RecipThroughput (vectorizers) or was just doing a before-vs-after comparison (vectorcombiner). Calls via getInstructionCost were just dropping the CostKind, so again there should be no change at this time (as getShuffleCost and its expansions don't use CostKind yet) - but it will make it easier for us to better account for size/latency shuffle costs in inline/unroll passes in the future.
Differential Revision: https://reviews.llvm.org/D132287
show more ...
|