ARMTargetTransformInfo.cpp - OpenGrok history log for /llvm-project/llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
# 9deee6bf	11-Aug-2023	Nikita Popov <npopov@redhat.com>	[SDAG] Don't transfer !range metadata without !noundef to SDAG (PR64589) D141386 changed the semantics of !range metadata to return poison on violation. If !range is combined with !noundef, violatio [SDAG] Don't transfer !range metadata without !noundef to SDAG (PR64589) D141386 changed the semantics of !range metadata to return poison on violation. If !range is combined with !noundef, violation is immediate UB instead, matching the old semantics. In theory, these IR semantics should also carry over into SDAG. In practice, DAGCombine has at least one key transform that is invalid in the presence of poison, namely the conversion of logical and/or to bitwise and/or (https://github.com/llvm/llvm-project/blob/c7b537bf0923df05254f9fa4722b298eb8f4790d/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L11252). Ideally, we would fix this transform, but this will require substantial work to avoid codegen regressions. In the meantime, avoid transferring !range metadata without !noundef, effectively restoring the old !range metadata semantics on the SDAG layer. Fixes https://github.com/llvm/llvm-project/issues/64589. Differential Revision: https://reviews.llvm.org/D157685 show more ...
Revision tags: llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# 8de9f2b5	26-Jun-2023	Job Noorman <jnoorman@igalia.com>	Move SubtargetFeature.h from MC to TargetParser SubtargetFeature.h is currently part of MC while it doesn't depend on anything in MC. Since some LLVM components might have the need to work with targ Move SubtargetFeature.h from MC to TargetParser SubtargetFeature.h is currently part of MC while it doesn't depend on anything in MC. Since some LLVM components might have the need to work with target features without necessarily needing MC, it might be worthwhile to move SubtargetFeature.h to a different location. This will reduce the dependencies of said components. Note that I choose TargetParser as the destination because that's where Triple lives and SubtargetFeatures feels related to that. This issues came up during a JITLink review (D149522). JITLink would like to avoid a dependency on MC while still needing to store target features. Reviewed By: MaskRay, arsenm Differential Revision: https://reviews.llvm.org/D150549 show more ...
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2
# c1221251	10-Apr-2023	NAKAMURA Takumi <geek4civic@gmail.com>	Restore CodeGen/MachineValueType.h from `Support` This is rework of; - rG13e77db2df94 (r328395; MVT) Since `LowLevelType.h` has been restored to `CodeGen`, `MachinveValueType.h` can be restored Restore CodeGen/MachineValueType.h from `Support` This is rework of; - rG13e77db2df94 (r328395; MVT) Since `LowLevelType.h` has been restored to `CodeGen`, `MachinveValueType.h` can be restored as well. Depends on D148767 Differential Revision: https://reviews.llvm.org/D149024 show more ...
# fb8038db	13-Apr-2023	Simon Pilgrim <llvm-dev@redking.me.uk>	[TTI] getExtendedReductionCost - replace std::optional<FastMathFlags> args with FastMathFlags Followup to D148149 where it was noticed that the std::optional wrapper wasn't helping with anything (we [TTI] getExtendedReductionCost - replace std::optional<FastMathFlags> args with FastMathFlags Followup to D148149 where it was noticed that the std::optional wrapper wasn't helping with anything (we can just use an empty FastMathFlags()). show more ...
# a39d2d50	11-Apr-2023	David Green <david.green@arm.com>	[ARM] Increase the Scalarized cost of masked gather/scatter operations If a gather/scatter is masked and will need to be scalarized then the cost should be higher than we currently produce. An addit [ARM] Increase the Scalarized cost of masked gather/scatter operations If a gather/scatter is masked and will need to be scalarized then the cost should be higher than we currently produce. An additional cost for scalarizing the mask, extracting i1s and branching on the result needs to be added, which this patch gives a cost of 5. Differential Revision: https://reviews.llvm.org/D147331 show more ...
Revision tags: llvmorg-16.0.1
# b4089cfa	04-Apr-2023	David Sherwood <david.sherwood@arm.com>	[NFC][LoopVectorize] Simplify preferPredicateOverEpilogue interface Given just how many arguments we pass to preferPredicateOverEpilogue and considering this list may grow over time I've decided to [NFC][LoopVectorize] Simplify preferPredicateOverEpilogue interface Given just how many arguments we pass to preferPredicateOverEpilogue and considering this list may grow over time I've decided to pass in a pointer to a new TailFoldingInfo structure instead, similar to what we do with IntrinsicCostAttributes, etc. In addition, many of the arguments we pass in are actually available in the LoopVectorizationLegality class so I've managed to reduce the set of pointers that we need to pass in the TailFoldingInfo struct. Differential Revision: https://reviews.llvm.org/D146127 show more ...
Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2
# c41b41eb	30-Jan-2023	Sander de Smalen <sander.desmalen@arm.com>	[LoopVectorize] Use overflow-check analysis to improve tail-folding. This work follows on from D142109 and addresses a possible regression when we know the loop iteration counter cannot overflow. W [LoopVectorize] Use overflow-check analysis to improve tail-folding. This work follows on from D142109 and addresses a possible regression when we know the loop iteration counter cannot overflow. When we know the overflow-check always evaluates to false, it's better to use the other style of tail folding where it assumes a runtime check was added, because that avoids having to calculate a modified trip-count. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D142894 show more ...
# a8cd35c3	16-Feb-2023	Simon Tatham <simon.tatham@arm.com>	[LowerTypeTests] Support generating Armv6-M jump tables. (reland) [Originally committed as f6ddf7781471b71243fa3c3ae7c93073f95c7dff; reverted in bbef38352fbade9e014ec97d5991da5dee306da7 due to test [LowerTypeTests] Support generating Armv6-M jump tables. (reland) [Originally committed as f6ddf7781471b71243fa3c3ae7c93073f95c7dff; reverted in bbef38352fbade9e014ec97d5991da5dee306da7 due to test breakage; now relanded with the Arm tests conditioned on `arm-registered-target`] The LowerTypeTests pass emits a jump table in the form of an `inlineasm` IR node containing a string representation of some assembly. It tests the target triple to see what architecture it should be generating assembly for. But that's not good enough for `Triple::thumb`, because the 32-bit PC-relative `b.w` branch instruction isn't available in all supported architecture versions. In particular, Armv6-M doesn't support that instruction (although the similar Armv8-M Baseline does). Most of this patch is concerned with working out whether the compilation target is Armv6-M or not, which I'm doing by going through all the functions in the module, retrieving a TargetTransformInfo for each one, and querying it via a new method I've added to check its SubtargetInfo. If any function's TTI indicates that it's targeting an architecture supporting B.W, then we assume we're also allowed to use B.W in the jump table. The Armv6-M compatible jump table format requires a temporary register, and therefore also has to use the stack in order to restore that register. Another consequence of this change is that jump tables on Arm/Thumb are no longer always the same size. In particular, on an architecture that supports Arm and Thumb-1 but not Thumb-2, the Arm and Thumb tables are different sizes from //each other//. As a consequence, ``getJumpTableEntrySize`` can no longer base its answer on the target triple's architecture: it has to take into account the decision that ``selectJumpTableArmEncoding`` made, which meant I had to move that function to an earlier point in the code and store its answer in the ``LowerTypeTestsModule`` class. Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D143576 show more ...
# 397265d8	20-Feb-2023	Kazu Hirata <kazu@google.com>	[llvm] Use APInt::isAllOnes instead of isAllOnesValue (NFC) Note that isAllOnesValue has been soft-deprecated in favor of isAllOnes.
# bbef3835	16-Feb-2023	Simon Tatham <simon.tatham@arm.com>	Revert "[LowerTypeTests] Support generating Armv6-M jump tables." This reverts commit f6ddf7781471b71243fa3c3ae7c93073f95c7dff. Eight buildbots reported that the two test files changed by that comm Revert "[LowerTypeTests] Support generating Armv6-M jump tables." This reverts commit f6ddf7781471b71243fa3c3ae7c93073f95c7dff. Eight buildbots reported that the two test files changed by that commit had started failing. The buildbots in question all had in common that they build with a very restricted `LLVM_TARGETS_TO_BUILD`, such as only X86 or AArch64 or Hexagon. I didn't notice this before commit because my own build has the full default set of targets, and in that circumstance, the tests pass. I assume the problem has something to do with the attempt to query TargetTransformInfo: if you can't make a valid TTI for the target triple then you can't ask it what kind of inline assembler you should be emitting, and so `opt` without the Arm backend can't get the Arm cases of these tests right. I don't have time to fix this until next week, so I'll revert the change for now to keep the buildbots happy. show more ...
# f6ddf778	16-Feb-2023	Simon Tatham <simon.tatham@arm.com>	[LowerTypeTests] Support generating Armv6-M jump tables. The LowerTypeTests pass emits a jump table in the form of an `inlineasm` IR node containing a string representation of some assembly. It test [LowerTypeTests] Support generating Armv6-M jump tables. The LowerTypeTests pass emits a jump table in the form of an `inlineasm` IR node containing a string representation of some assembly. It tests the target triple to see what architecture it should be generating assembly for. But that's not good enough for `Triple::thumb`, because the 32-bit PC-relative `b.w` branch instruction isn't available in all supported architecture versions. In particular, Armv6-M doesn't support that instruction (although the similar Armv8-M Baseline does). Most of this patch is concerned with working out whether the compilation target is Armv6-M or not, which I'm doing by going through all the functions in the module, retrieving a TargetTransformInfo for each one, and querying it via a new method I've added to check its SubtargetInfo. If any function's TTI indicates that it's targeting an architecture supporting B.W, then we assume we're also allowed to use B.W in the jump table. The Armv6-M compatible jump table format requires a temporary register, and therefore also has to use the stack in order to restore that register. Another consequence of this change is that jump tables on Arm/Thumb are no longer always the same size. In particular, on an architecture that supports Arm and Thumb-1 but not Thumb-2, the Arm and Thumb tables are different sizes from //each other//. As a consequence, ``getJumpTableEntrySize`` can no longer base its answer on the target triple's architecture: it has to take into account the decision that ``selectJumpTableArmEncoding`` made, which meant I had to move that function to an earlier point in the code and store its answer in the ``LowerTypeTestsModule`` class. Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D143576 show more ...
# 00531139	03-Feb-2023	Sander de Smalen <sander.desmalen@arm.com>	[LoopVectorize][TTI] NFCI: Clarify enum for the tail folding style. This NFC (intended) patch has several small changes: * It renames PredicationStyle to TailFoldingStyle. * It renames TTI.emitActiv [LoopVectorize][TTI] NFCI: Clarify enum for the tail folding style. This NFC (intended) patch has several small changes: * It renames PredicationStyle to TailFoldingStyle. * It renames TTI.emitActiveLaneMask() to TTI.getPreferredTailFoldingStyle() * Simplifies some of its uses in the LoopVectorizer Rationale: To my surprise PredicationStyle::None did not mean 'no predication', but rather 'no active lane mask intrinsic', such that the predicate is created using a splat + compare with stepvector. The enum is also highly specific to tail folding, so it seems better to name this around that feature, i.e. 'tail folding style'. This also makes it more amenable to extend it to other tail folding styles, such as the one added in D142109. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D142887 show more ...
Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init
# 5fb3a57e	21-Jan-2023	ShihPo Hung <shihpo.hung@sifive.com>	[Cost] Add CostKind to getVectorInstrCost and its related users LoopUnroll estimates the loop size via getInstructionCost(), but getInstructionCost() cannot pass CostKind to getVectorInstrCost(). An [Cost] Add CostKind to getVectorInstrCost and its related users LoopUnroll estimates the loop size via getInstructionCost(), but getInstructionCost() cannot pass CostKind to getVectorInstrCost(). And so does getShuffleCost() to getBroadcastShuffleOverhead(), getPermuteShuffleOverhead(), getExtractSubvectorOverhead(), and getInsertSubvectorOverhead(). To address this, this patch adds an argument CostKind to these functions. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D142116 show more ...
Revision tags: llvmorg-15.0.7
# 8fd5558b	11-Jan-2023	Guillaume Chatelet <gchatelet@google.com>	[NFC] Use TypeSize::geFixedValue() instead of TypeSize::getFixedSize() This change is one of a series to implement the discussion from https://reviews.llvm.org/D141134.
# 9b5f6268	21-Dec-2022	Alexey Bataev <a.bataev@outlook.com>	[SLP]Fix cost of the broadcast buildvector/gather. Need to include the cost of the initial insertelement to the cost of the broadcasts. Also, need to adjust the cost of the gather/buildvector if the [SLP]Fix cost of the broadcast buildvector/gather. Need to include the cost of the initial insertelement to the cost of the broadcasts. Also, need to adjust the cost of the gather/buildvector if the element is inserted into poison/undef vector. Differential Revision: https://reviews.llvm.org/D140498 show more ...
# 86fe4dfd	02-Dec-2022	Krzysztof Parzyszek <kparzysz@quicinc.com>	TargetTransformInfo: convert Optional to std::optional Recommit: added missing "#include <cstdint>".
# 4e12d183	02-Dec-2022	Krzysztof Parzyszek <kparzysz@quicinc.com>	Revert "TargetTransformInfo: convert Optional to std::optional" This reverts commit b83711248cb12639e7ef7303cfbb4452b4067e85. Some buildbots are failing.
# b8371124	02-Dec-2022	Krzysztof Parzyszek <kparzysz@quicinc.com>	TargetTransformInfo: convert Optional to std::optional
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3
# de6dfbbb	14-Oct-2022	David Green <david.green@arm.com>	[ARM] Fix for MVE i128 vector icmp costs. We were hitting an assert as the legalied type needn't be a vector. Fixes #58364
Revision tags: working, llvmorg-15.0.2
# f6d110e2	27-Sep-2022	Philip Reames <preames@rivosinc.com>	[LAA] Make getPtrStride return Option instead of overloading zero as error value [nfc] This is purely NFC restructure in advance of a change which actually exposes zero strides. This is mostly beca [LAA] Make getPtrStride return Option instead of overloading zero as error value [nfc] This is purely NFC restructure in advance of a change which actually exposes zero strides. This is mostly because I find this interface confusing each time I look at it. show more ...
Revision tags: llvmorg-15.0.1, llvmorg-15.0.0
# 2833760c	29-Aug-2022	Kazu Hirata <kazu@google.com>	[Target] Qualify auto in range-based for loops (NFC)
Revision tags: llvmorg-15.0.0-rc3
# df20ff9a	23-Aug-2022	Philip Reames <preames@rivosinc.com>	[TTI] Kill last couple uses of OperandValueKind in targets [nfc] Use the accessor methods on the containing class instead so that we can change the representation.
# c9608d57	22-Aug-2022	Philip Reames <preames@rivosinc.com>	[TTI] Plumb through OperandValueInfo in getMemoryOpCost [NFC] This has the effect of exposing the power-of-two property for use in memory op costing, but no target actually uses it yet. The main po [TTI] Plumb through OperandValueInfo in getMemoryOpCost [NFC] This has the effect of exposing the power-of-two property for use in memory op costing, but no target actually uses it yet. The main point of this change is simple consistency with the recently changes getArithmeticInstrCost, and to remove the last (interface) use of OperandValueKind. show more ...
# 104fa367	20-Aug-2022	Philip Reames <preames@rivosinc.com>	[TTI] Use OperandValueInfo in getArithmeticInstrCost implementation [NFC] This change completes the process of replacing OperandValueKind and OperandValueProperties which were previously passed inde [TTI] Use OperandValueInfo in getArithmeticInstrCost implementation [NFC] This change completes the process of replacing OperandValueKind and OperandValueProperties which were previously passed independently in this API with a single container class which contains both. This is the change which motivated the whole sequence which preceeded it. In an original spike version of this change, I'd noticed a nasty bug: I'd changed the signature without changing names, and as result, we silently passed additional information through a callsite which previously dropped the power-of-two fact. This might be harmless in most cases, but at least a couple clearly dependend for correctness on not passing that property through. I did my best to split off prior changes which reduced the scope of this one, and which made it possible to use compiler assistance. For instance, every parameter which changes type in this change also changes name. This was intentional to make sure that every call site possible effected must show up in the diff. This let me audit each one closely. show more ...
# 5263155d	21-Aug-2022	Simon Pilgrim <llvm-dev@redking.me.uk>	[CostModel] Add CostKind argument to getShuffleCost Defaults to TCK_RecipThroughput - as most explicit calls were assuming TCK_RecipThroughput (vectorizers) or was just doing a before-vs-after compa [CostModel] Add CostKind argument to getShuffleCost Defaults to TCK_RecipThroughput - as most explicit calls were assuming TCK_RecipThroughput (vectorizers) or was just doing a before-vs-after comparison (vectorcombiner). Calls via getInstructionCost were just dropping the CostKind, so again there should be no change at this time (as getShuffleCost and its expansions don't use CostKind yet) - but it will make it easier for us to better account for size/latency shuffle costs in inline/unroll passes in the future. Differential Revision: https://reviews.llvm.org/D132287 show more ...
123 4 5 6 7 8 9 10 >>...15