VectorUtils.cpp - OpenGrok history log for /llvm-project/llvm/lib/Analysis/VectorUtils.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7
# bab7920f	13-Jan-2025	Alexey Bataev <a.bataev@outlook.com>	[RISCV][CG]Use processShuffleMasks for per-register shuffles Patch adds usage of processShuffleMasks in in codegen in lowerShuffleViaVRegSplitting. This function is already used for X86 shuffles est [RISCV][CG]Use processShuffleMasks for per-register shuffles Patch adds usage of processShuffleMasks in in codegen in lowerShuffleViaVRegSplitting. This function is already used for X86 shuffles estimations and in DAGTypeLegalizer::SplitVecRes_VECTOR_SHUFFLE functions, unifies the code. Reviewers: topperc, wangpc-pp, lukel97, preames Reviewed By: preames Pull Request: https://github.com/llvm/llvm-project/pull/121765 show more ...
# 24bb180e	10-Jan-2025	Philip Reames <preames@rivosinc.com>	[RISCV] Attempt to widen SEW before generic shuffle lowering (#122311) This takes inspiration from AArch64 which does the same thing to assist with zip/trn/etc.. Doing this recursion unconditionall [RISCV] Attempt to widen SEW before generic shuffle lowering (#122311) This takes inspiration from AArch64 which does the same thing to assist with zip/trn/etc.. Doing this recursion unconditionally when the mask allows is slightly questionable, but seems to work out okay in practice. As a bit of context, it's helpful to realize that we have existing logic in both DAGCombine and InstCombine which mutates the element width of in an analogous manner. However, that code has two restriction which prevent it from handling the motivating cases here. First, it only triggers if there is a bitcast involving a different element type. Second, the matcher used considers a partially undef wide element to be a non-match. I considered trying to relax those assumptions, but the information loss for undef in mid-level opt seemed more likely to open a can of worms than I wanted. show more ...
# 45c01e8a	19-Dec-2024	Finn Plummer <50529406+inbelic@users.noreply.github.com>	[NFC][TargetTransformInfo][VectorUtils] Consolidate `isVectorIntrinsic...` api (#117635) - update `VectorUtils:isVectorIntrinsicWithScalarOpAtArg` to use TTI for all uses, to allow specifiction of [NFC][TargetTransformInfo][VectorUtils] Consolidate `isVectorIntrinsic...` api (#117635) - update `VectorUtils:isVectorIntrinsicWithScalarOpAtArg` to use TTI for all uses, to allow specifiction of target specific intrinsics - add TTI to the `isVectorIntrinsicWithStructReturnOverloadAtField` api - update TTI api to provide `isTargetIntrinsicWith...` functions and consistently name them - move `isTriviallyScalarizable` to VectorUtils - update all uses of the api and provide the TTI parameter Resolves #117030 show more ...
Revision tags: llvmorg-19.1.6
# b759020c	11-Dec-2024	LiqinWeng <liqin.weng@spacemit.com>	[LV][EVL] Support cast instruction with EVL-vectorization (#108351)
# b9aa155d	06-Dec-2024	Alexey Bataev <a.bataev@outlook.com>	[TTI][X86]Fix detection of the shuffles from the second shuffle operand only If the shuffle mask uses only indices from the second shuffle operand, processShuffleMasks function misses it currently, [TTI][X86]Fix detection of the shuffles from the second shuffle operand only If the shuffle mask uses only indices from the second shuffle operand, processShuffleMasks function misses it currently, which prevents correct cost estimation in this corner case. To fix this, need to raise the limit to 2 * VF rather than just VF and adjust processing correspondingly. Will allow future improvements for 2 sources permutations. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/118972 show more ...
Revision tags: llvmorg-19.1.5
# 4a3f46de	28-Nov-2024	LiqinWeng <liqin.weng@spacemit.com>	[LV][EVL] Support call instruction with EVL-vectorization (#110412)
# 8663b877	21-Nov-2024	Finn Plummer <50529406+inbelic@users.noreply.github.com>	[NFC][VectorUtils][TargetTransformInfo] Add `isVectorIntrinsicWithOverloadTypeAtArg` api (#114849) This changes allows target intrinsics to specify and overwrite overloaded types. - Updates `Repl [NFC][VectorUtils][TargetTransformInfo] Add `isVectorIntrinsicWithOverloadTypeAtArg` api (#114849) This changes allows target intrinsics to specify and overwrite overloaded types. - Updates `ReplaceWithVecLib` to not provide TTI as there most probably won't be a use-case - Updates `SLPVectorizer` to use available TTI - Updates `VPTransformState` to pass down TTI - Updates `VPlanRecipe` to use passed-down TTI This change will let us add scalarization for `asdouble`: #114847 show more ...
Revision tags: llvmorg-19.1.4
# 818d7159	09-Nov-2024	Tex Riddell <texr@microsoft.com>	[Analysis] atan2: isTriviallyVectorizable; add to massv and accelerate veclibs (#113637) This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 - Re [Analysis] atan2: isTriviallyVectorizable; add to massv and accelerate veclibs (#113637) This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 - Return true for atan2 from isTriviallyVectorizable - Add atan2 to VecFuncs.def for massv and accelerate libraries. - Add atan2 to hasOptimizedCodeGen - Add atan2 support in llvm/lib/Analysis/ValueTracking.cpp llvm::getIntrinsicForCallSite and update vectorization tests - Add atan2 name check to isLoweredToCall in llvm/include/llvm/Analysis/TargetTransformInfoImpl.h - Note: there's no test coverage for these names in isLoweredToCall, except that Transforms/TailCallElim/inf-recursion.ll is impacted by the "fabs" case Thanks to @jroelofs for the atan2 accelerate veclib and associated test additions, plus the hasOptimizedCodeGen addition. Part of: Implement the atan2 HLSL Function #70096. show more ...
# dfb60bb9	29-Oct-2024	Rohit Aggarwal <44664450+rohitaggarwal007@users.noreply.github.com>	Adding more vector calls for -fveclib=AMDLIBM (#109662) AMD has it's own implementation of vector calls. New vector calls are introduced in the library for exp10, log10, sincos and finite asin/acos Adding more vector calls for -fveclib=AMDLIBM (#109662) AMD has it's own implementation of vector calls. New vector calls are introduced in the library for exp10, log10, sincos and finite asin/acos Please refer [https://github.com/amd/aocl-libm-ose] --------- Co-authored-by: Rohit Aggarwal <Rohit.Aggarwal@amd.com> show more ...
Revision tags: llvmorg-19.1.3
# dcbf2c2c	21-Oct-2024	Farzon Lotfi <1802579+farzonl@users.noreply.github.com>	[Scalarizer][DirectX] support structs return types (#111569) Based on this RFC: https://discourse.llvm.org/t/rfc-allow-the-scalarizer-pass-to-scalarize-vectors-returned-in-structs/82306 LLVM int [Scalarizer][DirectX] support structs return types (#111569) Based on this RFC: https://discourse.llvm.org/t/rfc-allow-the-scalarizer-pass-to-scalarize-vectors-returned-in-structs/82306 LLVM intrinsics do not support out params. To get around this limitation implementers will make intrinsics return structs to capture a return type and an out param. This implementation detail should not impact scalarization since these cases should be elementwise operations. ## Three changes are needed. - The CallInst visitor needs to be updated to handle Structs - A new visitor is needed for `ExtractValue` instructions - finsh needs to be update to handle structs so that insert elements are properly propogated. ## Testing changes - Add support for `llvm.frexp` - Add support for `llvm.dx.splitdouble` fixes https://github.com/llvm/llvm-project/issues/111437 show more ...
# 4ba1800b	16-Oct-2024	Amr Hesham <amr96@programmer.net>	[LLVM][NFC] Reduce copying of parameter in lambda (#110299) Reduce redundant copy parameter in lambda Fixes #95642
Revision tags: llvmorg-19.1.2, llvmorg-19.1.1
# cc7b24a4	24-Sep-2024	Piotr Fusik <p.fusik@samsung.com>	[NFC] Fix typos in comments (#109765)
Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4
# a156b5a4	02-Sep-2024	Yingwei Zheng <dtcxzyw2333@gmail.com>	[SLP] Add vectorization support for [u\|s]cmp (#106747) This patch adds vectorization support for [u\|s]cmp intrinsic calls.
# d58d105c	30-Aug-2024	Simon Pilgrim <llvm-dev@redking.me.uk>	[Analysis] isTriviallyVectorizable - add vectorization support for acos/asin/atan and cosh/sinh/tanh intrinsics (#106584) Show fallback cases in amdlibm tests where it doesn't have that specific op
Revision tags: llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# b22fa909	16-Jul-2024	mskamp <msk@posteo.org>	[ValueTracking][X86] Compute KnownBits for phadd/phsub (#92429) Add KnownBits computations to ValueTracking and X86 DAG lowering. These instructions add/subtract adjacent vector elements in t [ValueTracking][X86] Compute KnownBits for phadd/phsub (#92429) Add KnownBits computations to ValueTracking and X86 DAG lowering. These instructions add/subtract adjacent vector elements in their operands. Example: phadd [X1, X2] [Y1, Y2] = [X1 + X2, Y1 + Y2]. This means that, in this example, we can compute the KnownBits of the operation by computing the KnownBits of [X1, X2] + [X1, X2] and [Y1, Y2] + [Y1, Y2] and intersecting the results. This approach also generalizes to all x86 vector types. There are also the operations phadd.sw and phsub.sw, which perform saturating addition/subtraction. Use sadd_sat and ssub_sat to compute the KnownBits of these operations. Also adjust the existing test case pr53247.ll because it can be transformed to a constant using the new KnownBits computation. Fixes #82516. show more ...
# 2d209d96	27-Jun-2024	Nikita Popov <npopov@redhat.com>	[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902) This is a helper to avoid writing `getModule()->getDataLayout()`. I regularly try to use this method only to remember it does [IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902) This is a helper to avoid writing `getModule()->getDataLayout()`. I regularly try to use this method only to remember it doesn't exist... `getModule()->getDataLayout()` is also a common (the most common?) reason why code has to include the Module.h header. show more ...
# d42b3926	26-Jun-2024	Nikita Popov <npopov@redhat.com>	[VectorUtils] Use SmallPtrSet::remove_if() (NFC)
# 5b4000dc	26-Jun-2024	Simon Pilgrim <llvm-dev@redking.me.uk>	[VectorUtils] Add llvm::scaleShuffleMaskElts wrapper for narrowShuffleMaskElts/widenShuffleMaskElts, NFC. (#96646) Using the target number of vector elements, scaleShuffleMaskElts will try to use na [VectorUtils] Add llvm::scaleShuffleMaskElts wrapper for narrowShuffleMaskElts/widenShuffleMaskElts, NFC. (#96646) Using the target number of vector elements, scaleShuffleMaskElts will try to use narrowShuffleMaskElts/widenShuffleMaskElts to scale the shuffle mask accordingly. Working on #58895 I didn't want to create yet another case where we have to handle both re-scaling cases. show more ...
# 605e1847	24-Jun-2024	Nikita Popov <npopov@redhat.com>	[VectorUtils] Use poison instead of undef in findScalarElement() Out-of-range extractelement returns poison, and so do poison elements in the shufflevector mask.
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7
# 1d874335	05-Jun-2024	Farzon Lotfi <1802579+farzonl@users.noreply.github.com>	[x86] Add tan intrinsic part 4 (#90503) This change is an implementation of #87367's investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://disc [x86] Add tan intrinsic part 4 (#90503) This change is an implementation of #87367's investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 Much of this change was following how G_FSIN and G_FCOS were used. Changes: - `llvm/docs/GlobalISel/GenericOpcode.rst` - Document the `G_FTAN` opcode - `llvm/docs/LangRef.rst` - Document the tan intrinsic - `llvm/include/llvm/Analysis/VecFuncs.def` - Associate the tan intrinsic as a vector function similar to the tanf libcall. - `llvm/include/llvm/CodeGen/BasicTTIImpl.h` - Map the tan intrinsic to `ISD::FTAN` - `llvm/include/llvm/CodeGen/ISDOpcodes.h` - Define ISD opcodes for `FTAN` and `STRICT_FTAN` - `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic - `llvm/include/llvm/IR/RuntimeLibcalls.def` - Define tan libcall mappings - `llvm/include/llvm/Target/GenericOpcodes.td` - Define the `G_FTAN` Opcode - `llvm/include/llvm/Support/TargetOpcodes.def` - Create a `G_FTAN` Opcode handler - `llvm/include/llvm/Target/GlobalISel/SelectionDAGCompat.td` - Map `G_FTAN` to `ftan` - `llvm/include/llvm/Target/TargetSelectionDAG.td` - Define `ftan`, `strict_ftan`, and `any_ftan` and map them to the ISD opcodes for `FTAN` and `STRICT_FTAN` - `llvm/lib/Analysis/VectorUtils.cpp` - Associate the tan intrinsic as a vector intrinsic - `llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp` Map the tan intrinsic to `G_FTAN` Opcode - `llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp` - Add `G_FTAN` to the list of floating point math operations also associate `G_FTAN` with the `TAN_F` runtime lib. - `llvm/lib/CodeGen/GlobalISel/Utils.cpp` - More floating point math operation common behaviors. - llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp - List the function expansion operations for `FTAN` and `STRICT_FTAN`. Also define both opcodes in `PromoteNode`. - `llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp` - More `FTAN` and `STRICT_FTAN` handling in the legalizer - `llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h` - Define `SoftenFloatRes_FTAN` and `ExpandFloatRes_FTAN`. - `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp` - Define `FTAN` as a legal vector operation. - `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp` - Define `FTAN` as a legal vector operation. - `llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp` - define tan as an intrinsic that doesn't return NaN. - `llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp` Map `LibFunc_tan`, `LibFunc_tanf`, and `LibFunc_tanl` to `ISD::FTAN`. Map `Intrinsic::tan` to `ISD::FTAN` and add selection dag handling for `Intrinsic::tan`. - `llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp` - Define `ftan` and `strict_ftan` names for the equivalent ISD opcodes. - `llvm/lib/CodeGen/TargetLoweringBase.cpp` -Define a Tan128 libcall and ISD::FTAN as a target lowering action. - `llvm/lib/Target/X86/X86ISelLowering.cpp` - Add x86_64 lowering for tan intrinsic resolves https://github.com/llvm/llvm-project/issues/70082 show more ...
Revision tags: llvmorg-18.1.6, llvmorg-18.1.5
# cf328ff9	24-Apr-2024	Pierre van Houtryve <pierre.vanhoutryve@amd.com>	[IR] Memory Model Relaxation Annotations (#78569) Implements the core/target-agnostic components of Memory Model Relaxation Annotations. RFC: https://discourse.llvm.org/t/rfc-mmras-memory-model [IR] Memory Model Relaxation Annotations (#78569) Implements the core/target-agnostic components of Memory Model Relaxation Annotations. RFC: https://discourse.llvm.org/t/rfc-mmras-memory-model-relaxation-annotations/76361/5 show more ...
Revision tags: llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1
# a1a590ef	05-Mar-2024	Yingwei Zheng <dtcxzyw2333@gmail.com>	[InstCombine] Fix miscompilation in PR83947 (#83993) https://github.com/llvm/llvm-project/blob/762f762504967efbe159db5c737154b989afc9bb/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp#L394-L407 [InstCombine] Fix miscompilation in PR83947 (#83993) https://github.com/llvm/llvm-project/blob/762f762504967efbe159db5c737154b989afc9bb/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp#L394-L407 Comment from @topperc: > This transforms assumes the mask is a non-zero splat. We only know its a splat and not provably all 0s. The mask is a constexpr that includes the address of the global variable. We can't resolve the constant expression to an exact value. Fixes #83947. show more ...
Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# 92289db8	17-Jan-2024	Alexandros Lamprineas <alexandros.lamprineas@arm.com>	[VFABI] Move the Vector ABI demangling utility to LLVMCore. (#77513) This fixes #71892 allowing us to check magled names in the IR verifier.
# e512df3e	02-Jan-2024	Alexandros Lamprineas <alexandros.lamprineas@arm.com>	[LV] Fix crash when vectorizing function calls with linear args. (#76274) llvm/lib/IR/Type.cpp:694: Assertion `isValidElementType(ElementType) && "Element type of a VectorType must be an i [LV] Fix crash when vectorizing function calls with linear args. (#76274) llvm/lib/IR/Type.cpp:694: Assertion `isValidElementType(ElementType) && "Element type of a VectorType must be an integer, floating point, or pointer type."' failed. Stack dump: llvm::FixedVectorType::get(llvm::Type, unsigned int) llvm::VPWidenCallRecipe::execute(llvm::VPTransformState&) llvm::VPBasicBlock::execute(llvm::VPTransformState) llvm::VPRegionBlock::execute(llvm::VPTransformState) llvm::VPlan::execute(llvm::VPTransformState) ... Happens with function calls of void return type. show more ...
# ddb6db4d	19-Dec-2023	Paschalis Mpeis <paschalis.mpeis@arm.com>	[VFABI] Create FunctionType for vector functions (#75058) `createFunctionType` returns a FunctionType that may contain a mask, which is currently placed as the last parameter to the Function. The [VFABI] Create FunctionType for vector functions (#75058) `createFunctionType` returns a FunctionType that may contain a mask, which is currently placed as the last parameter to the Function. The placement happens according to `VFParameters` of `VFInfo`, and it should be able to handle VFABI specification changes. Regarding the return type, it uses the scalar type of the input instruction, as the specification does not encode in the mangled name such information. If that ever happens, that information should be available from `VFInfo`. show more ...
12 3 4 5 6 7 8