History log of /llvm-project/llvm/lib/Analysis/VectorUtils.cpp (Results 1 – 25 of 193)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7
# bab7920f 13-Jan-2025 Alexey Bataev <a.bataev@outlook.com>

[RISCV][CG]Use processShuffleMasks for per-register shuffles

Patch adds usage of processShuffleMasks in in codegen
in lowerShuffleViaVRegSplitting. This function is already used for X86
shuffles est

[RISCV][CG]Use processShuffleMasks for per-register shuffles

Patch adds usage of processShuffleMasks in in codegen
in lowerShuffleViaVRegSplitting. This function is already used for X86
shuffles estimations and in DAGTypeLegalizer::SplitVecRes_VECTOR_SHUFFLE
functions, unifies the code.

Reviewers: topperc, wangpc-pp, lukel97, preames

Reviewed By: preames

Pull Request: https://github.com/llvm/llvm-project/pull/121765

show more ...


# 24bb180e 10-Jan-2025 Philip Reames <preames@rivosinc.com>

[RISCV] Attempt to widen SEW before generic shuffle lowering (#122311)

This takes inspiration from AArch64 which does the same thing to assist
with zip/trn/etc.. Doing this recursion unconditionall

[RISCV] Attempt to widen SEW before generic shuffle lowering (#122311)

This takes inspiration from AArch64 which does the same thing to assist
with zip/trn/etc.. Doing this recursion unconditionally when the mask
allows is slightly questionable, but seems to work out okay in practice.

As a bit of context, it's helpful to realize that we have existing logic
in both DAGCombine and InstCombine which mutates the element width of in
an analogous manner. However, that code has two restriction which
prevent it from handling the motivating cases here. First, it only
triggers if there is a bitcast involving a different element type.
Second, the matcher used considers a partially undef wide element to be
a non-match. I considered trying to relax those assumptions, but the
information loss for undef in mid-level opt seemed more likely to open a
can of worms than I wanted.

show more ...


# 45c01e8a 19-Dec-2024 Finn Plummer <50529406+inbelic@users.noreply.github.com>

[NFC][TargetTransformInfo][VectorUtils] Consolidate `isVectorIntrinsic...` api (#117635)

- update `VectorUtils:isVectorIntrinsicWithScalarOpAtArg` to use TTI for
all uses, to allow specifiction of

[NFC][TargetTransformInfo][VectorUtils] Consolidate `isVectorIntrinsic...` api (#117635)

- update `VectorUtils:isVectorIntrinsicWithScalarOpAtArg` to use TTI for
all uses, to allow specifiction of target specific intrinsics
- add TTI to the `isVectorIntrinsicWithStructReturnOverloadAtField` api
- update TTI api to provide `isTargetIntrinsicWith...` functions and
consistently name them
- move `isTriviallyScalarizable` to VectorUtils

- update all uses of the api and provide the TTI parameter

Resolves #117030

show more ...


Revision tags: llvmorg-19.1.6
# b759020c 11-Dec-2024 LiqinWeng <liqin.weng@spacemit.com>

[LV][EVL] Support cast instruction with EVL-vectorization (#108351)


# b9aa155d 06-Dec-2024 Alexey Bataev <a.bataev@outlook.com>

[TTI][X86]Fix detection of the shuffles from the second shuffle operand only

If the shuffle mask uses only indices from the second shuffle operand,
processShuffleMasks function misses it currently,

[TTI][X86]Fix detection of the shuffles from the second shuffle operand only

If the shuffle mask uses only indices from the second shuffle operand,
processShuffleMasks function misses it currently, which prevents correct
cost estimation in this corner case. To fix this, need to raise the
limit to 2 * VF rather than just VF and adjust processing
correspondingly. Will allow future improvements for 2 sources
permutations.

Reviewers: RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/118972

show more ...


Revision tags: llvmorg-19.1.5
# 4a3f46de 28-Nov-2024 LiqinWeng <liqin.weng@spacemit.com>

[LV][EVL] Support call instruction with EVL-vectorization (#110412)


# 8663b877 21-Nov-2024 Finn Plummer <50529406+inbelic@users.noreply.github.com>

[NFC][VectorUtils][TargetTransformInfo] Add `isVectorIntrinsicWithOverloadTypeAtArg` api (#114849)

This changes allows target intrinsics to specify and overwrite overloaded types.

- Updates `Repl

[NFC][VectorUtils][TargetTransformInfo] Add `isVectorIntrinsicWithOverloadTypeAtArg` api (#114849)

This changes allows target intrinsics to specify and overwrite overloaded types.

- Updates `ReplaceWithVecLib` to not provide TTI as there most probably won't be a use-case
- Updates `SLPVectorizer` to use available TTI
- Updates `VPTransformState` to pass down TTI
- Updates `VPlanRecipe` to use passed-down TTI

This change will let us add scalarization for `asdouble`: #114847

show more ...


Revision tags: llvmorg-19.1.4
# 818d7159 09-Nov-2024 Tex Riddell <texr@microsoft.com>

[Analysis] atan2: isTriviallyVectorizable; add to massv and accelerate veclibs (#113637)

This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

- Re

[Analysis] atan2: isTriviallyVectorizable; add to massv and accelerate veclibs (#113637)

This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

- Return true for atan2 from isTriviallyVectorizable
- Add atan2 to VecFuncs.def for massv and accelerate libraries.
- Add atan2 to hasOptimizedCodeGen
- Add atan2 support in llvm/lib/Analysis/ValueTracking.cpp
llvm::getIntrinsicForCallSite and update vectorization tests
- Add atan2 name check to isLoweredToCall in
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
- Note: there's no test coverage for these names in isLoweredToCall, except that Transforms/TailCallElim/inf-recursion.ll is impacted by the "fabs" case

Thanks to @jroelofs for the atan2 accelerate veclib and associated test
additions, plus the hasOptimizedCodeGen addition.

Part of: Implement the atan2 HLSL Function #70096.

show more ...


# dfb60bb9 29-Oct-2024 Rohit Aggarwal <44664450+rohitaggarwal007@users.noreply.github.com>

Adding more vector calls for -fveclib=AMDLIBM (#109662)

AMD has it's own implementation of vector calls.
New vector calls are introduced in the library for exp10, log10, sincos and finite asin/acos

Adding more vector calls for -fveclib=AMDLIBM (#109662)

AMD has it's own implementation of vector calls.
New vector calls are introduced in the library for exp10, log10, sincos and finite asin/acos
Please refer [https://github.com/amd/aocl-libm-ose]

---------

Co-authored-by: Rohit Aggarwal <Rohit.Aggarwal@amd.com>

show more ...


Revision tags: llvmorg-19.1.3
# dcbf2c2c 21-Oct-2024 Farzon Lotfi <1802579+farzonl@users.noreply.github.com>

[Scalarizer][DirectX] support structs return types (#111569)

Based on this RFC:
https://discourse.llvm.org/t/rfc-allow-the-scalarizer-pass-to-scalarize-vectors-returned-in-structs/82306

LLVM int

[Scalarizer][DirectX] support structs return types (#111569)

Based on this RFC:
https://discourse.llvm.org/t/rfc-allow-the-scalarizer-pass-to-scalarize-vectors-returned-in-structs/82306

LLVM intrinsics do not support out params. To get around this limitation
implementers will make intrinsics return structs to capture a return
type and an out param. This implementation detail should not impact
scalarization since these cases should be elementwise operations.

## Three changes are needed.
- The CallInst visitor needs to be updated to handle Structs
- A new visitor is needed for `ExtractValue` instructions
- finsh needs to be update to handle structs so that insert elements are
properly propogated.

## Testing changes
- Add support for `llvm.frexp`
- Add support for `llvm.dx.splitdouble`

fixes https://github.com/llvm/llvm-project/issues/111437

show more ...


# 4ba1800b 16-Oct-2024 Amr Hesham <amr96@programmer.net>

[LLVM][NFC] Reduce copying of parameter in lambda (#110299)

Reduce redundant copy parameter in lambda

Fixes #95642


Revision tags: llvmorg-19.1.2, llvmorg-19.1.1
# cc7b24a4 24-Sep-2024 Piotr Fusik <p.fusik@samsung.com>

[NFC] Fix typos in comments (#109765)


Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4
# a156b5a4 02-Sep-2024 Yingwei Zheng <dtcxzyw2333@gmail.com>

[SLP] Add vectorization support for [u|s]cmp (#106747)

This patch adds vectorization support for [u|s]cmp intrinsic calls.


# d58d105c 30-Aug-2024 Simon Pilgrim <llvm-dev@redking.me.uk>

[Analysis] isTriviallyVectorizable - add vectorization support for acos/asin/atan and cosh/sinh/tanh intrinsics (#106584)

Show fallback cases in amdlibm tests where it doesn't have that specific op


Revision tags: llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# b22fa909 16-Jul-2024 mskamp <msk@posteo.org>

[ValueTracking][X86] Compute KnownBits for phadd/phsub (#92429)

Add KnownBits computations to ValueTracking and X86 DAG lowering.

These instructions add/subtract adjacent vector elements in t

[ValueTracking][X86] Compute KnownBits for phadd/phsub (#92429)

Add KnownBits computations to ValueTracking and X86 DAG lowering.

These instructions add/subtract adjacent vector elements in their operands. Example: phadd [X1, X2] [Y1, Y2] = [X1 + X2, Y1 + Y2]. This means that, in this example, we can compute the KnownBits of the operation by computing the KnownBits of [X1, X2] + [X1, X2] and [Y1, Y2] + [Y1, Y2] and intersecting the results. This approach also generalizes to all x86 vector types.

There are also the operations phadd.sw and phsub.sw, which perform saturating addition/subtraction. Use sadd_sat and ssub_sat to compute the KnownBits of these operations.

Also adjust the existing test case pr53247.ll because it can be transformed to a constant using the new KnownBits computation.

Fixes #82516.

show more ...


# 2d209d96 27-Jun-2024 Nikita Popov <npopov@redhat.com>

[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)

This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it does

[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)

This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...

`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.

show more ...


# d42b3926 26-Jun-2024 Nikita Popov <npopov@redhat.com>

[VectorUtils] Use SmallPtrSet::remove_if() (NFC)


# 5b4000dc 26-Jun-2024 Simon Pilgrim <llvm-dev@redking.me.uk>

[VectorUtils] Add llvm::scaleShuffleMaskElts wrapper for narrowShuffleMaskElts/widenShuffleMaskElts, NFC. (#96646)

Using the target number of vector elements, scaleShuffleMaskElts will try to use na

[VectorUtils] Add llvm::scaleShuffleMaskElts wrapper for narrowShuffleMaskElts/widenShuffleMaskElts, NFC. (#96646)

Using the target number of vector elements, scaleShuffleMaskElts will try to use narrowShuffleMaskElts/widenShuffleMaskElts to scale the shuffle mask accordingly.

Working on #58895 I didn't want to create yet another case where we have to handle both re-scaling cases.

show more ...


# 605e1847 24-Jun-2024 Nikita Popov <npopov@redhat.com>

[VectorUtils] Use poison instead of undef in findScalarElement()

Out-of-range extractelement returns poison, and so do poison elements
in the shufflevector mask.


Revision tags: llvmorg-18.1.8, llvmorg-18.1.7
# 1d874335 05-Jun-2024 Farzon Lotfi <1802579+farzonl@users.noreply.github.com>

[x86] Add tan intrinsic part 4 (#90503)

This change is an implementation of #87367's investigation on supporting
IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://disc

[x86] Add tan intrinsic part 4 (#90503)

This change is an implementation of #87367's investigation on supporting
IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294


Much of this change was following how G_FSIN and G_FCOS were used.

Changes:
- `llvm/docs/GlobalISel/GenericOpcode.rst` - Document the `G_FTAN`
opcode
- `llvm/docs/LangRef.rst` - Document the tan intrinsic
- `llvm/include/llvm/Analysis/VecFuncs.def` - Associate the tan
intrinsic as a vector function similar to the tanf libcall.
- `llvm/include/llvm/CodeGen/BasicTTIImpl.h` - Map the tan intrinsic to
`ISD::FTAN`
- `llvm/include/llvm/CodeGen/ISDOpcodes.h` - Define ISD opcodes for
`FTAN` and `STRICT_FTAN`
- `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic
- `llvm/include/llvm/IR/RuntimeLibcalls.def` - Define tan libcall
mappings
- `llvm/include/llvm/Target/GenericOpcodes.td` - Define the `G_FTAN`
Opcode
- `llvm/include/llvm/Support/TargetOpcodes.def` - Create a `G_FTAN`
Opcode handler
- `llvm/include/llvm/Target/GlobalISel/SelectionDAGCompat.td` - Map
`G_FTAN` to `ftan`
- `llvm/include/llvm/Target/TargetSelectionDAG.td` - Define `ftan`,
`strict_ftan`, and `any_ftan` and map them to the ISD opcodes for `FTAN`
and `STRICT_FTAN`
- `llvm/lib/Analysis/VectorUtils.cpp` - Associate the tan intrinsic as a
vector intrinsic
- `llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp` Map the tan intrinsic
to `G_FTAN` Opcode
- `llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp` - Add `G_FTAN` to
the list of floating point math operations also associate `G_FTAN` with
the `TAN_F` runtime lib.
- `llvm/lib/CodeGen/GlobalISel/Utils.cpp` - More floating point math
operation common behaviors.
- llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp - List the function
expansion operations for `FTAN` and `STRICT_FTAN`. Also define both
opcodes in `PromoteNode`.
- `llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp` - More `FTAN`
and `STRICT_FTAN` handling in the legalizer
- `llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h` - Define
`SoftenFloatRes_FTAN` and `ExpandFloatRes_FTAN`.
- `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp` - Define `FTAN`
as a legal vector operation.
- `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp` - Define
`FTAN` as a legal vector operation.
- `llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp` - define tan as an
intrinsic that doesn't return NaN.
- `llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp` Map
`LibFunc_tan`, `LibFunc_tanf`, and `LibFunc_tanl` to `ISD::FTAN`. Map
`Intrinsic::tan` to `ISD::FTAN` and add selection dag handling for
`Intrinsic::tan`.
- `llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp` - Define `ftan`
and `strict_ftan` names for the equivalent ISD opcodes.
- `llvm/lib/CodeGen/TargetLoweringBase.cpp` -Define a Tan128 libcall and
ISD::FTAN as a target lowering action.
- `llvm/lib/Target/X86/X86ISelLowering.cpp` - Add x86_64 lowering for
tan intrinsic

resolves https://github.com/llvm/llvm-project/issues/70082

show more ...


Revision tags: llvmorg-18.1.6, llvmorg-18.1.5
# cf328ff9 24-Apr-2024 Pierre van Houtryve <pierre.vanhoutryve@amd.com>

[IR] Memory Model Relaxation Annotations (#78569)

Implements the core/target-agnostic components of Memory Model
Relaxation Annotations.

RFC:
https://discourse.llvm.org/t/rfc-mmras-memory-model

[IR] Memory Model Relaxation Annotations (#78569)

Implements the core/target-agnostic components of Memory Model
Relaxation Annotations.

RFC:
https://discourse.llvm.org/t/rfc-mmras-memory-model-relaxation-annotations/76361/5

show more ...


Revision tags: llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1
# a1a590ef 05-Mar-2024 Yingwei Zheng <dtcxzyw2333@gmail.com>

[InstCombine] Fix miscompilation in PR83947 (#83993)

https://github.com/llvm/llvm-project/blob/762f762504967efbe159db5c737154b989afc9bb/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp#L394-L407

[InstCombine] Fix miscompilation in PR83947 (#83993)

https://github.com/llvm/llvm-project/blob/762f762504967efbe159db5c737154b989afc9bb/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp#L394-L407

Comment from @topperc:
> This transforms assumes the mask is a non-zero splat. We only know its
a splat and not provably all 0s. The mask is a constexpr that includes
the address of the global variable. We can't resolve the constant
expression to an exact value.

Fixes #83947.

show more ...


Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# 92289db8 17-Jan-2024 Alexandros Lamprineas <alexandros.lamprineas@arm.com>

[VFABI] Move the Vector ABI demangling utility to LLVMCore. (#77513)

This fixes #71892 allowing us to check magled names in the IR verifier.


# e512df3e 02-Jan-2024 Alexandros Lamprineas <alexandros.lamprineas@arm.com>

[LV] Fix crash when vectorizing function calls with linear args. (#76274)

llvm/lib/IR/Type.cpp:694:
Assertion `isValidElementType(ElementType) && "Element type of a
VectorType must be an i

[LV] Fix crash when vectorizing function calls with linear args. (#76274)

llvm/lib/IR/Type.cpp:694:
Assertion `isValidElementType(ElementType) && "Element type of a
VectorType must be an integer, floating point, or pointer type."'
failed.
Stack dump:
llvm::FixedVectorType::get(llvm::Type*, unsigned int)
llvm::VPWidenCallRecipe::execute(llvm::VPTransformState&)
llvm::VPBasicBlock::execute(llvm::VPTransformState*)
llvm::VPRegionBlock::execute(llvm::VPTransformState*)
llvm::VPlan::execute(llvm::VPTransformState*)
...

Happens with function calls of void return type.

show more ...


# ddb6db4d 19-Dec-2023 Paschalis Mpeis <paschalis.mpeis@arm.com>

[VFABI] Create FunctionType for vector functions (#75058)

`createFunctionType` returns a FunctionType that may contain a mask,
which is currently placed as the last parameter to the Function.
The

[VFABI] Create FunctionType for vector functions (#75058)

`createFunctionType` returns a FunctionType that may contain a mask,
which is currently placed as the last parameter to the Function.
The placement happens according to `VFParameters` of `VFInfo`, and it
should be able to handle VFABI specification changes.

Regarding the return type, it uses the scalar type of the input instruction,
as the specification does not encode in the mangled name such information.
If that ever happens, that information should be available from `VFInfo`.

show more ...


12345678