History log of /llvm-project/llvm/test/CodeGen/ARM/fp16-insert-extract.ll (Results 1 – 11 of 11)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 2cca53c8 13-Dec-2021 Alexey Bataev <a.bataev@outlook.com>

[DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer.

We can process the long shuffles (working across several actual
vector registers) in the best way if we take t

[DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer.

We can process the long shuffles (working across several actual
vector registers) in the best way if we take the actual register
represantion into account. We can build more correct representation of
register shuffles, improve number of recognised buildvector sequences.
Also, same function can be used to improve the cost model for the
shuffles. in future patches.

Part of D100486

Differential Revision: https://reviews.llvm.org/D115653

show more ...


# 5f7ac159 20-Apr-2022 Alexey Bataev <a.bataev@outlook.com>

Revert "[DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer."

This reverts commit 2f49163b3365e5dc046b03e422a048dd45aee3f0 to fix
a buildbot failure. Reported in h

Revert "[DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer."

This reverts commit 2f49163b3365e5dc046b03e422a048dd45aee3f0 to fix
a buildbot failure. Reported in https://lab.llvm.org/buildbot#builders/105/builds/24284

show more ...


# 2f49163b 13-Dec-2021 Alexey Bataev <a.bataev@outlook.com>

[DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer.

We can process the long shuffles (working across several actual
vector registers) in the best way if we take t

[DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer.

We can process the long shuffles (working across several actual
vector registers) in the best way if we take the actual register
represantion into account. We can build more correct representation of
register shuffles, improve number of recognised buildvector sequences.
Also, same function can be used to improve the cost model for the
shuffles. in future patches.

Part of D100486

Differential Revision: https://reviews.llvm.org/D115653

show more ...


Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4
# 1da52ef2 19-Sep-2021 David Green <david.green@arm.com>

[ARM] Add VGETLANEu patterns for v4f16 and v8f16

These were apparently missing, having no pattern that could convert a
VGETLANEu of a v4f16 to an i32. Added bf16 whilst here, following the
same code.


Revision tags: llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2
# 649a3d00 04-Feb-2021 David Green <david.green@arm.com>

[ARM] Handle f16 in GeneratePerfectShuffle

This new f16 shuffle under Neon would hit an assert in
GeneratePerfectShuffle as it would try to treat a f16 vector as an i8.
Add f16 handling, treating th

[ARM] Handle f16 in GeneratePerfectShuffle

This new f16 shuffle under Neon would hit an assert in
GeneratePerfectShuffle as it would try to treat a f16 vector as an i8.
Add f16 handling, treating them like an i16.

Differential Revision: https://reviews.llvm.org/D95446

show more ...


Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1
# 9e2768a3 27-Jan-2021 David Green <david.green@arm.com>

[ARM] Add neon FP16 scalar_to_vector patterns.

This adds some simple fp16 scalar_to_vector patterns, preventing a
selection failure if this came up.

Differential Revision: https://reviews.llvm.org/

[ARM] Add neon FP16 scalar_to_vector patterns.

This adds some simple fp16 scalar_to_vector patterns, preventing a
selection failure if this came up.

Differential Revision: https://reviews.llvm.org/D95427

show more ...


Revision tags: llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2
# d428f881 26-Jun-2020 David Green <david.green@arm.com>

[ARM] VCVTT fpround instruction selection

Similar to the recent patch for fpext, this adds vcvtb and vcvtt with
insert into vector instruction selection patterns for fptruncs. This
helps clear up a

[ARM] VCVTT fpround instruction selection

Similar to the recent patch for fpext, this adds vcvtb and vcvtt with
insert into vector instruction selection patterns for fptruncs. This
helps clear up a lot of register shuffling that we would otherwise do.

Differential Revision: https://reviews.llvm.org/D81637

show more ...


# 76e0e1a5 26-Jun-2020 David Green <david.green@arm.com>

[ARM] VCVTT instruction selection

We current extract and convert from a top lane of a f16 vector using a
VMOVX;VCVTB pair. We can simplify that to use a single VCVTT. The
pattern is mostly copied fr

[ARM] VCVTT instruction selection

We current extract and convert from a top lane of a f16 vector using a
VMOVX;VCVTB pair. We can simplify that to use a single VCVTT. The
pattern is mostly copied from a vector extract pattern, but produces a
VCVTTHS f32 directly.

This had to move some code around so that ARMInstrVFP had access to the
required pattern frags that were previously part of ARMInstrNEON.

Differential Revision: https://reviews.llvm.org/D81556

show more ...


# 75268812 19-Jun-2020 Mikhail Maltsev <mikhail.maltsev@arm.com>

[ARM][BFloat] Lowering of create/get/set/dup intrinsics

This patch adds codegen for the following BFloat
operations to the ARM backend:
* concatenation of bf16 vectors
* bf16 vector element extracti

[ARM][BFloat] Lowering of create/get/set/dup intrinsics

This patch adds codegen for the following BFloat
operations to the ARM backend:
* concatenation of bf16 vectors
* bf16 vector element extraction
* bf16 vector element insertion
* duplication of a bf16 value into each lane of a vector
* duplication of a bf16 vector lane into each lane

Differential Revision: https://reviews.llvm.org/D81411

show more ...


# 61ef2d27 10-Jun-2020 David Green <david.green@arm.com>

[ARM] Update fp16-insert-extract.ll test checks. NFC


Revision tags: llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2
# 08da01b4 04-Jun-2019 Mikhail Maltsev <mikhail.maltsev@arm.com>

[ARM] Add FP16 vector insert/extract patterns

This change adds two FP16 extraction and two insertion patterns
(one per possible vector length).
Extractions are handled by copying a Q/D register into

[ARM] Add FP16 vector insert/extract patterns

This change adds two FP16 extraction and two insertion patterns
(one per possible vector length).
Extractions are handled by copying a Q/D register into one of VFP2
class registers, where single FP32 sub-registers can be accessed. Then
the extraction of even lanes are simple sub-register extractions
(because we don't care about the top parts of registers for FP16
operations). Odd lanes need an additional VMOVX instruction.

Unfortunately, insertions cannot be handled in the same way, because:
* There is no instruction to insert FP16 into an even lane (VINS only
works with odd lanes)
* The patterns for odd lanes will have a form of a DAG (not a tree),
and will not be implementable in pure tablegen

Because of this insertions are handled in the same way as 16-bit
integer insertions (with conversions between FP registers and GPRs
using VMOVHR instructions).

Without these patterns the ARM backend would sometimes fail during
instruction selection.

This patch also adds patterns which combine:
* an FP16 element extraction and a store into a single VST1
instruction
* an FP16 load and insertion into a single VLD1 instruction

Differential Revision: https://reviews.llvm.org/D62651

llvm-svn: 362482

show more ...