History log of /llvm-project/llvm/lib/Target/X86/X86ShuffleDecodeConstantPool.cpp (Results 1 – 25 of 40)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1
# a9bceb2b 30-Sep-2021 Jay Foad <jay.foad@amd.com>

[APInt] Stop using soft-deprecated constructors and methods in llvm. NFC.

Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in llvm, exc

[APInt] Stop using soft-deprecated constructors and methods in llvm. NFC.

Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in llvm, except for the APInt
unit tests which should still test the deprecated methods.

Differential Revision: https://reviews.llvm.org/D110807

show more ...


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2
# a3330717 03-Dec-2020 Kazu Hirata <kazu@google.com>

[X86] Remove DecodeVPERMVMask and DecodeVPERMV3Mask

This patch removes the variants of DecodeVPERMVMask and
DecodeVPERMV3Mask that take "const Constant *C" as they are not used
anymore.

They were i

[X86] Remove DecodeVPERMVMask and DecodeVPERMV3Mask

This patch removes the variants of DecodeVPERMVMask and
DecodeVPERMV3Mask that take "const Constant *C" as they are not used
anymore.

They were introduced on Sep 8, 2015 in commit
e88038f23517ffc741acfd307ff92e2b1af136d8.

The last use of DecodeVPERMVMask(const Constant *C, ...) was removed
on Feb 7, 2016 in commit 73fc26b44a8591b15f13eaffef17e67161c69388.

The last use of DecodeVPERMV3Mask(const Constant *C, ...) was removed
on May 28, 2018 in commit dcfcfdb0d166fff8388bdd2edc5a2948054c9da1.

Differential Revision: https://reviews.llvm.org/D91926

show more ...


Revision tags: llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4
# 0793b456 22-Sep-2020 Simon Pilgrim <llvm-dev@redking.me.uk>

[X86] Add missing namespace closure comments. NFCI.

Fixes some clang-tidy llvm-namespace-comment warnings.


Revision tags: llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3
# 0da1e7eb 29-Jun-2020 Christopher Tetreault <ctetreau@quicinc.com>

[SVE] Remove calls to VectorType::getNumElements from X86

Reviewers: efriedma, RKSimon, craig.topper, fpetrogalli, c-rhodes

Reviewed By: RKSimon

Subscribers: tschuett, hiraditya, rkruppe, psnobl,

[SVE] Remove calls to VectorType::getNumElements from X86

Reviewers: efriedma, RKSimon, craig.topper, fpetrogalli, c-rhodes

Reviewed By: RKSimon

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82508

show more ...


Revision tags: llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1
# dd24fb38 17-Apr-2020 Christopher Tetreault <ctetreau@quicinc.com>

Clean up usages of asserting vector getters in Type

Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicate

Clean up usages of asserting vector getters in Type

Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: craig.topper, sdesmalen, efriedma, RKSimon

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77264

show more ...


# 6dbf1a12 13-Apr-2020 Craig Topper <craig.topper@intel.com>

[X86] Move X86ShuffleDecode.cpp/h into MCTargetDesc and remove X86Utils library. NFC

The shuffle decoding is used by X86ISelLowering and
MCTargetDesc/X86InstComments. The latter used to be in a
sepa

[X86] Move X86ShuffleDecode.cpp/h into MCTargetDesc and remove X86Utils library. NFC

The shuffle decoding is used by X86ISelLowering and
MCTargetDesc/X86InstComments. The latter used to be in a
separate InstPrinter library. The Utils library existed to allow
InstPrinter and CodeGen to share the shuffle decoding. Since
X86InstComments now lives in the MCTargetDesc, which CodeGen
already depends on, we can sink the shuffle decoding there as well.

Differential Revision: https://reviews.llvm.org/D77980

show more ...


Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3
# f5ad93d2 02-Mar-2020 Simon Pilgrim <llvm-dev@redking.me.uk>

[X86] Cleanup ShuffleDecode implementations. NFCI.
- Remove unnecessary includes from the headers
- Fix cppcheck definition/declaration arg mismatch warnings
- Tidyup old comments (MVT usage was r

[X86] Cleanup ShuffleDecode implementations. NFCI.
- Remove unnecessary includes from the headers
- Fix cppcheck definition/declaration arg mismatch warnings
- Tidyup old comments (MVT usage was removed a long time ago)
- Use SmallVector::append for repeated mask entries

show more ...


Revision tags: llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1
# 2946cd70 19-Jan-2019 Chandler Carruth <chandlerc@gmail.com>

Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the ne

Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636

show more ...


Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1
# c8e183f9 22-Oct-2018 Craig Topper <craig.topper@intel.com>

Recommit r344877 "[X86] Stop promoting integer loads to vXi64"

I've included a fix to DAGCombiner::ForwardStoreValueToDirectLoad that I believe will prevent the previous miscompile.

Original commit

Recommit r344877 "[X86] Stop promoting integer loads to vXi64"

I've included a fix to DAGCombiner::ForwardStoreValueToDirectLoad that I believe will prevent the previous miscompile.

Original commit message:

Theoretically this was done to simplify the amount of isel patterns that were needed. But it also meant a substantial number of our isel patterns have to match an explicit bitcast. By making the vXi32/vXi16/vXi8 types legal for loads, DAG combiner should be able to change the load type to rem

I had to add some additional plain load instruction patterns and a few other special cases, but overall the isel table has reduced in size by ~12000 bytes. So it looks like this promotion was hurting us more than helping.

I still have one crash in vector-trunc.ll that I'm hoping @RKSimon can help with. It seems to relate to using getTargetConstantFromNode on a load that was shrunk due to an extract_subvector combine after the constant pool entry was created. So we end up decoding more mask elements than the lo

I'm hoping this patch will simplify the number of patterns needed to remove the and/or/xor promotion.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits, RKSimon

Differential Revision: https://reviews.llvm.org/D53306

llvm-svn: 344965

show more ...


# 8d8dcfe6 22-Oct-2018 Craig Topper <craig.topper@intel.com>

Revert r344877 "[X86] Stop promoting integer loads to vXi64"

Sam McCall reported miscompiles in some tensorflow code. Reverting while I try to figure out.

llvm-svn: 344921


# 321df5b0 21-Oct-2018 Craig Topper <craig.topper@intel.com>

[X86] Stop promoting integer loads to vXi64

Summary:
Theoretically this was done to simplify the amount of isel patterns that were needed. But it also meant a substantial number of our isel patterns

[X86] Stop promoting integer loads to vXi64

Summary:
Theoretically this was done to simplify the amount of isel patterns that were needed. But it also meant a substantial number of our isel patterns have to match an explicit bitcast. By making the vXi32/vXi16/vXi8 types legal for loads, DAG combiner should be able to change the load type to remove the bitcast.

I had to add some additional plain load instruction patterns and a few other special cases, but overall the isel table has reduced in size by ~12000 bytes. So it looks like this promotion was hurting us more than helping.

I still have one crash in vector-trunc.ll that I'm hoping @RKSimon can help with. It seems to relate to using getTargetConstantFromNode on a load that was shrunk due to an extract_subvector combine after the constant pool entry was created. So we end up decoding more mask elements than the load size.

I'm hoping this patch will simplify the number of patterns needed to remove the and/or/xor promotion.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits, RKSimon

Differential Revision: https://reviews.llvm.org/D53306

llvm-svn: 344877

show more ...


# 8de07b4d 21-Oct-2018 Craig Topper <craig.topper@intel.com>

Revert r344873 "foo"

Rebase gone wrong left this in my tree.

llvm-svn: 344875


# e367039f 21-Oct-2018 Craig Topper <craig.topper@intel.com>

foo

llvm-svn: 344873


Revision tags: llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1
# ad24af7f 13-Dec-2017 Michael Zolotukhin <mzolotukhin@apple.com>

Remove redundant includes from lib/Target/X86.

llvm-svn: 320636


Revision tags: llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1, llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1, llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1
# b02667c4 10-Mar-2017 Simon Pilgrim <llvm-dev@redking.me.uk>

[APInt] Add APInt::insertBits() method to insert an APInt into a larger APInt

We currently have to insert bits via a temporary variable of the same size as the target with various shift/mask stages,

[APInt] Add APInt::insertBits() method to insert an APInt into a larger APInt

We currently have to insert bits via a temporary variable of the same size as the target with various shift/mask stages, resulting in further temporary variables, all of which require the allocation of memory for large APInts (MaskSizeInBits > 64).

This is another of the compile time issues identified in PR32037 (see also D30265).

This patch adds the APInt::insertBits() helper method which avoids the temporary memory allocation and masks/inserts the raw bits directly into the target.

Differential Revision: https://reviews.llvm.org/D30780

llvm-svn: 297458

show more ...


# e86b7e22 09-Mar-2017 Simon Pilgrim <llvm-dev@redking.me.uk>

[X86][SSE] Speed up constant pool shuffle mask decoding with direct copy (PR32037).

If the constants are already the correct size, we can copy them directly into the shuffle mask.

llvm-svn: 297381


Revision tags: llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3
# e1be95c3 27-Feb-2017 Craig Topper <craig.topper@gmail.com>

[X86] Fix SmallVector sizes in constant pool shuffle decoding to avoid heap allocation

Some of the vectors are under sized to avoid heap allocation. In one case the vector was oversized.

Differenti

[X86] Fix SmallVector sizes in constant pool shuffle decoding to avoid heap allocation

Some of the vectors are under sized to avoid heap allocation. In one case the vector was oversized.

Differential Revision: https://reviews.llvm.org/D30387

llvm-svn: 296353

show more ...


# 53e5a38d 27-Feb-2017 Craig Topper <craig.topper@gmail.com>

[X86] Use APInt instead of SmallBitVector for tracking undef elements in constant pool shuffle decoding

Summary:
SmallBitVector uses a malloc for more than 58 bits on a 64-bit target and more than 2

[X86] Use APInt instead of SmallBitVector for tracking undef elements in constant pool shuffle decoding

Summary:
SmallBitVector uses a malloc for more than 58 bits on a 64-bit target and more than 27 bits on a 32-bit target. Some of the vector types we deal with here use more than those number of elements and therefore cause a malloc.

APInt on the other hand supports up to 64 bits without a malloc. That's the maximum number of bits we need here so we can avoid a malloc for all cases by using APInt. This will incur a minor increase in stack usage due to APInt storing the bit count separately from the data bits unlike SmallBitVector, but that should be ok.

Reviewers: RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30386

llvm-svn: 296352

show more ...


# 0f5fb5f5 25-Feb-2017 Simon Pilgrim <llvm-dev@redking.me.uk>

[APInt] Add APInt::extractBits() method to extract APInt subrange (reapplied)

The current pattern for extract bits in range is typically:

Mask.lshr(BitOffset).trunc(SubSizeInBits);

Which can be pa

[APInt] Add APInt::extractBits() method to extract APInt subrange (reapplied)

The current pattern for extract bits in range is typically:

Mask.lshr(BitOffset).trunc(SubSizeInBits);

Which can be particularly slow for large APInts (MaskSizeInBits > 64) as they require the allocation of memory for the temporary variable.

This is another of the compile time issues identified in PR32037 (see also D30265).

This patch adds the APInt::extractBits() helper method which avoids the temporary memory allocation.

Differential Revision: https://reviews.llvm.org/D30336

llvm-svn: 296272

show more ...


# cdf2bd65 24-Feb-2017 Simon Pilgrim <llvm-dev@redking.me.uk>

Revert: r296141 [APInt] Add APInt::extractBits() method to extract APInt subrange

The current pattern for extract bits in range is typically:

Mask.lshr(BitOffset).trunc(SubSizeInBits);

Which can b

Revert: r296141 [APInt] Add APInt::extractBits() method to extract APInt subrange

The current pattern for extract bits in range is typically:

Mask.lshr(BitOffset).trunc(SubSizeInBits);

Which can be particularly slow for large APInts (MaskSizeInBits > 64) as they require the allocation of memory for the temporary variable.

This is another of the compile time issues identified in PR32037 (see also D30265).

This patch adds the APInt::extractBits() helper method which avoids the temporary memory allocation.

Differential Revision: https://reviews.llvm.org/D30336

llvm-svn: 296147

show more ...


# bd9fb2ae 24-Feb-2017 Simon Pilgrim <llvm-dev@redking.me.uk>

[APInt] Add APInt::extractBits() method to extract APInt subrange

The current pattern for extract bits in range is typically:

Mask.lshr(BitOffset).trunc(SubSizeInBits);

Which can be particularly s

[APInt] Add APInt::extractBits() method to extract APInt subrange

The current pattern for extract bits in range is typically:

Mask.lshr(BitOffset).trunc(SubSizeInBits);

Which can be particularly slow for large APInts (MaskSizeInBits > 64) as they require the allocation of memory for the temporary variable.

This is another of the compile time issues identified in PR32037 (see also D30265).

This patch adds the APInt::extractBits() helper method which avoids the temporary memory allocation.

Differential Revision: https://reviews.llvm.org/D30336

llvm-svn: 296141

show more ...


# aed35227 24-Feb-2017 Simon Pilgrim <llvm-dev@redking.me.uk>

[APInt] Add APInt::setBits() method to set all bits in range

The current pattern for setting bits in range is typically:

Mask |= APInt::getBitsSet(MaskSizeInBits, LoPos, HiPos);

Which can be parti

[APInt] Add APInt::setBits() method to set all bits in range

The current pattern for setting bits in range is typically:

Mask |= APInt::getBitsSet(MaskSizeInBits, LoPos, HiPos);

Which can be particularly slow for large APInts (MaskSizeInBits > 64) as they require the allocation memory for the temporary variable.

This is one of the key compile time issues identified in PR32037.

This patch adds the APInt::setBits() helper method which avoids the temporary memory allocation completely, this first implementation uses setBit() internally instead but already significantly reduces the regression in PR32037 (~10% drop). Additional optimization may be possible.

I investigated whether there is need for APInt::clearBits() and APInt::flipBits() equivalents but haven't seen these patterns to be particularly common, but reusing the code would be trivial.

Differential Revision: https://reviews.llvm.org/D30265

llvm-svn: 296102

show more ...


# 3a895c48 22-Feb-2017 Simon Pilgrim <llvm-dev@redking.me.uk>

[X86][SSE] Use APInt::getBitsSet() instead of APInt::getLowBitsSet().shl() separately. NFCI.

llvm-svn: 295845


Revision tags: llvmorg-4.0.0-rc2, llvmorg-4.0.0-rc1, llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1
# 96ef0c11 23-Oct-2016 Simon Pilgrim <llvm-dev@redking.me.uk>

Use APInt::isAllOnesValue instead of popcnt. NFCI.

More obvious implementation and faster too.

llvm-svn: 284937


# 448358b5 18-Oct-2016 Craig Topper <craig.topper@gmail.com>

[X86] Fix DecodeVPERMVMask to handle cases where the constant pool entry has a different type than the shuffle itself.

This is especially important for 32-bit targets with 64-bit shuffle elements.

[X86] Fix DecodeVPERMVMask to handle cases where the constant pool entry has a different type than the shuffle itself.

This is especially important for 32-bit targets with 64-bit shuffle elements.

llvm-svn: 284453

show more ...


12