History log of /llvm-project/llvm/lib/CodeGen/TargetLoweringBase.cpp (Results 151 – 175 of 500)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 11ef356d 05-Feb-2021 Craig Topper <craig.topper@sifive.com>

[TargetLowering] Use Align in allowsMisalignedMemoryAccesses.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96097


# d745b82d 25-Jan-2021 Fangrui Song <i@maskray.me>

[XRay] Support DW_TAG_call_site and delete unneeded PATCHABLE_EVENT_CALL/PATCHABLE_TYPED_EVENT_CALL lowering


# 551aaa24 22-Jan-2021 Kazu Hirata <kazu@google.com>

[llvm] Use isDigit (NFC)


# cfc60730 18-Dec-2020 Jessica Paquette <jpaquette@apple.com>

[GlobalISel] Combine (a[0]) | (a[1] << k1) | ...| (a[m] << kn) into a wide load

This is a restricted version of the combine in `DAGCombiner::MatchLoadCombine`.
(See D27861)

This tries to recognize

[GlobalISel] Combine (a[0]) | (a[1] << k1) | ...| (a[m] << kn) into a wide load

This is a restricted version of the combine in `DAGCombiner::MatchLoadCombine`.
(See D27861)

This tries to recognize patterns like below (assuming a little-endian target):

```
s8* x = ...
s32 val = a[0] | (a[1] << 8) | (a[2] << 16) | (a[3] << 24)
->
s32 val = *((i32)a)

s8* x = ...
s32 val = a[3] | (a[2] << 8) | (a[1] << 16) | (a[0] << 24)
->
s32 val = BSWAP(*((s32)a))
```

(This patch also handles the big-endian target case as well, in which the first
example above has a BSWAP, and the second example above does not.)

To recognize the pattern, this searches from the last G_OR in the expression
tree.

E.g.

```
Reg Reg
\ /
OR_1 Reg
\ /
OR_2
\ Reg
.. /
Root
```

Each non-OR register in the tree is put in a list. Each register in the list is
then checked to see if it's an appropriate load + shift logic.

If every register is a load + potentially a shift, the combine checks if those
loads + shifts, when OR'd together, are equivalent to a wide load (possibly with
a BSWAP.)

To simplify things, this patch

(1) Only handles G_ZEXTLOADs (which appear to be the common case)
(2) Only works in a single MachineBasicBlock
(3) Only handles G_SHL as the bit twiddling to stick the small load into a
specific location

An IR example of this is here: https://godbolt.org/z/4sP9Pj (lifted from
test/CodeGen/AArch64/load-combine.ll)

At -Os on AArch64, this is a 0.5% code size improvement for CTMark/sqlite3,
and a 0.4% improvement for CTMark/7zip-benchmark.

Also fix a bug in `isPredecessor` which caused it to fail whenever `DefMI` was
the first instruction in the block.

Differential Revision: https://reviews.llvm.org/D94350

show more ...


# 8f004471 02-Jan-2021 Brandon Bergren <bdragon@FreeBSD.org>

[PowerPC] Add the LLVM triple for powerpcle [1/5]

Add a triple for powerpcle-*-*.

This is a little-endian encoding of the 32-bit PowerPC ABI, useful in certain niche situations:

1) A loader such a

[PowerPC] Add the LLVM triple for powerpcle [1/5]

Add a triple for powerpcle-*-*.

This is a little-endian encoding of the 32-bit PowerPC ABI, useful in certain niche situations:

1) A loader such as the FreeBSD loader which will be loading a little endian kernel. This is required for PowerPC64LE to load properly in pseries VMs.
Such a loader is implemented as a freestanding ELF32 LSB binary.

2) Userspace emulation of a 32-bit LE architecture such as x86 on 64-bit hosts such as PowerPC64LE with tools like box86 requires having a 32-bit LE toolchain and library set, as they operate by translating only the main binary and switching to native code when making library calls.

3) The Void Linux for PowerPC project is experimenting with running an entire powerpcle userland.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D93918

show more ...


Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2
# a89d751f 17-Dec-2020 Bjorn Pettersson <bjorn.a.pettersson@ericsson.com>

Add intrinsics for saturating float to int casts

This patch adds support for the fptoui.sat and fptosi.sat intrinsics,
which provide basically the same functionality as the existing fptoui
and fptos

Add intrinsics for saturating float to int casts

This patch adds support for the fptoui.sat and fptosi.sat intrinsics,
which provide basically the same functionality as the existing fptoui
and fptosi instructions, but will saturate (or return 0 for NaN) on
values unrepresentable in the target type, instead of returning
poison. Related mailing list discussion can be found at:
https://groups.google.com/d/msg/llvm-dev/cgDFaBmCnDQ/CZAIMj4IBAAJ

The intrinsics have overloaded source and result type and support
vector operands:

i32 @llvm.fptoui.sat.i32.f32(float %f)
i100 @llvm.fptoui.sat.i100.f64(double %f)
<4 x i32> @llvm.fptoui.sat.v4i32.v4f16(half %f)
// etc

On the SelectionDAG layer two new ISD opcodes are added,
FP_TO_UINT_SAT and FP_TO_SINT_SAT. These opcodes have two operands
and one result. The second operand is an integer constant specifying
the scalar saturation width. The idea here is that initially the
second operand and the scalar width of the result type are the same,
but they may change during type legalization. For example:

i19 @llvm.fptsi.sat.i19.f32(float %f)
// builds
i19 fp_to_sint_sat f, 19
// type legalizes (through integer result promotion)
i32 fp_to_sint_sat f, 19

I went for this approach, because saturated conversion does not
compose well. There is no good way of "adjusting" a saturating
conversion to i32 into one to i19 short of saturating twice.
Specifying the saturation width separately allows directly saturating
to the correct width.

There are two baseline expansions for the fp_to_xint_sat opcodes. If
the integer bounds can be exactly represented in the float type and
fminnum/fmaxnum are legal, we can expand to something like:

f = fmaxnum f, FP(MIN)
f = fminnum f, FP(MAX)
i = fptoxi f
i = select f uo f, 0, i # unnecessary if unsigned as 0 = MIN

If the bounds cannot be exactly represented, we expand to something
like this instead:

i = fptoxi f
i = select f ult FP(MIN), MIN, i
i = select f ogt FP(MAX), MAX, i
i = select f uo f, 0, i # unnecessary if unsigned as 0 = MIN

It should be noted that this expansion assumes a non-trapping fptoxi.

Initial tests are for AArch64, x86_64 and ARM. This exercises all of
the scalar and vector legalization. ARM is included to test float
softening.

Original patch by @nikic and @ebevhan (based on D54696).

Differential Revision: https://reviews.llvm.org/D54749

show more ...


# 08e287aa 14-Dec-2020 QingShan Zhang <qshanz@cn.ibm.com>

[PowerPC][FP128] Fix the incorrect signature for math library call

The runtime library has two family library implementation for ppc_fp128 and fp128.
For IBM Long double(ppc_fp128), it is suffixed w

[PowerPC][FP128] Fix the incorrect signature for math library call

The runtime library has two family library implementation for ppc_fp128 and fp128.
For IBM Long double(ppc_fp128), it is suffixed with 'l', i.e(sqrtl). For
IEEE Long double(fp128), it is suffixed with "ieee128" or "f128".
We miss to map several libcall for IEEE Long double.

Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D91675

show more ...


Revision tags: llvmorg-11.0.1-rc1
# c5978f42 21-Oct-2020 Tim Northover <t.p.northover@gmail.com>

UBSAN: emit distinctive traps

Sometimes people get minimal crash reports after a UBSAN incident. This change
tags each trap with an integer representing the kind of failure encountered,
which can ai

UBSAN: emit distinctive traps

Sometimes people get minimal crash reports after a UBSAN incident. This change
tags each trap with an integer representing the kind of failure encountered,
which can aid in tracking down the root cause of the problem.

show more ...


# 78a57069 06-Dec-2020 Martin Storsjö <martin@martin.st>

[CodeGen] Restore accessing __stack_chk_guard via a .refptr stub on mingw after 2518433f861fcb87

Add tests for this particular detail for x86 and arm (similar tests
already existed for x86_64 and aa

[CodeGen] Restore accessing __stack_chk_guard via a .refptr stub on mingw after 2518433f861fcb87

Add tests for this particular detail for x86 and arm (similar tests
already existed for x86_64 and aarch64).

The libssp implementation may be located in a separate DLL, and in
those cases, the references need to be in a .refptr stub, to avoid
needing to touch up code in the text section at runtime (which is
supported but inefficient for x86, and unsupported for arm).

Differential Revision: https://reviews.llvm.org/D92738

show more ...


# 2518433f 05-Dec-2020 Fangrui Song <i@maskray.me>

Make __stack_chk_guard dso_local if Reloc::Static

This is currently implied by TargetMachine::shouldAssumeDSOLocal
but will be changed in the future.


Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3
# f7bc7c29 03-Jul-2020 Hsiangkai Wang <kai.wang@sifive.com>

[RISCV] Support Zfh half-precision floating-point extension.

Support "Zfh" extension according to
https://github.com/riscv/riscv-isa-manual/blob/zfh/src/zfh.tex

Differential Revision: https://revie

[RISCV] Support Zfh half-precision floating-point extension.

Support "Zfh" extension according to
https://github.com/riscv/riscv-isa-manual/blob/zfh/src/zfh.tex

Differential Revision: https://reviews.llvm.org/D90738

show more ...


# 4d7df43f 19-Nov-2020 Pavel Iliin <Pavel.Iliin@arm.com>

[AArch64] Out-of-line atomics (-moutline-atomics) implementation.

This patch implements out of line atomics for LSE deployment
mechanism. Details how it works can be found in llvm/docs/Atomics.rst
O

[AArch64] Out-of-line atomics (-moutline-atomics) implementation.

This patch implements out of line atomics for LSE deployment
mechanism. Details how it works can be found in llvm/docs/Atomics.rst
Options -moutline-atomics and -mno-outline-atomics to enable and disable it
were added to clang driver. This is clang and llvm part of out-of-line atomics
interface, library part is already supported by libgcc. Compiler-rt
support is provided in separate patch.

Differential Revision: https://reviews.llvm.org/D91157

show more ...


# 80732011 18-Nov-2020 Adhemerval Zanella <adhemerval.zanella@linaro.org>

[AArch64] Lower fptrunc/fpext from/to FP128t to/from FP16

The compiler-rt part which adds the emitted symbols is handled in
a subsequent patch.

Differential Revision: https://reviews.llvm.org/D91731


# c126eb75 04-Nov-2020 Cameron McInally <mcinally@cray.com>

[SelectionDAG] Add legalizations for VECREDUCE_SEQ_FMUL

Hook up legalizations for VECREDUCE_SEQ_FMUL. This is following up on the VECREDUCE_SEQ_FADD work from D90247.

Differential Revision: https:/

[SelectionDAG] Add legalizations for VECREDUCE_SEQ_FMUL

Hook up legalizations for VECREDUCE_SEQ_FMUL. This is following up on the VECREDUCE_SEQ_FADD work from D90247.

Differential Revision: https://reviews.llvm.org/D90644

show more ...


# dda1e74b 30-Oct-2020 Cameron McInally <mcinally@cray.com>

[Legalize] Add legalizations for VECREDUCE_SEQ_FADD

Add Legalization support for VECREDUCE_SEQ_FADD, so that we don't need to depend on ExpandReductionsPass.

Differential Revision: https://reviews.

[Legalize] Add legalizations for VECREDUCE_SEQ_FADD

Add Legalization support for VECREDUCE_SEQ_FADD, so that we don't need to depend on ExpandReductionsPass.

Differential Revision: https://reviews.llvm.org/D90247

show more ...


# 35a531fb 29-Sep-2020 David Sherwood <david.sherwood@arm.com>

[SVE][CodeGen][NFC] Replace TypeSize comparison operators with their scalar equivalents

In certain places in llvm/lib/CodeGen we were relying upon the TypeSize
comparison operators when in fact the

[SVE][CodeGen][NFC] Replace TypeSize comparison operators with their scalar equivalents

In certain places in llvm/lib/CodeGen we were relying upon the TypeSize
comparison operators when in fact the code was only ever expecting
either scalar values or fixed width vectors. I've changed some of these
places to use the equivalent scalar operator.

Differential Revision: https://reviews.llvm.org/D88482

show more ...


# 1687a8d8 13-Oct-2020 Craig Topper <craig.topper@intel.com>

[X86][SelectionDAG] Add SADDO_CARRY and SSUBO_CARRY to support multipart signed add/sub overflow legalization.

This passes existing X86 test but I'm not sure if it handles all type
legalization case

[X86][SelectionDAG] Add SADDO_CARRY and SSUBO_CARRY to support multipart signed add/sub overflow legalization.

This passes existing X86 test but I'm not sure if it handles all type
legalization cases it needs to.

Alternative to D89200

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D89222

show more ...


# c5ba0d33 23-Sep-2020 David Sherwood <david.sherwood@arm.com>

[SVE] Make ElementCount and TypeSize use a new PolySize class

I have introduced a new template PolySize class, where the template
parameter determines the type of quantity, i.e. for an element
count

[SVE] Make ElementCount and TypeSize use a new PolySize class

I have introduced a new template PolySize class, where the template
parameter determines the type of quantity, i.e. for an element
count this is just an unsigned value. The ElementCount class is
now just a simple derivation of PolySize<unsigned>, whereas TypeSize
is more complicated because it still needs to contain the uint64_t
cast operator, since there are still many places in the code that
rely upon this implicit cast. As such the class also still needs
some of it's own operators.

I've tried to minimise the amount of code in the base PolySize
class, which led to a couple of changes:

1. In some places we were relying on '==' operator comparisons
between ElementCounts and the scalar value 1. I didn't put this
operator in the new PolySize class, and thought it was actually
clearer to use the isScalar() function instead.
2. I removed the isByteSized function and replaced it with calls
to isKnownMultipleOf(8).

I've also renamed NextPowerOf2 to be coefficientNextPowerOf2 so
that it's more consistent with coefficientDivideBy.

Differential Revision: https://reviews.llvm.org/D88409

show more ...


# b8ce6a67 01-Oct-2020 David Sherwood <david.sherwood@arm.com>

[SVE][CodeGen] Add new EVT/MVT getFixedSizeInBits() functions

When we know that a particular type is always going to be fixed
width we have so far been writing code like this:

getSizeInBits().get

[SVE][CodeGen] Add new EVT/MVT getFixedSizeInBits() functions

When we know that a particular type is always going to be fixed
width we have so far been writing code like this:

getSizeInBits().getFixedSize()

Since we are doing this in quite a few places now it seems to make
sense to add a new helper function that allows us to replace
these calls with a single getFixedSizeInBits() call.

Differential Revision: https://reviews.llvm.org/D88649

show more ...


# bafdd113 11-Sep-2020 David Sherwood <david.sherwood@arm.com>

[SVE] Replace / operator in TypeSize/ElementCount with divideCoefficientBy

After some recent upstream discussion we decided that it was best
to avoid having the / operator for both ElementCount and

[SVE] Replace / operator in TypeSize/ElementCount with divideCoefficientBy

After some recent upstream discussion we decided that it was best
to avoid having the / operator for both ElementCount and TypeSize,
since this could give the impression that these classes can be used
in the same way as basic integer integer types. However, division
for scalable types is a bit odd because we are only dividing the
minimum quantity by a value, as opposed to something like:

(MinSize * Vscale) / SomeValue

This is why when performing division it's important the caller
first establishes whether the operation makes sense, perhaps by
calling isKnownMultipleOf() prior to division. The caller must now
explictly call divideCoefficientBy() on the class to perform the
operation.

Differential Revision: https://reviews.llvm.org/D87700

show more ...


# e077367a 18-Sep-2020 David Sherwood <david.sherwood@arm.com>

[SVE] Make EVT::getScalarSizeInBits and others consistent with Type::getScalarSizeInBits

An existing function Type::getScalarSizeInBits returns a uint64_t
instead of a TypeSize class because the cal

[SVE] Make EVT::getScalarSizeInBits and others consistent with Type::getScalarSizeInBits

An existing function Type::getScalarSizeInBits returns a uint64_t
instead of a TypeSize class because the caller is requesting a
scalar size, which cannot be scalable. This patch makes other
similar functions requesting a scalar size consistent with that,
thereby eliminating more than 1000 implicit TypeSize -> uint64_t
casts.

Differential revision: https://reviews.llvm.org/D87889

show more ...


# ad3d6f99 12-Sep-2020 Craig Topper <craig.topper@intel.com>

[SelectionDAG][X86][ARM][AArch64] Add ISD opcode for __builtin_parity. Expand it to shifts and xors.

Clang emits (and (ctpop X), 1) for __builtin_parity. If ctpop
isn't natively supported by the tar

[SelectionDAG][X86][ARM][AArch64] Add ISD opcode for __builtin_parity. Expand it to shifts and xors.

Clang emits (and (ctpop X), 1) for __builtin_parity. If ctpop
isn't natively supported by the target, this leads to poor codegen
due to the expansion of ctpop being more complex than what is needed
for parity.

This adds a DAG combine to convert the pattern to ISD::PARITY
before operation legalization. Type legalization is updated
to handled Expanding and Promoting this operation. If after type
legalization, CTPOP is supported for this type, LegalizeDAG will
turn it back into CTPOP+AND. Otherwise LegalizeDAG will emit a
series of shifts and xors followed by an AND with 1.

I've avoided vectors in this patch to avoid more legalization
complexity for this patch.

X86 previously had a custom DAG combiner for this. This is now
moved to Custom lowering for the new opcode. There is a minor
regression in vector-reduce-xor-bool.ll, but a follow up patch
can easily fix that.

Fixes PR47433

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D87209

show more ...


# f4257c58 14-Aug-2020 David Sherwood <david.sherwood@arm.com>

[SVE] Make ElementCount members private

This patch changes ElementCount so that the Min and Scalable
members are now private and can only be accessed via the get
functions getKnownMinValue() and isS

[SVE] Make ElementCount members private

This patch changes ElementCount so that the Min and Scalable
members are now private and can only be accessed via the get
functions getKnownMinValue() and isScalable(). In addition I've
added some other member functions for more commonly used operations.
Hopefully this makes the class more useful and will reduce the
need for calling getKnownMinValue().

Differential Revision: https://reviews.llvm.org/D86065

show more ...


# d870e363 27-Aug-2020 Brad Smith <brad@comstyle.com>

[SSP] Restore setting the visibility of __guard_local to hidden for better code generation.

Patch by: Philip Guenther


# a407ec9b 19-Aug-2020 Mehdi Amini <joker.eph@gmail.com>

Revert "Revert "[NFC][llvm] Make the contructors of `ElementCount` private.""

Was reverted because MLIR/Flang builds were broken, these APIs have been
fixed in the meantime.


12345678910>>...20