History log of /llvm-project/llvm/lib/CodeGen/TargetLoweringBase.cpp (Results 1 – 25 of 500)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init
# ab976a17 24-Jan-2025 Stephen Long <63318318+steplong@users.noreply.github.com>

PreISelIntrinsicLowering: Lower llvm.exp/llvm.exp2 to a loop if scalable vec arg (#117568)


# d9f165dd 20-Jan-2025 Graham Hunter <graham.hunter@arm.com>

[SDAG] Add an ISD node to help lower vector.extract.last.active (#118810)

Based on feedback from the clastb codegen PR, I'm refactoring basic codegen for the vector.extract.last.active intrinsic to

[SDAG] Add an ISD node to help lower vector.extract.last.active (#118810)

Based on feedback from the clastb codegen PR, I'm refactoring basic codegen for the vector.extract.last.active intrinsic to lower to an ISD node in SelectionDAGBuilder then expand in LegalizeVectorOps, instead of doing everything in the builder.

The new ISD node (vector_find_last_active) only covers finding the index of the last active element of the mask, and extracting the element + handling passthru is left to existing ISD nodes.

show more ...


Revision tags: llvmorg-19.1.7, llvmorg-19.1.6
# 0d9fc174 13-Dec-2024 Craig Topper <craig.topper@sifive.com>

[GISel] Remove unused DataLayout operand from getApproximateEVTForLLT (#119833)


# e55c1677 09-Dec-2024 Sergei Barannikov <barannikov88@gmail.com>

[TargetLowering] Return Align from getByValTypeAlignment (NFC) (#119233)


Revision tags: llvmorg-19.1.5, llvmorg-19.1.4
# 28e4aad4 12-Nov-2024 Feng Zou <feng.zou@intel.com>

[X86][BF16] Add libcall for FP128 -> BF16 (#115825)

This is to fix #115710.


# ea859005 05-Nov-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

SafeStack: Respect alloca addrspace (#112536)

Just insert addrspacecast in cases where the alloca uses a
different address space, since I don't know what else you
could possibly do.


# c3260c65 29-Oct-2024 Benjamin Maxwell <benjamin.maxwell@arm.com>

[IR] Add `llvm.sincos` intrinsic (#109825)

This adds the `llvm.sincos` intrinsic, legalization, and lowering.

The `llvm.sincos` intrinsic takes a floating-point value and returns
both the sine a

[IR] Add `llvm.sincos` intrinsic (#109825)

This adds the `llvm.sincos` intrinsic, legalization, and lowering.

The `llvm.sincos` intrinsic takes a floating-point value and returns
both the sine and cosine (as a struct).

```
declare { float, float } @llvm.sincos.f32(float %Val)
declare { double, double } @llvm.sincos.f64(double %Val)
declare { x86_fp80, x86_fp80 } @llvm.sincos.f80(x86_fp80 %Val)
declare { fp128, fp128 } @llvm.sincos.f128(fp128 %Val)
declare { ppc_fp128, ppc_fp128 } @llvm.sincos.ppcf128(ppc_fp128 %Val)
declare { <4 x float>, <4 x float> } @llvm.sincos.v4f32(<4 x float> %Val)
```

The lowering is built on top of the existing FSINCOS ISD node, with
additional type legalization to allow for f16, f128, and vector values.

show more ...


Revision tags: llvmorg-19.1.3
# 6ab26eab 28-Oct-2024 Ellis Hoag <ellis.sparky.hoag@gmail.com>

Check hasOptSize() in shouldOptimizeForSize() (#112626)


# 875afa93 16-Oct-2024 Tex Riddell <texr@microsoft.com>

[X86][CodeGen] Add base atan2 intrinsic lowering (p4) (#110760)

This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

Based on example PR #96222 an

[X86][CodeGen] Add base atan2 intrinsic lowering (p4) (#110760)

This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

Based on example PR #96222 and fix PR #101268, with some differences due
to 2-arg intrinsic and intermediate refactor (RuntimeLibCalls.cpp).

- Add llvm.experimental.constrained.atan2 - Intrinsics.td,
ConstrainedOps.def, LangRef.rst
- Add to ISDOpcodes.h and TargetSelectionDAG.td, connect to intrinsic in
BasicTTIImpl.h, and LibFunc_ in SelectionDAGBuilder.cpp
- Update LegalizeDAG.cpp, LegalizeFloatTypes.cpp, LegalizeVectorOps.cpp,
and LegalizeVectorTypes.cpp
- Update isKnownNeverNaN in SelectionDAG.cpp
- Update SelectionDAGDumper.cpp
- Update libcalls - RuntimeLibcalls.def, RuntimeLibcalls.cpp
- TargetLoweringBase.cpp - Expand for vectors, promote f16
- X86ISelLowering.cpp - Expand f80, promote f32 to f64 for MSVC

Part 4 for Implement the atan2 HLSL Function #70096.

show more ...


Revision tags: llvmorg-19.1.2, llvmorg-19.1.1
# 3073c3c2 24-Sep-2024 Benjamin Maxwell <benjamin.maxwell@arm.com>

[SDAG] Avoid creating redundant stack slots when lowering FSINCOS (#108401)

When lowering `FSINCOS` to a library call (that takes output pointers)
we can avoid creating new stack allocations if the

[SDAG] Avoid creating redundant stack slots when lowering FSINCOS (#108401)

When lowering `FSINCOS` to a library call (that takes output pointers)
we can avoid creating new stack allocations if the results of the
`FSINCOS` are being stored. Instead, we can take the destination
pointers from the stores and pass those to the library call.

---

Note: As a NFC this also adds (and uses) `RTLIB::getFSINCOS()`.

show more ...


# c18be321 19-Sep-2024 Phoebe Wang <phoebe.wang@intel.com>

Reland "[X86][BF16] Add libcall for F80 -> BF16 (#109116)" (#109143)

This reverts commit ababfee78714313a0cad87591b819f0944b90d09.

Add X86 FP80 check.


# a10c9f99 18-Sep-2024 Phoebe Wang <phoebe.wang@intel.com>

Revert "[X86][BF16] Add libcall for F80 -> BF16" (#109140)

Reverts llvm/llvm-project#109116


# 76eda76f 18-Sep-2024 Phoebe Wang <phoebe.wang@intel.com>

[X86][BF16] Add libcall for F80 -> BF16 (#109116)

This fixes #108936, but the calling convention doesn't match with GCC. I
doubt we have such a lib function for now, so leave the calling
conventio

[X86][BF16] Add libcall for F80 -> BF16 (#109116)

This fixes #108936, but the calling convention doesn't match with GCC. I
doubt we have such a lib function for now, so leave the calling
convention as is.

show more ...


Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4
# db67a66e 31-Aug-2024 Brandon Wu <brandon.wu@sifive.com>

Revert "[RISCV] RISCV vector calling convention (2/2)" (#97994)

This reverts commit 91dd844aa499d69c7ff75bf3156e2e3593a88057.

Stacked on https://github.com/llvm/llvm-project/pull/97993


# e78156a0 21-Aug-2024 Sumanth Gundapaneni <sumanth.gundapaneni@amd.com>

Scalarize the vector inputs to llvm.lround intrinsic by default. (#101054)

Verifier is updated in a different patch to let the vector types for
llvm.lround and llvm.llround intrinsics.


Revision tags: llvmorg-19.1.0-rc3
# fb9e685f 15-Aug-2024 YunQiang Su <syq@debian.org>

Intrinsic: introduce minimumnum and maximumnum for IR and SelectionDAG (#96649)

C23 introduced new functions fminimum_num and fmaximum_num, and they
follow the minimumNumber and maximumNumber of IE

Intrinsic: introduce minimumnum and maximumnum for IR and SelectionDAG (#96649)

C23 introduced new functions fminimum_num and fmaximum_num, and they
follow the minimumNumber and maximumNumber of IEEE754-2019. Let's
introduce new intrinsics to support them.

This patch introduces support only support for scalar values. The
support of
vector (vp, vp.reduce, vector.reduce),
experimental.constrained
will be added in future patches.

With this patch, MIPSr6 and LoongArch can work out of box with
fcanonical and fmax/fmin.

Aarch64/PowerPC64 can use the same login as MIPSr6 and LoongArch, while
they have no fcanonical support yet.
I will add it in future patches.

The FMIN/FMAX of RISC-V instructions follows the
minimumNumber/maximumNumber of IEEE754-2019. We can just add it in
future patch.

Background

https://discourse.llvm.org/t/rfc-fix-llvm-min-f-and-llvm-max-f-intrinsics/79735
Currently we have fminnum/fmaxnum, which have different behavior on
different platform for NUM vs sNaN:
1) Fallback to fmin(3)/fmax(3): return qNaN.
2) ARM64/ARM32+Neon: same as libc.
3) MIPSr6/LoongArch/RISC-V: return NUM.

And the fix of fminnum/fmaxnum to follow minNUM/maxNUM of IEEE754-2008
will submit as separated patches.

show more ...


# 0d074ba1 14-Aug-2024 hanbeom <kese111@gmail.com>

[DAG] Support saturated truncate (#99418)

A truncate is considered saturated if no additional conversion is required between the target and return values. If the target is saturated when attempting

[DAG] Support saturated truncate (#99418)

A truncate is considered saturated if no additional conversion is required between the target and return values. If the target is saturated when attempting to truncate from a vector, there is an opportunity to optimize it.

Previously, each architecture had its own attempt at optimization, leading to redundant code. This patch implements common logic by introducing three new ISDs:

`ISD::TRUNCATE_SSAT_S`: When the operand is a signed value and the range of values matches the range of signed values of the destination type.

`ISD::TRUNCATE_SSAT_U`: When the operand is a signed value and the range of values matches the range of unsigned values of the destination type.

`ISD::TRUNCATE_USAT_U`: When the operand is an unsigned value and the range of values matches the range of unsigned values of the destination type.

These ISDs indicate a saturated truncate.

Fixes https://github.com/llvm/llvm-project/issues/85903

show more ...


Revision tags: llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1
# 0ee32c45 24-Jul-2024 Sumanth Gundapaneni <sumanth.gundapaneni@amd.com>

[AMDGPU] Implement llvm.lrint intrinsic lowering (#98931)

This patch enabled the target-independent lowering of llvm.lrint via
GlobalISel.
For SelectionDAG, the instrinsic is custom lowered for AM

[AMDGPU] Implement llvm.lrint intrinsic lowering (#98931)

This patch enabled the target-independent lowering of llvm.lrint via
GlobalISel.
For SelectionDAG, the instrinsic is custom lowered for AMDGPU.

show more ...


# fc832d53 23-Jul-2024 Sumanth Gundapaneni <sumanth.gundapaneni@amd.com>

[AMDGPU] Implement llvm.lround intrinsic lowering. (#98970)

This patch enables the target-independent lowering of llvm.lround via
GlobalISel. For SelectionDAG, the instrinsic is custom lowered for

[AMDGPU] Implement llvm.lround intrinsic lowering. (#98970)

This patch enables the target-independent lowering of llvm.lround via
GlobalISel. For SelectionDAG, the instrinsic is custom lowered for
AMDGPU. In order to support vector floating point input for llvm.lround,
this patch extends the target independent APIs and provide support for
scalarizing. pr98950 is needed to let verifier allow vector floating
point types

show more ...


Revision tags: llvmorg-20-init
# 615b7eea 20-Jul-2024 Joseph Huber <huberjn@outlook.com>

Reapply "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)"

This reverts commit 740161a9b98c9920dedf1852b5f1c94d0a683af5.

I moved the `ISD` dependencies into the CodeGen port

Reapply "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)"

This reverts commit 740161a9b98c9920dedf1852b5f1c94d0a683af5.

I moved the `ISD` dependencies into the CodeGen portion of the handling,
it's a little awkward but it's the easiest solution I can think of for
now.

show more ...


# 5893b1e2 20-Jul-2024 NAKAMURA Takumi <geek4civic@gmail.com>

Reformat


# 740161a9 20-Jul-2024 NAKAMURA Takumi <geek4civic@gmail.com>

Revert "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)"

This reverts commit c05126bdfc3b02daa37d11056fa43db1a6cdef69.
(llvmorg-19-init-17714-gc05126bdfc3b)
See #99610


# 177ce190 17-Jul-2024 Lawrence Benson <github@lawben.com>

[LLVM] Add `llvm.experimental.vector.compress` intrinsic (#92289)

This PR adds a new vector intrinsic `@llvm.experimental.vector.compress`
to "compress" data within a vector based on a selection ma

[LLVM] Add `llvm.experimental.vector.compress` intrinsic (#92289)

This PR adds a new vector intrinsic `@llvm.experimental.vector.compress`
to "compress" data within a vector based on a selection mask, i.e., it
moves all selected values (i.e., where `mask[i] == 1`) to consecutive
lanes in the result vector. A `passthru` vector can be provided, from
which remaining lanes are filled.

The main reason for this is that the existing
`@llvm.masked.compressstore` has very strong constraints in that it can
only write values that were selected, resulting in guard branches for
all targets except AVX-512 (and even there the AMD implementation is
_very_ slow). More instruction sets support "compress" logic, but only
within registers. So to store the values, an additional store is needed.
But this combination is likely significantly faster on many target as it
avoids branches.

In follow up PRs, my plan is to add target-specific lowerings for x86,
SVE, and possibly RISCV. I also want to combine this with a store
instruction, as this is probably a common case and we can avoid some
memory writes in that case.

See [discussion in
forum](https://discourse.llvm.org/t/new-intrinsic-for-masked-vector-compress-without-store/78663)
for initial discussion on the design.

show more ...


# c05126bd 16-Jul-2024 Joseph Huber <huberjn@outlook.com>

[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)

Summary:
The LTO pass and LLD linker have logic in them that forces extraction
and prevent internalization of needed runtime

[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)

Summary:
The LTO pass and LLD linker have logic in them that forces extraction
and prevent internalization of needed runtime calls. However, these
currently take all RTLibcalls into account, even if the target does not
support them. The target opts-out of a libcall if it sets its name to
nullptr. This patch pulls this logic out into a class in the header so
that LTO / lld can use it to determine if a symbol actually needs to be
kept.

This is important for targets like AMDGPU that want to be able to use
`lld` to perform the final link step, but does not want the overhead of
uncalled functions. (This adds like a second to the link time trivially)

show more ...


# 1ccd8756 12-Jul-2024 Joseph Huber <huberjn@outlook.com>

[NVPTX] Disable all RTLib libcalls (#98672)

Summary:
This patch explicitly disables runtime calls to be emitted from the
NVPTX backend. This allows other utilities to know that we do not need
to wor

[NVPTX] Disable all RTLib libcalls (#98672)

Summary:
This patch explicitly disables runtime calls to be emitted from the
NVPTX backend. This allows other utilities to know that we do not need
to worry about emitting these.

show more ...


12345678910>>...20