Revision tags: llvmorg-21-init |
|
#
ab976a17 |
| 24-Jan-2025 |
Stephen Long <63318318+steplong@users.noreply.github.com> |
PreISelIntrinsicLowering: Lower llvm.exp/llvm.exp2 to a loop if scalable vec arg (#117568)
|
#
d9f165dd |
| 20-Jan-2025 |
Graham Hunter <graham.hunter@arm.com> |
[SDAG] Add an ISD node to help lower vector.extract.last.active (#118810)
Based on feedback from the clastb codegen PR, I'm refactoring basic codegen for the vector.extract.last.active intrinsic to
[SDAG] Add an ISD node to help lower vector.extract.last.active (#118810)
Based on feedback from the clastb codegen PR, I'm refactoring basic codegen for the vector.extract.last.active intrinsic to lower to an ISD node in SelectionDAGBuilder then expand in LegalizeVectorOps, instead of doing everything in the builder.
The new ISD node (vector_find_last_active) only covers finding the index of the last active element of the mask, and extracting the element + handling passthru is left to existing ISD nodes.
show more ...
|
Revision tags: llvmorg-19.1.7, llvmorg-19.1.6 |
|
#
0d9fc174 |
| 13-Dec-2024 |
Craig Topper <craig.topper@sifive.com> |
[GISel] Remove unused DataLayout operand from getApproximateEVTForLLT (#119833)
|
#
e55c1677 |
| 09-Dec-2024 |
Sergei Barannikov <barannikov88@gmail.com> |
[TargetLowering] Return Align from getByValTypeAlignment (NFC) (#119233)
|
Revision tags: llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
28e4aad4 |
| 12-Nov-2024 |
Feng Zou <feng.zou@intel.com> |
[X86][BF16] Add libcall for FP128 -> BF16 (#115825)
This is to fix #115710.
|
#
ea859005 |
| 05-Nov-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
SafeStack: Respect alloca addrspace (#112536)
Just insert addrspacecast in cases where the alloca uses a
different address space, since I don't know what else you
could possibly do.
|
#
c3260c65 |
| 29-Oct-2024 |
Benjamin Maxwell <benjamin.maxwell@arm.com> |
[IR] Add `llvm.sincos` intrinsic (#109825)
This adds the `llvm.sincos` intrinsic, legalization, and lowering.
The `llvm.sincos` intrinsic takes a floating-point value and returns
both the sine a
[IR] Add `llvm.sincos` intrinsic (#109825)
This adds the `llvm.sincos` intrinsic, legalization, and lowering.
The `llvm.sincos` intrinsic takes a floating-point value and returns
both the sine and cosine (as a struct).
```
declare { float, float } @llvm.sincos.f32(float %Val)
declare { double, double } @llvm.sincos.f64(double %Val)
declare { x86_fp80, x86_fp80 } @llvm.sincos.f80(x86_fp80 %Val)
declare { fp128, fp128 } @llvm.sincos.f128(fp128 %Val)
declare { ppc_fp128, ppc_fp128 } @llvm.sincos.ppcf128(ppc_fp128 %Val)
declare { <4 x float>, <4 x float> } @llvm.sincos.v4f32(<4 x float> %Val)
```
The lowering is built on top of the existing FSINCOS ISD node, with
additional type legalization to allow for f16, f128, and vector values.
show more ...
|
Revision tags: llvmorg-19.1.3 |
|
#
6ab26eab |
| 28-Oct-2024 |
Ellis Hoag <ellis.sparky.hoag@gmail.com> |
Check hasOptSize() in shouldOptimizeForSize() (#112626)
|
#
875afa93 |
| 16-Oct-2024 |
Tex Riddell <texr@microsoft.com> |
[X86][CodeGen] Add base atan2 intrinsic lowering (p4) (#110760)
This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294
Based on example PR #96222 an
[X86][CodeGen] Add base atan2 intrinsic lowering (p4) (#110760)
This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294
Based on example PR #96222 and fix PR #101268, with some differences due
to 2-arg intrinsic and intermediate refactor (RuntimeLibCalls.cpp).
- Add llvm.experimental.constrained.atan2 - Intrinsics.td,
ConstrainedOps.def, LangRef.rst
- Add to ISDOpcodes.h and TargetSelectionDAG.td, connect to intrinsic in
BasicTTIImpl.h, and LibFunc_ in SelectionDAGBuilder.cpp
- Update LegalizeDAG.cpp, LegalizeFloatTypes.cpp, LegalizeVectorOps.cpp,
and LegalizeVectorTypes.cpp
- Update isKnownNeverNaN in SelectionDAG.cpp
- Update SelectionDAGDumper.cpp
- Update libcalls - RuntimeLibcalls.def, RuntimeLibcalls.cpp
- TargetLoweringBase.cpp - Expand for vectors, promote f16
- X86ISelLowering.cpp - Expand f80, promote f32 to f64 for MSVC
Part 4 for Implement the atan2 HLSL Function #70096.
show more ...
|
Revision tags: llvmorg-19.1.2, llvmorg-19.1.1 |
|
#
3073c3c2 |
| 24-Sep-2024 |
Benjamin Maxwell <benjamin.maxwell@arm.com> |
[SDAG] Avoid creating redundant stack slots when lowering FSINCOS (#108401)
When lowering `FSINCOS` to a library call (that takes output pointers)
we can avoid creating new stack allocations if the
[SDAG] Avoid creating redundant stack slots when lowering FSINCOS (#108401)
When lowering `FSINCOS` to a library call (that takes output pointers)
we can avoid creating new stack allocations if the results of the
`FSINCOS` are being stored. Instead, we can take the destination
pointers from the stores and pass those to the library call.
---
Note: As a NFC this also adds (and uses) `RTLIB::getFSINCOS()`.
show more ...
|
#
c18be321 |
| 19-Sep-2024 |
Phoebe Wang <phoebe.wang@intel.com> |
Reland "[X86][BF16] Add libcall for F80 -> BF16 (#109116)" (#109143)
This reverts commit ababfee78714313a0cad87591b819f0944b90d09.
Add X86 FP80 check.
|
#
a10c9f99 |
| 18-Sep-2024 |
Phoebe Wang <phoebe.wang@intel.com> |
Revert "[X86][BF16] Add libcall for F80 -> BF16" (#109140)
Reverts llvm/llvm-project#109116
|
#
76eda76f |
| 18-Sep-2024 |
Phoebe Wang <phoebe.wang@intel.com> |
[X86][BF16] Add libcall for F80 -> BF16 (#109116)
This fixes #108936, but the calling convention doesn't match with GCC. I
doubt we have such a lib function for now, so leave the calling
conventio
[X86][BF16] Add libcall for F80 -> BF16 (#109116)
This fixes #108936, but the calling convention doesn't match with GCC. I
doubt we have such a lib function for now, so leave the calling
convention as is.
show more ...
|
Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4 |
|
#
db67a66e |
| 31-Aug-2024 |
Brandon Wu <brandon.wu@sifive.com> |
Revert "[RISCV] RISCV vector calling convention (2/2)" (#97994)
This reverts commit 91dd844aa499d69c7ff75bf3156e2e3593a88057.
Stacked on https://github.com/llvm/llvm-project/pull/97993
|
#
e78156a0 |
| 21-Aug-2024 |
Sumanth Gundapaneni <sumanth.gundapaneni@amd.com> |
Scalarize the vector inputs to llvm.lround intrinsic by default. (#101054)
Verifier is updated in a different patch to let the vector types for
llvm.lround and llvm.llround intrinsics.
|
Revision tags: llvmorg-19.1.0-rc3 |
|
#
fb9e685f |
| 15-Aug-2024 |
YunQiang Su <syq@debian.org> |
Intrinsic: introduce minimumnum and maximumnum for IR and SelectionDAG (#96649)
C23 introduced new functions fminimum_num and fmaximum_num, and they
follow the minimumNumber and maximumNumber of IE
Intrinsic: introduce minimumnum and maximumnum for IR and SelectionDAG (#96649)
C23 introduced new functions fminimum_num and fmaximum_num, and they
follow the minimumNumber and maximumNumber of IEEE754-2019. Let's
introduce new intrinsics to support them.
This patch introduces support only support for scalar values. The
support of
vector (vp, vp.reduce, vector.reduce),
experimental.constrained
will be added in future patches.
With this patch, MIPSr6 and LoongArch can work out of box with
fcanonical and fmax/fmin.
Aarch64/PowerPC64 can use the same login as MIPSr6 and LoongArch, while
they have no fcanonical support yet.
I will add it in future patches.
The FMIN/FMAX of RISC-V instructions follows the
minimumNumber/maximumNumber of IEEE754-2019. We can just add it in
future patch.
Background
https://discourse.llvm.org/t/rfc-fix-llvm-min-f-and-llvm-max-f-intrinsics/79735
Currently we have fminnum/fmaxnum, which have different behavior on
different platform for NUM vs sNaN:
1) Fallback to fmin(3)/fmax(3): return qNaN.
2) ARM64/ARM32+Neon: same as libc.
3) MIPSr6/LoongArch/RISC-V: return NUM.
And the fix of fminnum/fmaxnum to follow minNUM/maxNUM of IEEE754-2008
will submit as separated patches.
show more ...
|
#
0d074ba1 |
| 14-Aug-2024 |
hanbeom <kese111@gmail.com> |
[DAG] Support saturated truncate (#99418)
A truncate is considered saturated if no additional conversion is required between the target and return values. If the target is saturated when attempting
[DAG] Support saturated truncate (#99418)
A truncate is considered saturated if no additional conversion is required between the target and return values. If the target is saturated when attempting to truncate from a vector, there is an opportunity to optimize it.
Previously, each architecture had its own attempt at optimization, leading to redundant code. This patch implements common logic by introducing three new ISDs:
`ISD::TRUNCATE_SSAT_S`: When the operand is a signed value and the range of values matches the range of signed values of the destination type.
`ISD::TRUNCATE_SSAT_U`: When the operand is a signed value and the range of values matches the range of unsigned values of the destination type.
`ISD::TRUNCATE_USAT_U`: When the operand is an unsigned value and the range of values matches the range of unsigned values of the destination type.
These ISDs indicate a saturated truncate.
Fixes https://github.com/llvm/llvm-project/issues/85903
show more ...
|
Revision tags: llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1 |
|
#
0ee32c45 |
| 24-Jul-2024 |
Sumanth Gundapaneni <sumanth.gundapaneni@amd.com> |
[AMDGPU] Implement llvm.lrint intrinsic lowering (#98931)
This patch enabled the target-independent lowering of llvm.lrint via
GlobalISel.
For SelectionDAG, the instrinsic is custom lowered for AM
[AMDGPU] Implement llvm.lrint intrinsic lowering (#98931)
This patch enabled the target-independent lowering of llvm.lrint via
GlobalISel.
For SelectionDAG, the instrinsic is custom lowered for AMDGPU.
show more ...
|
#
fc832d53 |
| 23-Jul-2024 |
Sumanth Gundapaneni <sumanth.gundapaneni@amd.com> |
[AMDGPU] Implement llvm.lround intrinsic lowering. (#98970)
This patch enables the target-independent lowering of llvm.lround via
GlobalISel. For SelectionDAG, the instrinsic is custom lowered for
[AMDGPU] Implement llvm.lround intrinsic lowering. (#98970)
This patch enables the target-independent lowering of llvm.lround via
GlobalISel. For SelectionDAG, the instrinsic is custom lowered for
AMDGPU. In order to support vector floating point input for llvm.lround,
this patch extends the target independent APIs and provide support for
scalarizing. pr98950 is needed to let verifier allow vector floating
point types
show more ...
|
Revision tags: llvmorg-20-init |
|
#
615b7eea |
| 20-Jul-2024 |
Joseph Huber <huberjn@outlook.com> |
Reapply "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)"
This reverts commit 740161a9b98c9920dedf1852b5f1c94d0a683af5.
I moved the `ISD` dependencies into the CodeGen port
Reapply "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)"
This reverts commit 740161a9b98c9920dedf1852b5f1c94d0a683af5.
I moved the `ISD` dependencies into the CodeGen portion of the handling, it's a little awkward but it's the easiest solution I can think of for now.
show more ...
|
#
5893b1e2 |
| 20-Jul-2024 |
NAKAMURA Takumi <geek4civic@gmail.com> |
Reformat
|
#
740161a9 |
| 20-Jul-2024 |
NAKAMURA Takumi <geek4civic@gmail.com> |
Revert "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)"
This reverts commit c05126bdfc3b02daa37d11056fa43db1a6cdef69. (llvmorg-19-init-17714-gc05126bdfc3b) See #99610
|
#
177ce190 |
| 17-Jul-2024 |
Lawrence Benson <github@lawben.com> |
[LLVM] Add `llvm.experimental.vector.compress` intrinsic (#92289)
This PR adds a new vector intrinsic `@llvm.experimental.vector.compress`
to "compress" data within a vector based on a selection ma
[LLVM] Add `llvm.experimental.vector.compress` intrinsic (#92289)
This PR adds a new vector intrinsic `@llvm.experimental.vector.compress`
to "compress" data within a vector based on a selection mask, i.e., it
moves all selected values (i.e., where `mask[i] == 1`) to consecutive
lanes in the result vector. A `passthru` vector can be provided, from
which remaining lanes are filled.
The main reason for this is that the existing
`@llvm.masked.compressstore` has very strong constraints in that it can
only write values that were selected, resulting in guard branches for
all targets except AVX-512 (and even there the AMD implementation is
_very_ slow). More instruction sets support "compress" logic, but only
within registers. So to store the values, an additional store is needed.
But this combination is likely significantly faster on many target as it
avoids branches.
In follow up PRs, my plan is to add target-specific lowerings for x86,
SVE, and possibly RISCV. I also want to combine this with a store
instruction, as this is probably a common case and we can avoid some
memory writes in that case.
See [discussion in
forum](https://discourse.llvm.org/t/new-intrinsic-for-masked-vector-compress-without-store/78663)
for initial discussion on the design.
show more ...
|
#
c05126bd |
| 16-Jul-2024 |
Joseph Huber <huberjn@outlook.com> |
[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)
Summary: The LTO pass and LLD linker have logic in them that forces extraction and prevent internalization of needed runtime
[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)
Summary: The LTO pass and LLD linker have logic in them that forces extraction and prevent internalization of needed runtime calls. However, these currently take all RTLibcalls into account, even if the target does not support them. The target opts-out of a libcall if it sets its name to nullptr. This patch pulls this logic out into a class in the header so that LTO / lld can use it to determine if a symbol actually needs to be kept.
This is important for targets like AMDGPU that want to be able to use `lld` to perform the final link step, but does not want the overhead of uncalled functions. (This adds like a second to the link time trivially)
show more ...
|
#
1ccd8756 |
| 12-Jul-2024 |
Joseph Huber <huberjn@outlook.com> |
[NVPTX] Disable all RTLib libcalls (#98672)
Summary: This patch explicitly disables runtime calls to be emitted from the NVPTX backend. This allows other utilities to know that we do not need to wor
[NVPTX] Disable all RTLib libcalls (#98672)
Summary: This patch explicitly disables runtime calls to be emitted from the NVPTX backend. This allows other utilities to know that we do not need to worry about emitting these.
show more ...
|