Revision tags: llvmorg-21-init, llvmorg-19.1.7 |
|
#
a77346ba |
| 06-Jan-2025 |
Yingwei Zheng <dtcxzyw2333@gmail.com> |
[IRBuilder] Refactor FMF interface (#121657)
Up to now, the only way to set specified FMF flags in IRBuilder is to
use `FastMathFlagGuard`. It makes the code ugly and hard to maintain.
This patc
[IRBuilder] Refactor FMF interface (#121657)
Up to now, the only way to set specified FMF flags in IRBuilder is to
use `FastMathFlagGuard`. It makes the code ugly and hard to maintain.
This patch introduces a helper class `FMFSource` to replace the original
parameter `Instruction *FMFSource` in IRBuilder. To maximize the
compatibility, it accepts an instruction or a specified FMF.
This patch also removes the use of `FastMathFlagGuard` in some simple
cases.
Compile-time impact:
https://llvm-compile-time-tracker.com/compare.php?from=f87a9db8322643ccbc324e317a75b55903129b55&to=9397e712f6010be15ccf62f12740e9b4a67de2f4&stat=instructions%3Au
show more ...
|
Revision tags: llvmorg-19.1.6 |
|
#
855bc46b |
| 08-Dec-2024 |
Andreas Jonson <andjo403@hotmail.com> |
[InstCombine] Fold trunc nuw/nsw X to i1 -> true IFF X != 0 (#119131)
proof https://alive2.llvm.org/ce/z/prpPex
|
#
99dc3967 |
| 06-Dec-2024 |
John Brawn <john.brawn@arm.com> |
[InstCombine] Make fptrunc combine use intersection of fast math flags (#118808)
These combines involve swapping the fptrunc with its operand, and using
the intersection of fast math flags is the s
[InstCombine] Make fptrunc combine use intersection of fast math flags (#118808)
These combines involve swapping the fptrunc with its operand, and using
the intersection of fast math flags is the safest option as e.g. if we
have (fptrunc (fneg ninf x)) then (fneg ninf (fptrunc x)) will not be
correct as if x is a not within the range of the destination type the
result of (fptrunc x) will be inf.
show more ...
|
#
d09632ba |
| 05-Dec-2024 |
Nikita Popov <npopov@redhat.com> |
[InstCombine] Remove nusw handling in ptrtoint of gep fold (NFCI) (#118804)
Now that #111144 infers gep nuw, we no longer have to repeat the
inference in this fold.
|
Revision tags: llvmorg-19.1.5 |
|
#
4d8eb009 |
| 25-Nov-2024 |
Nikita Popov <npopov@redhat.com> |
[InstCombine] Remove SPF guard for trunc transforms (#117535)
This shouldn't be necessary anymore now that SPF patterns are
canonicalized to intrinsics.
|
Revision tags: llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2 |
|
#
fa789dff |
| 11-Oct-2024 |
Rahul Joshi <rjoshi@nvidia.com> |
[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is a
[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
show more ...
|
Revision tags: llvmorg-19.1.1 |
|
#
795c24c6 |
| 28-Sep-2024 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[InstCombine] foldVecExtTruncToExtElt - extend to handle trunc(lshr(extractelement(x,c1),c2)) -> extractelement(bitcast(x),c3) patterns. (#109689)
This patch moves the existing trunc+extractlement -
[InstCombine] foldVecExtTruncToExtElt - extend to handle trunc(lshr(extractelement(x,c1),c2)) -> extractelement(bitcast(x),c3) patterns. (#109689)
This patch moves the existing trunc+extractlement -> extractelement+bitcast fold into a foldVecExtTruncToExtElt helper and extends the helper to handle trunc+lshr+extractelement cases as well.
Fixes #107404
show more ...
|
#
790f2eb1 |
| 17-Sep-2024 |
Alex MacLean <amaclean@nvidia.com> |
[InstCombine] Avoid simplifying bitcast of undef to a zeroinitializer vector (#108872)
In some cases, if an undef value is the product of another instcombine
simplification, a bitcast of undef is s
[InstCombine] Avoid simplifying bitcast of undef to a zeroinitializer vector (#108872)
In some cases, if an undef value is the product of another instcombine
simplification, a bitcast of undef is simplified to a zeroinitializer
vector instead of undef.
show more ...
|
Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4 |
|
#
170a21e7 |
| 21-Aug-2024 |
Marius Kamp <msk@posteo.org> |
[InstCombine] Extend Fold of Zero-extended Bit Test (#102100)
Previously, (zext (icmp ne (and X, (1 << ShAmt)), 0)) has only been
folded if the bit width of X and the result were equal. Use a trunc
[InstCombine] Extend Fold of Zero-extended Bit Test (#102100)
Previously, (zext (icmp ne (and X, (1 << ShAmt)), 0)) has only been
folded if the bit width of X and the result were equal. Use a trunc or
zext instruction to also support other bit widths.
This is a follow-up to commit 533190acdb9d2ed774f96a998b5c03be3df4f857,
which introduced a regression: (zext (icmp ne (and (lshr X ShAmt) 1) 0))
is not folded any longer to (zext/trunc (and (lshr X ShAmt) 1)) since
the commit introduced the fold of (icmp ne (and (lshr X ShAmt) 1) 0) to
(icmp ne (and X (1 << ShAmt)) 0). The change introduced by this commit
restores this fold.
Alive proof: https://alive2.llvm.org/ce/z/MFkNXs
Relates to issue #86813 and pull request #101838.
show more ...
|
Revision tags: llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2 |
|
#
b455edbc |
| 31-Jul-2024 |
Yingwei Zheng <dtcxzyw2333@gmail.com> |
[InstCombine] Recognize copysign idioms (#101324)
This patch folds `(bitcast (or (and (bitcast X to int), signmask), nneg
Y) to fp)` into `copysign((bitcast Y to fp), X)`. I found this pattern
exi
[InstCombine] Recognize copysign idioms (#101324)
This patch folds `(bitcast (or (and (bitcast X to int), signmask), nneg
Y) to fp)` into `copysign((bitcast Y to fp), X)`. I found this pattern
exists in some graphics applications/math libraries.
Alive2: https://alive2.llvm.org/ce/z/ggQZV2
show more ...
|
Revision tags: llvmorg-19.1.0-rc1 |
|
#
abacc522 |
| 25-Jul-2024 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
Fix unused variable warning. NFC.
|
#
dfeb3991 |
| 25-Jul-2024 |
James Y Knight <jyknight@google.com> |
Remove the `x86_mmx` IR type. (#98505)
It is now translated to `<1 x i64>`, which allows the removal of a bunch
of special casing.
This _incompatibly_ changes the ABI of any LLVM IR function wit
Remove the `x86_mmx` IR type. (#98505)
It is now translated to `<1 x i64>`, which allows the removal of a bunch
of special casing.
This _incompatibly_ changes the ABI of any LLVM IR function with
`x86_mmx` arguments or returns: instead of passing in mmx registers,
they will now be passed via integer registers. However, the real-world
incompatibility caused by this is expected to be minimal, because Clang
never uses the x86_mmx type -- it lowers `__m64` to either `<1 x i64>`
or `double`, depending on ABI.
This change does _not_ eliminate the SelectionDAG `MVT::x86mmx` type.
That type simply no longer corresponds to an IR type, and is used only
by MMX intrinsics and inline-asm operands.
Because SelectionDAGBuilder only knows how to generate the
operands/results of intrinsics based on the IR type, it thus now
generates the intrinsics with the type MVT::v1i64, instead of
MVT::x86mmx. We need to fix this before the DAG LegalizeTypes, and thus
have the X86 backend fix them up in DAGCombine. (This may be a
short-lived hack, if all the MMX intrinsics can be removed in upcoming
changes.)
Works towards issue #98272.
show more ...
|
Revision tags: llvmorg-20-init |
|
#
d873630f |
| 12-Jul-2024 |
Nikita Popov <npopov@redhat.com> |
Revert "[InstCombine] Generalize ptrtoint(gep) fold (NFC)"
This reverts commit c45f939e34dafaf0f57fd1d93df7df5cc89f1dec.
This refactoring turned out to not be useful for the case I had originally i
Revert "[InstCombine] Generalize ptrtoint(gep) fold (NFC)"
This reverts commit c45f939e34dafaf0f57fd1d93df7df5cc89f1dec.
This refactoring turned out to not be useful for the case I had originally in mind, so revert it for now.
show more ...
|
#
c45f939e |
| 12-Jul-2024 |
Nikita Popov <npopov@redhat.com> |
[InstCombine] Generalize ptrtoint(gep) fold (NFC)
We're currently handling a special case of ptrtoint gep -> add ptrtoint. Reframe the code to make it easier to add more patterns for this transform.
|
#
4502ea89 |
| 11-Jul-2024 |
Nikita Popov <npopov@redhat.com> |
[InstCombine] More precise nuw preservation in ptrtoint of gep fold
We can transfer a nuw flag from the gep to the add. Additionally, the inbounds + nneg case can be relaxed to nusw + nneg. Finally,
[InstCombine] More precise nuw preservation in ptrtoint of gep fold
We can transfer a nuw flag from the gep to the add. Additionally, the inbounds + nneg case can be relaxed to nusw + nneg. Finally, don't forget to pass the correct context instruction to SimplifyQuery.
show more ...
|
#
440af98a |
| 18-Jun-2024 |
Nikita Popov <npopov@redhat.com> |
[InstCombine] Avoid use of ConstantExpr::getShl()
Use IRBuilder instead. Also use ImmConstant to guarantee that this will fold.
|
#
534f8569 |
| 17-Jun-2024 |
Nikita Popov <npopov@redhat.com> |
[InstCombine] Don't preserve context across div
We can't preserve the context across a non-speculatable instruction, as this might introduce a trap. Alternatively, we could also insert all the repla
[InstCombine] Don't preserve context across div
We can't preserve the context across a non-speculatable instruction, as this might introduce a trap. Alternatively, we could also insert all the replacement instruction at the use-site, but that would be a more intrusive change for the sake of this edge case.
Fixes https://github.com/llvm/llvm-project/issues/95547.
show more ...
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7 |
|
#
6bf1601a |
| 20-May-2024 |
Monad <yanwqmonad@gmail.com> |
[InstCombine] Fold pointer adding in integer to arithmetic add (#91596)
Fold
``` llvm
define i32 @src(i32 %x, i32 %y) {
%base = inttoptr i32 %x to ptr
%ptr = getelementptr inbounds i8, ptr %
[InstCombine] Fold pointer adding in integer to arithmetic add (#91596)
Fold
``` llvm
define i32 @src(i32 %x, i32 %y) {
%base = inttoptr i32 %x to ptr
%ptr = getelementptr inbounds i8, ptr %base, i32 %y
%r = ptrtoint ptr %ptr to i32
ret i32 %r
}
```
where both `%base` and `%ptr` have only one use, to
``` llvm
define i32 @tgt(i32 %x, i32 %y) {
%r = add i32 %x, %y
ret i32 %r
}
```
The `add` can be `nuw` if the GEP is `inbounds` and the offset is
non-negative. The relevant Alive2 proof is
https://alive2.llvm.org/ce/z/nP3RWy.
### Motivation
It seems unnecessary to convert `int` to `ptr` just to get its offset.
In most cases, they generates the same assembly, but sometimes it may
miss some optimizations since the analysis of `GEP` is not as perfect as
that of arithmetic operation. One example is
https://github.com/dtcxzyw/llvm-opt-benchmark/blob/e3c822bf41df3a88ca38eba884a52b0cc7e70bf2/bench/protobuf/optimized/generated_message_reflection.cc.ll#L39860-L39873
``` llvm
%conv.i188 = zext i32 %145 to i64
%add.i189 = add i64 %conv.i188, %125
%146 = load i16, ptr %num_aux_entries10.i, align 2
%conv2.i191 = zext i16 %146 to i64
%mul.i192 = shl nuw nsw i64 %conv2.i191, 3
%add3.i193 = add i64 %add.i189, %mul.i192
%147 = inttoptr i64 %add3.i193 to ptr
%sub.ptr.lhs.cast.i195 = ptrtoint ptr %144 to i64
%sub.ptr.rhs.cast.i196 = ptrtoint ptr %143 to i64
%sub.ptr.sub.i197 = sub i64 %sub.ptr.lhs.cast.i195, %sub.ptr.rhs.cast.i196
%add.ptr = getelementptr inbounds i8, ptr %147, i64 %sub.ptr.sub.i197
%sub.ptr.lhs.cast = ptrtoint ptr %add.ptr to i64
%sub.ptr.sub = sub i64 %sub.ptr.lhs.cast, %125
```
where `%conv.i188` first adds `%125` and then subtracts `%125` (the
result is `%sub.ptr.sub`), which can be optimized.
show more ...
|
Revision tags: llvmorg-18.1.6, llvmorg-18.1.5 |
|
#
34c89eff |
| 30-Apr-2024 |
Monad <yanwqmonad@gmail.com> |
[InstCombine] Fold `trunc nuw/nsw (x xor y) to i1` to `x != y` (#90408)
Fold:
``` llvm
define i1 @src(i8 %x, i8 %y) {
%xor = xor i8 %x, %y
%r = trunc nuw/nsw i8 %xor to i1
ret i1 %r
}
[InstCombine] Fold `trunc nuw/nsw (x xor y) to i1` to `x != y` (#90408)
Fold:
``` llvm
define i1 @src(i8 %x, i8 %y) {
%xor = xor i8 %x, %y
%r = trunc nuw/nsw i8 %xor to i1
ret i1 %r
}
define i1 @tgt(i8 %x, i8 %y) {
%r = icmp ne i8 %x, %y
ret i1 %r
}
```
Proof: https://alive2.llvm.org/ce/z/dcuHmn
show more ...
|
#
873889b7 |
| 25-Apr-2024 |
Nikita Popov <npopov@redhat.com> |
[InstCombine] Extract logic for "emit offset and rewrite gep" (NFC)
|
#
1baa3850 |
| 18-Apr-2024 |
Nikita Popov <npopov@redhat.com> |
[IR][PatternMatch] Only accept poison in getSplatValue() (#89159)
In #88217 a large set of matchers was changed to only accept poison
values in splats, but not undef values. This is because we now
[IR][PatternMatch] Only accept poison in getSplatValue() (#89159)
In #88217 a large set of matchers was changed to only accept poison
values in splats, but not undef values. This is because we now use
poison for non-demanded vector elements, and allowing undef can cause
correctness issues.
This patch covers the remaining matchers by changing the AllowUndef
parameter of getSplatValue() to AllowPoison instead. We also carry out
corresponding renames in matchers.
As a followup, we may want to change the default for things like m_APInt
to m_APIntAllowPoison (as this is much less risky when only allowing
poison), but this change doesn't do that.
There is one caveat here: We have a single place
(X86FixupVectorConstants) which does require handling of vector splats
with undefs. This is because this works on backend constant pool
entries, which currently still use undef instead of poison for
non-demanded elements (because SDAG as a whole does not have an explicit
poison representation). As it's just the single use, I've open-coded a
getSplatValueAllowUndef() helper there, to discourage use in any other
places.
show more ...
|
#
da04e4af |
| 17-Apr-2024 |
Noah Goldstein <goldstein.w.n@gmail.com> |
[InstCombine] Use `auto *` instead of `auto` in `visitSIToFP`; NFC
|
Revision tags: llvmorg-18.1.4, llvmorg-18.1.3 |
|
#
b6bd41db |
| 21-Mar-2024 |
Noah Goldstein <goldstein.w.n@gmail.com> |
[InstCombine] Add canonicalization of `sitofp` -> `uitofp nneg`
This is essentially the same as #82404 but has the `nneg` flag which allows the backend to reliably undo the transform.
Closes #88299
|
#
b1094776 |
| 11-Apr-2024 |
Yingwei Zheng <dtcxzyw2333@gmail.com> |
[InstCombine] Infer nsw/nuw for trunc (#87910)
This patch adds support for inferring trunc's nsw/nuw flags.
|
#
56b3222b |
| 29-Mar-2024 |
Monad <yanwqmonad@gmail.com> |
[InstCombine] Remove the canonicalization of `trunc` to `i1` (#84628)
Remove the canonicalization of `trunc` to `i1` according to the
suggestion of
https://github.com/llvm/llvm-project/pull/83829#
[InstCombine] Remove the canonicalization of `trunc` to `i1` (#84628)
Remove the canonicalization of `trunc` to `i1` according to the
suggestion of
https://github.com/llvm/llvm-project/pull/83829#issuecomment-1986801166
https://github.com/llvm/llvm-project/blob/a84e66a92d7b97f68aa3ae7d2c5839f3fb0d291d/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp#L737-L745
Alive2: https://alive2.llvm.org/ce/z/cacYVA
show more ...
|