Revision tags: llvmorg-21-init, llvmorg-19.1.7 |
|
#
4f7dc1b5 |
| 12-Jan-2025 |
Ruhung <143302514+Ruhung@users.noreply.github.com> |
[InstCombine] Fold (add (add A, 1), (sext (icmp ne A, 0))) to call umax(A, 1) (#122491)
Transform (add (add A, 1), (sext (icmp ne A, 0))) into call umax(A, 1).
Fixes #121853.
Alive2: https://a
[InstCombine] Fold (add (add A, 1), (sext (icmp ne A, 0))) to call umax(A, 1) (#122491)
Transform (add (add A, 1), (sext (icmp ne A, 0))) into call umax(A, 1).
Fixes #121853.
Alive2: https://alive2.llvm.org/ce/z/TweTan
show more ...
|
#
69ba5657 |
| 06-Jan-2025 |
Yingwei Zheng <dtcxzyw2333@gmail.com> |
[InstCombine] Handle commuted pattern for `((X s/ C1) << C2) + X` (#121737)
Closes https://github.com/llvm/llvm-project/issues/121700
|
#
a77346ba |
| 06-Jan-2025 |
Yingwei Zheng <dtcxzyw2333@gmail.com> |
[IRBuilder] Refactor FMF interface (#121657)
Up to now, the only way to set specified FMF flags in IRBuilder is to
use `FastMathFlagGuard`. It makes the code ugly and hard to maintain.
This patc
[IRBuilder] Refactor FMF interface (#121657)
Up to now, the only way to set specified FMF flags in IRBuilder is to
use `FastMathFlagGuard`. It makes the code ugly and hard to maintain.
This patch introduces a helper class `FMFSource` to replace the original
parameter `Instruction *FMFSource` in IRBuilder. To maximize the
compatibility, it accepts an instruction or a specified FMF.
This patch also removes the use of `FastMathFlagGuard` in some simple
cases.
Compile-time impact:
https://llvm-compile-time-tracker.com/compare.php?from=f87a9db8322643ccbc324e317a75b55903129b55&to=9397e712f6010be15ccf62f12740e9b4a67de2f4&stat=instructions%3Au
show more ...
|
Revision tags: llvmorg-19.1.6 |
|
#
4a0d53a0 |
| 13-Dec-2024 |
Ramkumar Ramachandra <ramkumar.ramachandra@codasip.com> |
PatternMatch: migrate to CmpPredicate (#118534)
With the introduction of CmpPredicate in 51a895a (IR: introduce struct
with CmpInst::Predicate and samesign), PatternMatch is one of the first
key p
PatternMatch: migrate to CmpPredicate (#118534)
With the introduction of CmpPredicate in 51a895a (IR: introduce struct
with CmpInst::Predicate and samesign), PatternMatch is one of the first
key pieces of infrastructure that must be updated to match a CmpInst
respecting samesign information. Implement this change to Cmp-matchers.
This is a preparatory step in migrating the codebase over to
CmpPredicate. Since we no functional changes are desired at this stage,
we have chosen not to migrate CmpPredicate::operator==(CmpPredicate)
calls to use CmpPredicate::getMatching(), as that would have visible
impact on tests that are not yet written: instead, we call
CmpPredicate::operator==(Predicate), preserving the old behavior, while
also inserting a few FIXME comments for follow-ups.
show more ...
|
#
c3175c50 |
| 10-Dec-2024 |
fengfeng <153487255+fengfeng09@users.noreply.github.com> |
[InstCombine] Fold `(X & C1) - (X & C2) --> X & (C1 ^ C2)` if `(C1 & C2) == C2` (#119316)
if (C1 & C2) == C2 then (X & C1) - (X & C2) --> X & (C1 ^ C2)
Alive2: https://alive2.llvm.org/ce/z/JvQU8w
|
#
66ed8fb9 |
| 04-Dec-2024 |
Nikita Popov <npopov@redhat.com> |
[InstCombine] Fix use after free
Make sure we only access cached nowrap flags.
|
#
4a7abfe0 |
| 04-Dec-2024 |
Nikita Popov <npopov@redhat.com> |
[InstCombine] Preserve nuw in OptimizePointerDifference
If both the geps and the subs are nuw the new sub is also nuw.
Proof: https://alive2.llvm.org/ce/z/mM8UvF
|
Revision tags: llvmorg-19.1.5 |
|
#
1a3eace8 |
| 01-Dec-2024 |
Yingwei Zheng <dtcxzyw2333@gmail.com> |
[InstCombine] Fold `umax(X, C) + -C` into `usub.sat(X, C)` (#118195)
Alive2: https://alive2.llvm.org/ce/z/oSWe5S
Closes https://github.com/llvm/llvm-project/issues/118155
|
#
18abc7e0 |
| 25-Nov-2024 |
David Green <david.green@arm.com> |
[PatternMatch] Introduce m_c_Select (#114328)
This matches m_Select(m_Value(), L, R) or m_Select(m_Value(), R, L).
|
#
abac5be6 |
| 19-Nov-2024 |
Nikita Popov <npopov@redhat.com> |
[InstCombine] Fix APInt ctor assertion
The (extended) bit width might not fit into the (non-extended) type, resulting in an incorrect truncation of the compared value.
Fix this by using m_SpecificI
[InstCombine] Fix APInt ctor assertion
The (extended) bit width might not fit into the (non-extended) type, resulting in an incorrect truncation of the compared value.
Fix this by using m_SpecificInt(), which is both simpler and handles this correctly.
Fixes the assertion failure reported in: https://github.com/llvm/llvm-project/pull/114539#issuecomment-2485799395
show more ...
|
Revision tags: llvmorg-19.1.4 |
|
#
db90673d |
| 18-Nov-2024 |
Yingwei Zheng <dtcxzyw2333@gmail.com> |
[InstCombine] Re-queue users of phi when nsw/nuw flags of add are inferred (#113933)
This patch re-queue users of phi when one of its incoming add
instructions is updated. If an add instruction is
[InstCombine] Re-queue users of phi when nsw/nuw flags of add are inferred (#113933)
This patch re-queue users of phi when one of its incoming add
instructions is updated. If an add instruction is updated, the analysis
results of phis may be improved. Thus we may further fold some users of
this phi node.
See the following case:
```
define i8 @trunc_in_loop_exit_block() {
; CHECK-LABEL: @trunc_in_loop_exit_block(
; CHECK-NEXT: entry:
; CHECK-NEXT: br label [[LOOP:%.*]]
; CHECK: loop:
; CHECK-NEXT: [[IV:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP_LATCH:%.*]] ]
; CHECK-NEXT: [[PHI:%.*]] = phi i32 [ 1, [[ENTRY]] ], [ [[IV_NEXT]], [[LOOP_LATCH]] ]
; CHECK-NEXT: [[CMP:%.*]] = icmp samesign ult i32 [[IV]], 100
; CHECK-NEXT: br i1 [[CMP]], label [[LOOP_LATCH]], label [[EXIT:%.*]]
; CHECK: loop.latch:
; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i32 [[IV]], 1
; CHECK-NEXT: br label [[LOOP]]
; CHECK: exit:
; CHECK-NEXT: [[TRUNC:%.*]] = trunc i32 [[PHI]] to i8
; CHECK-NEXT: ret i8 [[TRUNC]]
;
entry:
br label %loop
loop:
%iv = phi i32 [ 0, %entry ], [ %iv.next, %loop.latch ]
%phi = phi i32 [ 1, %entry ], [ %iv.next, %loop.latch ]
%cmp = icmp ult i32 %iv, 100
br i1 %cmp, label %loop.latch, label %exit
loop.latch:
%iv.next = add i32 %iv, 1
br label %loop
exit:
%trunc = trunc i32 %phi to i8
ret i8 %trunc
}
```
`%iv u< 100` -> infer `nsw/nuw` for `%iv.next = add i32 %iv, 1`
-> `%iv` is non-negative -> infer `samesign` for `%cmp = icmp ult i32
%iv, 100`.
Without re-queuing users of phi nodes, we cannot improve `%cmp` in one
iteration.
Address review comment
https://github.com/llvm/llvm-project/pull/112642#discussion_r1804712271.
This patch also fixes some non-fixpoint issues in tests.
show more ...
|
#
6c1fc821 |
| 15-Nov-2024 |
Nikolay Panchenko <npanchen@modular.com> |
[InstCombine] fold `sub(zext(ptrtoint),zext(ptrtoint))` (#115369)
On a 32-bit target if pointer arithmetic with `addrspace` is used in i64
computation, the missed folding in InstCombine results to
[InstCombine] fold `sub(zext(ptrtoint),zext(ptrtoint))` (#115369)
On a 32-bit target if pointer arithmetic with `addrspace` is used in i64
computation, the missed folding in InstCombine results to suboptimal
performance, unlike same code compiled for 64bit target.
show more ...
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2 |
|
#
fa789dff |
| 11-Oct-2024 |
Rahul Joshi <rjoshi@nvidia.com> |
[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is a
[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
show more ...
|
Revision tags: llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4 |
|
#
a6edcea2 |
| 22-Aug-2024 |
Noah Goldstein <goldstein.w.n@gmail.com> |
[InstCombine] Simplify `(add/sub (sub/add) (sub/add))` irrelivant of use-count
Added folds: - `(add (sub X, Y), (sub Z, X))` -> `(sub Z, Y)` - `(sub (add X, Y), (add X, Z))` -> `(sub Y, Z)`
[InstCombine] Simplify `(add/sub (sub/add) (sub/add))` irrelivant of use-count
Added folds: - `(add (sub X, Y), (sub Z, X))` -> `(sub Z, Y)` - `(sub (add X, Y), (add X, Z))` -> `(sub Y, Z)`
The fold typically is handled in the `Reassosiate` pass, but it fails if the inner `sub`/`add` are multi-use. Less importantly, Reassosiate doesn't propagate flags correctly.
This patch adds the fold explicitly the InstCombine
Proofs: https://alive2.llvm.org/ce/z/p6JyRP
Closes #105866
show more ...
|
#
be7d08cd |
| 21-Aug-2024 |
Volodymyr Vasylkun <vvmposeydon@gmail.com> |
[InstCombine] Fold `sext(A < B) + zext(A > B)` into `ucmp/scmp(A, B)` (#103833)
This change also covers the fold of `zext(A > B) - zext(A < B)` since it
is already being canonicalized into the afor
[InstCombine] Fold `sext(A < B) + zext(A > B)` into `ucmp/scmp(A, B)` (#103833)
This change also covers the fold of `zext(A > B) - zext(A < B)` since it
is already being canonicalized into the aforementioned pattern.
Proof: https://alive2.llvm.org/ce/z/AgnfMn
show more ...
|
#
a1058776 |
| 21-Aug-2024 |
Nikita Popov <npopov@redhat.com> |
[InstCombine] Remove some of the complexity-based canonicalization (#91185)
The idea behind this canonicalization is that it allows us to handle less
patterns, because we know that some will be can
[InstCombine] Remove some of the complexity-based canonicalization (#91185)
The idea behind this canonicalization is that it allows us to handle less
patterns, because we know that some will be canonicalized away. This is
indeed very useful to e.g. know that constants are always on the right.
However, this is only useful if the canonicalization is actually
reliable. This is the case for constants, but not for arguments: Moving
these to the right makes it look like the "more complex" expression is
guaranteed to be on the left, but this is not actually the case in
practice. It fails as soon as you replace the argument with another
instruction.
The end result is that it looks like things correctly work in tests,
while they actually don't. We use the "thwart complexity-based
canonicalization" trick to handle this in tests, but it's often a
challenge for new contributors to get this right, and based on the
regressions this PR originally exposed, we clearly don't get this right
in many cases.
For this reason, I think that it's better to remove this complexity
canonicalization. It will make it much easier to write tests for
commuted cases and make sure that they are handled.
show more ...
|
Revision tags: llvmorg-19.1.0-rc3 |
|
#
dd9a99f2 |
| 16-Aug-2024 |
Nikita Popov <npopov@redhat.com> |
[InstCombine] Preserve nsw in A + -B fold
This was already done for -B + A, but not for A + -B.
Proof: https://alive2.llvm.org/ce/z/F3V2yZ
|
Revision tags: llvmorg-19.1.0-rc2 |
|
#
62e9f409 |
| 29-Jul-2024 |
Yingwei Zheng <dtcxzyw2333@gmail.com> |
[PatternMatch] Use `m_SpecificCmp` matchers. NFC. (#100878)
Compile-time improvement:
http://llvm-compile-time-tracker.com/compare.php?from=13996378d81c8fa9a364aeaafd7382abbc1db83a&to=861ffa4ec5f7b
[PatternMatch] Use `m_SpecificCmp` matchers. NFC. (#100878)
Compile-time improvement:
http://llvm-compile-time-tracker.com/compare.php?from=13996378d81c8fa9a364aeaafd7382abbc1db83a&to=861ffa4ec5f7bde5a194a7715593a1b5359eb581&stat=instructions:u
baseline: 803eaf29267c6aae9162d1a83a4a2ae508b440d3
```
Top 5 improvements:
stockfish/movegen.ll 2541620819 2538599412 -0.12%
minetest/profiler.cpp.ll 431724935 431246500 -0.11%
abc/luckySwap.c.ll 581173720 580581935 -0.10%
abc/kitTruth.c.ll 2521936288 2519445570 -0.10%
abc/extraUtilTruth.c.ll 1216674614 1215495502 -0.10%
Top 5 regressions:
openssl/libcrypto-shlib-sm4.ll 1155054721 1155943201 +0.08%
openssl/libcrypto-lib-sm4.ll 1155054838 1155943063 +0.08%
spike/vsm4r_vv.ll 1296430080 1297039258 +0.05%
spike/vsm4r_vs.ll 1312496906 1313093460 +0.05%
nuttx/lib_rand48.c.ll 126201233 126246692 +0.04%
Overall: -0.02112308%
```
show more ...
|
Revision tags: llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
01ceb984 |
| 12-Jul-2024 |
Craig Topper <craig.topper@sifive.com> |
[InstCombine] Fold (zext (X +nuw C)) + -C --> zext(X) when zext has additional use. (#98533)
We have a general fold for (zext (X +nuw C2)) + C1 --> zext (X + (C2 +
trunc(C1)))
but this fold is dis
[InstCombine] Fold (zext (X +nuw C)) + -C --> zext(X) when zext has additional use. (#98533)
We have a general fold for (zext (X +nuw C2)) + C1 --> zext (X + (C2 +
trunc(C1)))
but this fold is disabled if the zext has an additional use.
If the two constants cancel, we can fold the whole expression to
zext(X) without increasing the number of instructions.
show more ...
|
Revision tags: llvmorg-18.1.8 |
|
#
96af1149 |
| 08-Jun-2024 |
csstormq <swust_xiaoqiangxu@163.com> |
[InstCombine] Preserve the nsw/nuw flags for (X | Op01C) + Op1C --> X + (Op01C + Op1C) (#94586)
This patch simplifies `sdiv` to `udiv` by preserving the `nsw` flag for
`(X | Op01C) + Op1C --> X + (
[InstCombine] Preserve the nsw/nuw flags for (X | Op01C) + Op1C --> X + (Op01C + Op1C) (#94586)
This patch simplifies `sdiv` to `udiv` by preserving the `nsw` flag for
`(X | Op01C) + Op1C --> X + (Op01C + Op1C)` if the sum of `Op01C` and
`Op1C` will not overflow, and preserves the `nuw` flag unconditionally.
Alive2 Proofs (provided by @nikic): https://alive2.llvm.org/ce/z/nrdCZT,
https://alive2.llvm.org/ce/z/YnJHnH
show more ...
|
Revision tags: llvmorg-18.1.7 |
|
#
0310f7f2 |
| 30-May-2024 |
Noah Goldstein <goldstein.w.n@gmail.com> |
[InstCombine] Fold `(add X, (sext/zext (icmp eq X, C)))`
We can convert this to a select based on the `(icmp eq X, C)`, then constant fold the addition the true arm begin `(add C, (sext/zext 1))` an
[InstCombine] Fold `(add X, (sext/zext (icmp eq X, C)))`
We can convert this to a select based on the `(icmp eq X, C)`, then constant fold the addition the true arm begin `(add C, (sext/zext 1))` and the false arm being `(add X, 0)` e.g
- `(select (icmp eq X, C), (add C, (sext/zext 1)), (add X, 0))`.
This is essentially a specialization of the only case that sees to actually show up from #89020
Closes #93840
show more ...
|
Revision tags: llvmorg-18.1.6 |
|
#
0d335f78 |
| 09-May-2024 |
Nikita Popov <npopov@redhat.com> |
[InstCombine] Handle more commuted cases in matchesSquareSum()
|
Revision tags: llvmorg-18.1.5 |
|
#
d26002ac |
| 26-Apr-2024 |
Nikita Popov <npopov@redhat.com> |
[InstCombine] Fix use-after-free in OptimizePointerDifference()
EmitGEPOffset() may remove the old GEP, so be sure to cache the inbounds flag beforehand.
|
#
cbe1760f |
| 26-Apr-2024 |
Nikita Popov <npopov@redhat.com> |
[InstCombine] Allow multi-use OptimizePointerDifference() with two GEPs (#90017)
Currently, the OptimizePointerDifference fold does not trigger when
working on the sub of two geps where one of the
[InstCombine] Allow multi-use OptimizePointerDifference() with two GEPs (#90017)
Currently, the OptimizePointerDifference fold does not trigger when
working on the sub of two geps where one of the geps has multiple uses,
to avoid duplicating the offset arithmetic too much.
However, there are cases where performing it would still be
clearly profitable, e.g. test_sub_ptradd_multiuse.
This patch drops the one-use restriction using the same strategy we use
in GEP comparison folds: If there are multiple uses, we rewrite the GEP
to use the expanded offset arithmetic instead (effectively
canonicalizing it into ptradd representation).
Fixes https://github.com/llvm/llvm-project/issues/88231.
show more ...
|
#
cbb0477e |
| 25-Apr-2024 |
Yingwei Zheng <dtcxzyw2333@gmail.com> |
[InstCombine] Fold fneg over select (#89947)
As we folds fabs over select in
https://github.com/llvm/llvm-project/pull/86390, this patch folds fneg
over select to make sure nabs idioms are generat
[InstCombine] Fold fneg over select (#89947)
As we folds fabs over select in
https://github.com/llvm/llvm-project/pull/86390, this patch folds fneg
over select to make sure nabs idioms are generated.
Addresses
https://github.com/llvm/llvm-project/pull/86390#discussion_r1568862289.
Alive2 for FMF propagation: https://alive2.llvm.org/ce/z/-h6Vuo
show more ...
|