minmax-fold.ll - OpenGrok history log for /llvm-project/llvm/test/Transforms/InstCombine/minmax-fold.ll

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7
# d7fe2cf8	18-Dec-2024	tianleliu <tianle.l.liu@intel.com>	[InstCombine] Widen Sel width after Cmp to generate Max/Min intrinsics. (#118932) When Sel(Cmp) are in different integer type, From: (K and N mean width, K < N; a and b are src operands.) bN = [InstCombine] Widen Sel width after Cmp to generate Max/Min intrinsics. (#118932) When Sel(Cmp) are in different integer type, From: (K and N mean width, K < N; a and b are src operands.) bN = Ext(bK) cond = Cmp(aN, bN) aK = Trunc aN retK = Sel(cond, aK, bK) To: bN = Ext(bK) cond = Cmp(aN, bN) retN = Sel(cond, aN, bN) retK = Trunc retN Though Sel's operands width becomes larger, the benefit of making type width in Sel the same as Cmp, is for combing to max/min intrinsics, and also better performance for SIMD instructions. References of correctness: https://alive2.llvm.org/ce/z/Y4Kegm https://alive2.llvm.org/ce/z/qFtjtR Reference of generated code comparision: https://gcc.godbolt.org/z/o97svGvYM https://gcc.godbolt.org/z/59Ynj91ov show more ...
Revision tags: llvmorg-19.1.6, llvmorg-19.1.5
# 979a0356	02-Dec-2024	Veera <32646674+veera-sivarajan@users.noreply.github.com>	[InstCombine] Fold `X Pred C2 ? X BOp C1 : C2 BOp C1` to `min/max(X, C2) BOp C1` (#116888) Fixes #82414. General Proof: https://alive2.llvm.org/ce/z/ERjNs4 Proof for Tests: https://alive2.llvm. [InstCombine] Fold `X Pred C2 ? X BOp C1 : C2 BOp C1` to `min/max(X, C2) BOp C1` (#116888) Fixes #82414. General Proof: https://alive2.llvm.org/ce/z/ERjNs4 Proof for Tests: https://alive2.llvm.org/ce/z/K-934G This PR transforms `select` instructions of the form `select (Cmp X C1) (BOp X C2) C3` to `BOp (min/max X C1) C2` iff `C3 == BOp C1 C2`. This helps in eliminating a noop loop in https://github.com/rust-lang/rust/issues/123845 but does not improve optimizations. show more ...
Revision tags: llvmorg-19.1.4
# 38fffa63	06-Nov-2024	Paul Walker <paul.walker@arm.com>	[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548)
Revision tags: llvmorg-19.1.3
# 583fa4f5	15-Oct-2024	Alexey Bader <alexey.bader@intel.com>	[InstCombine] Extend fcmp+select folding to minnum/maxnum intrinsics (#112088) Today, InstCombine can fold fcmp+select patterns to minnum/maxnum intrinsics when the nnan and nsz flags are set. The [InstCombine] Extend fcmp+select folding to minnum/maxnum intrinsics (#112088) Today, InstCombine can fold fcmp+select patterns to minnum/maxnum intrinsics when the nnan and nsz flags are set. The ordering of the operands in both the fcmp and select instructions is important for the folding to occur. maxnum patterns: 1. (a op b) ? a : b -> maxnum(a, b), where op is one of {ogt, oge} 2. (a op b) ? b : a -> maxnum(a, b), where op is one of {ule, ult} The second pattern is supposed to make the order of the operands in the select instruction irrelevant. However, the pattern matching code uses the CmpInst::getInversePredicate method to invert the comparison predicate. This method doesn't take into account the fast-math flags, which can lead missing the folding opportunity. The patch extends the pattern matching code to handle unordered fcmp instructions. This allows the folding to occur even when the select instruction has the operands in the inverse order. New maxnum patterns: 1. (a op b) ? a : b -> maxnum(a, b), where op is one of {ugt, uge} 2. (a op b) ? b : a -> maxnum(a, b), where op is one of {ole, olt} The same changes are applied to the minnum intrinsic. show more ...
Revision tags: llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4
# a1058776	21-Aug-2024	Nikita Popov <npopov@redhat.com>	[InstCombine] Remove some of the complexity-based canonicalization (#91185) The idea behind this canonicalization is that it allows us to handle less patterns, because we know that some will be can [InstCombine] Remove some of the complexity-based canonicalization (#91185) The idea behind this canonicalization is that it allows us to handle less patterns, because we know that some will be canonicalized away. This is indeed very useful to e.g. know that constants are always on the right. However, this is only useful if the canonicalization is actually reliable. This is the case for constants, but not for arguments: Moving these to the right makes it look like the "more complex" expression is guaranteed to be on the left, but this is not actually the case in practice. It fails as soon as you replace the argument with another instruction. The end result is that it looks like things correctly work in tests, while they actually don't. We use the "thwart complexity-based canonicalization" trick to handle this in tests, but it's often a challenge for new contributors to get this right, and based on the regressions this PR originally exposed, we clearly don't get this right in many cases. For this reason, I think that it's better to remove this complexity canonicalization. It will make it much easier to write tests for commuted cases and make sure that they are handled. show more ...
Revision tags: llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5
# b8f3024a	24-Apr-2024	Andreas Jonson <andjo403@hotmail.com>	[InstCombine] Swap out range metadata to range attribute for cttz/ctlz/ctpop (#88776) Since all optimizations that use range metadata now also handle range attribute, this patch replaces writes of [InstCombine] Swap out range metadata to range attribute for cttz/ctlz/ctpop (#88776) Since all optimizations that use range metadata now also handle range attribute, this patch replaces writes of range metadata for call instructions to range attributes. show more ...
# d9a5aa8e	17-Apr-2024	Nikita Popov <npopov@redhat.com>	[PatternMatch] Do not accept undef elements in m_AllOnes() and friends (#88217) Change all the cstval_pred_ty based PatternMatch helpers (things like m_AllOnes and m_Zero) to only allow poison elem [PatternMatch] Do not accept undef elements in m_AllOnes() and friends (#88217) Change all the cstval_pred_ty based PatternMatch helpers (things like m_AllOnes and m_Zero) to only allow poison elements inside vector splats, not undef elements. Historically, we used to represent non-demanded elements in vectors using undef. Nowadays, we use poison instead. As such, I believe that support for undef in vector splats is no longer useful. At the same time, while poison splat elements are pretty much always safe to ignore, this is not generally the case for undef elements. We have existing miscompiles in our tests due to this (see the masked-merge-*.ll tests changed here) and it's easy to miss such cases in the future, now that we write tests using poison instead of undef elements. I think overall, keeping support for undef elements no longer makes sense, and we should drop it. Once this is done consistently, I think we may also consider allowing poison in m_APInt by default, as doing that change is much less risky than doing the same with undef. This change involves a substantial amount of test changes. For most tests, I've just replaced undef with poison, as I don't think there is value in retaining both. For some tests (where the distinction between undef and poison is important), I've duplicated tests. show more ...
Revision tags: llvmorg-18.1.4, llvmorg-18.1.3
# b6bd41db	21-Mar-2024	Noah Goldstein <goldstein.w.n@gmail.com>	[InstCombine] Add canonicalization of `sitofp` -> `uitofp nneg` This is essentially the same as #82404 but has the `nneg` flag which allows the backend to reliably undo the transform. Closes #88299
# 6960ace5	20-Mar-2024	Noah Goldstein <goldstein.w.n@gmail.com>	Revert "[InstCombine] Canonicalize `(sitofp x)` -> `(uitofp x)` if `x >= 0`" This reverts commit d80d5b923c6f611590a12543bdb33e0c16044d44. It wasn't a particularly important transform to begin with Revert "[InstCombine] Canonicalize `(sitofp x)` -> `(uitofp x)` if `x >= 0`" This reverts commit d80d5b923c6f611590a12543bdb33e0c16044d44. It wasn't a particularly important transform to begin with and caused some codegen regressions on targets that prefer `sitofp` so dropping. Might re-visit along with adding `nneg` flag to `uitofp` so its easily reversable for the backend. show more ...
Revision tags: llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3
# d80d5b92	20-Feb-2024	Noah Goldstein <goldstein.w.n@gmail.com>	[InstCombine] Canonicalize `(sitofp x)` -> `(uitofp x)` if `x >= 0` Just a standard canonicalization. Proofs: https://alive2.llvm.org/ce/z/9W4VFm Closes #82404
# 641d160a	25-Feb-2024	Yingwei Zheng <dtcxzyw2333@gmail.com>	[InstCombine] Fold umax(smax)/smin(umin) with non-negative constants (#82929) This patch extends `reassociateMinMaxWithConstants` to fold the following patterns: ``` umax (smax X, nneg C0), nneg [InstCombine] Fold umax(smax)/smin(umin) with non-negative constants (#82929) This patch extends `reassociateMinMaxWithConstants` to fold the following patterns: ``` umax (smax X, nneg C0), nneg C1 --> smax X, (umax C0, C1) smin (umin X, nneg C0), nneg C1 --> umin X, (smin/umin C0, C1) ``` Alive2: https://alive2.llvm.org/ce/z/wfEj-e Address the comment https://github.com/llvm/llvm-project/pull/82472#pullrequestreview-1896922897. show more ...
Revision tags: llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5
# 5918f623	08-Nov-2023	Nikita Popov <npopov@redhat.com>	[InstCombine] Infer zext nneg flag (#71534) Use KnownBits to infer the nneg flag on zext instructions. Currently we only set nneg when converting sext -> zext, but don't set it when we have a ze [InstCombine] Infer zext nneg flag (#71534) Use KnownBits to infer the nneg flag on zext instructions. Currently we only set nneg when converting sext -> zext, but don't set it when we have a zext in the first place. If we want to use it in optimizations, we should make sure the flag inference is consistent. show more ...
Revision tags: llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1
# cbca9ce9	31-Mar-2023	Nikita Popov <npopov@redhat.com>	[InstCombine] Remove min/max special case when folding into select Now that we canonicalize to min/max intrinsics, we no longer need to guard against this here. In fact, it seems like the issue fro [InstCombine] Remove min/max special case when folding into select Now that we canonicalize to min/max intrinsics, we no longer need to guard against this here. In fact, it seems like the issue from PR46271 was the final push for introducing the intrinsics in the first place... show more ...
Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# 379de123	16-Dec-2022	Nikita Popov <npopov@redhat.com>	[InstCombine] Preserve instruction name in replaceInstUsesWith() Currently InstCombine folds using the `return replaceInstUsesWith(V, Builder.CreateFoo())` pattern do not preserve the original name [InstCombine] Preserve instruction name in replaceInstUsesWith() Currently InstCombine folds using the `return replaceInstUsesWith(V, Builder.CreateFoo())` pattern do not preserve the original name of the instruction. To preserve the name, you either have to use something like `return FooInst::Create(...)` which is usually less nice, or go out of the way to preserve the name with takeName(). We often don't do that. This patch instead preserves the name in replaceInstUsesWith() when replacing a named instruction with an unnamed instruction. To be conservative, I also added a zero-use check, which is a proxy for the case where the instruction was just created, rather than an existing one reused. Possibly we could drop that part. As InstCombine tests are robust against renames this does not cause any test diffs, so I regenerated a random test to show the effects. Differential Revision: https://reviews.llvm.org/D140192 show more ...
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2
# 4ab40eca	03-Oct-2022	Bjorn Pettersson <bjorn.a.pettersson@ericsson.com>	[test][InstCombine] Update some test cases to use opaque pointers These tests cases were converted using the script at https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34 Differential Re [test][InstCombine] Update some test cases to use opaque pointers These tests cases were converted using the script at https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34 Differential Revision: https://reviews.llvm.org/D135094 show more ...
Revision tags: llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4
# 82190f91	06-May-2022	Nikita Popov <npopov@redhat.com>	[InstCombine] Fold icmp of select with implied condition When threading the icmp over the select, check whether the condition can be folded when taking into account the select condition.
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2
# e1608a9d	24-Feb-2022	Nikita Popov <npopov@redhat.com>	[InstCombine] Remove SPF min/max canonicalization Now that we canonicalize SPF min/max to intrinsics, there's no need to canonicalize the structure of the SPF min/max itself anymore. This is concept [InstCombine] Remove SPF min/max canonicalization Now that we canonicalize SPF min/max to intrinsics, there's no need to canonicalize the structure of the SPF min/max itself anymore. This is conceptually NFC, but in practice does slightly impact results due to folding order differences. show more ...
# a266af72	14-Feb-2022	Nikita Popov <npopov@redhat.com>	[InstCombine] Canonicalize SPF to min/max intrinsics Now that integer min/max intrinsics have good support in both InstCombine and other passes, start canonicalizing SPF min/max to intrinsic min/max [InstCombine] Canonicalize SPF to min/max intrinsics Now that integer min/max intrinsics have good support in both InstCombine and other passes, start canonicalizing SPF min/max to intrinsic min/max. Once this sticks, we can stop matching SPF min/max in various places, and can remove hacks we have for preventing infinite loops and breaking of SPF canonicalization. Differential Revision: https://reviews.llvm.org/D98152 show more ...
Revision tags: llvmorg-14.0.0-rc1
# acdc419c	04-Feb-2022	Bjorn Pettersson <bjorn.a.pettersson@ericsson.com>	[test] Use -passes=instcombine instead of -instcombine in lots of tests. NFC Another step moving away from the deprecated syntax of specifying pass pipeline in opt. Differential Revision: https://r [test] Use -passes=instcombine instead of -instcombine in lots of tests. NFC Another step moving away from the deprecated syntax of specifying pass pipeline in opt. Differential Revision: https://reviews.llvm.org/D119081 show more ...
Revision tags: llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# e6ad9ef4	14-Dec-2021	Philip Reames <listmail@philipreames.com>	[instcombine] Canonicalize constant index type to i64 for extractelement/insertelement The basic idea to this is that a) having a single canonical type makes CSE easier, and b) many of our transform [instcombine] Canonicalize constant index type to i64 for extractelement/insertelement The basic idea to this is that a) having a single canonical type makes CSE easier, and b) many of our transforms are inconsistent about which types we end up with based on visit order. I'm restricting this to constants as for non-constants, we'd have to decide whether the simplicity was worth extra instructions. For constants, there are no extra instructions. We chose the canonical type as i64 arbitrarily. We might consider changing this to something else in the future if we have cause. Differential Revision: https://reviews.llvm.org/D115387 show more ...
Revision tags: llvmorg-13.0.1-rc1
# 1376301c	06-Nov-2021	Nikita Popov <nikita.ppv@gmail.com>	[InstCombine] Canonicalize range test idiom InstCombine converts range tests of the form (X > C1 && X < C2) or (X < C1 \|\| X > C2) into checks of the form (X + C3 < C4) or (X + C3 > C4). It is possib [InstCombine] Canonicalize range test idiom InstCombine converts range tests of the form (X > C1 && X < C2) or (X < C1 \|\| X > C2) into checks of the form (X + C3 < C4) or (X + C3 > C4). It is possible to express all range tests in either of these forms (with different choices of constants), but currently neither of them is considered canonical. We may have equivalent range tests using either ult or ugt. This proposes to canonicalize all range tests to use ult. An alternative would be to canonicalize to either ult or ugt depending on the specific constants involved -- e.g. in practice we currently generate ult for && style ranges and ugt for \|\| style ranges when going through the insertRangeTest() helper. In fact, the "clamp like" fold was relying on this, which is why I had to tweak it to not assume whether inversion is needed based on just the predicate. Proof: https://alive2.llvm.org/ce/z/_SP_rQ Differential Revision: https://reviews.llvm.org/D113366 show more ...
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2
# 6db163c7	13-Aug-2021	Qiu Chaofan <qiucofan@cn.ibm.com>	Pre-commit two-way clamp tests
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4
# 7e18cd88	20-Mar-2021	Nikita Popov <nikita.ppv@gmail.com>	[InstCombine] Whitelist non-refining folds in SimplifyWithOpReplaced This is an alternative to D98391/D98585, playing things more conservatively. If AllowRefinement == false, then we don't use InstS [InstCombine] Whitelist non-refining folds in SimplifyWithOpReplaced This is an alternative to D98391/D98585, playing things more conservatively. If AllowRefinement == false, then we don't use InstSimplify methods at all, and instead explicitly implement a small number of non-refining folds. Most cases are handled by constant folding, and I only had to add three folds to cover our unit tests / test-suite. While this may lose some optimization power, I think it is safer to approach from this direction, given how many issues this code has already caused. Differential Revision: https://reviews.llvm.org/D99027 show more ...
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1
# 9d70dbdc	27-Dec-2020	Juneyoung Lee <aqjune@gmail.com>	[InstCombine] use poison as placeholder for undemanded elems Currently undef is used as a don’t-care vector when constructing a vector using a series of insertelement. However, this is problematic b [InstCombine] use poison as placeholder for undemanded elems Currently undef is used as a don’t-care vector when constructing a vector using a series of insertelement. However, this is problematic because undef isn’t undefined enough. Especially, a sequence of insertelement can be optimized to shufflevector, but using undef as its placeholder makes shufflevector a poison-blocking instruction because undef cannot be optimized to poison. This makes a few straightforward optimizations incorrect, such as: ``` ; https://bugs.llvm.org/show_bug.cgi?id=44185 define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) { %xv = insertelement <4 x float> %q, float %x, i32 2 %r = shufflevector <4 x float> %y, <4 x float> %xv, <4 x i32> { 0, 6, 2, undef } ret <4 x float> %r ; %r[3] is undef } => define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) { %r = insertelement <4 x float> %y, float %x, i32 1 ret <4 x float> %r ; %r[3] = %y[3], incorrect if %y[3] = poison } Transformation doesn't verify! ERROR: Target is more poisonous than source ``` I’d like to suggest 1. Using poison as insertelement’s placeholder value (IRBuilder::CreateVectorSplat should be patched too) 2. Updating shufflevector’s semantics to return poison element if mask is undef Note that poison is currently lowered into UNDEF in SelDag, so codegen part is okay. m_Undef() matches PoisonValue as well, so existing optimizations will still fire. The only concern is hidden miscompilations that will go incorrect when poison constant is given. A conservative way is copying all tests having `insertelement undef` & replacing it with `insertelement poison` & run Alive2 on it, but it will create many tests and people won’t like it. :( Instead, I’ll simply locally maintain the tests and run Alive2. If there is any bug found, I’ll report it. Relevant links: https://bugs.llvm.org/show_bug.cgi?id=43958 , http://lists.llvm.org/pipermail/llvm-dev/2019-November/137242.html Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93586 show more ...
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1
# d12ec0f7	18-Jul-2020	Nikita Popov <nikita.ppv@gmail.com>	[InstCombine] Fix store merge worklist management (PR46680) Fixes https://bugs.llvm.org/show_bug.cgi?id=46680. Just like insertions through IRBuilder, InsertNewInstBefore() should be using the defe [InstCombine] Fix store merge worklist management (PR46680) Fixes https://bugs.llvm.org/show_bug.cgi?id=46680. Just like insertions through IRBuilder, InsertNewInstBefore() should be using the deferred worklist mechanism, so that processing of newly added instructions is prioritized. There's one side-effect of the worklist order change which could be classified as a regression. An add op gets pushed through a select that at the time is not a umax. We could add a reverse transform that tries to push adds in the reverse direction to restore a min/max, but that seems like a sure way of getting infinite loops... Seems like something that should best wait on min/max intrinsics. Differential Revision: https://reviews.llvm.org/D84109 show more ...
12 3