#
a4e47586 |
| 03-Jan-2025 |
Craig Topper <craig.topper@sifive.com> |
[ExpandMemCmp] Recognize canonical form of (icmp sle/sge X, 0) in getMemCmpOneBlock. (#121540)
This code recognizes special cases where the result of memcmp is
compared with 0. If the compare is sl
[ExpandMemCmp] Recognize canonical form of (icmp sle/sge X, 0) in getMemCmpOneBlock. (#121540)
This code recognizes special cases where the result of memcmp is
compared with 0. If the compare is sle/sge, then InstCombine
canonicalizes to (icmp slt X, 1) or (icmp sgt X, -1). We should
recognize those patterns too.
show more ...
|
#
4dfea22e |
| 03-Jan-2025 |
Craig Topper <craig.topper@sifive.com> |
[ExpandMemCmp][AArch64][PowerPC][RISCV][X86] Use llvm.ucmp instead of (sub (zext (icmp ugt)), (zext (icmp ult))). (#121530)
AArch64 and PowerPC look like a improvements.
RISC-V is neutral.
X86 tra
[ExpandMemCmp][AArch64][PowerPC][RISCV][X86] Use llvm.ucmp instead of (sub (zext (icmp ugt)), (zext (icmp ult))). (#121530)
AArch64 and PowerPC look like a improvements.
RISC-V is neutral.
X86 trades a dependency breaking xor before a seta for a movsx after a
sbbb. Depending on how the result is used, this movsx might go away.
show more ...
|
#
72db3f98 |
| 03-Jan-2025 |
Pengcheng Wang <wangpengcheng.pp@bytedance.com> |
[RISCV] Allow tail memcmp expansion (#121460)
This optimization was introduced by #70469.
Like AArch64, we allow tail expansions for 3 on RV32 and 3/5/6 on RV64.
This can simplify the comparison a
[RISCV] Allow tail memcmp expansion (#121460)
This optimization was introduced by #70469.
Like AArch64, we allow tail expansions for 3 on RV32 and 3/5/6 on RV64.
This can simplify the comparison and reduce the number of blocks.
show more ...
|
#
9122c523 |
| 15-Nov-2024 |
Pengcheng Wang <wangpengcheng.pp@bytedance.com> |
[RISCV] Enable bidirectional scheduling and tracking register pressure (#115445)
This is based on other targets like PPC/AArch64 and some experiments.
This PR will only enable bidirectional schedu
[RISCV] Enable bidirectional scheduling and tracking register pressure (#115445)
This is based on other targets like PPC/AArch64 and some experiments.
This PR will only enable bidirectional scheduling and tracking register pressure.
Disclaimer: I haven't tested it on many cores, maybe we should make some options being features. I believe downstreams must have tried this before, so feedbacks are welcome.
show more ...
|
#
7a5b040e |
| 06-Nov-2024 |
Pengcheng Wang <wangpengcheng.pp@bytedance.com> |
[RISCV] Add initial support of memcmp expansion
There are two passes that have dependency on the implementation of `TargetTransformInfo::enableMemCmpExpansion` : `MergeICmps` and `ExpandMemCmp`.
Th
[RISCV] Add initial support of memcmp expansion
There are two passes that have dependency on the implementation of `TargetTransformInfo::enableMemCmpExpansion` : `MergeICmps` and `ExpandMemCmp`.
This PR adds the initial implementation of `enableMemCmpExpansion` so that we can have some basic benefits from these two passes.
We don't enable expansion when there is no unaligned access support currently because there are some issues about unaligned loads and stores in `ExpandMemcmp` pass. We should fix these issues and enable the expansion later.
Vector case hasn't been tested as we don't generate inlined vector instructions for memcmp currently.
Reviewers: preames, arcbbb, topperc, asb, dtcxzyw
Reviewed By: topperc, preames
Pull Request: https://github.com/llvm/llvm-project/pull/107548
show more ...
|
#
5adb5c05 |
| 06-Nov-2024 |
Pengcheng Wang <wangpengcheng.pp@bytedance.com> |
[RISCV] Add tests for memcmp expansion
We add tests for the following cases: * Length = 0, 1, 2, 3, 4, 5, 6, 7, 8, 15, 16, 31, 32, 63, 64, 127, 128, runtime. * Comparisons against zero. * RUN line
[RISCV] Add tests for memcmp expansion
We add tests for the following cases: * Length = 0, 1, 2, 3, 4, 5, 6, 7, 8, 15, 16, 31, 32, 63, 64, 127, 128, runtime. * Comparisons against zero. * RUN lines for scalar/vector w/ or w/o strict align. * Optimize for size.
Reviewers: topperc, preames
Reviewed By: topperc, preames
Pull Request: https://github.com/llvm/llvm-project/pull/107824
show more ...
|