#
84f7fb62 |
| 16-Jan-2024 |
Alex Bradbury <asb@igalia.com> |
[MachineScheduler] Add option to control reordering for store/load clustering (#75338)
Reordering based on the sort order of the MemOpInfo array was disabled
in <https://reviews.llvm.org/D72706>. H
[MachineScheduler] Add option to control reordering for store/load clustering (#75338)
Reordering based on the sort order of the MemOpInfo array was disabled
in <https://reviews.llvm.org/D72706>. However, it's not clear this is
desirable for al targets. It also makes it more difficult to compare the
incremental benefit of enabling load clustering in the selectiondag
scheduler as well was the machinescheduler, as the sdag scheduler does
seem to allow this reordering.
This patch adds a parameter that can control the behaviour on a
per-target basis.
Split out from #73789.
show more ...
|
#
ea85345e |
| 08-Dec-2023 |
Wang Pengcheng <wangpengcheng.pp@bytedance.com> |
[RISCV][NFC] Use raw_svector_ostream to construct key of SubtargetMap (#72964)
To simplify some code.
|
#
efc32f5e |
| 08-Dec-2023 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Use Triple::isRISCV64(). NFC
|
#
d0a39e61 |
| 01-Dec-2023 |
Piyou Chen <piyou.chen@sifive.com> |
[RISCV] default enable splitting regalloc between RVV and other (#72950)
This patch make riscv-split-regalloc as true by default.
It will not affect the codegen result if it vector register allo
[RISCV] default enable splitting regalloc between RVV and other (#72950)
This patch make riscv-split-regalloc as true by default.
It will not affect the codegen result if it vector register allocation
doesn't exist. If there is the vector register allocation, it may affect
the non-rvv register LiveInterval's segment/weight. It will make the
allocation in a different order.
show more ...
|
#
85c9c168 |
| 29-Nov-2023 |
Alex Bradbury <asb@igalia.com> |
[RISCV] Support load clustering in the MachineScheduler (off by default) (#73754)
This adds minimal support for load clustering, but disables it by
default. The intent is to iterate on the precise
[RISCV] Support load clustering in the MachineScheduler (off by default) (#73754)
This adds minimal support for load clustering, but disables it by
default. The intent is to iterate on the precise heuristic and the
question of turning this on by default in a separate PR. Although
previous discussion indicates hope that the MachineScheduler would
replace most uses of the SelectionDAG scheduler, it does seem most
targets aren't using MachineScheduler load clustering right now:
PPC+AArch64 seem to just use it to help with paired load/store formation
and although AMDGPU uses it for general clustering it also implements
ShouldScheduleLoadsNear for the SelectionDAG scheduler's clustering.
show more ...
|
Revision tags: llvmorg-17.0.6 |
|
#
ac4868ea |
| 16-Nov-2023 |
Piyou Chen <piyou.chen@sifive.com> |
[RISCV] Split regalloc between RVV and other (#72096)
Enable this flow by -riscv-split-regalloc=1 (default disable), and could
designate specific allocator to RVV by
-riscv-rvv-regalloc=<fast|basi
[RISCV] Split regalloc between RVV and other (#72096)
Enable this flow by -riscv-split-regalloc=1 (default disable), and could
designate specific allocator to RVV by
-riscv-rvv-regalloc=<fast|basic|greedy>
It uses the RegClass filter function to decide which regclass need to be
processed.
This patch is pre-requirement for supporting PostRA vsetvl insertion
pass.
show more ...
|
Revision tags: llvmorg-17.0.5 |
|
#
9bb69c1d |
| 10-Nov-2023 |
Wang Pengcheng <wangpengcheng.pp@bytedance.com> |
[RISCV] Enable LoopDataPrefetch pass (#66201)
So that we can benefit from data prefetch when `Zicbop` extension is supported.
Tune information for data prefetching are added in `RISCVTuneInfo`.
|
#
014390d9 |
| 02-Nov-2023 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Implement cross basic block VXRM write insertion. (#70382)
This adds a new pass to insert VXRM writes for vector instructions. With
the goal of avoiding redundant writes.
The pass does 2
[RISCV] Implement cross basic block VXRM write insertion. (#70382)
This adds a new pass to insert VXRM writes for vector instructions. With
the goal of avoiding redundant writes.
The pass does 2 dataflow algorithms. The first is a forward data flow to
calculate where a VXRM value is available. The second is a backwards
dataflow to determine where a VXRM value is anticipated.
Finally, we use the results of these two dataflows to insert VXRM writes
where a value is anticipated, but not available.
The pass does not split critical edges so we aren't always able to
eliminate all redundancy.
The pass will only insert vxrm writes on paths that always require it.
show more ...
|
Revision tags: llvmorg-17.0.4 |
|
#
72e6c1c7 |
| 30-Oct-2023 |
Luke Lau <luke@igalia.com> |
[RISCV] Begin moving post-isel vector peepholes to a MF pass (#70342)
We currently have three postprocess peephole optimisations for vector
pseudos:
1) Masked pseudo with all ones mask -> unmask
[RISCV] Begin moving post-isel vector peepholes to a MF pass (#70342)
We currently have three postprocess peephole optimisations for vector
pseudos:
1) Masked pseudo with all ones mask -> unmasked pseudo
2) Merge vmerge pseudo into operand pseudo's mask
3) vmerge pseudo with all ones mask -> vmv.v.v pseudo
This patch aims to move these peepholes out of SelectionDAG and into a
separate RISCVFoldMasks MachineFunction pass.
There are a few motivations for doing this:
* The current SelectionDAG implementation operates on MachineSDNodes,
which are essentially MachineInstrs but require a bunch of logic to
reason about chain and glue operands. The RISCVII::has*Op helper
functions also don't exactly line up with the SDNode operands. Mutating
these pseudos and their operands in place becomes a good bit easier at
the MachineInstr level. For example, we would no longer need to check
for cycles in the DAG during performCombineVMergeAndVOps.
* Although it's further down the line, moving this code out of
SelectionDAG allows it to be reused by GlobalISel later on.
* In performCombineVMergeAndVOps, it may be possible to commute the
operands to enable folding in more cases (see
test/CodeGen/RISCV/rvv/vmadd-vp.ll). There is existing machinery to
commute operands in TII::commuteInstruction, but it's implemented on
MachineInstrs.
The pass runs straight after ISel, before any of the other machine SSA
optimization passes run. This is so that dead-mi-elimination can mop up
any vmsets that are no longer used (but if preferred we could try and
erase them from inside RISCVFoldMasks itself). This also means that
these peepholes are no longer run at codegen -O0, so this patch isn't
strictly NFC.
Only the performVMergeToVMv peephole is refactored in this patch, the
remaining two would be implemented later. And as noted by @preames, it
should be possible to move doPeepholeSExtW out of SelectionDAG as well.
show more ...
|
#
109aa586 |
| 26-Oct-2023 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Add an experimental pseudoinstruction to represent a rematerializable constant materialization sequence. (#69983)
Rematerialization during register allocation is currently limited to a
sing
[RISCV] Add an experimental pseudoinstruction to represent a rematerializable constant materialization sequence. (#69983)
Rematerialization during register allocation is currently limited to a
single instruction with no inputs.
This patch introduces a pseudoinstruction that represents the
materialization of a constant. I've started with a sequence of 2
instructions for now, which covers at least the common LUI+ADDI(W) case.
This instruction will be expanded into real instructions immediately
after register allocation using a new pass. This gives the post-RA
scheduler a chance to separate the 2 instructions to improve ILP.
I believe this matches the approach used by AArch64.
Unfortunately, this loses some CSE opportunies when an LUI value is used
by multiple constants with different LSBs.
This feature is off by default and a new backend command line option is
added to enable it for testing.
This avoids the spill and reloads reported in #69586.
show more ...
|
#
f4231bf4 |
| 19-Oct-2023 |
Wang Pengcheng <137158460+wangpc-pp@users.noreply.github.com> |
[RISCV] Replace PostRAScheduler with PostMachineScheduler (#68696)
Just like what other targets have done.
And this will make DAG mutations like MacroFusion take effect.
|
Revision tags: llvmorg-17.0.3 |
|
#
45636ecf |
| 07-Oct-2023 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Add sink-and-fold support for RISC-V. (#67602)
This uses the recently introduced sink-and-fold support in MachineSink.
https://reviews.llvm.org/D152828
This enables folding ADDI into
[RISCV] Add sink-and-fold support for RISC-V. (#67602)
This uses the recently introduced sink-and-fold support in MachineSink.
https://reviews.llvm.org/D152828
This enables folding ADDI into load/store addresses.
Enabling by default will be a separate PR.
show more ...
|
Revision tags: llvmorg-17.0.2 |
|
#
8e87dc10 |
| 22-Sep-2023 |
Craig Topper <craig.topper@sifive.com> |
[RISCV][GISel] Add a post legalizer combiner and enable a couple comb… (#67053)
…ines.
We have an existing test that shows benefit from redundant_and and
identity combines so use them as a start
[RISCV][GISel] Add a post legalizer combiner and enable a couple comb… (#67053)
…ines.
We have an existing test that shows benefit from redundant_and and
identity combines so use them as a starting point.
show more ...
|
#
93fde2ea |
| 19-Sep-2023 |
Yingwei Zheng <dtcxzyw2333@gmail.com> |
[RISCV] Add a pass to rewrite rd to x0 for non-computational instrs whose return values are unused
When AMOs are used to implement parallel reduction operations, typically the return value would be
[RISCV] Add a pass to rewrite rd to x0 for non-computational instrs whose return values are unused
When AMOs are used to implement parallel reduction operations, typically the return value would be discarded. This patch adds a peephole pass `RISCVDeadRegisterDefinitions`. It rewrites `rd` to `x0` when `rd` is marked as dead. It may improve the register allocation and reduce pipeline hazards on CPUs without register renaming and OOO. Comparison with GCC: https://godbolt.org/z/bKaxnEcec
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D158759
show more ...
|
Revision tags: llvmorg-17.0.1, llvmorg-17.0.0 |
|
#
8677aaa1 |
| 07-Sep-2023 |
Craig Topper <craig.topper@sifive.com> |
[RISCV][GISel] Add initial pre-legalizer combiners copying from AArch64.
|
#
0a1aa6cd |
| 14-Sep-2023 |
Arthur Eubanks <aeubanks@google.com> |
[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295)
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future chang
[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295)
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.
This matches other nearby enums.
For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
show more ...
|
Revision tags: llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3 |
|
#
a63bd7e9 |
| 14-Aug-2023 |
Philip Reames <preames@rivosinc.com> |
[RISCV] Use NoReg in place of IMPLICIT_DEF for undefined passthru operands
In a recent series of refactorings (described here: https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structur
[RISCV] Use NoReg in place of IMPLICIT_DEF for undefined passthru operands
In a recent series of refactorings (described here: https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295), I greatly increased the number of IMPLICIT_DEF operands to our vector instructions. This has turned out to have an unexpected negative impact because MachineCSE does not CSE IMPLICIT_DEFs, and thus does not CSE any instruction with an IMPLICIT_DEF operand. SelectionDAG *does* CSE the same case, but that only covers the same block case, not the cross block case. This lead to the performance regression reported in https://github.com/llvm/llvm-project/issues/64282.
This change is a slightly ugly hack to side step the issue. Instead of fixing the root cause (lack of CSE for IMPLICIT_DEF) or undoing the operand changes, we leave the extra operand in place, and use NoReg in place of IMPLICIT_DEF. I then convert back to IMPLICIT_DEF just before register allocation so that ProcessImplicitDefs and TwoAddressInstructions can do the normal transforms to Undef tied registers.
We may end up backporting this into the 17.x release branch. Given how late in the release cycle this is landing, that's much less likely now, but still a possibility.
Differential Revision: https://reviews.llvm.org/D156909
show more ...
|
Revision tags: llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init |
|
#
c0221e00 |
| 07-Jul-2023 |
WuXinlong <821408745@qq.com> |
[RISCV] Add a pass to combine `cm.pop` and `ret` insts
`RISCVPushPopOptimizer.cpp` combine `cm.pop` and `ret` to generates `cm.popretz` or `cm.popret` .
Reviewed By: craig.topper
Differential Revi
[RISCV] Add a pass to combine `cm.pop` and `ret` insts
`RISCVPushPopOptimizer.cpp` combine `cm.pop` and `ret` to generates `cm.popretz` or `cm.popret` .
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D150416
show more ...
|
#
83835e22 |
| 23-Jun-2023 |
Sami Tolvanen <samitolvanen@google.com> |
[RISCV] Implement KCFI operand bundle lowering
With `-fsanitize=kcfi` (Kernel Control-Flow Integrity), Clang emits "kcfi" operand bundles to indirect call instructions. Similarly to the target-speci
[RISCV] Implement KCFI operand bundle lowering
With `-fsanitize=kcfi` (Kernel Control-Flow Integrity), Clang emits "kcfi" operand bundles to indirect call instructions. Similarly to the target-specific lowering added in D119296, implement KCFI operand bundle lowering for RISC-V.
This patch disables the generic KCFI pass for RISC-V in Clang, and adds the KCFI machine function pass in `RISCVPassConfig::addPreSched` to emit target-specific `KCFI_CHECK` pseudo instructions before calls that have KCFI operand bundles. The machine function pass also bundles the instructions to ensure we emit the checks immediately before the calls, which is not possible with the generic pass.
`KCFI_CHECK` instructions are lowered in `RISCVAsmPrinter` to a contiguous code sequence that traps if the expected hash in the operand bundle doesn't match the hash before the target function address. This patch emits an `ebreak` instruction for error handling to match the Linux kernel's `BUG()` implementation. Just like for X86, we also emit trap locations to a `.kcfi_traps` section to support error handling, as we cannot embed additional information to the trap instruction itself.
Relands commit 62fa708ceb027713b386c7e0efda994f8bdc27e2 with fixed tests.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D148385
show more ...
|
#
e809ebeb |
| 23-Jun-2023 |
Sami Tolvanen <samitolvanen@google.com> |
Revert "[RISCV] Implement KCFI operand bundle lowering"
This reverts commit 62fa708ceb027713b386c7e0efda994f8bdc27e2.
Reverting to investigate -verify-machineinstrs errors in MIR tests.
|
#
62fa708c |
| 23-Jun-2023 |
Sami Tolvanen <samitolvanen@google.com> |
[RISCV] Implement KCFI operand bundle lowering
With `-fsanitize=kcfi` (Kernel Control-Flow Integrity), Clang emits "kcfi" operand bundles to indirect call instructions. Similarly to the target-speci
[RISCV] Implement KCFI operand bundle lowering
With `-fsanitize=kcfi` (Kernel Control-Flow Integrity), Clang emits "kcfi" operand bundles to indirect call instructions. Similarly to the target-specific lowering added in D119296, implement KCFI operand bundle lowering for RISC-V.
This patch disables the generic KCFI pass for RISC-V in Clang, and adds the KCFI machine function pass in `RISCVPassConfig::addPreSched` to emit target-specific `KCFI_CHECK` pseudo instructions before calls that have KCFI operand bundles. The machine function pass also bundles the instructions to ensure we emit the checks immediately before the calls, which is not possible with the generic pass.
`KCFI_CHECK` instructions are lowered in `RISCVAsmPrinter` to a contiguous code sequence that traps if the expected hash in the operand bundle doesn't match the hash before the target function address. This patch emits an `ebreak` instruction for error handling to match the Linux kernel's `BUG()` implementation. Just like for X86, we also emit trap locations to a `.kcfi_traps` section to support error handling, as we cannot embed additional information to the trap instruction itself.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D148385
show more ...
|
#
c9e08fa6 |
| 21-Jun-2023 |
WuXinlong <821408745@qq.com> |
[RISCV] Add a pass to merge moving parameter registers instructions for Zcmp
This patch adds a pass to generate `cm.mvsa01` & `cm.mva01s`.
RISCVMoveOptimizer.cpp which combines two mv inst into one
[RISCV] Add a pass to merge moving parameter registers instructions for Zcmp
This patch adds a pass to generate `cm.mvsa01` & `cm.mva01s`.
RISCVMoveOptimizer.cpp which combines two mv inst into one cm.mva01s or cm.mva01s.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D150415
show more ...
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5 |
|
#
7c836512 |
| 23-May-2023 |
eopXD <yueh.ting.chen@gmail.com> |
[2/3][RISCV][POC] Model vxrm in LLVM intrinsics and machine instructions for RVV fixed-point instructions
Depends on D151395.
This is the 2nd patch of the patch-set. For the cover letter of the pat
[2/3][RISCV][POC] Model vxrm in LLVM intrinsics and machine instructions for RVV fixed-point instructions
Depends on D151395.
This is the 2nd patch of the patch-set. For the cover letter of the patch-set, please checkout D151395. This patch originates from D121376.
This commit models vxrm by adding an immediate operand into intrinsics and machine instructions of RVV fixed-point instruction `vaadd`, `vaaddu`, `vasub`, and `vasubu`. This commit only covers intrinsics of the four instructions, the proceeding patches of the patch-set will do the same to other RVV fixed-point instructions.
The current naiive approach is to have a write to vxrm inserted before every fixed-point instruction. This is done by the new added pass `RISCVInsertReadWriteCSR`. The reason to name the pass in a more general term is because we will also model rounding mode for the RVV floating- point instructions. The approach will be improved in the future, implementing partial redundancy elimination algorithms to it.
The original LLVM intrinsics and machine instructions, take `vaadd` as an example, does not model the rounding mode is not removed in this patch. That is, `int.riscv.vaadd.*` co-exists with `int.riscv.vaadd.rm.*` after this patch. The next patch will add C intrinsics of vaadd with an additional operand that models the control of the rounding mode, in this patch, `int.riscv.vaadd.rm.*` will replace `int.riscv.vaadd.*`.
Authored-by: ShihPo Hung <shihpo.hung@sifive.com> Co-Authored-by: eop Chen <eop.chen@sifive.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D151396
show more ...
|
Revision tags: llvmorg-16.0.4, llvmorg-16.0.3 |
|
#
13fe6733 |
| 01-May-2023 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Move NTLH hint emission into RISCVAsmPrinter.cpp.
Rather than having a separate pass to add the hint instructions, emit them directly into the streamer during asm printing.
Reviewed By: BeM
[RISCV] Move NTLH hint emission into RISCVAsmPrinter.cpp.
Rather than having a separate pass to add the hint instructions, emit them directly into the streamer during asm printing.
Reviewed By: BeMg, kito-cheng
Differential Revision: https://reviews.llvm.org/D149511
show more ...
|
Revision tags: llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2 |
|
#
8d7c865c |
| 06-Feb-2023 |
Piyou Chen <piyou.chen@sifive.com> |
[RISCV] Support __builtin_nontemporal_load/store by MachineMemOperand
Differential Revision: https://reviews.llvm.org/D143361
|