#
da2551f3 |
| 15-Dec-2020 |
Nico Weber <thakis@chromium.org> |
Revert "[Debugify] Support checking Machine IR debug info"
This reverts commit c4d2d4337d50bed3cafd564daece1a197005b22b. Necessary to revert 2a5675f11d3bc803a245c0e.
|
#
c4d2d433 |
| 15-Dec-2020 |
Xiang1 Zhang <xiang1.zhang@intel.com> |
[Debugify] Support checking Machine IR debug info Add mir-check-debug pass to check MIR-level debug info.
For IR-level, currently, LLVM have debugify + check-debugify to generate and check debug IR.
[Debugify] Support checking Machine IR debug info Add mir-check-debug pass to check MIR-level debug info.
For IR-level, currently, LLVM have debugify + check-debugify to generate and check debug IR. Much like the IR-level pass debugify, mir-debugify inserts sequentially increasing line locations to each MachineInstr in a Module, But there is no equivalent MIR-level check-debugify pass, So now we support it at "mir-check-debug".
Reviewed By: djtodoro
Differential Revision: https://reviews.llvm.org/D91595
show more ...
|
#
fc0f4010 |
| 15-Dec-2020 |
Xiang1 Zhang <xiang1.zhang@intel.com> |
Revert "[Debugify] Support checking Machine IR debug info"
This reverts commit 57a3d9ec4a8c1422f07264bed9f12a4ea416707e.
|
#
57a3d9ec |
| 15-Dec-2020 |
Xiang1 Zhang <xiang1.zhang@intel.com> |
[Debugify] Support checking Machine IR debug info Add mir-check-debug pass to check MIR-level debug info.
For IR-level, currently, LLVM have debugify + check-debugify to generate and check debug IR.
[Debugify] Support checking Machine IR debug info Add mir-check-debug pass to check MIR-level debug info.
For IR-level, currently, LLVM have debugify + check-debugify to generate and check debug IR. Much like the IR-level pass debugify, mir-debugify inserts sequentially increasing line locations to each MachineInstr in a Module, But there is no equivalent MIR-level check-debugify pass, So now we support it at "mir-check-debug".
Reviewed By: djtodoro
Differential Revision: https://reviews.llvm.org/D95195
show more ...
|
#
29356e32 |
| 07-Dec-2020 |
Anna Thomas <anna@azul.com> |
[ScalarizeMaskedMemIntrin] Add new PM support
This patch adds new PM support for the pass and the pass can be now used during middle-end transforms. The old pass is remamed to ScalarizeMaskedMemIntr
[ScalarizeMaskedMemIntrin] Add new PM support
This patch adds new PM support for the pass and the pass can be now used during middle-end transforms. The old pass is remamed to ScalarizeMaskedMemIntrinLegacyPass.
Reviewed-By: skatkov, aeubanks Differential Revision: https://reviews.llvm.org/D92743
show more ...
|
#
24d4291c |
| 02-Dec-2020 |
Hongtao Yu <hoy@fb.com> |
[CSSPGO] Pseudo probes for function calls.
An indirect call site needs to be probed for its potential call targets. With CSSPGO a direct call also needs a probe so that a calling context can be repr
[CSSPGO] Pseudo probes for function calls.
An indirect call site needs to be probed for its potential call targets. With CSSPGO a direct call also needs a probe so that a calling context can be represented by a stack of callsite probes. Unlike pseudo probes for basic blocks that are in form of standalone intrinsic call instructions, pseudo probes for callsites have to be attached to the call instruction, thus a separate instruction would not work.
One possible way of attaching a probe to a call instruction is to use a special metadata that carries information about the probe. The special metadata will have to make its way through the optimization pipeline down to object emission. This requires additional efforts to maintain the metadata in various places. Given that the `!dbg` metadata is a first-class metadata and has all essential support in place , leveraging the `!dbg` metadata as a channel to encode pseudo probe information is probably the easiest solution.
With the requirement of not inflating `!dbg` metadata that is allocated for almost every instruction, we found that the 32-bit DWARF discriminator field which mainly serves AutoFDO can be reused for pseudo probes. DWARF discriminators distinguish identical source locations between instructions and with pseudo probes such support is not required. In this change we are using the discriminator field to encode the ID and type of a callsite probe and the encoded value will be unpacked and consumed right before object emission. When a callsite is inlined, the callsite discriminator field will go with the inlined instructions. The `!dbg` metadata of an inlined instruction is in form of a scope stack. The top of the stack is the instruction's original `!dbg` metadata and the bottom of the stack is for the original callsite of the top-level inliner. Except for the top of the stack, all other elements of the stack actually refer to the nested inlined callsites whose discriminator field (which actually represents a calliste probe) can be used together to represent the inline context of an inlined PseudoProbeInst or CallInst.
To avoid collision with the baseline AutoFDO in various places that handles dwarf discriminators where a check against the `-pseudo-probe-for-profiling` switch is not available, a special encoding scheme is used to tell apart a pseudo probe discriminator from a regular discriminator. For the regular discriminator, if all lowest 3 bits are non-zero, it means the discriminator is basically empty and all higher 29 bits can be reversed for pseudo probe use.
Callsite pseudo probes are inserted in `SampleProfileProbePass` and a target-independent MIR pass `PseudoProbeInserter` is added to unpack the probe ID/type from `!dbg`.
Note that with this work the switch -debug-info-for-profiling will not work with -pseudo-probe-for-profiling anymore. They cannot be used at the same time.
Reviewed By: wmi
Differential Revision: https://reviews.llvm.org/D91756
show more ...
|
#
a65d8c5d |
| 02-Dec-2020 |
jasonliu <jasonliu.development@gmail.com> |
[XCOFF][AIX] Generate LSDA data and compact unwind section on AIX
Summary: AIX uses the existing EH infrastructure in clang and llvm. The major differences would be 1. AIX do not have CFI instructio
[XCOFF][AIX] Generate LSDA data and compact unwind section on AIX
Summary: AIX uses the existing EH infrastructure in clang and llvm. The major differences would be 1. AIX do not have CFI instructions. 2. AIX uses a new personality routine, named __xlcxx_personality_v1. It doesn't use the GCC personality rountine, because the interoperability is not there yet on AIX. 3. AIX do not use eh_frame sections. Instead, it would use a eh_info section (compat unwind section) to store the information about personality routine and LSDA data address.
Reviewed By: daltenty, hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D91455
show more ...
|
#
6042c25b |
| 08-Oct-2020 |
Amara Emerson <amara@apple.com> |
[GlobalISel] Add translation support for vector reduction intrinsics.
In order to prevent the ExpandReductions pass from expanding some intrinsics before they get to codegen, I had to add a -disable
[GlobalISel] Add translation support for vector reduction intrinsics.
In order to prevent the ExpandReductions pass from expanding some intrinsics before they get to codegen, I had to add a -disable-expand-reductions flag for testing purposes.
Differential Revision: https://reviews.llvm.org/D89028
show more ...
|
#
3ae07b2a |
| 21-Sep-2020 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
TargetPassConfig.cpp - use auto const& iterator in for-range loop to avoid copies. NFCI.
|
#
ad99e34c |
| 12-Sep-2020 |
Yuanfang Chen <yuanfang.chen@sony.com> |
Revert "[NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline"
This reverts commit 31ecf8d29d81d196374a562c6d2bd2c25a62861e. This reverts commit 3fdaa8602a086a3fca5f0fc8527536
Revert "[NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline"
This reverts commit 31ecf8d29d81d196374a562c6d2bd2c25a62861e. This reverts commit 3fdaa8602a086a3fca5f0fc8527536ac659079d0.
There is laying violation for Target->CodeGen.
show more ...
|
#
31ecf8d2 |
| 11-Sep-2020 |
Yuanfang Chen <yuanfang.chen@sony.com> |
[NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline
Following up on D67687. Please refer to the RFC here http://lists.llvm.org/pipermail/llvm-dev/2020-July/143309.html
`Cod
[NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline
Following up on D67687. Please refer to the RFC here http://lists.llvm.org/pipermail/llvm-dev/2020-July/143309.html
`CodeGenPassBuilder` is the NPM counterpart of `TargetPassConfig` with below differences. - Debugging features (MIR print/verify, disable pass, start/stop-before/after, etc.) living in `TargetPassConfig` are moved to use PassInstrument as much as possible. (Implementation also lives in `TargetPassConfig.cpp`) - `TargetPassConfig` is a polymorphic base (virtual inheritance) to build the target-dependent pipeline whereas `CodeGenPassBuilder` is the CRTP base/helper to implement the target-dependent pipeline. The motivation is flexibility for targets to customize the pipeline, inlining opportunity, and fits the overall NPM value semantics design. - `TargetPassConfig` is a legacy immutable pass to declare hooks for targets to customize some target-independent codegen layer behavior. This is partially ported to TargetMachine::options. The rest, such as `createMachineScheduler/createPostMachineScheduler`, are left out for now. They should be implemented in LLVMTargetMachine in the future.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D83608
show more ...
|
#
94faadac |
| 05-Aug-2020 |
Snehasish Kumar <snehasishk@google.com> |
[llvm][CodeGen] Machine Function Splitter
We introduce a codegen optimization pass which splits functions into hot and cold parts. This pass leverages the basic block sections feature recently intro
[llvm][CodeGen] Machine Function Splitter
We introduce a codegen optimization pass which splits functions into hot and cold parts. This pass leverages the basic block sections feature recently introduced in LLVM from the Propeller project. The pass targets functions with profile coverage, identifies cold blocks and moves them to a separate section. The linker groups all cold blocks across functions together, decreasing fragmentation and improving icache and itlb utilization.
We evaluated the Machine Function Splitter pass on clang bootstrap and SPECInt 2017.
For clang bootstrap we observe a mean 2.33% runtime improvement with a ~32% reduction in itlb and stlb misses. Additionally, L1 icache misses reduced by 9.5% while L2 instruction misses reduced by 20%.
For SPECInt we report the change in IntRate the C/C++ benchmarks. All benchmarks apart from mcf and x264 improve, on average by 0.6% with the max for deepsjeng at 1.6%.
Benchmark % Change 500.perlbench_r 0.78 502.gcc_r 0.82 505.mcf_r -0.30 520.omnetpp_r 0.18 523.xalancbmk_r 0.37 525.x264_r -0.46 531.deepsjeng_r 1.61 541.leela_r 0.83 557.xz_r 0.15
Differential Revision: https://reviews.llvm.org/D85368
show more ...
|
#
8d943a92 |
| 06-Aug-2020 |
Snehasish Kumar <snehasishk@google.com> |
[NFC] Rename BBSectionsPrepare -> BasicBlockSections.
Rename the BBSectionsPrepare pass as suggested by the review comment in https://reviews.llvm.org/D85368.
Differential Revision: https://reviews
[NFC] Rename BBSectionsPrepare -> BasicBlockSections.
Rename the BBSectionsPrepare pass as suggested by the review comment in https://reviews.llvm.org/D85368.
Differential Revision: https://reviews.llvm.org/D85380
show more ...
|
#
dc619f3d |
| 23-Jul-2020 |
Evgeny Leviant <eleviant@accesssoftek.com> |
[CodeGen][TargetPassConfig] Add unreachable-mbb-elimination pass explicitly
Differential revision: https://reviews.llvm.org/D84228
|
#
589c646a |
| 20-Jul-2020 |
Yuanfang Chen <yuanfang.chen@sony.com> |
[llc] (almost) remove `--print-machineinstrs`
Its effect could be achieved by `-stop-after`,`-print-after`,`-print-after-all`. But a few tests need to print MIR after ISel which could not be done wi
[llc] (almost) remove `--print-machineinstrs`
Its effect could be achieved by `-stop-after`,`-print-after`,`-print-after-all`. But a few tests need to print MIR after ISel which could not be done with `-print-after`/`-stop-after` since isel pass does not have commandline name. That's the reason `--print-machineinstrs` is downgraded to `--print-after-isel` in this patch. `--print-after-isel` could be removed after we switch to new pass manager since isel pass would have a commandline text name to use `print-after` or equivalent switches.
The motivation of this patch is to reduce tests dependency on would-be-deprecated feature.
Reviewed By: arsenm, dsanders
Differential Revision: https://reviews.llvm.org/D83275
show more ...
|
#
24089928 |
| 18-Jul-2020 |
Evgeny Leviant <v.evgeny.leviant@ntd.nintendo.com> |
[CodeGen][TargetPassConfig] Add TargetTransformInfo pass correctly
Patch adds tti pass directly enforcing its execution with correctly set TargetTransformInfo.
Differential revision: https://review
[CodeGen][TargetPassConfig] Add TargetTransformInfo pass correctly
Patch adds tti pass directly enforcing its execution with correctly set TargetTransformInfo.
Differential revision: https://reviews.llvm.org/D84047
show more ...
|
#
1e495e10 |
| 06-Jul-2020 |
Yuanfang Chen <yuanfang.chen@sony.com> |
[NFC] change getLimitedCodeGenPipelineReason to static function
|
#
54b64572 |
| 27-May-2020 |
Juneyoung Lee <aqjune@gmail.com> |
[TargetPassConfig] Add CanonicalizeFreezeInLoops before LSR
Summary: This patch adds CanonicalizeFreezeInLoops before LSR. Relevant patch: https://reviews.llvm.org/D77523
Reviewers: spatel, efriedm
[TargetPassConfig] Add CanonicalizeFreezeInLoops before LSR
Summary: This patch adds CanonicalizeFreezeInLoops before LSR. Relevant patch: https://reviews.llvm.org/D77523
Reviewers: spatel, efriedma, jdoerfert, fhahn, nikic, reames, xbolva00
Reviewed By: nikic
Subscribers: xbolva00, nikic, lebedev.ri, hiraditya, llvm-commits, sanwou01, nlopes
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77524
show more ...
|
#
2833c46f |
| 21-May-2020 |
Nikita Popov <nikita.ppv@gmail.com> |
[DwarfEHPrepare] Don't prune unreachable resumes at optnone
Disable pruning of unreachable resumes in the DwarfEHPrepare pass at optnone. While I expect the pruning itself to be essentially free, th
[DwarfEHPrepare] Don't prune unreachable resumes at optnone
Disable pruning of unreachable resumes in the DwarfEHPrepare pass at optnone. While I expect the pruning itself to be essentially free, this does require a dominator tree calculation, that is not used for anything else. Saving this DT construction makes for a 0.4% O0 compile-time improvement.
Differential Revision: https://reviews.llvm.org/D80400
show more ...
|
#
0c6bba71 |
| 19-May-2020 |
Nikita Popov <nikita.ppv@gmail.com> |
[TargetPassConfig] Don't add alias analysis at optnone
When performing codegen at optnone, don't add alias analysis to the pipeline. We don't need it, but it causes an unnecessary dominator tree cal
[TargetPassConfig] Don't add alias analysis at optnone
When performing codegen at optnone, don't add alias analysis to the pipeline. We don't need it, but it causes an unnecessary dominator tree calculation.
I've also moved the module verifier call to the top so that a bunch of disabled-at-optnone passes group more nicely.
Differential Revision: https://reviews.llvm.org/D80378
show more ...
|
#
46a52ff9 |
| 22-Apr-2020 |
Eli Friedman <efriedma@quicinc.com> |
[TargetPassConfig] Run MachineVerifier after more passes.
We were disabling verification for no reason in a bunch of places; just turn it on.
At this point, there are two key places where we don't
[TargetPassConfig] Run MachineVerifier after more passes.
We were disabling verification for no reason in a bunch of places; just turn it on.
At this point, there are two key places where we don't run verification: during register allocation, and after addPreEmitPass. Regalloc probably isn't worth messing with; it has its own invariants, and verifying afterwards is probably good enough. For after addPreEmitPass, it's probably worth investigating improvements.
show more ...
|
#
f71350f0 |
| 08-Apr-2020 |
Daniel Sanders <daniel_l_sanders@apple.com> |
Add -debugify-and-strip-all to add debug info before a pass and remove it after
Summary: This allows us to test each backend pass under the presence of debug info using pre-existing tests. The tests
Add -debugify-and-strip-all to add debug info before a pass and remove it after
Summary: This allows us to test each backend pass under the presence of debug info using pre-existing tests. The tests should not fail as a result of this so long as it's true that debug info does not affect CodeGen.
In practice, a few tests are sensitive to this: * Tests that check the pass structure (e.g. O0-pipeline.ll) * Tests that check --debug output. Specifically instruction dumps containing MMO's (e.g. prelegalizercombiner-extends.ll) * Tests that contain debugify metadata as mir-strip-debug will remove it (e.g. fastisel-debugvalue-undef.ll) * Tests with partial debug info (e.g. patchable-function-entry-empty.mir had debug info but no !llvm.dbg.cu) * Tests that check optimization remarks overly strictly (e.g. prologue-epilogue-remarks.mir) * Tests that would inject the pass in an unsafe region (e.g. seqpairspill.mir would inject between register alloc and virt reg rewriter) In all cases, the checks can either be updated or --debugify-and-strip-all-safe=0 can be used to avoid being affected by something like llvm-lit -Dllc='llc --debugify-and-strip-all-safe'
I tested this without the lost debug locations verifier to confirm that AArch64 behaviour is unaffected (with the fixes in this patch) and with it to confirm it finds the problems without the additional RUN lines we had before.
Depends on D77886, D77887, D77747
Reviewers: aprantl, vsk, bogner
Subscribers: qcolombet, kristof.beyls, hiraditya, danielkiss, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77888
show more ...
|
#
c162bc2a |
| 08-Apr-2020 |
Daniel Sanders <daniel_l_sanders@apple.com> |
Make TargetPassConfig and llc add pre/post passes the same way. NFC
Summary: At the moment, any changes we make to the passes that can be injected before/after others (e.g. -verify-machineinstrs and
Make TargetPassConfig and llc add pre/post passes the same way. NFC
Summary: At the moment, any changes we make to the passes that can be injected before/after others (e.g. -verify-machineinstrs and -print-after-all) have to be duplicated in both TargetPassConfig (for normal execution, -start-before/ -stop-before/etc) and llc (for -run-pass). Unify this pass injection into addMachinePrePass/addMachinePostPass that both TargetPassConfig and llc can use.
Reviewers: vsk, aprantl, bogner
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77887
show more ...
|
#
4275eb13 |
| 09-Apr-2020 |
Serguei Katkov <serguei.katkov@azul.com> |
Re-land [Codegen/Statepoint] Allow usage of registers for non gc deopt values.
The change introduces the usage of physical registers for non-gc deopt values. This require runtime support to know how
Re-land [Codegen/Statepoint] Allow usage of registers for non gc deopt values.
The change introduces the usage of physical registers for non-gc deopt values. This require runtime support to know how to take a value from register. By default usage is off and can be switched on by option.
The change also introduces additional fix-up patch which forces the spilling of caller saved registers (clobbered after the call) and re-writes statepoint to use spill slots instead of caller saved registers.
Reviewers: reames, danstrushin Reviewed By: dantrushin Subscribers: mgorny, hiraditya, mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D77797
show more ...
|
#
44f0d7f1 |
| 09-Apr-2020 |
Serguei Katkov <serguei.katkov@azul.com> |
Revert "[Codegen/Statepoint] Allow usage of registers for non gc deopt values."
This reverts commit a0275705bb5aa938119c3e7c8bc957a823450b17.
It causes buildbot failures building LLVM with BUILD_SH
Revert "[Codegen/Statepoint] Allow usage of registers for non gc deopt values."
This reverts commit a0275705bb5aa938119c3e7c8bc957a823450b17.
It causes buildbot failures building LLVM with BUILD_SHARED_LIBS due to a linker error.
show more ...
|