#
66e0498d |
| 29-Jan-2025 |
David Green <david.green@arm.com> |
[GlobalISel] Do not run verifier after ResetMachineFunctionPass (#124799)
After we fall back from GlobalISel to SDAG, the verifier gets called,
which calls getReservedRegs which uses SIMachineFunct
[GlobalISel] Do not run verifier after ResetMachineFunctionPass (#124799)
After we fall back from GlobalISel to SDAG, the verifier gets called,
which calls getReservedRegs which uses SIMachineFunctionInfo::usesAGPRs
which caches the result of UsesAGPRs. Because we have just fallen-back
the function is empty and it incorrectly gets cached to false. This
patch makes sure we don't try to run the verifier whilst the function is
empty.
show more ...
|
#
3feb7244 |
| 29-Jan-2025 |
Mingming Liu <mingmingl@google.com> |
[AsmPrinter][ELF] Support profile-guided section prefix for jump tables' (read-only) data sections (#122215)
https://github.com/llvm/llvm-project/pull/122183 adds a codegen pass to infer machine jum
[AsmPrinter][ELF] Support profile-guided section prefix for jump tables' (read-only) data sections (#122215)
https://github.com/llvm/llvm-project/pull/122183 adds a codegen pass to infer machine jump table entry's hotness from the MBB hotness. This is a follow-up PR to produce `.hot` and or `.unlikely` section prefix for jump table's (read-only) data sections in the relocatable `.o` files.
When this patch is enabled, linker will see {`.rodata`, `.rodata.hot`, `.rodata.unlikely`} in input sections. It can map `.rodata.hot` and `.rodata` in the input sections to `.rodata.hot` in the executable, and map `.rodata.unlikely` into `.rodata` with a pending extension to `--keep-text-section-prefix` like https://github.com/llvm/llvm-project/commit/059e7cbb66a30ce35f3ee43197eed1a106b50c5b, or with a linker script.
1. To partition hot and jump tables, the AsmPrinter pass slices a function's jump table indices into two groups, one for hot and the other for cold jump tables. It then emits hot jump tables into a `.hot`-prefixed data section and cold ones into a `.unlikely`-prefixed data section, retaining the relative order of `LJT<N>` labels within each group.
2. [ELF only] To have data sections with _dynamic_ names (e.g., `.rodata.hot[.func]`), we implement `TargetLoweringObjectFile::getSectionForJumpTable` method that accepts a `MachineJumpTableEntry` parameter, and update `selectELFSectionForGlobal` to generate `.hot` or `.unlikely` based on MJTE's hotness. - The dynamic JT section name doesn't depend on `-ffunction-section=true` or `-funique-section-names=true`, even though it leverages the similar underlying mechanism to have a MCSection with on-demand name as `-ffunction-section` does.
3. The new code path is off by default. - Typically, `TargetOptions` conveys clang or LLVM tools' options to code generation passes. To follow the pattern, add option `EnableStaticDataPartitioning` bit in `TargetOptions` and make it readable through `TargetMachine`. - To enable the new code path in tools like `llc`, `partition-static-data-sections` option is introduced in `CodeGen/CommandFlags.h/cpp`. - A subsequent patch ([draft](https://github.com/llvm/llvm-project/commit/8f36a1374365862b3ca9be5615dd38f02a318c45)) will add a clang option to enable the new code path.
---------
Co-authored-by: Ellis Hoag <ellis.sparky.hoag@gmail.com>
show more ...
|
Revision tags: llvmorg-21-init |
|
#
5a81a559 |
| 27-Jan-2025 |
David Green <david.green@arm.com> |
[GISel] Explicitly disable BF16 tablegen patterns. (#124113)
We currently have an issue where bf16 patters can be used to match fp16
types, as GISel does not know about the difference between the t
[GISel] Explicitly disable BF16 tablegen patterns. (#124113)
We currently have an issue where bf16 patters can be used to match fp16
types, as GISel does not know about the difference between the two. This
patch explicitly disables them to make sure that they are never used.
The opposite can also happen too, where fp16 patterns are used for
operators that should be bf16. So this also changes any operations with
bf16 types to now cause a fallback to SDAG.
The pass setup for GISel has been slightly adjusted to make sure that a
verify pass does not get added between AMD-SDAG and SIFixSGPRCopiesPass,
which otherwise can cause verifier issues when falling back.
show more ...
|
#
de209fa1 |
| 23-Jan-2025 |
Mingming Liu <mingmingl@google.com> |
[CodeGen] Introduce Static Data Splitter pass (#122183)
https://discourse.llvm.org/t/rfc-profile-guided-static-data-partitioning/83744 proposes to partition static data sections.
This patch introdu
[CodeGen] Introduce Static Data Splitter pass (#122183)
https://discourse.llvm.org/t/rfc-profile-guided-static-data-partitioning/83744 proposes to partition static data sections.
This patch introduces a codegen pass. This patch produces jump table hotness in the in-memory states (machine jump table info and entries). Target-lowering and asm-printer consume the states and produce `.hot` section suffix. The follow up PR https://github.com/llvm/llvm-project/pull/122215 implements such changes.
---------
Co-authored-by: Ellis Hoag <ellis.sparky.hoag@gmail.com>
show more ...
|
Revision tags: llvmorg-19.1.7 |
|
#
9efa7d7a |
| 30-Dec-2024 |
Fangrui Song <i@maskray.me> |
Remove -print-lsr-output in favor of --stop-after=loop-reduce
Pull Request: https://github.com/llvm/llvm-project/pull/121305
|
Revision tags: llvmorg-19.1.6, llvmorg-19.1.5 |
|
#
68f7b075 |
| 23-Nov-2024 |
Rahman Lavaee <rahmanl@google.com> |
[BasicBlockSections] Allow mixing of -basic-block-sections with MFS. (#117076)
This PR allows mixing `-basic-block-sections` with
`-enable-machine-function-splitter`. The strategy is to let
`-basi
[BasicBlockSections] Allow mixing of -basic-block-sections with MFS. (#117076)
This PR allows mixing `-basic-block-sections` with
`-enable-machine-function-splitter`. The strategy is to let
`-basic-block-sections` take precedence over functions with profiles.
show more ...
|
Revision tags: llvmorg-19.1.4 |
|
#
3f9d02aa |
| 18-Nov-2024 |
Akshat Oke <Akshat.Oke@amd.com> |
[CodeGen][NewPM] Port PeepholeOptimizer to NPM (#116326)
With this, all machine SSA optimization passes are available in the new codegen pipeline.
|
#
bb3f5e1f |
| 14-Nov-2024 |
Matin Raayai <30674652+matinraayai@users.noreply.github.com> |
Overhaul the TargetMachine and LLVMTargetMachine Classes (#111234)
Following discussions in #110443, and the following earlier discussions
in https://lists.llvm.org/pipermail/llvm-dev/2017-October/
Overhaul the TargetMachine and LLVMTargetMachine Classes (#111234)
Following discussions in #110443, and the following earlier discussions
in https://lists.llvm.org/pipermail/llvm-dev/2017-October/117907.html,
https://reviews.llvm.org/D38482, https://reviews.llvm.org/D38489, this
PR attempts to overhaul the `TargetMachine` and `LLVMTargetMachine`
interface classes. More specifically:
1. Makes `TargetMachine` the only class implemented under
`TargetMachine.h` in the `Target` library.
2. `TargetMachine` contains target-specific interface functions that
relate to IR/CodeGen/MC constructs, whereas before (at least on paper)
it was supposed to have only IR/MC constructs. Any Target that doesn't
want to use the independent code generator simply does not implement
them, and returns either `false` or `nullptr`.
3. Renames `LLVMTargetMachine` to `CodeGenCommonTMImpl`. This renaming
aims to make the purpose of `LLVMTargetMachine` clearer. Its interface
was moved under the CodeGen library, to further emphasis its usage in
Targets that use CodeGen directly.
4. Makes `TargetMachine` the only interface used across LLVM and its
projects. With these changes, `CodeGenCommonTMImpl` is simply a set of
shared function implementations of `TargetMachine`, and CodeGen users
don't need to static cast to `LLVMTargetMachine` every time they need a
CodeGen-specific feature of the `TargetMachine`.
5. More importantly, does not change any requirements regarding library
linking.
cc @arsenm @aeubanks
show more ...
|
#
d23c5c2d |
| 14-Nov-2024 |
Kyungwoo Lee <kyulee@meta.com> |
[CGData] Global Merge Functions (#112671)
This implements a global function merging pass. Unlike traditional
function merging passes that use IR comparators, this pass employs a
structurally stabl
[CGData] Global Merge Functions (#112671)
This implements a global function merging pass. Unlike traditional
function merging passes that use IR comparators, this pass employs a
structurally stable hash to identify similar functions while ignoring
certain constant operands. These ignored constants are tracked and
encoded into a stable function summary. When merging, instead of
explicitly folding similar functions and their call sites, we form a
merging instance by supplying different parameters via thunks. The
actual size reduction occurs when identically created merging instances
are folded by the linker.
Currently, this pass is wired to a pre-codegen pass, enabled by the
`-enable-global-merge-func` flag.
In a local merging mode, the analysis and merging steps occur
sequentially within a module:
- `analyze`: Collects stable function hashes and tracks locations of
ignored constant operands.
- `finalize`: Identifies merge candidates with matching hashes and
computes the set of parameters that point to different constants.
- `merge`: Uses the stable function map to optimistically create a
merged function.
We can enable a global merging mode similar to the global function
outliner
(https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753/),
which will perform the above steps separately.
- `-codegen-data-generate`: During the first round of code generation,
we analyze local merging instances and publish their summaries.
- Offline using `llvm-cgdata` or at link-time, we can finalize all these
merging summaries that are combined to determine parameters.
- `-codegen-data-use`: During the second round of code generation, we
optimistically create merging instances within each module, and finally,
the linker folds identically created merging instances.
Depends on #112664
This is a patch for
https://discourse.llvm.org/t/rfc-global-function-merging/82608.
show more ...
|
#
d2aff182 |
| 07-Nov-2024 |
abhishek-kaushik22 <abhishek.kaushik@intel.com> |
Revert "TLS loads opimization (hoist)" (#114740)
This reverts commit c31014322c0b5ae596da129cbb844fb2198b4ef4.
Based on the discussions in #112772, this pass is not needed after the
introduction
Revert "TLS loads opimization (hoist)" (#114740)
This reverts commit c31014322c0b5ae596da129cbb844fb2198b4ef4.
Based on the discussions in #112772, this pass is not needed after the
introduction of `llvm.threadlocal.address` intrinsic.
Fixes https://github.com/llvm/llvm-project/issues/112771.
show more ...
|
#
44d0e952 |
| 30-Oct-2024 |
Akshat Oke <Akshat.Oke@amd.com> |
[CodeGen][NewPM] Port TailDuplicate pass to NPM (#113293)
|
Revision tags: llvmorg-19.1.3 |
|
#
c4c60c0d |
| 23-Oct-2024 |
Akshat Oke <Akshat.Oke@amd.com> |
[CodeGen][NewPM] Port OptimizePHIs to NPM (#113433)
|
#
488d3924 |
| 16-Oct-2024 |
Christudasan Devadasan <christudasan.devadasan@amd.com> |
[CodeGen][NewPM] Port EarlyIfConversion pass to NPM. (#108508)
|
Revision tags: llvmorg-19.1.2 |
|
#
cd6c2b80 |
| 14-Oct-2024 |
Akshat Oke <76596238+optimisan@users.noreply.github.com> |
[NewPM][CodeGen] Port StackColoring to NPM (#111812)
|
Revision tags: llvmorg-19.1.1, llvmorg-19.1.0 |
|
#
6c143a86 |
| 04-Sep-2024 |
Christudasan Devadasan <christudasan.devadasan@amd.com> |
[CodeGen][NewPM] Port MachineCSE pass to new pass manager. (#106605)
|
Revision tags: llvmorg-19.1.0-rc4 |
|
#
3d08ade7 |
| 29-Aug-2024 |
Stephen Tozer <stephen.tozer@sony.com> |
[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149)
This patch is part of a set of patches that add an `-fextend-lifetimes`
flag to clang, which extends the lifetimes of
[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149)
This patch is part of a set of patches that add an `-fextend-lifetimes`
flag to clang, which extends the lifetimes of local variables and
parameters for improved debuggability. In addition to that flag, the
patch series adds a pragma to selectively disable `-fextend-lifetimes`,
and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes`
for this pointers only. All changes and tests in these patches were
written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer)
has handled review and merging. The extend lifetimes flag is intended to
eventually be set on by `-Og`, as discussed in the RFC
here:
https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850
This patch implements a new intrinsic instruction in LLVM,
`llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand
and has no effect other than "using" its operand, to ensure that its
operand remains live until after the fake use. This patch does not emit
fake uses anywhere; the next patch in this sequence causes them to be
emitted from the clang frontend, such that for each variable (or this) a
fake.use operand is inserted at the end of that variable's scope, using
that variable's value. This patch covers everything post-frontend, which
is largely just the basic plumbing for a new intrinsic/instruction,
along with a few steps to preserve the fake uses through optimizations
(such as moving them ahead of a tail call or translating them through
SROA).
Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>
show more ...
|
Revision tags: llvmorg-19.1.0-rc3 |
|
#
27a62ec7 |
| 18-Aug-2024 |
Philip Reames <preames@rivosinc.com> |
[LSR] Split the -lsr-term-fold transformation into it's own pass (#104234)
This transformation doesn't actually use any of the internal state of
LSR and recomputes all information from SCEV. Split
[LSR] Split the -lsr-term-fold transformation into it's own pass (#104234)
This transformation doesn't actually use any of the internal state of
LSR and recomputes all information from SCEV. Splitting it out makes
it easier to test.
Note that long term I would like to write a version of this transform
which *is* integrated with LSR's solver, but if that happens, we'll
just delete the extra pass.
Integration wise, I switched from using TTI to using a pass configuration
variable. This seems slightly more idiomatic, and means we don't run
the extra logic on any target other than RISCV.
show more ...
|
#
b4edfc19 |
| 13-Aug-2024 |
Jay Foad <jay.foad@amd.com> |
[LTO] Run ObjCARCContractPass according to the callgraph (#103034)
This matches other IR codegen passes and avoids a Dominator Tree
Construction in AMDGPU O2/O3 builds.
|
#
74e4694b |
| 09-Aug-2024 |
Peter Rong <peterrong96@gmail.com> |
[LTO] enable `ObjCARCContractPass` only on optimized build (#101114)
\#92331 tried to make `ObjCARCContractPass` by default, but it caused a
regression on O0 builds and was reverted.
This patch t
[LTO] enable `ObjCARCContractPass` only on optimized build (#101114)
\#92331 tried to make `ObjCARCContractPass` by default, but it caused a
regression on O0 builds and was reverted.
This patch trys to bring that back by:
1. reverts the
[revert](https://github.com/llvm/llvm-project/commit/1579e9ca9ce17364963861517fecf13b00fe4d8a).
2. `createObjCARCContractPass` only on optimized builds.
Tests are updated to refelect the changes. Specifically, all `O0` tests
should not include `ObjCARCContractPass`
Signed-off-by: Peter Rong <PeterRong@meta.com>
show more ...
|
#
fa92d51f |
| 06-Aug-2024 |
Alexis Engelke <engelke@in.tum.de> |
[VP] Merge ExpandVP pass into PreISelIntrinsicLowering (#101652)
Similar to #97727; avoid an extra pass over the entire IR by performing
the lowering as part of the pre-isel-intrinsic-lowering pass.
|
Revision tags: llvmorg-19.1.0-rc2 |
|
#
b5fc083d |
| 01-Aug-2024 |
Alexis Engelke <engelke@in.tum.de> |
[CodeGen] Merge lowerConstantIntrinsics into pre-isel lowering (#97727)
Currently, the LowerConstantIntrinsics pass does an RPO traversal of
every function... only to find that many functions don't
[CodeGen] Merge lowerConstantIntrinsics into pre-isel lowering (#97727)
Currently, the LowerConstantIntrinsics pass does an RPO traversal of
every function... only to find that many functions don't have constant
intrinsics (is.constant, objectsize). In the CodeGen pipeline, there is
already a pre-isel intrinsic lowering pass, which iterates over
intrinsic declarations and lowers all users. Call
lowerConstantIntrinsics from this pass to avoid the extra iteration over
the entire IR and the RPO traversal.
show more ...
|
Revision tags: llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7 |
|
#
cab81dd0 |
| 31-May-2024 |
Egor Pasko <pasko@chromium.org> |
[EntryExitInstrumenter] Move passes out of clang into LLVM default pipelines (#92171)
Move EntryExitInstrumenter(PostInlining=true) to as late as possible and
EntryExitInstrumenter(PostInlining=fal
[EntryExitInstrumenter] Move passes out of clang into LLVM default pipelines (#92171)
Move EntryExitInstrumenter(PostInlining=true) to as late as possible and
EntryExitInstrumenter(PostInlining=false) to an early pre-inlining stage
(but skip for ThinLTO post-link).
This should fix the issues reported in
https://github.com/rust-lang/rust/issues/92109 and
https://github.com/llvm/llvm-project/issues/52853. These are caused
by https://reviews.llvm.org/D97608.
show more ...
|
#
1579e9ca |
| 24-May-2024 |
Nikita Popov <nikita.ppv@gmail.com> |
Revert "Run ObjCContractPass in Default Codegen Pipeline (#92331)"
This reverts commit 8cc8e5d6c6ac9bfc888f3449f7e424678deae8c2. This reverts commit dae55c89835347a353619f506ee5c8f8a2c136a7.
Causes
Revert "Run ObjCContractPass in Default Codegen Pipeline (#92331)"
This reverts commit 8cc8e5d6c6ac9bfc888f3449f7e424678deae8c2. This reverts commit dae55c89835347a353619f506ee5c8f8a2c136a7.
Causes major compile-time regressions for unoptimized builds.
show more ...
|
#
8cc8e5d6 |
| 23-May-2024 |
Nuri Amari <nuri.amari99@gmail.com> |
Run ObjCContractPass in Default Codegen Pipeline (#92331)
Prior to this patch, when using -fthinlto-index= the ObjCARCContractPass isn't run prior to CodeGen, and instruction selection fails on IR c
Run ObjCContractPass in Default Codegen Pipeline (#92331)
Prior to this patch, when using -fthinlto-index= the ObjCARCContractPass isn't run prior to CodeGen, and instruction selection fails on IR containing arc intrinsics. This patch is motivated by that usecase.
The pass was previously added in various places codegen is performed. This patch adds the pass to the default codegen pipepline, makes sure it bails immediately if no arc intrinsics are found, and removes the adhoc scheduling of the pass.
Co-authored-by: Nuri Amari <nuriamari@fb.com>
show more ...
|
Revision tags: llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2 |
|
#
bd6eb548 |
| 08-Mar-2024 |
Paul Walker <paul.walker@arm.com> |
[LLVM][CodeGen] Teach SelectionDAG how to expand FREM to a vector math call. (#83859)
This removes, at least when a vector library is available, a failure
case for scalable vectors. Doing so means
[LLVM][CodeGen] Teach SelectionDAG how to expand FREM to a vector math call. (#83859)
This removes, at least when a vector library is available, a failure
case for scalable vectors. Doing so means we can confidently cost vector
FREM instructions without making an assumption that later passes will
transform the IR before it gets to the code generator.
NOTE: Whilst only FREM has been implemented the same mechanism
can be used for the other libm related ISD nodes.
show more ...
|