#
e6aea4a5 |
| 17-Oct-2022 |
Dmitry Vyukov <dvyukov@google.com> |
Use-after-return sanitizer binary metadata
Currently per-function metadata consists of: (start-pc, size, features)
This adds a new UAR feature and if it's set an additional element: (start-pc, size
Use-after-return sanitizer binary metadata
Currently per-function metadata consists of: (start-pc, size, features)
This adds a new UAR feature and if it's set an additional element: (start-pc, size, features, stack-args-size)
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D136078
show more ...
|
#
dbb11309 |
| 29-Nov-2022 |
Kazu Hirata <kazu@google.com> |
Revert "Use-after-return sanitizer binary metadata"
This reverts commit a1255dc467f7ce57a966efa76bbbb4ee91d9115a.
This patch results in:
llvm/lib/CodeGen/SanitizerBinaryMetadata.cpp:57:17: error
Revert "Use-after-return sanitizer binary metadata"
This reverts commit a1255dc467f7ce57a966efa76bbbb4ee91d9115a.
This patch results in:
llvm/lib/CodeGen/SanitizerBinaryMetadata.cpp:57:17: error: no member named 'size' in 'llvm::MDTuple'
show more ...
|
#
a1255dc4 |
| 17-Oct-2022 |
Dmitry Vyukov <dvyukov@google.com> |
Use-after-return sanitizer binary metadata
Currently per-function metadata consists of: (start-pc, size, features)
This adds a new UAR feature and if it's set an additional element: (start-pc, size
Use-after-return sanitizer binary metadata
Currently per-function metadata consists of: (start-pc, size, features)
This adds a new UAR feature and if it's set an additional element: (start-pc, size, features, stack-args-size)
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D136078
show more ...
|
#
d6f0ab47 |
| 26-Nov-2022 |
Kazu Hirata <kazu@google.com> |
[CodeGen] Use std::optional in TargetPassConfig.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalu
[CodeGen] Use std::optional in TargetPassConfig.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
show more ...
|
#
2090e85f |
| 19-Jul-2022 |
Matthias Gehre <matthias.gehre@xilinx.com> |
[llvm/CodeGen] Enable the ExpandLargeDivRem pass for X86, Arm and AArch64
This adds the ExpandLargeDivRem to the default pass pipeline. The limit at which it expands div/rem instructions is configur
[llvm/CodeGen] Enable the ExpandLargeDivRem pass for X86, Arm and AArch64
This adds the ExpandLargeDivRem to the default pass pipeline. The limit at which it expands div/rem instructions is configured via a new TargetTransformInfo hook (default: no expansion) X86, Arm and AArch64 backends implement this hook to expand div/rem instructions with more than 128 bits.
Differential Revision: https://reviews.llvm.org/D130076
show more ...
|
#
de9d80c1 |
| 08-Aug-2022 |
Fangrui Song <i@maskray.me> |
[llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.
|
#
5cb09798 |
| 25-Jun-2022 |
Luo, Yuanke <yuanke.luo@intel.com> |
[X86][AMX] Split greedy RA for tile register
When we fill the shape to tile configure memory, the shape is gotten from AMX pseudo instruction. However the register for the shape may be split or spil
[X86][AMX] Split greedy RA for tile register
When we fill the shape to tile configure memory, the shape is gotten from AMX pseudo instruction. However the register for the shape may be split or spilled by greedy RA. That cause we fill the shape to config memory after ldtilecfg is executed, so that the shape configuration would be wrong. This patch is to split the tile register allocation from greedy register allocation, so that after tile registers are allocated the shape registers are still virtual register. The shape register only may be redefined or multi-defined by phi elimination pass, two address pass. That doesn't affect tile register configuration.
Differential Revision: https://reviews.llvm.org/D128584
show more ...
|
#
95a13425 |
| 05-Jun-2022 |
Fangrui Song <i@maskray.me> |
Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options
|
#
08cc0585 |
| 27-May-2022 |
Rahman Lavaee <rahmanl@google.com> |
Reland "[Propeller] Promote functions with propeller profiles to .text.hot."
This relands commit 4d8d2580c53e130c3c3dd3877384301e3c495554.
The major change here is using 'addUsedIfAvailable<BasicBl
Reland "[Propeller] Promote functions with propeller profiles to .text.hot."
This relands commit 4d8d2580c53e130c3c3dd3877384301e3c495554.
The major change here is using 'addUsedIfAvailable<BasicBlockSectionsProfileReader>()` to make sure we don't change the pipeline tests.
Differential Revision: https://reviews.llvm.org/D126518
show more ...
|
#
3aa24932 |
| 27-May-2022 |
Rahman Lavaee <rahmanl@google.com> |
Revert "[Propeller] Promote functions with propeller profiles to .text.hot."
This reverts commit 4d8d2580c53e130c3c3dd3877384301e3c495554.
|
#
4d8d2580 |
| 24-May-2022 |
Rahman Lavaee <rahmanl@google.com> |
[Propeller] Promote functions with propeller profiles to .text.hot.
Today, text section prefixes (none, .unlikely, .hot, and .unkown) are determined based on PGO profile. However, Propeller may deem
[Propeller] Promote functions with propeller profiles to .text.hot.
Today, text section prefixes (none, .unlikely, .hot, and .unkown) are determined based on PGO profile. However, Propeller may deem a function hot when PGO doesn't. Besides, when `-Wl,-keep-text-section-prefix=true` Propeller cannot enforce a global section ordering as the linker can only reorder sections within each output section (.text, .text.hot, .text.unlikely).
This patch promotes all functions with Propeller profiles (functions listed in the basic-block-sections profile) to .text.hot. The feature is hidden behind the flag `--bbsections-guided-section-prefix` which defaults to `true`.
The new implementation refactors the parsing of basic block sections profile into a new `BasicBlockSectionsProfileReader` analysis pass. This allows us to use the information earlier in `CodeGenPrepare` in order to set the functions text prefix. `BasicBlockSectionsProfileReader` will be used both by `BasicBlockSections` pass and `CodeGenPrepare`.
Differential Revision: https://reviews.llvm.org/D122930
show more ...
|
#
ca7c307d |
| 13-May-2022 |
Sotiris Apostolakis <apostolakis@google.com> |
[SelectOpti][1/5] Setup new select-optimize pass
This is the first commit for the cmov-vs-branch optimization pass. The goal is to develop a new profile-guided and target-independent cost/benefit an
[SelectOpti][1/5] Setup new select-optimize pass
This is the first commit for the cmov-vs-branch optimization pass. The goal is to develop a new profile-guided and target-independent cost/benefit analysis for selecting conditional moves over branches when optimizing for performance.
Initially, this new pass is expected to be enabled only for instrumentation-based PGO.
RFC: https://discourse.llvm.org/t/rfc-cmov-vs-branch-optimization/6040
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D120230
show more ...
|
#
b4ad28da |
| 11-Apr-2022 |
Momchil Velikov <momchil.velikov@arm.com> |
[CodeGen] Async unwind - add a pass to fix CFI information
This pass inserts the necessary CFI instructions to compensate for the inconsistency of the call-frame information caused by linear (non-CG
[CodeGen] Async unwind - add a pass to fix CFI information
This pass inserts the necessary CFI instructions to compensate for the inconsistency of the call-frame information caused by linear (non-CGA aware) nature of the unwind tables.
Unlike the `CFIInstrInserer` pass, this one almost always emits only `.cfi_remember_state`/`.cfi_restore_state`, which results in smaller unwind tables and also transparently handles custom unwind info extensions like CFA offset adjustement and save locations of SVE registers.
This pass takes advantage of the constraints taht LLVM imposes on the placement of save/restore points (cf. `ShrinkWrap.cpp`):
* there is a single basic block, containing the function prologue
* possibly multiple epilogue blocks, where each epilogue block is complete and self-contained, i.e. CSR restore instructions (and the corresponding CFI instructions are not split across two or more blocks.
* prologue and epilogue blocks are outside of any loops
Thus, during execution, at the beginning and at the end of each basic block the function can be in one of two states:
- "has a call frame", if the function has executed the prologue, or has not executed any epilogue
- "does not have a call frame", if the function has not executed the prologue, or has executed an epilogue
These properties can be computed for each basic block by a single RPO traversal.
From the point of view of the unwind tables, the "has/does not have call frame" state at beginning of each block is determined by the state at the end of the previous block, in layout order.
Where these states differ, we insert compensating CFI instructions, which come in two flavours:
- CFI instructions, which reset the unwind table state to the initial one. This is done by a target specific hook and is expected to be trivial to implement, for example it could be: ``` .cfi_def_cfa <sp>, 0 .cfi_same_value <rN> .cfi_same_value <rN-1> ... ``` where `<rN>` are the callee-saved registers.
- CFI instructions, which reset the unwind table state to the one created by the function prologue. These are the sequence: ``` .cfi_restore_state .cfi_remember_state ``` In this case we also insert a `.cfi_remember_state` after the last CFI instruction in the function prologue.
Reviewed By: MaskRay, danielkiss, chill
Differential Revision: https://reviews.llvm.org/D114545
show more ...
|
#
0320115c |
| 05-Apr-2022 |
Muhammad Omair Javaid <omair.javaid@linaro.org> |
Revert "[CodeGen] Async unwind - add a pass to fix CFI information"
This reverts commit 980c3e6dd223a8e628367144b8180117950bb364.
This commit had failing tests with clang crashing across various AA
Revert "[CodeGen] Async unwind - add a pass to fix CFI information"
This reverts commit 980c3e6dd223a8e628367144b8180117950bb364.
This commit had failing tests with clang crashing across various AArch64/Linux buildots.
https://lab.llvm.org/buildbot/#/builders/179/builds/3346
Differential Revision: https://reviews.llvm.org/D114545
show more ...
|
#
980c3e6d |
| 04-Apr-2022 |
Momchil Velikov <momchil.velikov@arm.com> |
[CodeGen] Async unwind - add a pass to fix CFI information
This pass inserts the necessary CFI instructions to compensate for the inconsistency of the call-frame information caused by linear (non-CF
[CodeGen] Async unwind - add a pass to fix CFI information
This pass inserts the necessary CFI instructions to compensate for the inconsistency of the call-frame information caused by linear (non-CFG aware) nature of the unwind tables.
Unlike the `CFIInstrInserer` pass, this one almost always emits only `.cfi_remember_state`/`.cfi_restore_state`, which results in smaller unwind tables and also transparently handles custom unwind info extensions like CFA offset adjustement and save locations of SVE registers.
This pass takes advantage of the constraints that LLVM imposes on the placement of save/restore points (cf. `ShrinkWrap.cpp`):
* there is a single basic block, containing the function prologue
* possibly multiple epilogue blocks, where each epilogue block is complete and self-contained, i.e. CSR restore instructions (and the corresponding CFI instructions are not split across two or more blocks.
* prologue and epilogue blocks are outside of any loops
Thus, during execution, at the beginning and at the end of each basic block the function can be in one of two states:
- "has a call frame", if the function has executed the prologue, or has not executed any epilogue
- "does not have a call frame", if the function has not executed the prologue, or has executed an epilogue
These properties can be computed for each basic block by a single RPO traversal.
In order to accommodate backends which do not generate unwind info in epilogues we compute an additional property "strong no call frame on entry" which is set for the entry point of the function and for every block reachable from the entry along a path that does not execute the prologue. If this property holds, it takes precedence over the "has a call frame" property.
From the point of view of the unwind tables, the "has/does not have call frame" state at beginning of each block is determined by the state at the end of the previous block, in layout order.
Where these states differ, we insert compensating CFI instructions, which come in two flavours:
- CFI instructions, which reset the unwind table state to the initial one. This is done by a target specific hook and is expected to be trivial to implement, for example it could be: ``` .cfi_def_cfa <sp>, 0 .cfi_same_value <rN> .cfi_same_value <rN-1> ... ``` where `<rN>` are the callee-saved registers.
- CFI instructions, which reset the unwind table state to the one created by the function prologue. These are the sequence: ``` .cfi_restore_state .cfi_remember_state ``` In this case we also insert a `.cfi_remember_state` after the last CFI instruction in the function prologue.
Reviewed By: MaskRay, danielkiss, chill
Differential Revision: https://reviews.llvm.org/D114545
show more ...
|
#
64902d33 |
| 24-Mar-2022 |
Julian Lettner <julian.lettner@apple.com> |
Reland "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO"
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with `__cxa_atexit` calls to avoid emitting the deprecated `__mod_te
Reland "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO"
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with `__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.
Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.
Enable fallback to the old behavior via Clang driver flag (`-fregister-global-dtors-with-atexit`) or llc / code generation flag (`-lower-global-dtors-via-cxa-atexit`). This escape hatch will be removed in the future.
Differential Revision: https://reviews.llvm.org/D121736
show more ...
|
#
581dc3c7 |
| 23-Mar-2022 |
Zequan Wu <zequanwu@google.com> |
Revert "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO"
This reverts commit 22570bac694396514fff18dec926558951643fa6.
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
#
22570bac |
| 09-Mar-2022 |
Julian Lettner <julian.lettner@apple.com> |
Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with `__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.
Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with `__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.
Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.
Enable fallback to the old behavior via Clang driver flag (`-fregister-global-dtors-with-atexit`) or llc / code generation flag (`-lower-global-dtors-via-cxa-atexit`). This escape hatch will be removed in the future.
Differential Revision: https://reviews.llvm.org/D121736
show more ...
|
#
ac64d0d2 |
| 16-Mar-2022 |
Shengchen Kan <shengchen.kan@intel.com> |
[NFC][CodeGen] Remove redundant if clause in TargetPassConfig::addPass
|
#
989f1c72 |
| 15-Mar-2022 |
serge-sans-paille <sguelton@redhat.com> |
Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169
after: 1061034926 before: 1063332844
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-in
Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169
after: 1061034926 before: 1063332844
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681
show more ...
|
#
7262eacd |
| 15-Mar-2022 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
Revert rG9c542a5a4e1ba36c24e48185712779df52b7f7a6 "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO"
Mane of the build bots are complaining: Unknown command line argument '-lower-global-dtor
Revert rG9c542a5a4e1ba36c24e48185712779df52b7f7a6 "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO"
Mane of the build bots are complaining: Unknown command line argument '-lower-global-dtors'
show more ...
|
#
9c542a5a |
| 09-Mar-2022 |
Julian Lettner <julian.lettner@apple.com> |
Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with `__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.
Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with `__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.
Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.
Enable fallback to the old behavior via Clang driver flag (`-fregister-global-dtors-with-atexit`) or llc / code generation flag (`-lower-global-dtors-via-cxa-atexit`). This escape hatch will be removed in the future.
Differential Revision: https://reviews.llvm.org/D121327
show more ...
|
#
a278250b |
| 10-Mar-2022 |
Nico Weber <thakis@chromium.org> |
Revert "Cleanup codegen includes"
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https:/
Revert "Cleanup codegen includes"
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169
show more ...
|
#
7f230fee |
| 07-Mar-2022 |
serge-sans-paille <sguelton@redhat.com> |
Cleanup codegen includes
after: 1061034926 before: 1063332844
Differential Revision: https://reviews.llvm.org/D121169
|
#
c3101432 |
| 02-Mar-2022 |
Xiang1 Zhang <xiang1.zhang@intel.com> |
TLS loads opimization (hoist)
Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120000
|