#
fb34d531 |
| 03-Jun-2022 |
Benjamin Kramer <benny.kra@googlemail.com> |
Promote bf16 to f32 when the target doesn't support it
This is modeled after the half-precision fp support. Two new nodes are introduced for casting from and to bf16. Since casting from bf16 is a si
Promote bf16 to f32 when the target doesn't support it
This is modeled after the half-precision fp support. Two new nodes are introduced for casting from and to bf16. Since casting from bf16 is a simple operation I opted to always directly lower it to integer arithmetic. The other way round is more complicated if you want to preserve IEEE semantics, so it's handled by a new __truncsfbf2 compiler-rt builtin.
This is of course very bare bones, but sufficient to get a semi-softened fadd on x86.
Possible future improvements: - Targets with bf16 conversion instructions can now make fp_to_bf16 legal - The software conversion to bf16 can be replaced by a trivial implementation under fast math.
Differential Revision: https://reviews.llvm.org/D126953
show more ...
|
#
a92ed167 |
| 02-Jun-2022 |
Hendrik Greving <hgreving@google.com> |
[ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4.
Adds MVT::v128i2, MVT::v64i4, and implied MVT::i2, MVT::i4.
Keeps MVT::i2, MVT::i4 lowering actions as expand, which should be remove
[ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4.
Adds MVT::v128i2, MVT::v64i4, and implied MVT::i2, MVT::i4.
Keeps MVT::i2, MVT::i4 lowering actions as expand, which should be removed once targets set this explicitly.
Adjusts 11 lit tests to reflect slightly different behavior during DAG combine.
Differential Revision: https://reviews.llvm.org/D125247
show more ...
|
#
e9d05cc7 |
| 01-Jun-2022 |
Hendrik Greving <hgreving@google.com> |
Revert "[ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4."
This reverts commit 430ac5c3029c52e391e584c6d4447e6e361fae99.
Due to failures in Clang tests.
Differential Revision: https:
Revert "[ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4."
This reverts commit 430ac5c3029c52e391e584c6d4447e6e361fae99.
Due to failures in Clang tests.
Differential Revision: https://reviews.llvm.org/D125247
show more ...
|
#
430ac5c3 |
| 06-May-2022 |
Hendrik Greving <hgreving@google.com> |
[ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4.
Adds MVT::v128i2, MVT::v64i4, and implied MVT::i2, MVT::i4.
Keeps MVT::i2, MVT::i4 lowering actions as `expand`, which should be remo
[ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4.
Adds MVT::v128i2, MVT::v64i4, and implied MVT::i2, MVT::i4.
Keeps MVT::i2, MVT::i4 lowering actions as `expand`, which should be removed once targets set this explicitly.
Adjusts 11 lit tests to reflect slightly different behavior during DAG combine.
Differential Revision: https://reviews.llvm.org/D125247
show more ...
|
#
cd19af74 |
| 03-May-2022 |
Matthias Braun <matze@braunis.de> |
Avoid 8 and 16bit switch conditions on x86
This adds a `TargetLoweringBase::getSwitchConditionType` callback to give targets a chance to control the type used in `CodeGenPrepare::optimizeSwitchInst`
Avoid 8 and 16bit switch conditions on x86
This adds a `TargetLoweringBase::getSwitchConditionType` callback to give targets a chance to control the type used in `CodeGenPrepare::optimizeSwitchInst`.
Implement callback for X86 to avoid i8 and i16 types where possible as they often incur extra zero-extensions.
This is NFC for non-X86 targets.
Differential Revision: https://reviews.llvm.org/D124894
show more ...
|
Revision tags: llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
#
170a9031 |
| 10-Oct-2021 |
Serge Pavlov <sepavloff@gmail.com> |
Intrinsic for checking floating point class
This change introduces a new intrinsic, `llvm.is.fpclass`, which checks if the provided floating-point number belongs to any of the the specified value cl
Intrinsic for checking floating point class
This change introduces a new intrinsic, `llvm.is.fpclass`, which checks if the provided floating-point number belongs to any of the the specified value classes. The intrinsic implements the checks made by C standard library functions `isnan`, `isinf`, `isfinite`, `isnormal`, `issubnormal`, `issignaling` and corresponding IEEE-754 operations.
The primary motivation for this intrinsic is the support of strict FP mode. In this mode using compare instructions or other FP operations is not possible, because if the value is a signaling NaN, floating-point exception `Invalid` is raised, but the aforementioned functions must never raise exceptions.
Currently there are two solutions for this problem, both are implemented partially. One of them is using integer operations to implement the check. It was implemented in https://reviews.llvm.org/D95948 for `isnan`. It solves the problem of exceptions, but offers one solution for all targets, although some can do the check in more efficient way.
The other, implemented in https://reviews.llvm.org/D96568, introduced a hook 'clang::TargetCodeGenInfo::testFPKind', which injects a target specific code into IR to implement `isnan` and some other functions. It is convenient for targets that have dedicated instruction to determine FP data class. However using target-specific intrinsic complicates analysis and can prevent some optimizations.
A special intrinsic for value class checks allows representing data class tests with enough flexibility. During IR transformations it represents the check in target-independent way and saves it from undesired transformations. In the instruction selector it allows efficient lowering depending on the used target and mode.
This implementation is an extended variant of `llvm.isnan` introduced in https://reviews.llvm.org/D104854. It is limited to minimal intrinsic support. Target-specific treatment will be implemented in separate patches.
Differential Revision: https://reviews.llvm.org/D112025
show more ...
|
#
e90110e6 |
| 12-Apr-2022 |
Shao-Ce SUN <sunshaoce@iscas.ac.cn> |
[NFC][CodeGen] Use ArrayRef in TargetLowering functions
This patch is similar to D122557, adding an `ArrayRef` version for `setOperationAction`, `setLoadExtAction`, `setCondCodeAction`, `setLibcallN
[NFC][CodeGen] Use ArrayRef in TargetLowering functions
This patch is similar to D122557, adding an `ArrayRef` version for `setOperationAction`, `setLoadExtAction`, `setCondCodeAction`, `setLibcallName`.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D123467
show more ...
|
#
989f1c72 |
| 15-Mar-2022 |
serge-sans-paille <sguelton@redhat.com> |
Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169
after: 1061034926 before: 1063332844
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-in
Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169
after: 1061034926 before: 1063332844
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681
show more ...
|
#
a278250b |
| 10-Mar-2022 |
Nico Weber <thakis@chromium.org> |
Revert "Cleanup codegen includes"
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https:/
Revert "Cleanup codegen includes"
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169
show more ...
|
#
7f230fee |
| 07-Mar-2022 |
serge-sans-paille <sguelton@redhat.com> |
Cleanup codegen includes
after: 1061034926 before: 1063332844
Differential Revision: https://reviews.llvm.org/D121169
|
#
7b85f0f3 |
| 02-Mar-2022 |
Paul Robinson <Paul.Robinson@sony.com> |
[PS4] isPS4 and isPS4CPU are not meaningfully different
|
#
1df8efae |
| 19-Feb-2022 |
Craig Topper <craig.topper@sifive.com> |
[SelectionDAG][X86] Support f16 in getReciprocalOpName.
If the "reciprocal-estimates" attribute is present and it doesn't contain "all", "none", or "default", we previously crashed on f16 operations
[SelectionDAG][X86] Support f16 in getReciprocalOpName.
If the "reciprocal-estimates" attribute is present and it doesn't contain "all", "none", or "default", we previously crashed on f16 operations.
This patch addes an 'h' suffix' to prevent the crash.
I've added simple tests that just enable the estimate for all vec-sqrt and one test case that explicitly tests the new 'h' suffix to override the default steps.
There may be some frontend change needed to, but I haven't checked that yet.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D120158
show more ...
|
#
0d59a54c |
| 18-Feb-2022 |
Craig Topper <craig.topper@sifive.com> |
Revert "[SelectionDAG][X86] Support f16 in getReciprocalOpName."
This reverts commit 86b5e256628ae49193ad9962626a73bafeda2883.
This wasn't supposed to be commited yet
|
#
86b5e256 |
| 18-Feb-2022 |
Craig Topper <craig.topper@sifive.com> |
[SelectionDAG][X86] Support f16 in getReciprocalOpName.
If the "reciprocal-estimates" attribute is present and it doesn't contain "all", "none", or "default", we previously crashed on f16 operations
[SelectionDAG][X86] Support f16 in getReciprocalOpName.
If the "reciprocal-estimates" attribute is present and it doesn't contain "all", "none", or "default", we previously crashed on f16 operations.
This patch addes an 'h' suffix' to prevent the crash.
I've added simple tests that just enable the estimate for all vec-sqrt and one test case that explicitly tests the new 'h' suffix to override the default steps.
There may be some frontend change needed to, but I haven't checked that yet.
Differential Revision: https://reviews.llvm.org/D120158
show more ...
|
#
4072e362 |
| 11-Feb-2022 |
David Green <david.green@arm.com> |
[ISel] Port AArch64 HADD and RHADD to ISel
This ports the aarch64 combines for HADD and RHADD over to DAG combine, so that they can be used in more architectures (notably MVE in a followup patch). T
[ISel] Port AArch64 HADD and RHADD to ISel
This ports the aarch64 combines for HADD and RHADD over to DAG combine, so that they can be used in more architectures (notably MVE in a followup patch). They are renamed to AVGFLOOR and AVGCEIL in the process, to avoid confusion with instructions such as X86 hadd. The code was also rewritten slightly to remove the AArch64 idiosyncrasies.
The general pattern for a AVGFLOORS is %xe = sext i8 %x to i32 %ye = sext i8 %y to i32 %a = add i32 %xe, %ye %r = lshr i32 %a, 1 %t = trunc i32 %r to i8
An AVGFLOORU is equivalent with zext. Because of the truncate lshr==ashr, as the top bits are not demanded. An AVGCEIL also includes an extra rounding, so includes an extra add of 1.
Differential Revision: https://reviews.llvm.org/D106237
show more ...
|
#
f15014ff |
| 26-Jan-2022 |
Benjamin Kramer <benny.kra@googlemail.com> |
Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17"
This reverts commit ef8206320769ad31422a803a0d6de6077fd231d2.
- It conflicts with the existing llvm::size in STLEx
Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17"
This reverts commit ef8206320769ad31422a803a0d6de6077fd231d2.
- It conflicts with the existing llvm::size in STLExtras, which will now never be called. - Calling it without llvm:: breaks C++17 compat
show more ...
|
#
ef820632 |
| 26-Jan-2022 |
serge-sans-paille <sguelton@redhat.com> |
Rename llvm::array_lengthof into llvm::size to match std::size from C++17
As a conquence move llvm::array_lengthof from STLExtras.h to STLForwardCompat.h (which is included by STLExtras.h so no buil
Rename llvm::array_lengthof into llvm::size to match std::size from C++17
As a conquence move llvm::array_lengthof from STLExtras.h to STLForwardCompat.h (which is included by STLExtras.h so no build breakage expected).
show more ...
|
#
fffd663c |
| 05-Jan-2022 |
David Green <david.green@arm.com> |
[CodeGen] Initialize MaxBytesForAlignment in TargetLoweringBase::TargetLoweringBase.
This appears to be missing from D114590, causing sanitizer errors.
|
#
73d92faa |
| 01-Dec-2021 |
Nicholas Guy <nicholas.guy@arm.com> |
[CodeGen] Emit alignment "Max Skip" operand
The current AsmPrinter has support to emit the "Max Skip" operand (the 3rd of .p2align), however has no support for it to actually be specified. Adding Ma
[CodeGen] Emit alignment "Max Skip" operand
The current AsmPrinter has support to emit the "Max Skip" operand (the 3rd of .p2align), however has no support for it to actually be specified. Adding MaxBytesForAlignment to MachineBasicBlock provides this capability on a per-block basis. Leaving the value as default (0) causes no observable differences in behaviour.
Differential Revision: https://reviews.llvm.org/D114590
show more ...
|
#
26bd534a |
| 17-Dec-2021 |
Kazu Hirata <kazu@google.com> |
[llvm] Use none_of instead of \!any_of (NFC)
|
#
8d77555b |
| 12-Nov-2021 |
David Sherwood <david.sherwood@arm.com> |
[Analysis] Ensure getTypeLegalizationCost returns a simple VT for TypeScalarizeScalableVector
When getTypeConversion returns TypeScalarizeScalableVector we were sometimes returning a non-simple type
[Analysis] Ensure getTypeLegalizationCost returns a simple VT for TypeScalarizeScalableVector
When getTypeConversion returns TypeScalarizeScalableVector we were sometimes returning a non-simple type from getTypeLegalizationCost. However, many callers depend upon this being a simple type and will crash if not. This patch changes getTypeLegalizationCost to ensure that we always a return sensible simple VT. If the vector type contains unusual integer types, e.g. <vscale x 2 x i3>, then we just set the type to MVT::i64 as a reasonable default.
A test has been added here that demonstrates the vectoriser can correctly calculate the cost of vectorising a "zext i3 to i64" instruction with a VF=vscale x 1:
Transforms/LoopVectorize/AArch64/sve-inductions-unusual-types.ll
Differential Revision: https://reviews.llvm.org/D113777
show more ...
|
#
1cb9f37a |
| 05-Nov-2021 |
Alfredo Dal'Ava Junior <alfredo.junior@eldorado.org.br> |
[FreeBSD] Do not mark __stack_chk_guard as dso_local
This symbol is defined in libc.so so it is definitely not DSO-Local. Marking it as such causes problems on some platforms (such as PowerPC).
Dif
[FreeBSD] Do not mark __stack_chk_guard as dso_local
This symbol is defined in libc.so so it is definitely not DSO-Local. Marking it as such causes problems on some platforms (such as PowerPC).
Differential revision: https://reviews.llvm.org/D109090
show more ...
|
#
d51e3a21 |
| 25-Oct-2021 |
Craig Topper <craig.topper@sifive.com> |
[LegalizeTypes][TargetLowering] Merge getShiftAmountTyForConstant into TargetLowering::getShiftAmountTy.
getShiftAmountTyForConstant is a special helper that changes the shift amount to i32 if the t
[LegalizeTypes][TargetLowering] Merge getShiftAmountTyForConstant into TargetLowering::getShiftAmountTy.
getShiftAmountTyForConstant is a special helper that changes the shift amount to i32 if the type chosen by TargetLowering::getShiftAmountTy can't represent all possible values. This is needed to satisfy an assert in SelectionDAG::getNode.
It requires additional consideration to know when this helper should be used. I'm not sure that we are always using it when we should.
This patch merges the getShiftAmountTyForConstant handling into TargetLowering::getShiftAmountTy so we don't need to think about it anymore.
Technically this may slightly increase compile times since the majority of callers of getShiftAmountTy won't need this. Hopefully, this isn't an issue in practice.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D112469
show more ...
|
#
3649fb14 |
| 09-Oct-2021 |
Dávid Bolvanský <david.bolvansky@gmail.com> |
Fixed some errors detected by PVS Studio
|
#
a0a49351 |
| 08-Oct-2021 |
Arthur Eubanks <aeubanks@google.com> |
Make more places that use alignment use uint64_t
Followup to D110451.
|