#
e15d67cf |
| 10-Jul-2024 |
Daniel Kiss <daniel.kiss@arm.com> |
[Clang][ARM][AArch64] Alway emit protection attributes for functions. (#82819)
So far branch protection, sign return address, guarded control stack
attributes are
only emitted as module flags to i
[Clang][ARM][AArch64] Alway emit protection attributes for functions. (#82819)
So far branch protection, sign return address, guarded control stack
attributes are
only emitted as module flags to indicate the functions need to be
generated with
those features.
The problem is in case of an LTO build the module flags are merged with
the `min`
rule which means if one of the module is not build with sign return
address then the features
will be turned off for all functions. Due to the functions take the
branch-protection and
sign-return-address features from the module flags. The
sign-return-address is
function level option therefore it is expected functions from files that
is
compiled with -mbranch-protection=pac-ret to be protected.
The inliner might inline functions with different set of flags as it
doesn't consider
the module flags.
This patch adds the attributes to all functions and drops the checking
of the module flags
for the code generation.
Module flag is still used for generating the ELF markers.
Also drops the "true"/"false" values from the
branch-protection-enforcement,
branch-protection-pauth-lr, guarded-control-stack attributes as presence
of the
attribute means it is on absence means off and no other option.
show more ...
|
#
a48305e0 |
| 05-Jul-2024 |
Shengchen Kan <shengchen.kan@intel.com> |
[X86][CodeGen] Convert masked.load/store to CLOAD/CSTORE node only when vector size = 1
This fixes the crash when building llvm-test-suite with avx512f + cf.
|
#
c60b9307 |
| 05-Jul-2024 |
Shengchen Kan <shengchen.kan@intel.com> |
Revert "[X86][CodeGen] Convert masked.load/store to CLOAD/CSTORE node only when vector size = 1"
This reverts commit 74984dee51307779a3eab10a8cd6102be37e1081.
It caused AArch64 test sve-nontemporal
Revert "[X86][CodeGen] Convert masked.load/store to CLOAD/CSTORE node only when vector size = 1"
This reverts commit 74984dee51307779a3eab10a8cd6102be37e1081.
It caused AArch64 test sve-nontemporal-masked-ldst.ll to fail.
show more ...
|
#
74984dee |
| 05-Jul-2024 |
Shengchen Kan <shengchen.kan@intel.com> |
[X86][CodeGen] Convert masked.load/store to CLOAD/CSTORE node only when vector size = 1
This fixes the crash when building llvm-test-suite with avx512f + cf.
|
#
6222c8f0 |
| 04-Jul-2024 |
Nicholas Guy <67685292+NickGuy-Arm@users.noreply.github.com> |
[IR][LangRef] Add partial reduction add intrinsic (#94499)
Adds the llvm.experimental.partial.reduce.add.* overloaded intrinsic,
this intrinsic represents add reductions that result in a narrower
[IR][LangRef] Add partial reduction add intrinsic (#94499)
Adds the llvm.experimental.partial.reduce.add.* overloaded intrinsic,
this intrinsic represents add reductions that result in a narrower
vector.
show more ...
|
#
58fd3bea |
| 02-Jul-2024 |
Kazu Hirata <kazu@google.com> |
[CodeGen] Use range-based for loops (NFC) (#97467)
|
#
23db37c5 |
| 02-Jul-2024 |
Igor Kudrin <ikudrin@accesssoftek.com> |
[CodeGen] Do not emit TRAP for `unreachable` after `@llvm.trap` (#94570)
With `--trap-unreachable`, `clang` can emit double `TRAP` instructions
for code that contains a call to `__builtin_trap()`:
[CodeGen] Do not emit TRAP for `unreachable` after `@llvm.trap` (#94570)
With `--trap-unreachable`, `clang` can emit double `TRAP` instructions
for code that contains a call to `__builtin_trap()`:
```
> cat test.c
void test() { __builtin_trap(); }
> clang test.c --target=x86_64 -mllvm --trap-unreachable -O1 -S -o -
...
test:
...
ud2
ud2
...
```
`SimplifyCFGPass` inserts `unreachable` after a call to a `noreturn`
function, and later this instruction causes `TRAP/G_TRAP` to be emitted
in `SelectionDAGBuilder::visitUnreachable()` or
`IRTranslator::translateUnreachable()` if
`TargetOptions.TrapUnreachable` is set.
The patch checks the instruction before `unreachable` and avoids
inserting an additional trap.
show more ...
|
#
1488fb41 |
| 28-Jun-2024 |
Daniil Kovalev <dkovalev@accesssoftek.com> |
[PAC][AArch64] Lower ptrauth constants in code (#96879)
This re-applies #94241 after fixing buildbot failure, see
https://lab.llvm.org/buildbot/#/builders/51/builds/570
According to standard, `c
[PAC][AArch64] Lower ptrauth constants in code (#96879)
This re-applies #94241 after fixing buildbot failure, see
https://lab.llvm.org/buildbot/#/builders/51/builds/570
According to standard, `constexpr` variables and `const` variables
initialized with constant expressions can be used in lambdas w/o
capturing - see https://en.cppreference.com/w/cpp/language/lambda.
However, MSVC used on buildkite seems to ignore that rule and does not
allow using such uncaptured variables in lambdas: we have "error C3493:
'Mask16' cannot be implicitly captured because no default capture mode
has been specified" - see
https://buildkite.com/llvm-project/github-pull-requests/builds/73238
Explicitly capturing such a variable, however, makes buildbot fail with
"error: lambda capture 'Mask16' is not required to be captured for this
use [-Werror,-Wunused-lambda-capture]" - see
https://lab.llvm.org/buildbot/#/builders/51/builds/570.
Fix both cases by using `0xffff` value directly instead of giving a name
to it.
Original PR description below.
Depends on #94240.
Define the following pseudos for lowering ptrauth constants in code:
- non-`extern_weak`:
- no GOT load needed: `MOVaddrPAC` - similar to `MOVaddr`, with added
PAC;
- GOT load needed: `LOADgotPAC` - similar to `LOADgot`, with added PAC;
- `extern_weak`: `LOADauthptrstatic` - similar to `LOADgot`, but use a
special stub slot named `sym$auth_ptr$key$disc` filled by dynamic linker
during relocation resolving instead of a GOT slot.
---------
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
show more ...
|
#
15fc801c |
| 27-Jun-2024 |
Shengchen Kan <shengchen.kan@intel.com> |
[X86][CodeGen] Support hoisting load/store with conditional faulting (#96720)
1. Add TTI interface for conditional load/store.
2. Mark 1 x i16/i32/i64 masked load/store legal so that it's not
l
[X86][CodeGen] Support hoisting load/store with conditional faulting (#96720)
1. Add TTI interface for conditional load/store.
2. Mark 1 x i16/i32/i64 masked load/store legal so that it's not
legalized in pass scalarize-masked-mem-intrin.
3. Visit 1 x i16/i32/i64 masked load/store to build a target-specific
CLOAD/CSTORE node to avoid error in
`DAGTypeLegalizer::ScalarizeVectorResult`.
4. Combine DAG to simplify the nodes for CLOAD/CSTORE.
5. Lower CLOAD/CSTORE to CFCMOV by pattern match.
This is CodeGen part of #95515
show more ...
|
#
99251f5a |
| 27-Jun-2024 |
Daniil Kovalev <dkovalev@accesssoftek.com> |
Revert "[PAC][AArch64] Lower ptrauth constants in code (#94241)" (#96865)
This reverts #94241.
See buildbot failure
https://lab.llvm.org/buildbot/#/builders/51/builds/570
|
#
b5cc19e5 |
| 27-Jun-2024 |
Daniil Kovalev <dkovalev@accesssoftek.com> |
[PAC][AArch64] Lower ptrauth constants in code (#94241)
Depends on #94240.
Define the following pseudos for lowering ptrauth constants in code:
- non-`extern_weak`:
- no GOT load needed: `M
[PAC][AArch64] Lower ptrauth constants in code (#94241)
Depends on #94240.
Define the following pseudos for lowering ptrauth constants in code:
- non-`extern_weak`:
- no GOT load needed: `MOVaddrPAC` - similar to `MOVaddr`, with added
PAC;
- GOT load needed: `LOADgotPAC` - similar to `LOADgot`, with added PAC;
- `extern_weak`: `LOADauthptrstatic` - similar to `LOADgot`, but use a
special stub slot named `sym$auth_ptr$key$disc` filled by dynamic linker
during relocation resolving instead of a GOT slot.
---------
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
show more ...
|
#
f2f18459 |
| 21-Jun-2024 |
Nikita Popov <npopov@redhat.com> |
Revert "Intrinsic: introduce minimumnum and maximumnum (#93841)"
As far as I can tell, this pull request was not approved, and did not go through an RFC on discourse.
This reverts commit 8988148003
Revert "Intrinsic: introduce minimumnum and maximumnum (#93841)"
As far as I can tell, this pull request was not approved, and did not go through an RFC on discourse.
This reverts commit 89881480030f48f83af668175b70a9798edca2fb. This reverts commit 225d8fc8eb24fb797154c1ef6dcbe5ba033142da.
show more ...
|
#
89881480 |
| 21-Jun-2024 |
YunQiang Su <syq@debian.org> |
Intrinsic: introduce minimumnum and maximumnum (#93841)
Currently, on different platform, the behaivor of llvm.minnum is
different if one operand is sNaN:
When we compare sNaN vs NUM:
ARM/AAr
Intrinsic: introduce minimumnum and maximumnum (#93841)
Currently, on different platform, the behaivor of llvm.minnum is
different if one operand is sNaN:
When we compare sNaN vs NUM:
ARM/AArch64/PowerPC: follow the IEEE754-2008's minNUM: return qNaN.
RISC-V/Hexagon follow the IEEE754-2019's minimumNumber: return NUM. X86:
Returns NUM but not same with IEEE754-2019's minimumNumber as
+0.0 is not always greater than -0.0.
MIPS/LoongArch/Generic: return NUM.
LIBCALL: returns qNaN.
So, let's introduce llvm.minmumnum/llvm.maximumnum, which always follow
IEEE754-2019's minimumNumber/maximumNumber.
Half-fix: #93033
show more ...
|
#
3c8f3b91 |
| 19-Jun-2024 |
Heejin Ahn <aheejin@gmail.com> |
[WebAssembly] Treat 'rethrow' as terminator in custom isel (#95967)
`rethrow` instruction is a terminator, but when when its DAG is built in
`SelectionDAGBuilder` in a custom routine, it was NOT tr
[WebAssembly] Treat 'rethrow' as terminator in custom isel (#95967)
`rethrow` instruction is a terminator, but when when its DAG is built in
`SelectionDAGBuilder` in a custom routine, it was NOT treated as such.
```ll
rethrow: ; preds = %catch.start
invoke void @llvm.wasm.rethrow() #1 [ "funclet"(token %1) ]
to label %unreachable unwind label %ehcleanup
ehcleanup: ; preds = %rethrow, %catch.dispatch
%tmp = phi i32 [ 10, %catch.dispatch ], [ 20, %rethrow ]
...
```
In this bitcode, because of the `phi`, a `CONST_I32` will be created in
the `rethrow` BB. Without this patch, the DAG for the `rethrow` BB looks
like this:
```
t0: ch,glue = EntryToken
t3: ch = CopyToReg t0, Register:i32 %9, Constant:i32<20>
t5: ch = llvm.wasm.rethrow t0, TargetConstant:i32<12161>
t6: ch = TokenFactor t3, t5
t8: ch = br t6, BasicBlock:ch<unreachable 0x562532e43c50>
```
Note that `CopyToReg` and `llvm.wasm.rethrow` don't have dependence so
either can come first in the selected code, which can result in the code
like
```mir
bb.3.rethrow:
RETHROW 0, implicit-def dead $arguments
%9:i32 = CONST_I32 20, implicit-def dead $arguments
BR %bb.6, implicit-def dead $arguments
```
After this patch, `llvm.wasm.rethrow` is treated as a terminator, and
the DAG will look like
```
t0: ch,glue = EntryToken
t3: ch = CopyToReg t0, Register:i32 %9, Constant:i32<20>
t5: ch = llvm.wasm.rethrow t3, TargetConstant:i32<12161>
t7: ch = br t5, BasicBlock:ch<unreachable 0x5555e3d32c70>
```
Note that now `rethrow` takes a token from `CopyToReg`, so `rethrow` has
to come after `CopyToReg`. And the resulting code will be
```mir
bb.3.rethrow:
%9:i32 = CONST_I32 20, implicit-def dead $arguments
RETHROW 0, implicit-def dead $arguments
BR %bb.6, implicit-def dead $arguments
```
I'm not very familiar with the internals of `getRoot` vs.
`getControlRoot`, but other terminator instructions seem to use the
latter, and using it for `rethrow` too worked.
show more ...
|
#
62b5196b |
| 18-Jun-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
DAG: Fix asserting on invalid inline asm constraints (#95935)
|
#
995835fe |
| 17-Jun-2024 |
Poseydon42 <vvmposeydon@gmail.com> |
[SelectionDAG] Add support for the 3-way comparison intrinsics [US]CMP (#91871)
This PR adds initial support for the `scmp`/`ucmp` 3-way comparison
intrinsics in the SelectionDAG. Some of the expan
[SelectionDAG] Add support for the 3-way comparison intrinsics [US]CMP (#91871)
This PR adds initial support for the `scmp`/`ucmp` 3-way comparison
intrinsics in the SelectionDAG. Some of the expansions/lowerings
are not optimal yet.
show more ...
|
#
85a7bba7 |
| 16-Jun-2024 |
NAKAMURA Takumi <geek4civic@gmail.com> |
Cleanup MC/DC intrinsics for #82448 (#95496)
3rd arg of `tvbitmap.update` was made unused. Remove 3rd arg.
Sweep `condbitmap.update`, since it is no longer used.
|
Revision tags: llvmorg-18.1.8 |
|
#
ae71609e |
| 14-Jun-2024 |
Andreas Jonson <andjo403@hotmail.com> |
[SDAG] Lower range attribute to AssertZext (#95450)
Add support for range attributes on calls, in addition to range metadata.
|
#
785dc76c |
| 14-Jun-2024 |
Jon Roelofs <jonathan_roelofs@apple.com> |
[llvm][SelectionDAG] Fix up chains in lowerInvokeable. rdar://113994760 (#94004)
lowerInvokeable wasn't updating the returned chain after emitting the
lowerEndEH, which caused SwiftErrorVal-handlin
[llvm][SelectionDAG] Fix up chains in lowerInvokeable. rdar://113994760 (#94004)
lowerInvokeable wasn't updating the returned chain after emitting the
lowerEndEH, which caused SwiftErrorVal-handling code to re-set the DAG
root, and thus accidentally skip the EH_LABEL node it was supposed to
have addeed. After fixing that, a few places needed to be adjusted that
assume the specific shape of the returned DAG.
Fixes: #64826
Fixes: rdar://113994760
show more ...
|
#
d4a01549 |
| 13-Jun-2024 |
Jay Foad <jay.foad@amd.com> |
[llvm-project] Fix typo "seperate" (#95373)
|
#
0605e984 |
| 07-Jun-2024 |
Quentin Colombet <quentin.colombet@gmail.com> |
[SDISel][Builder] Fix the instantiation of <1 x bfloat|half> (#94591)
Prior to this change, `SelectionDAGBuilder` was producing `SDNode`s of
the form: `f32 = extract_vector_elt <1 x bfloat|half>, i
[SDISel][Builder] Fix the instantiation of <1 x bfloat|half> (#94591)
Prior to this change, `SelectionDAGBuilder` was producing `SDNode`s of
the form: `f32 = extract_vector_elt <1 x bfloat|half>, i32 0` when
lowering phis of `<1 x bfloat|half>` and running on a target that
promotes this type to `f32` (like some x86 or AMDGPU targets.)
This construct is invalid since this type of node only allows type
extensions for integer types.
It went unotice because the `extract_vector_elt` node is later broken
down in `bitcast` followed by `bf16_to_fp|fp_extend`. However, when the
argument of the phi is a constant we were crashing because the existing
code would try to constant fold this `extract_vector_elt` into a
any_ext.
This patch fixes this by using a proper decomposition for `<1 x
bfloat|half>`:
```
bfloat|half = bitcast <1 x blfoat|half>
float = fp_extend bfloat|half
```
This change should be NFC for the non-constant-folding cases and fix the
SDISel crashes (reported in
https://github.com/llvm/llvm-project/issues/94449) for the folding
cases.
Note: The change on the arm test is a missing fp16 to f32 constant folding
exposed by this patch. I'll push a separate improvement for that.
show more ...
|
#
ea32197d |
| 06-Jun-2024 |
Orlando Cazalet-Hyams <orlando.hyams@sony.com> |
[DebugInfo][SelectionDAG] Fix position of salvaged 'dangling' DBG_VALUEs (#94458)
`SelectionDAGBuilder::handleDebugValue` has a parameter `Order` which
represents the insert-at position for the new
[DebugInfo][SelectionDAG] Fix position of salvaged 'dangling' DBG_VALUEs (#94458)
`SelectionDAGBuilder::handleDebugValue` has a parameter `Order` which
represents the insert-at position for the new DBG_VALUE. Prior to this patch
`SelectionDAGBuilder::SDNodeOrder` is used instead of the `Order` parameter.
The only code-paths where `Order != SDNodeOrder` are the two calls calls to
`handleDebugValue` from `salvageUnresolvedDbgValue`.
`salvageUnresolvedDbgValue` is called from `resolveOrClearDbgInfo` and
`dropDanglingDebugInfo`. The former is called after SelectionDAG completes one
block.
Some dbg.values can't be lowered to DBG_VALUEs right away. These get recorded
as 'dangling' - their order-number is saved - and get salvaged later through
`dropDanglingDebugInfo`, or if we've still got dangling debug info once the
whole block has been emitted, through `resolveOrClearDbgInfo`. Their saved
order-number is passed to `handleDebugValue`.
Prior to this patch, DBG_VALUEs inserted using these functions are inserted at
the "current" `SDNodeOrder` rather than the intended position that is passed to
the function.
Fix and add test.
show more ...
|
Revision tags: llvmorg-18.1.7 |
|
#
1d874335 |
| 05-Jun-2024 |
Farzon Lotfi <1802579+farzonl@users.noreply.github.com> |
[x86] Add tan intrinsic part 4 (#90503)
This change is an implementation of #87367's investigation on supporting
IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://disc
[x86] Add tan intrinsic part 4 (#90503)
This change is an implementation of #87367's investigation on supporting
IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294
Much of this change was following how G_FSIN and G_FCOS were used.
Changes:
- `llvm/docs/GlobalISel/GenericOpcode.rst` - Document the `G_FTAN`
opcode
- `llvm/docs/LangRef.rst` - Document the tan intrinsic
- `llvm/include/llvm/Analysis/VecFuncs.def` - Associate the tan
intrinsic as a vector function similar to the tanf libcall.
- `llvm/include/llvm/CodeGen/BasicTTIImpl.h` - Map the tan intrinsic to
`ISD::FTAN`
- `llvm/include/llvm/CodeGen/ISDOpcodes.h` - Define ISD opcodes for
`FTAN` and `STRICT_FTAN`
- `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic
- `llvm/include/llvm/IR/RuntimeLibcalls.def` - Define tan libcall
mappings
- `llvm/include/llvm/Target/GenericOpcodes.td` - Define the `G_FTAN`
Opcode
- `llvm/include/llvm/Support/TargetOpcodes.def` - Create a `G_FTAN`
Opcode handler
- `llvm/include/llvm/Target/GlobalISel/SelectionDAGCompat.td` - Map
`G_FTAN` to `ftan`
- `llvm/include/llvm/Target/TargetSelectionDAG.td` - Define `ftan`,
`strict_ftan`, and `any_ftan` and map them to the ISD opcodes for `FTAN`
and `STRICT_FTAN`
- `llvm/lib/Analysis/VectorUtils.cpp` - Associate the tan intrinsic as a
vector intrinsic
- `llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp` Map the tan intrinsic
to `G_FTAN` Opcode
- `llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp` - Add `G_FTAN` to
the list of floating point math operations also associate `G_FTAN` with
the `TAN_F` runtime lib.
- `llvm/lib/CodeGen/GlobalISel/Utils.cpp` - More floating point math
operation common behaviors.
- llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp - List the function
expansion operations for `FTAN` and `STRICT_FTAN`. Also define both
opcodes in `PromoteNode`.
- `llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp` - More `FTAN`
and `STRICT_FTAN` handling in the legalizer
- `llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h` - Define
`SoftenFloatRes_FTAN` and `ExpandFloatRes_FTAN`.
- `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp` - Define `FTAN`
as a legal vector operation.
- `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp` - Define
`FTAN` as a legal vector operation.
- `llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp` - define tan as an
intrinsic that doesn't return NaN.
- `llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp` Map
`LibFunc_tan`, `LibFunc_tanf`, and `LibFunc_tanl` to `ISD::FTAN`. Map
`Intrinsic::tan` to `ISD::FTAN` and add selection dag handling for
`Intrinsic::tan`.
- `llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp` - Define `ftan`
and `strict_ftan` names for the equivalent ISD opcodes.
- `llvm/lib/CodeGen/TargetLoweringBase.cpp` -Define a Tan128 libcall and
ISD::FTAN as a target lowering action.
- `llvm/lib/Target/X86/X86ISelLowering.cpp` - Add x86_64 lowering for
tan intrinsic
resolves https://github.com/llvm/llvm-project/issues/70082
show more ...
|
#
deab451e |
| 04-Jun-2024 |
Nikita Popov <npopov@redhat.com> |
[IR] Remove support for icmp and fcmp constant expressions (#93038)
Remove support for the icmp and fcmp constant expressions.
This is part of:
https://discourse.llvm.org/t/rfc-remove-most-const
[IR] Remove support for icmp and fcmp constant expressions (#93038)
Remove support for the icmp and fcmp constant expressions.
This is part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179
As usual, many of the updated tests will no longer test what they were
originally intended to -- this is hard to preserve when constant
expressions get removed, and in many cases just impossible as the
existence of a specific kind of constant expression was the cause of the
issue in the first place.
show more ...
|
#
0b4af3a5 |
| 03-Jun-2024 |
Jon Roelofs <jonathan_roelofs@apple.com> |
[llvm][SelectionDAG] Relax llvm.ptrmask's size check on arm64_32 (#94125)
Since pointers in memory, as well as the index type are both 32 bits,
but in registers pointers are 64 bits, the mask gener
[llvm][SelectionDAG] Relax llvm.ptrmask's size check on arm64_32 (#94125)
Since pointers in memory, as well as the index type are both 32 bits,
but in registers pointers are 64 bits, the mask generated by
llvm.ptrmask needs to be zero-extended.
Fixes: #94075
Fixes: rdar://125263567
show more ...
|