#
2fad6e69 |
| 31-Aug-2023 |
Nick Desaulniers <ndesaulniers@google.com> |
[InlineAsm] wrap Kind in enum class NFC
Should add some minor type safety to the use of this information, since there's quite a bit of metadata being laundered through an `unsigned`.
I'm looking to
[InlineAsm] wrap Kind in enum class NFC
Should add some minor type safety to the use of this information, since there's quite a bit of metadata being laundered through an `unsigned`.
I'm looking to potentially add more bitfields to that `unsigned`, but I find InlineAsm's big ol' bag of enum values and usage of `unsigned` confusing, type-unsafe, and un-ergonomic. These can probably be better abstracted.
I think the lack of static_cast outside of InlineAsm indicates the prior code smell fixed here.
Reviewed By: qcolombet
Differential Revision: https://reviews.llvm.org/D159242
show more ...
|
#
1b12427c |
| 25-Aug-2023 |
LiaoChunyu <chunyu@iscas.ac.cn> |
[VP][RISCV] Add vp.is.fpclass and RISC-V support
There is no vp.fpclass after FCLASS_VL(D151176), try to support vp.fpclass.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D15
[VP][RISCV] Add vp.is.fpclass and RISC-V support
There is no vp.fpclass after FCLASS_VL(D151176), try to support vp.fpclass.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D152993
show more ...
|
#
35f4ef1f |
| 23-Aug-2023 |
Felipe de Azevedo Piovezan <fpiovezan@apple.com> |
[SelectionDAG][DebugInfo] Handle entry_value dbg.value DIExprs earlier
When SelectiondDAG converts dbg.value intrinsics, it first ensures we have already generated code for the value operator of the
[SelectionDAG][DebugInfo] Handle entry_value dbg.value DIExprs earlier
When SelectiondDAG converts dbg.value intrinsics, it first ensures we have already generated code for the value operator of the intrinsic. The rationale being that if we haven't had the need to generate code for this value, it won't be a debug value that causes the generation.
For example, if the first use the physical register of an argument is a dbg.value, we are going to hit this code path. However, this is irrelevant for entry value expressions: by definition we are not interested in the _current_ value of the physical register, but rather on its value at the start of the function. To deal with this, this patch changes lowering to handle this case as early as possible.
Differential Revision: https://reviews.llvm.org/D158649
show more ...
|
#
6862f0fa |
| 24-Aug-2023 |
Serge Pavlov <sepavloff@gmail.com> |
[FPEnv] Intrinsics for access to FP control modes
The change introduces intrinsics 'get_fpmode', 'set_fpmode' and 'reset_fpmode'. They manage all target dynamic floating-point control modes, which i
[FPEnv] Intrinsics for access to FP control modes
The change introduces intrinsics 'get_fpmode', 'set_fpmode' and 'reset_fpmode'. They manage all target dynamic floating-point control modes, which include, for instance, rounding direction, precision, treatment of denormals and so on. The intrinsics do the same operations as the C library functions 'fegetmode' and 'fesetmode'. By default they are lowered to calls to these functions.
Two main use cases are supported by this implementation.
1. Local modification of the control modes. In this case the code usually has a pattern (in pseudocode):
saved_modes = get_fpmode() set_fpmode(<new_modes>) ... <do operations under the new modes> ... set_fpmode(saved_modes)
In the case when it is known that the current FP environment is default, the code may be shorter:
set_fpmode(<new_modes>) ... <do operations under the new modes> ... reset_fpmode()
Such patterns appear not only in user code but also in implementations of various FP controlling pragmas. In particular, the implementation of `#pragma STDC FENV_ROUND` requires similar code if the target does not support static rounding mode.
2. Portable control of FP modes. Usually FP control modes are set by writing to some control register. Different targets have different layout of this register, the way the register is accessed also may be different. Using set of target-specific definitions for the control register bits together with these intrinsic functions provides enough portable way to handle control modes across wide range of hardware.
This change defines only llvm intrinsic function, which implement the access required for the aforementioned use cases.
Differential Revision: https://reviews.llvm.org/D82525
show more ...
|
#
56606520 |
| 14-Aug-2023 |
Paul Walker <paul.walker@arm.com> |
[SelectionDAG] Use TypeSize variant of ComputeValueVTs to compute correct offsets for scalable aggregate types.
Differential Revision: https://reviews.llvm.org/D157872
|
#
9deee6bf |
| 11-Aug-2023 |
Nikita Popov <npopov@redhat.com> |
[SDAG] Don't transfer !range metadata without !noundef to SDAG (PR64589)
D141386 changed the semantics of !range metadata to return poison on violation. If !range is combined with !noundef, violatio
[SDAG] Don't transfer !range metadata without !noundef to SDAG (PR64589)
D141386 changed the semantics of !range metadata to return poison on violation. If !range is combined with !noundef, violation is immediate UB instead, matching the old semantics.
In theory, these IR semantics should also carry over into SDAG. In practice, DAGCombine has at least one key transform that is invalid in the presence of poison, namely the conversion of logical and/or to bitwise and/or (https://github.com/llvm/llvm-project/blob/c7b537bf0923df05254f9fa4722b298eb8f4790d/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L11252). Ideally, we would fix this transform, but this will require substantial work to avoid codegen regressions.
In the meantime, avoid transferring !range metadata without !noundef, effectively restoring the old !range metadata semantics on the SDAG layer.
Fixes https://github.com/llvm/llvm-project/issues/64589.
Differential Revision: https://reviews.llvm.org/D157685
show more ...
|
#
a91a4d93 |
| 11-Aug-2023 |
Paul Walker <paul.walker@arm.com> |
[NFC][SelectionDAGBuilder] Use getObjectPtrOffset in place of discrete nodes.
Some prep work to make aggregate loads and stores TypeSize aware.
|
#
b7e6e568 |
| 07-Aug-2023 |
Paul Walker <paul.walker@arm.com> |
[SelectionDAG] Fix problematic call to EVT::changeVectorElementType().
The function changeVectorElementType assumes MVT input types will result in MVT output types. There's no gurantee this is poss
[SelectionDAG] Fix problematic call to EVT::changeVectorElementType().
The function changeVectorElementType assumes MVT input types will result in MVT output types. There's no gurantee this is possible during early code generation and so this patch converts an instance used during initial DAG construction to instead explicitly create a new EVT.
NOTE: I could have added more MVTs, but that seemed unscalable as you can either have MVTs with 100% element count coverage or 100% bitwidth coverage, but not both.
Differential Revision: https://reviews.llvm.org/D157392
show more ...
|
#
4ce7c4a9 |
| 01-Aug-2023 |
Bjorn Pettersson <bjorn.a.pettersson@ericsson.com> |
[llvm] Drop some typed pointer handling/bitcasts
Differential Revision: https://reviews.llvm.org/D157016
|
#
60b98363 |
| 31-Jul-2023 |
Simon Tatham <simon.tatham@arm.com> |
Retain all jump table range checks when using BTI.
This modifies the switch-statement generation in SelectionDAGBuilder, specifically the part that generates case clusters of type CC_JumpTable.
A t
Retain all jump table range checks when using BTI.
This modifies the switch-statement generation in SelectionDAGBuilder, specifically the part that generates case clusters of type CC_JumpTable.
A table-based branch of any kind is at risk of being a JOP gadget, if it doesn't range-check the offset into the table. For some types of table branch, such as Arm TBB/TBH, the impact of this is limited because the value loaded from the table is a relative offset of limited size; for others, such as a MOV PC,Rn computed branch into a table of further branch instructions, the gadget is fully general.
When compiling for branch-target enforcement via Arm's BTI system, many of these table branch idioms use branch instructions of types that do not require a BTI instruction at the branch destination. This avoids the need to put a BTI at the start of each case handler, reducing the number of available gadgets //with// BTIs (i.e. ones which could be used by a JOP attack in spite of the BTI system). But without a range check, the use of a non-BTI-requiring branch also opens up a larger range of followup gadgets for an attacker's use.
A defence against this is to avoid optimising away the range check on the table offset, even if the compiler believes that no out-of-range value should be able to reach the table branch. (Rationale: that may be true for values generated legitimately by the program, but not those generated maliciously by attackers who have already corrupted the control flow.)
The effect of keeping the range check and branching to an unreachable block is that no actual code is generated at that block, so it will typically point at the end of the function. That may still cause some kind of unpredictable code execution (such as executing data as code, or falling through to the next function in the code section), but even if so, there will only be //one// possible invalid branch target, rather than giving an attacker the choice of many possibilities.
This defence is enabled only when branch target enforcement is in use. Without branch target enforcement, the range check is easily bypassed anyway, by branching in to a location just after it. But with enforcement, the attacker will have to enter the jump table dispatcher at the initial BTI and then go through the range check. (Or, if they don't, it's because they //already// have a general BTI-bypassing gadget.)
Reviewed By: MaskRay, chill
Differential Revision: https://reviews.llvm.org/D155485
show more ...
|
#
432338a6 |
| 11-Jul-2023 |
Amara Emerson <amara@apple.com> |
Don't assert on a non-pointer value being used for a "p" inline asm constraint.
GCC and existing codebases allow the use of integral values to be used with this constraint. A recent change D133914 i
Don't assert on a non-pointer value being used for a "p" inline asm constraint.
GCC and existing codebases allow the use of integral values to be used with this constraint. A recent change D133914 in this area started causing asserts. Removing the assert is enough as the rest of the code works fine.
rdar://109675485
Differential Revision: https://reviews.llvm.org/D155023
show more ...
|
#
de79233b |
| 12-Jul-2023 |
Marco Elver <elver@google.com> |
[X86] Complete preservation of !pcsections in X86ISelLowering
https://reviews.llvm.org/D130883 introduced MIMetadata to simplify metadata propagation (DebugLoc and PCSections).
However, we're curre
[X86] Complete preservation of !pcsections in X86ISelLowering
https://reviews.llvm.org/D130883 introduced MIMetadata to simplify metadata propagation (DebugLoc and PCSections).
However, we're currently still permitting implicit conversion of DebugLoc to MIMetadata, to allow for a gradual transition and let the old code work as-is.
This manifests in lost !pcsections metadata for X86-specific lowerings. For example, 128-bit atomics.
Fix the situation for X86ISelLowering by converting all BuildMI() calls to use an explicitly constructed MIMetadata.
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D154986
show more ...
|
#
003b58f6 |
| 27-Apr-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
IR: Add llvm.frexp intrinsic
Add an intrinsic which returns the two pieces as multiple return values. Alternatively could introduce a pair of intrinsics to separately return the fractional and expon
IR: Add llvm.frexp intrinsic
Add an intrinsic which returns the two pieces as multiple return values. Alternatively could introduce a pair of intrinsics to separately return the fractional and exponent parts.
AMDGPU has native instructions to return the two halves, but could use some generic legalization and optimization handling. For example, we should be able to handle legalization of f16 on older targets, and for bf16. Additionally antique targets need a hardware workaround which would be better handled in the backend rather than in library code where it is now.
show more ...
|
#
4afa2ab7 |
| 26-Jun-2023 |
Craig Topper <craig.topper@sifive.com> |
[RISCV][SelectionDAGBuilder] Fix an implicit scalable TypeSize to fixed size conversion in getUniformBase.
If the index needs to be scaled by a scalable size, just give up.
Fixes #63459
Reviewed B
[RISCV][SelectionDAGBuilder] Fix an implicit scalable TypeSize to fixed size conversion in getUniformBase.
If the index needs to be scaled by a scalable size, just give up.
Fixes #63459
Reviewed By: frasercrmck, RKSimon
Differential Revision: https://reviews.llvm.org/D153601
show more ...
|
#
d22a236a |
| 24-Jun-2023 |
Youngsuk Kim <youngsuk.kim@hpe.com> |
[llvm] Replace use of Type::getPointerTo() (NFC)
Partial progress towards replacing in-tree uses of `Type::getPointerTo()`.
If `getPointerTo()` is used solely to support an unnecessary bitcast, rem
[llvm] Replace use of Type::getPointerTo() (NFC)
Partial progress towards replacing in-tree uses of `Type::getPointerTo()`.
If `getPointerTo()` is used solely to support an unnecessary bitcast, remove the bitcast.
Reviewed By: barannikov88, nikic
Differential Revision: https://reviews.llvm.org/D153307
show more ...
|
#
f9fd0062 |
| 23-Jun-2023 |
Fangrui Song <i@maskray.me> |
[XRay][AArch64] Suppport __xray_customevent/__xray_typedevent
`__xray_customevent` and `__xray_typedevent` are built-in functions in Clang. With -fxray-instrument, they are lowered to intrinsics llv
[XRay][AArch64] Suppport __xray_customevent/__xray_typedevent
`__xray_customevent` and `__xray_typedevent` are built-in functions in Clang. With -fxray-instrument, they are lowered to intrinsics llvm.xray.customevent and llvm.xray.typedevent, respectively. These intrinsics are then lowered to TargetOpcode::{PATCHABLE_EVENT_CALL,PATCHABLE_TYPED_EVENT_CALL}. The target is responsible for generating a code sequence that calls either `__xray_CustomEvent` (with 2 arguments) or `__xray_TypedEvent` (with 3 arguments).
Before patching, the code sequence is prefixed by a branch instruction that skips the rest of the code sequence. After patching (compiler-rt/lib/xray/xray_AArch64.cpp), the branch instruction becomes a NOP and the function call will take effects.
This patch implements the lowering process for {PATCHABLE_EVENT_CALL,PATCHABLE_TYPED_EVENT_CALL} and implements the runtime.
``` // Lowering of PATCHABLE_EVENT_CALL .Lxray_sled_N: b #24 stp x0, x1, [sp, #-16]! x0 = reg of op0 x1 = reg of op1 bl __xray_CustomEvent ldrp x0, x1, [sp], #16 ```
As a result, two updated tests in compiler-rt/test/xray/TestCases/Posix/ now pass on AArch64.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D153320
show more ...
|
#
81ec494c |
| 21-Jun-2023 |
Nikita Popov <npopov@redhat.com> |
[SDAGBuilder] Handle multi-part arguments in argument copy elision (PR63430)
When eliding an argument copy, we need to update the chain to ensure the argument reads are performed before later writes
[SDAGBuilder] Handle multi-part arguments in argument copy elision (PR63430)
When eliding an argument copy, we need to update the chain to ensure the argument reads are performed before later writes. However, the code doing this only handled this for the first part of the argument. If the argument had multiple parts, the chains of the later parts were dropped. Make sure we preserve all chains.
Fixes https://github.com/llvm/llvm-project/issues/63430.
show more ...
|
#
43ad2e9c |
| 20-Jun-2023 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[DAG] Add getExtOrTrunc helper. NFC.
Wrap the getSExtOrTrunc/getZExtOrTrunc calls behind an IsSigned argument.
|
#
cdcbef1b |
| 12-Jun-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
DAG: Fix typo in GET_FPENV legality check
This made GET_FPENV unusable since the DAG builder would always emit the mem version.
|
#
26bfbec5 |
| 09-Jun-2023 |
Anna Thomas <anna@azul.com> |
[Intrinsic] Introduce reduction intrinsics for minimum/maximum
This patch introduces the reduction intrinsic for floating point minimum and maximum which has the same semantics (for NaN and signed z
[Intrinsic] Introduce reduction intrinsics for minimum/maximum
This patch introduces the reduction intrinsic for floating point minimum and maximum which has the same semantics (for NaN and signed zero) as llvm.minimum and llvm.maximum.
Reviewed-By: nikic
Differential Revision: https://reviews.llvm.org/D152370
show more ...
|
#
8d1edae9 |
| 13-Jun-2023 |
Serge Pavlov <sepavloff@gmail.com> |
Use SelectionDAGBuiler::getRoot instead of SelectionDAG::getRoot
|
#
7634905a |
| 09-Jun-2023 |
Phoebe Wang <phoebe.wang@intel.com> |
[X86][BF16] Share FP16 vector ABI with BF16
The ABI of BF16 is identical to FP16 rather than i16.
Fixes #62997
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D151710
|
#
eece6ba2 |
| 27-Apr-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
IR: Add llvm.ldexp and llvm.experimental.constrained.ldexp intrinsics
AMDGPU has native instructions and target intrinsics for this, but these really should be subject to legalization and generic op
IR: Add llvm.ldexp and llvm.experimental.constrained.ldexp intrinsics
AMDGPU has native instructions and target intrinsics for this, but these really should be subject to legalization and generic optimizations. This will enable legalization of f16->f32 on targets without f16 support.
Implement a somewhat horrible inline expansion for targets without libcall support. This could be better if we could introduce control flow (GlobalISel version not yet implemented). Support for strictfp legalization is less complete but works for the simple cases.
show more ...
|
#
eecaeb6f |
| 05-Jun-2023 |
Serge Pavlov <sepavloff@gmail.com> |
[FPEnv] Intrinsics for access to FP environment
The change implements intrinsics 'get_fpenv', 'set_fpenv' and 'reset_fpenv'. They are used to read floating-point environment, set it or reset to some
[FPEnv] Intrinsics for access to FP environment
The change implements intrinsics 'get_fpenv', 'set_fpenv' and 'reset_fpenv'. They are used to read floating-point environment, set it or reset to some default state. They do the same actions as C library functions 'fegetenv' and 'fesetenv'. By default these intrinsics are lowered to calls to these functions.
The new intrinsics specify FP environment as a value of integer type, it is convenient of most targets where the FP state is a content of some register. Some targets however use long representations. On X86 the size of FP environment is 256 bits, and even half of this size is not a legal ibteger type. To facilitate legalization in such cases, two sets of DAG nodes is used. Nodes GET_FPENV and SET_FPENV are used when FP environment may be represented by a legal integer type. Nodes GET_FPENV_MEM and SET_FPENV_MEM consider FP environment as a region in memory, much like `fesetenv` and `fegetenv` do. They are used when target has long representation for floationg-point state.
Differential Revision: https://reviews.llvm.org/D71742
show more ...
|
#
09515f2c |
| 01-Jun-2023 |
Dávid Bolvanský <david.bolvansky@gmail.com> |
[SDAG] Preserve unpredictable metadata, teach X86CmovConversion to respect this metadata
Sometimes an developer would like to have more control over cmov vs branch. We have unpredictable metadata in
[SDAG] Preserve unpredictable metadata, teach X86CmovConversion to respect this metadata
Sometimes an developer would like to have more control over cmov vs branch. We have unpredictable metadata in LLVM IR, but currently it is ignored by X86 backend. Propagate this metadata and avoid cmov->branch conversion in X86CmovConversion for cmov with this metadata.
Example:
``` int MaxIndex(int n, int *a) { int t = 0; for (int i = 1; i < n; i++) { // cmov is converted to branch by X86CmovConversion if (a[i] > a[t]) t = i; } return t; }
int MaxIndex2(int n, int *a) { int t = 0; for (int i = 1; i < n; i++) { // cmov is preserved if (__builtin_unpredictable(a[i] > a[t])) t = i; } return t; } ```
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D118118
show more ...
|