Revision tags: llvmorg-21-init |
|
#
f95a8bde |
| 27-Jan-2025 |
Momchil Velikov <momchil.velikov@arm.com> |
[AArch64] Refactor implementation of FP8 types (NFC) (#123604)
- The FP8 scalar type (`__mfp8`) was described as a vector type
- The FP8 vector types were described/assumed to have integer element
[AArch64] Refactor implementation of FP8 types (NFC) (#123604)
- The FP8 scalar type (`__mfp8`) was described as a vector type
- The FP8 vector types were described/assumed to have integer element
type (the element type ought to be `__mfp8`)
- Add support for `m` type specifier (denoting `__mfp8`) in
`DecodeTypeFromStr` and create builtin function prototypes using that
specifier, instead of `int8_t`
show more ...
|
#
d028eaae |
| 21-Jan-2025 |
Jonathan Thackray <jonathan.thackray@arm.com> |
[AArch64] Update SVE untyped intrinsics to have FP8 variants (#123585)
Update the following intrinsics to have FP8 variants:
``` c
svuint8_t svdup_laneq[_u8](svuint8_t zn, uint64_t imm_idx);
svui
[AArch64] Update SVE untyped intrinsics to have FP8 variants (#123585)
Update the following intrinsics to have FP8 variants:
``` c
svuint8_t svdup_laneq[_u8](svuint8_t zn, uint64_t imm_idx);
svuint8_t svextq[_u8](svuint8_t zdn, svuint8_t zm, uint64_t imm);
svint8_t svtblq[_s8](svint8_t zn, svuint8_t zm);
svint8_t svtbxq[_s8](svint8_t fallback, svint8_t zn, svuint8_t zm);
svuint8_t svuzpq1[_u8](svuint8_t zn, svuint8_t zm);
svuint8_t svuzpq2[_u8](svuint8_t zn, svuint8_t zm);
svuint8_t svzipq1[_u8](svuint8_t zn, svuint8_t zm);
svuint8_t svzipq2[_u8](svuint8_t zn, svuint8_t zm);
```
show more ...
|
Revision tags: llvmorg-19.1.7 |
|
#
16e45b8f |
| 13-Jan-2025 |
Momchil Velikov <momchil.velikov@arm.com> |
[AArch64] Implement FP8 SVE/SME reinterpret intrinsics (#121063)
|
#
21b531ea |
| 07-Jan-2025 |
Nicholas Guy <nicholas.guy@arm.com> |
[clang][llvm][aarch64] Add aarch64_sme_in_streaming_mode intrinsic (#120265)
Replacing the extant streaming mode function call with an intrinsic
allows us to make further optimisations around it. F
[clang][llvm][aarch64] Add aarch64_sme_in_streaming_mode intrinsic (#120265)
Replacing the extant streaming mode function call with an intrinsic
allows us to make further optimisations around it. For example, if it's
called within a function that has a known streaming mode, we can remove
the dead code, and avoid the redundant conditional branch.
show more ...
|
#
db84ae3a |
| 19-Dec-2024 |
SpencerAbson <Spencer.Abson@arm.com> |
[Clang][AArch64] Add signed index/offset variants of sve2p1 qword stores (#120549)
This patch adds signed offset/index variants to the SVE2p1 quadword
store intrinsics, in accordance with
https://
[Clang][AArch64] Add signed index/offset variants of sve2p1 qword stores (#120549)
This patch adds signed offset/index variants to the SVE2p1 quadword
store intrinsics, in accordance with
https://github.com/ARM-software/acle/pull/359.
show more ...
|
Revision tags: llvmorg-19.1.6 |
|
#
c2172431 |
| 13-Dec-2024 |
Momchil Velikov <momchil.velikov@arm.com> |
[AArch64] Implements FP8 SVE intrinsics for dot-product (#118125)
This patch adds the following intrinsics:
* 8-bit floating-point dot product to single-precision.
// Only if (__ARM_FEATURE_SV
[AArch64] Implements FP8 SVE intrinsics for dot-product (#118125)
This patch adds the following intrinsics:
* 8-bit floating-point dot product to single-precision.
// Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8DOT4) ||
__ARM_FEATURE_SSVE_FP8DOT4
svfloat32_t svdot[_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn,
svmfloat8_t zm, fpm_t fpm);
svfloat32_t svdot[_n_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn,
mfloat8_t zm, fpm_t fpm);
* 8-bit floating-point indexed dot product to single-precision.
// Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8DOT4) ||
__ARM_FEATURE_SSVE_FP8DOT4
svfloat32_t svdot_lane[_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn,
svmfloat8_t zm,
uint64_t imm0_3, fpm_t fpm);
* 8-bit floating-point dot product to half-precision.
// Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8DOT2) ||
__ARM_FEATURE_SSVE_FP8DOT2
svfloat16_t svdot[_f16_mf8]_fpm(svfloat16_t zda, svmfloat8_t zn,
svmfloat8_t zm, fpm_t fpm);
svfloat16_t svdot[_n_f16_mf8]_fpm(svfloat16_t zda, svmfloat8_t zn,
mfloat8_t zm, fpm_t fpm);
* 8-bit floating-point indexed dot product to half-precision.
// Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8DOT2) ||
__ARM_FEATURE_SSVE_FP8DOT2
svfloat16_t svdot_lane[_f16_mf8]_fpm(svfloat16_t zda, svmfloat8_t zn,
svmfloat8_t zm,
uint64_t imm0_7, fpm_t fpm);
show more ...
|
Revision tags: llvmorg-19.1.5 |
|
#
ac7fe426 |
| 02-Dec-2024 |
SpencerAbson <Spencer.Abson@arm.com> |
[Clang][AArch64]Refactor typespec handling in SveEmitter.cpp (#117717)
- Switch to an enumerated type approach, which is less error-prone as we
continue to add new types. This is similar to NeonEmi
[Clang][AArch64]Refactor typespec handling in SveEmitter.cpp (#117717)
- Switch to an enumerated type approach, which is less error-prone as we
continue to add new types. This is similar to NeonEmitter.
- Fix existing faulty typespec modifiers
show more ...
|
#
e4ee970c |
| 28-Nov-2024 |
SpencerAbson <Spencer.Abson@arm.com> |
[AArch64] Implement intrinsics for F1CVTL/F2CVTL and BF1CVTL/BF2CVTL (#116959)
This patch implements the following intrinsics:
8-bit floating-point convert to deinterleaved half-precision or
BFl
[AArch64] Implement intrinsics for F1CVTL/F2CVTL and BF1CVTL/BF2CVTL (#116959)
This patch implements the following intrinsics:
8-bit floating-point convert to deinterleaved half-precision or
BFloat16.
``` c
// Variant is also available for: _bf16[_mf8]_x2
svfloat16x2_t svcvtl1_f16[_mf8]_x2_fpm(svmfloat8_t zn, fpm_t fpm) __arm_streaming;
svfloat16x2_t svcvtl2_f16[_mf8]_x2_fpm(svmfloat8_t zn, fpm_t fpm) __arm_streaming;
```
Defined in https://github.com/ARM-software/acle/pull/323
Co-authored-by: Caroline Concatto caroline.concatto@arm.com
Co-authored-by: Marian Lukac marian.lukac@arm.com
show more ...
|
Revision tags: llvmorg-19.1.4 |
|
#
63aa8cf6 |
| 17-Nov-2024 |
Rahul Joshi <rjoshi@nvidia.com> |
[NFC][Clang][TableGen] Fix file header comments (#116491)
|
#
a8a1e903 |
| 14-Nov-2024 |
Kazu Hirata <kazu@google.com> |
[TableGen] Remove unused includes (NFC) (#116168)
Identified with misc-include-cleaner.
|
Revision tags: llvmorg-19.1.3 |
|
#
508fd966 |
| 18-Oct-2024 |
CarolineConcatto <caroline.concatto@arm.com> |
[CLANG][AArch64]Add SVE tuple types for mfloat8_t (#112687)
This patch adds scalable tuple types vectors for MFloat_8 type,
according to the ACLE[1].
[1] https://github.com/ARM-software/acle.git
|
#
cb43021e |
| 17-Oct-2024 |
CarolineConcatto <caroline.concatto@arm.com> |
[CLANG]Add Scalable vectors for mfloat8_t (#101644)
This patch adds these new vector sizes for sve:
svmfloat8_t
According to the ARM ACLE PR#323[1].
[1] ARM-software/acle#323
|
Revision tags: llvmorg-19.1.2 |
|
#
f22e6d59 |
| 08-Oct-2024 |
Sander de Smalen <sander.desmalen@arm.com> |
[Clang][AArch64] Fix checkArmStreamingBuiltin for 'sve-b16b16' (#109420)
The implementation made the assumption that any feature starting with
"sve" meant that this was an SVE feature. This is not
[Clang][AArch64] Fix checkArmStreamingBuiltin for 'sve-b16b16' (#109420)
The implementation made the assumption that any feature starting with
"sve" meant that this was an SVE feature. This is not the case for
"sve-b16b16", as this is a feature that applies to both SVE and SME.
This meant that:
```
__attribute__((target("+sme2,+sve2,+sve-b16b16")))
svbfloat16_t foo(svbfloat16_t a, svbfloat16_t b, svbfloat16_t c)
__arm_streaming {
return svclamp_bf16(a, b, c);
}
```
would result in an incorrect diagnostic saying that `svclamp_bf16` could
only be used in non-streaming functions.
show more ...
|
#
a140931b |
| 01-Oct-2024 |
Rahul Joshi <rjoshi@nvidia.com> |
[TableGen] Change `getValueAsListOfDefs` to return const pointer vector (#110713)
Change `getValueAsListOfDefs` to return a vector of const Record
pointer, and remove `getValueAsListOfConstDefs` th
[TableGen] Change `getValueAsListOfDefs` to return const pointer vector (#110713)
Change `getValueAsListOfDefs` to return a vector of const Record
pointer, and remove `getValueAsListOfConstDefs` that was added as a
transition aid.
This is a part of effort to have better const correctness in TableGen
backends:
https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
show more ...
|
#
e9dbdb20 |
| 01-Oct-2024 |
Rahul Joshi <rjoshi@nvidia.com> |
[Clang][TableGen] Change NeonEmitter to use const Record * (#110597)
This is a part of effort to have better const correctness in TableGen
backends:
https://discourse.llvm.org/t/psa-planned-ch
[Clang][TableGen] Change NeonEmitter to use const Record * (#110597)
This is a part of effort to have better const correctness in TableGen
backends:
https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
show more ...
|
Revision tags: llvmorg-19.1.1, llvmorg-19.1.0 |
|
#
0e948bfd |
| 16-Sep-2024 |
Rahul Joshi <rjoshi@nvidia.com> |
[NFC][clang][TableGen] Remove redundant llvm:: namespace qualifier (#108627)
Remove llvm:: from .cpp files, and add "using namespace llvm" if needed.
|
#
711278e2 |
| 13-Sep-2024 |
Rahul Joshi <rjoshi@nvidia.com> |
[clang][TableGen] Change SVE Emitter to use const RecordKeeper (#108503)
Change SVE Emitter to use const RecordKeeper.
This is a part of effort to have better const correctness in TableGen
backe
[clang][TableGen] Change SVE Emitter to use const RecordKeeper (#108503)
Change SVE Emitter to use const RecordKeeper.
This is a part of effort to have better const correctness in TableGen
backends:
https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
show more ...
|
#
1f70fcef |
| 06-Sep-2024 |
SpencerAbson <Spencer.Abson@arm.com> |
[Clang][AArch64] Add customisable immediate range checking to NEON (#100278)
This patch moves NEON immediate argument specification and checking to
the system currently shared by both SVE and SME.
[Clang][AArch64] Add customisable immediate range checking to NEON (#100278)
This patch moves NEON immediate argument specification and checking to
the system currently shared by both SVE and SME.
In its current form, the TableGen definition of a NEON intrinsic cannot
control how its immediate arguments are range-checked, this information
must be inferred from the name of the intrinsic by NeonEmitter, which
also assumes that any NEON instruction will only ever receive a single
immediate argument. For SVE/SME instrinsics, this information is more
conveniently supplied in the TableGen definition.
As a result, for each immediate argument, NEON instructions must define
- The index of the immediate argument to be checked
- The type of immediate range check to be performed,
(e.g., ImmCheckShiftRight)
- The index of the argument whose type defines the context
of this immediate check (base type, vector size).
- **Difference from SVE/SME** If this definition generates a polymorphic
NEON builtin, the base type defined by this argument is overwritten by
that of the type code supplied to the overloaded builtin call. This
third argument is omitted in some cases due to this.
Here is an example for
[`vfma_laneq`](https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:@navigationhierarchiessimdisa=[Neon]&q=vfma_laneq)
- The immediate is supplied in argument 3
- The immediate is used as an index into the lanes of argument 2
- So we must perform an immediate check on argument 3, based on the type
information of argument 2.
- `ImmCheck<3, ImmCheckLaneIndex, 2>`
During this work, we discovered that the existing immediate
range-checking system was largely untested, which made it difficult to
make reliable progress. Missing tests have been added to verify this
implementation against all intrinsics which take constrained immediate
arguments. All test immediate range checking tests for NEON intrinsics
are moved to a dedicated directory
`clang/test/Sema/aarch64-neon-immediate-ranges/`.
show more ...
|
Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2 |
|
#
1fa7f05b |
| 05-Aug-2024 |
Kazu Hirata <kazu@google.com> |
[clang] Construct SmallVector with ArrayRef (NFC) (#101898)
|
Revision tags: llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
09c0337a |
| 24-Jun-2024 |
Sander de Smalen <sander.desmalen@arm.com> |
[Clang][SveEmitter] Split up TargetGuard into SVE and SME component. (#96482)
One reason to want to split this up is to simplify the code added in
#93802, where it checks the SME streaming-mode req
[Clang][SveEmitter] Split up TargetGuard into SVE and SME component. (#96482)
One reason to want to split this up is to simplify the code added in
#93802, where it checks the SME streaming-mode requirements for a
builtin by checking for the absence of SVE. If the target guards are
separate, we can generate a table and make the Sema code to verify the
runtime mode simpler.
Another reason is to avoid an issue with a check in SveEmitter.cpp where
it ensures that the 'VerifyRuntimeMode' is set correctly for functions
that have both SVE and SME target guards:
if (!Def->isFlagSet(VerifyRuntimeMode) &&
Def->getGuard().contains("sve") &&
Def->getGuard().contains("sme"))
llvm_unreachable("Missing VerifyRuntimeMode flag");
However, if we ever add a new feature with "sme" in the name, even
though it is unrelated to FEAT_SME, then this code no longer works.
Note that the arm_sve.td and arm_sme.td files could do with a bit of
restructuring after this but it seems better to follow that up in an NFC
patch.
show more ...
|
#
b39f523a |
| 21-Jun-2024 |
Sander de Smalen <sander.desmalen@arm.com> |
[Clang][AArch64] Expose compatible SVE intrinsics with only +sme (#95787)
This allows code with SVE intrinsics to be compiled with +sme,+nosve,
assuming the encompassing function is in the correct
[Clang][AArch64] Expose compatible SVE intrinsics with only +sme (#95787)
This allows code with SVE intrinsics to be compiled with +sme,+nosve,
assuming the encompassing function is in the correct mode (see #93802)
show more ...
|
#
1644a31a |
| 16-Jun-2024 |
Sander de Smalen <sander.desmalen@arm.com> |
[Clang][AArch64] Generalise streaming mode checks for builtins. (#93802)
PR #76975 added 'IsStreamingOrSVE2p1' to emit a diagnostic when a builtin marked
with 'IsStreamingOrSVE2p1' is used in a non
[Clang][AArch64] Generalise streaming mode checks for builtins. (#93802)
PR #76975 added 'IsStreamingOrSVE2p1' to emit a diagnostic when a builtin marked
with 'IsStreamingOrSVE2p1' is used in a non-streaming function that is not
compiled with `+sve2p1`.
The problem is a bit more complex than only this case. For example, we've marked
lots of builtins with 'IsStreamingCompatible', meaning it can be used in either
streaming, streaming-compatible or non-streaming functions. But the code in
SemaChecking, doesn't check the appropriate target guards. This issue becomes
relevant when SVE builtins are only available in streaming mode, e.g. when
compiling for SME without SVE.
If we were to add the appropriate target guards, we'd have to add many more
combinations, e.g.:
IsStreamingSMEOrSVE
IsStreamingSME2OrSVE2
IsStreamingSMEOrSVE2p1
IsStreamingSME2OrSVE2p1
etc.
To avoid having to add more combinations (and avoid having to add more in the
future for new extensions), we use a single 'IsSVEOrStreamingSVE' flag for all
builtins that are available in streaming mode for the appropriate SME flags, or
in non-streaming mode for the appropriate SVE flags, or both. The code in
SemaChecking will then verify for which mode (or both) the builtin would be
defined, given the target features of the function/compilation unit.
For example:
'svclamp' is enabled under FEAT_SVE2p1 and FEAT_SME2
* When we compile for SVE2p1 and SME (but not SME2), the builtin is undefined
behaviour when called from a streaming function.
* When we compile for SME2 and SVE2 (but not SVE2p1), the builtin is undefined
behaviour when called from a non-streaming function.
* When we compile for _both_ SVE2p1 and SME2, the builtin can be used in either
mode (non-streaming, streaming or streaming-compatible)
show more ...
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7 |
|
#
f81da756 |
| 23-May-2024 |
Sander de Smalen <sander.desmalen@arm.com> |
[Clang][AArch64] Use __clang_arm_builtin_alias for overloaded svreinterpret's (#92427)
The intrinsics are currently defined as:
```
__aio __attribute__((target("sve")))
svint8_t svreinterpr
[Clang][AArch64] Use __clang_arm_builtin_alias for overloaded svreinterpret's (#92427)
The intrinsics are currently defined as:
```
__aio __attribute__((target("sve")))
svint8_t svreinterpret_s8(svuint8_t op) __arm_streaming_compatible {
return __builtin_sve_reinterpret_s8_u8(op);
}
```
which doesn't work when calling it from an __arm_streaming function when
only +sme is available. By defining it in the same way as we've defined
all the other intrinsics, we can leave it to the code in SemaChecking to
verify that either +sve or +sme is available.
This PR also fixes the target guards for the svreinterpret_c and
svreinterpret_b intrinsics, that convert between svcount_t and svbool_t,
as these are available both in SME2 and SVE2p1.
show more ...
|
Revision tags: llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3 |
|
#
451cad3a |
| 02-Apr-2024 |
aniplcc <aniplccode@gmail.com> |
[clang] Prefer logical && over & for boolean operations (#87276)
|
Revision tags: llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4 |
|
#
3c90fce4 |
| 23-Feb-2024 |
Sander de Smalen <sander.desmalen@arm.com> |
[Clang][AArch64] Add missing prototypes for streaming-compatible routines (#82649)
|