SveEmitter.cpp - OpenGrok history log for /llvm-project/clang/utils/TableGen/SveEmitter.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init
# f95a8bde	27-Jan-2025	Momchil Velikov <momchil.velikov@arm.com>	[AArch64] Refactor implementation of FP8 types (NFC) (#123604) - The FP8 scalar type (`__mfp8`) was described as a vector type - The FP8 vector types were described/assumed to have integer element [AArch64] Refactor implementation of FP8 types (NFC) (#123604) - The FP8 scalar type (`__mfp8`) was described as a vector type - The FP8 vector types were described/assumed to have integer element type (the element type ought to be `__mfp8`) - Add support for `m` type specifier (denoting `__mfp8`) in `DecodeTypeFromStr` and create builtin function prototypes using that specifier, instead of `int8_t` show more ...
# d028eaae	21-Jan-2025	Jonathan Thackray <jonathan.thackray@arm.com>	[AArch64] Update SVE untyped intrinsics to have FP8 variants (#123585) Update the following intrinsics to have FP8 variants: ``` c svuint8_t svdup_laneq[_u8](svuint8_t zn, uint64_t imm_idx); svui [AArch64] Update SVE untyped intrinsics to have FP8 variants (#123585) Update the following intrinsics to have FP8 variants: ``` c svuint8_t svdup_laneq[_u8](svuint8_t zn, uint64_t imm_idx); svuint8_t svextq[_u8](svuint8_t zdn, svuint8_t zm, uint64_t imm); svint8_t svtblq[_s8](svint8_t zn, svuint8_t zm); svint8_t svtbxq[_s8](svint8_t fallback, svint8_t zn, svuint8_t zm); svuint8_t svuzpq1[_u8](svuint8_t zn, svuint8_t zm); svuint8_t svuzpq2[_u8](svuint8_t zn, svuint8_t zm); svuint8_t svzipq1[_u8](svuint8_t zn, svuint8_t zm); svuint8_t svzipq2[_u8](svuint8_t zn, svuint8_t zm); ``` show more ...
Revision tags: llvmorg-19.1.7
# 16e45b8f	13-Jan-2025	Momchil Velikov <momchil.velikov@arm.com>	[AArch64] Implement FP8 SVE/SME reinterpret intrinsics (#121063)
# 21b531ea	07-Jan-2025	Nicholas Guy <nicholas.guy@arm.com>	[clang][llvm][aarch64] Add aarch64_sme_in_streaming_mode intrinsic (#120265) Replacing the extant streaming mode function call with an intrinsic allows us to make further optimisations around it. F [clang][llvm][aarch64] Add aarch64_sme_in_streaming_mode intrinsic (#120265) Replacing the extant streaming mode function call with an intrinsic allows us to make further optimisations around it. For example, if it's called within a function that has a known streaming mode, we can remove the dead code, and avoid the redundant conditional branch. show more ...
# db84ae3a	19-Dec-2024	SpencerAbson <Spencer.Abson@arm.com>	[Clang][AArch64] Add signed index/offset variants of sve2p1 qword stores (#120549) This patch adds signed offset/index variants to the SVE2p1 quadword store intrinsics, in accordance with https:// [Clang][AArch64] Add signed index/offset variants of sve2p1 qword stores (#120549) This patch adds signed offset/index variants to the SVE2p1 quadword store intrinsics, in accordance with https://github.com/ARM-software/acle/pull/359. show more ...
Revision tags: llvmorg-19.1.6
# c2172431	13-Dec-2024	Momchil Velikov <momchil.velikov@arm.com>	[AArch64] Implements FP8 SVE intrinsics for dot-product (#118125) This patch adds the following intrinsics: * 8-bit floating-point dot product to single-precision. // Only if (__ARM_FEATURE_SV [AArch64] Implements FP8 SVE intrinsics for dot-product (#118125) This patch adds the following intrinsics: * 8-bit floating-point dot product to single-precision. // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8DOT4) \|\| __ARM_FEATURE_SSVE_FP8DOT4 svfloat32_t svdot[_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn, svmfloat8_t zm, fpm_t fpm); svfloat32_t svdot[_n_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn, mfloat8_t zm, fpm_t fpm); * 8-bit floating-point indexed dot product to single-precision. // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8DOT4) \|\| __ARM_FEATURE_SSVE_FP8DOT4 svfloat32_t svdot_lane[_f32_mf8]_fpm(svfloat32_t zda, svmfloat8_t zn, svmfloat8_t zm, uint64_t imm0_3, fpm_t fpm); * 8-bit floating-point dot product to half-precision. // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8DOT2) \|\| __ARM_FEATURE_SSVE_FP8DOT2 svfloat16_t svdot[_f16_mf8]_fpm(svfloat16_t zda, svmfloat8_t zn, svmfloat8_t zm, fpm_t fpm); svfloat16_t svdot[_n_f16_mf8]_fpm(svfloat16_t zda, svmfloat8_t zn, mfloat8_t zm, fpm_t fpm); * 8-bit floating-point indexed dot product to half-precision. // Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_FP8DOT2) \|\| __ARM_FEATURE_SSVE_FP8DOT2 svfloat16_t svdot_lane[_f16_mf8]_fpm(svfloat16_t zda, svmfloat8_t zn, svmfloat8_t zm, uint64_t imm0_7, fpm_t fpm); show more ...
Revision tags: llvmorg-19.1.5
# ac7fe426	02-Dec-2024	SpencerAbson <Spencer.Abson@arm.com>	[Clang][AArch64]Refactor typespec handling in SveEmitter.cpp (#117717) - Switch to an enumerated type approach, which is less error-prone as we continue to add new types. This is similar to NeonEmi [Clang][AArch64]Refactor typespec handling in SveEmitter.cpp (#117717) - Switch to an enumerated type approach, which is less error-prone as we continue to add new types. This is similar to NeonEmitter. - Fix existing faulty typespec modifiers show more ...
# e4ee970c	28-Nov-2024	SpencerAbson <Spencer.Abson@arm.com>	[AArch64] Implement intrinsics for F1CVTL/F2CVTL and BF1CVTL/BF2CVTL (#116959) This patch implements the following intrinsics: 8-bit floating-point convert to deinterleaved half-precision or BFl [AArch64] Implement intrinsics for F1CVTL/F2CVTL and BF1CVTL/BF2CVTL (#116959) This patch implements the following intrinsics: 8-bit floating-point convert to deinterleaved half-precision or BFloat16. ``` c // Variant is also available for: _bf16[_mf8]_x2 svfloat16x2_t svcvtl1_f16[_mf8]_x2_fpm(svmfloat8_t zn, fpm_t fpm) __arm_streaming; svfloat16x2_t svcvtl2_f16[_mf8]_x2_fpm(svmfloat8_t zn, fpm_t fpm) __arm_streaming; ``` Defined in https://github.com/ARM-software/acle/pull/323 Co-authored-by: Caroline Concatto caroline.concatto@arm.com Co-authored-by: Marian Lukac marian.lukac@arm.com show more ...
Revision tags: llvmorg-19.1.4
# 63aa8cf6	17-Nov-2024	Rahul Joshi <rjoshi@nvidia.com>	[NFC][Clang][TableGen] Fix file header comments (#116491)
# a8a1e903	14-Nov-2024	Kazu Hirata <kazu@google.com>	[TableGen] Remove unused includes (NFC) (#116168) Identified with misc-include-cleaner.
Revision tags: llvmorg-19.1.3
# 508fd966	18-Oct-2024	CarolineConcatto <caroline.concatto@arm.com>	[CLANG][AArch64]Add SVE tuple types for mfloat8_t (#112687) This patch adds scalable tuple types vectors for MFloat_8 type, according to the ACLE[1]. [1] https://github.com/ARM-software/acle.git
# cb43021e	17-Oct-2024	CarolineConcatto <caroline.concatto@arm.com>	[CLANG]Add Scalable vectors for mfloat8_t (#101644) This patch adds these new vector sizes for sve: svmfloat8_t According to the ARM ACLE PR#323[1]. [1] ARM-software/acle#323
Revision tags: llvmorg-19.1.2
# f22e6d59	08-Oct-2024	Sander de Smalen <sander.desmalen@arm.com>	[Clang][AArch64] Fix checkArmStreamingBuiltin for 'sve-b16b16' (#109420) The implementation made the assumption that any feature starting with "sve" meant that this was an SVE feature. This is not [Clang][AArch64] Fix checkArmStreamingBuiltin for 'sve-b16b16' (#109420) The implementation made the assumption that any feature starting with "sve" meant that this was an SVE feature. This is not the case for "sve-b16b16", as this is a feature that applies to both SVE and SME. This meant that: ``` __attribute__((target("+sme2,+sve2,+sve-b16b16"))) svbfloat16_t foo(svbfloat16_t a, svbfloat16_t b, svbfloat16_t c) __arm_streaming { return svclamp_bf16(a, b, c); } ``` would result in an incorrect diagnostic saying that `svclamp_bf16` could only be used in non-streaming functions. show more ...
# a140931b	01-Oct-2024	Rahul Joshi <rjoshi@nvidia.com>	[TableGen] Change `getValueAsListOfDefs` to return const pointer vector (#110713) Change `getValueAsListOfDefs` to return a vector of const Record pointer, and remove `getValueAsListOfConstDefs` th [TableGen] Change `getValueAsListOfDefs` to return const pointer vector (#110713) Change `getValueAsListOfDefs` to return a vector of const Record pointer, and remove `getValueAsListOfConstDefs` that was added as a transition aid. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089 show more ...
# e9dbdb20	01-Oct-2024	Rahul Joshi <rjoshi@nvidia.com>	[Clang][TableGen] Change NeonEmitter to use const Record * (#110597) This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-ch [Clang][TableGen] Change NeonEmitter to use const Record * (#110597) This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089 show more ...
Revision tags: llvmorg-19.1.1, llvmorg-19.1.0
# 0e948bfd	16-Sep-2024	Rahul Joshi <rjoshi@nvidia.com>	[NFC][clang][TableGen] Remove redundant llvm:: namespace qualifier (#108627) Remove llvm:: from .cpp files, and add "using namespace llvm" if needed.
# 711278e2	13-Sep-2024	Rahul Joshi <rjoshi@nvidia.com>	[clang][TableGen] Change SVE Emitter to use const RecordKeeper (#108503) Change SVE Emitter to use const RecordKeeper. This is a part of effort to have better const correctness in TableGen backe [clang][TableGen] Change SVE Emitter to use const RecordKeeper (#108503) Change SVE Emitter to use const RecordKeeper. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089 show more ...
# 1f70fcef	06-Sep-2024	SpencerAbson <Spencer.Abson@arm.com>	[Clang][AArch64] Add customisable immediate range checking to NEON (#100278) This patch moves NEON immediate argument specification and checking to the system currently shared by both SVE and SME. [Clang][AArch64] Add customisable immediate range checking to NEON (#100278) This patch moves NEON immediate argument specification and checking to the system currently shared by both SVE and SME. In its current form, the TableGen definition of a NEON intrinsic cannot control how its immediate arguments are range-checked, this information must be inferred from the name of the intrinsic by NeonEmitter, which also assumes that any NEON instruction will only ever receive a single immediate argument. For SVE/SME instrinsics, this information is more conveniently supplied in the TableGen definition. As a result, for each immediate argument, NEON instructions must define - The index of the immediate argument to be checked - The type of immediate range check to be performed, (e.g., ImmCheckShiftRight) - The index of the argument whose type defines the context of this immediate check (base type, vector size). - Difference from SVE/SME If this definition generates a polymorphic NEON builtin, the base type defined by this argument is overwritten by that of the type code supplied to the overloaded builtin call. This third argument is omitted in some cases due to this. Here is an example for [`vfma_laneq`](https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:@navigationhierarchiessimdisa=[Neon]&q=vfma_laneq) - The immediate is supplied in argument 3 - The immediate is used as an index into the lanes of argument 2 - So we must perform an immediate check on argument 3, based on the type information of argument 2. - `ImmCheck<3, ImmCheckLaneIndex, 2>` During this work, we discovered that the existing immediate range-checking system was largely untested, which made it difficult to make reliable progress. Missing tests have been added to verify this implementation against all intrinsics which take constrained immediate arguments. All test immediate range checking tests for NEON intrinsics are moved to a dedicated directory `clang/test/Sema/aarch64-neon-immediate-ranges/`. show more ...
Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2
# 1fa7f05b	05-Aug-2024	Kazu Hirata <kazu@google.com>	[clang] Construct SmallVector with ArrayRef (NFC) (#101898)
Revision tags: llvmorg-19.1.0-rc1, llvmorg-20-init
# 09c0337a	24-Jun-2024	Sander de Smalen <sander.desmalen@arm.com>	[Clang][SveEmitter] Split up TargetGuard into SVE and SME component. (#96482) One reason to want to split this up is to simplify the code added in #93802, where it checks the SME streaming-mode req [Clang][SveEmitter] Split up TargetGuard into SVE and SME component. (#96482) One reason to want to split this up is to simplify the code added in #93802, where it checks the SME streaming-mode requirements for a builtin by checking for the absence of SVE. If the target guards are separate, we can generate a table and make the Sema code to verify the runtime mode simpler. Another reason is to avoid an issue with a check in SveEmitter.cpp where it ensures that the 'VerifyRuntimeMode' is set correctly for functions that have both SVE and SME target guards: if (!Def->isFlagSet(VerifyRuntimeMode) && Def->getGuard().contains("sve") && Def->getGuard().contains("sme")) llvm_unreachable("Missing VerifyRuntimeMode flag"); However, if we ever add a new feature with "sme" in the name, even though it is unrelated to FEAT_SME, then this code no longer works. Note that the arm_sve.td and arm_sme.td files could do with a bit of restructuring after this but it seems better to follow that up in an NFC patch. show more ...
# b39f523a	21-Jun-2024	Sander de Smalen <sander.desmalen@arm.com>	[Clang][AArch64] Expose compatible SVE intrinsics with only +sme (#95787) This allows code with SVE intrinsics to be compiled with +sme,+nosve, assuming the encompassing function is in the correct [Clang][AArch64] Expose compatible SVE intrinsics with only +sme (#95787) This allows code with SVE intrinsics to be compiled with +sme,+nosve, assuming the encompassing function is in the correct mode (see #93802) show more ...
# 1644a31a	16-Jun-2024	Sander de Smalen <sander.desmalen@arm.com>	[Clang][AArch64] Generalise streaming mode checks for builtins. (#93802) PR #76975 added 'IsStreamingOrSVE2p1' to emit a diagnostic when a builtin marked with 'IsStreamingOrSVE2p1' is used in a non [Clang][AArch64] Generalise streaming mode checks for builtins. (#93802) PR #76975 added 'IsStreamingOrSVE2p1' to emit a diagnostic when a builtin marked with 'IsStreamingOrSVE2p1' is used in a non-streaming function that is not compiled with `+sve2p1`. The problem is a bit more complex than only this case. For example, we've marked lots of builtins with 'IsStreamingCompatible', meaning it can be used in either streaming, streaming-compatible or non-streaming functions. But the code in SemaChecking, doesn't check the appropriate target guards. This issue becomes relevant when SVE builtins are only available in streaming mode, e.g. when compiling for SME without SVE. If we were to add the appropriate target guards, we'd have to add many more combinations, e.g.: IsStreamingSMEOrSVE IsStreamingSME2OrSVE2 IsStreamingSMEOrSVE2p1 IsStreamingSME2OrSVE2p1 etc. To avoid having to add more combinations (and avoid having to add more in the future for new extensions), we use a single 'IsSVEOrStreamingSVE' flag for all builtins that are available in streaming mode for the appropriate SME flags, or in non-streaming mode for the appropriate SVE flags, or both. The code in SemaChecking will then verify for which mode (or both) the builtin would be defined, given the target features of the function/compilation unit. For example: 'svclamp' is enabled under FEAT_SVE2p1 and FEAT_SME2 * When we compile for SVE2p1 and SME (but not SME2), the builtin is undefined behaviour when called from a streaming function. * When we compile for SME2 and SVE2 (but not SVE2p1), the builtin is undefined behaviour when called from a non-streaming function. * When we compile for _both_ SVE2p1 and SME2, the builtin can be used in either mode (non-streaming, streaming or streaming-compatible) show more ...
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7
# f81da756	23-May-2024	Sander de Smalen <sander.desmalen@arm.com>	[Clang][AArch64] Use __clang_arm_builtin_alias for overloaded svreinterpret's (#92427) The intrinsics are currently defined as: ``` __aio __attribute__((target("sve"))) svint8_t svreinterpr [Clang][AArch64] Use __clang_arm_builtin_alias for overloaded svreinterpret's (#92427) The intrinsics are currently defined as: ``` __aio __attribute__((target("sve"))) svint8_t svreinterpret_s8(svuint8_t op) __arm_streaming_compatible { return __builtin_sve_reinterpret_s8_u8(op); } ``` which doesn't work when calling it from an __arm_streaming function when only +sme is available. By defining it in the same way as we've defined all the other intrinsics, we can leave it to the code in SemaChecking to verify that either +sve or +sme is available. This PR also fixes the target guards for the svreinterpret_c and svreinterpret_b intrinsics, that convert between svcount_t and svbool_t, as these are available both in SME2 and SVE2p1. show more ...
Revision tags: llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3
# 451cad3a	02-Apr-2024	aniplcc <aniplccode@gmail.com>	[clang] Prefer logical && over & for boolean operations (#87276)
Revision tags: llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4
# 3c90fce4	23-Feb-2024	Sander de Smalen <sander.desmalen@arm.com>	[Clang][AArch64] Add missing prototypes for streaming-compatible routines (#82649)
12 3 4 5