1Pointer Authentication 2====================== 3 4.. contents:: 5 :local: 6 7Introduction 8------------ 9 10Pointer authentication is a technology which offers strong probabilistic 11protection against exploiting a broad class of memory bugs to take control of 12program execution. When adopted consistently in a language ABI, it provides 13a form of relatively fine-grained control flow integrity (CFI) check that 14resists both return-oriented programming (ROP) and jump-oriented programming 15(JOP) attacks. 16 17While pointer authentication can be implemented purely in software, direct 18hardware support (e.g. as provided by Armv8.3 PAuth) can dramatically improve 19performance and code size. Similarly, while pointer authentication 20can be implemented on any architecture, taking advantage of the (typically) 21excess addressing range of a target with 64-bit pointers minimizes the impact 22on memory performance and can allow interoperation with existing code (by 23disabling pointer authentication dynamically). This document will generally 24attempt to present the pointer authentication feature independent of any 25hardware implementation or ABI. Considerations that are 26implementation-specific are clearly identified throughout. 27 28Note that there are several different terms in use: 29 30- **Pointer authentication** is a target-independent language technology. 31 32- **PAuth** (sometimes referred to as **PAC**, for Pointer Authentication 33 Codes) is an AArch64 architecture extension that provides hardware support 34 for pointer authentication. Additional extensions either modify some of the 35 PAuth instruction behavior (notably FPAC), or provide new instruction 36 variants (PAuth_LR). 37 38- **Armv8.3** is an AArch64 architecture revision that makes PAuth mandatory. 39 40- **arm64e** is a specific ABI (not yet fully stable) for implementing pointer 41 authentication using PAuth on certain Apple operating systems. 42 43This document serves four purposes: 44 45- It describes the basic ideas of pointer authentication. 46 47- It documents several language extensions that are useful on targets using 48 pointer authentication. 49 50- It will eventually present a theory of operation for the security mitigation, 51 describing the basic requirements for correctness, various weaknesses in the 52 mechanism, and ways in which programmers can strengthen its protections 53 (including recommendations for language implementors). 54 55- It will eventually document the language ABIs currently used for C, C++, 56 Objective-C, and Swift on arm64e, although these are not yet stable on any 57 target. 58 59Basic Concepts 60-------------- 61 62The simple address of an object or function is a **raw pointer**. A raw 63pointer can be **signed** to produce a **signed pointer**. A signed pointer 64can be then **authenticated** in order to verify that it was **validly signed** 65and extract the original raw pointer. These terms reflect the most likely 66implementation technique: computing and storing a cryptographic signature along 67with the pointer. 68 69An **abstract signing key** is a name which refers to a secret key which is 70used to sign and authenticate pointers. The concrete key value for a 71particular name is consistent throughout a process. 72 73A **discriminator** is an arbitrary value used to **diversify** signed pointers 74so that one validly-signed pointer cannot simply be copied over another. 75A discriminator is simply opaque data of some implementation-defined size that 76is included in the signature as a salt (see `Discriminators`_ for details.) 77 78Nearly all aspects of pointer authentication use just these two primary 79operations: 80 81- ``sign(raw_pointer, key, discriminator)`` produces a signed pointer given 82 a raw pointer, an abstract signing key, and a discriminator. 83 84- ``auth(signed_pointer, key, discriminator)`` produces a raw pointer given 85 a signed pointer, an abstract signing key, and a discriminator. 86 87``auth(sign(raw_pointer, key, discriminator), key, discriminator)`` must 88succeed and produce ``raw_pointer``. ``auth`` applied to a value that was 89ultimately produced in any other way is expected to fail, which halts the 90program either: 91 92- immediately, on implementations that enforce ``auth`` success (e.g., when 93 using compiler-generated ``auth`` failure checks, or Armv8.3 with the FPAC 94 extension), or 95 96- when the resulting pointer value is used, on implementations that don't. 97 98However, regardless of the implementation's handling of ``auth`` failures, it 99is permitted for ``auth`` to fail to detect that a signed pointer was not 100produced in this way, in which case it may return anything; this is what makes 101pointer authentication a probabilistic mitigation rather than a perfect one. 102 103There are two secondary operations which are required only to implement certain 104intrinsics in ``<ptrauth.h>``: 105 106- ``strip(signed_pointer, key)`` produces a raw pointer given a signed pointer 107 and a key without verifying its validity, unlike ``auth``. This is useful 108 for certain kinds of tooling, such as crash backtraces; it should generally 109 not be used in the basic language ABI except in very careful ways. 110 111- ``sign_generic(value)`` produces a cryptographic signature for arbitrary 112 data, not necessarily a pointer. This is useful for efficiently verifying 113 that non-pointer data has not been tampered with. 114 115Whenever any of these operations is called for, the key value must be known 116statically. This is because the layout of a signed pointer may vary according 117to the signing key. (For example, in Armv8.3, the layout of a signed pointer 118depends on whether Top Byte Ignore (TBI) is enabled, which can be set 119independently for I and D keys.) 120 121.. admonition:: Note for API designers and language implementors 122 123 These are the *primitive* operations of pointer authentication, provided for 124 clarity of description. They are not suitable either as high-level 125 interfaces or as primitives in a compiler IR because they expose raw 126 pointers. Raw pointers require special attention in the language 127 implementation to avoid the accidental creation of exploitable code 128 sequences. 129 130The following details are all implementation-defined: 131 132- the nature of a signed pointer 133- the size of a discriminator 134- the number and nature of the signing keys 135- the implementation of the ``sign``, ``auth``, ``strip``, and ``sign_generic`` 136 operations 137 138While the use of the terms "sign" and "signed pointer" suggest the use of 139a cryptographic signature, other implementations may be possible. See 140`Alternative implementations`_ for an exploration of implementation options. 141 142.. admonition:: Implementation example: Armv8.3 143 144 Readers may find it helpful to know how these terms map to Armv8.3 PAuth: 145 146 - A signed pointer is a pointer with a signature stored in the 147 otherwise-unused high bits. The kernel configures the address width based 148 on the system's addressing needs, and enables TBI for I or D keys as 149 needed. The bits above the address bits and below the TBI bits (if 150 enabled) are unused. The signature width then depends on this addressing 151 configuration. 152 153 - A discriminator is a 64-bit integer. Constant discriminators are 16-bit 154 integers. Blending a constant discriminator into an address consists of 155 replacing the top 16 bits of the pointer containing the address with the 156 constant. Pointers used for blending purposes should only have address 157 bits, since higher bits will be at least partially overwritten with the 158 constant discriminator. 159 160 - There are five 128-bit signing-key registers, each of which can only be 161 directly read or set by privileged code. Of these, four are used for 162 signing pointers, and the fifth is used only for ``sign_generic``. The key 163 data is simply a pepper added to the hash, not an encryption key, and so 164 can be initialized using random data. 165 166 - ``sign`` computes a cryptographic hash of the pointer, discriminator, and 167 signing key, and stores it in the high bits as the signature. ``auth`` 168 removes the signature, computes the same hash, and compares the result with 169 the stored signature. ``strip`` removes the signature without 170 authenticating it. While ``aut*`` instructions do not themselves trap on 171 failure in Armv8.3 PAuth, they do with the later optional FPAC extension. 172 An implementation can also choose to emulate this trapping behavior by 173 emitting additional instructions around ``aut*``. 174 175 - ``sign_generic`` corresponds to the ``pacga`` instruction, which takes two 176 64-bit values and produces a 64-bit cryptographic hash. Implementations of 177 this instruction are not required to produce meaningful data in all bits of 178 the result. 179 180Discriminators 181~~~~~~~~~~~~~~ 182 183A discriminator is arbitrary extra data which alters the signature calculated 184for a pointer. When two pointers are signed differently --- either with 185different keys or with different discriminators --- an attacker cannot simply 186replace one pointer with the other. 187 188To use standard cryptographic terminology, a discriminator acts as a 189`salt <https://en.wikipedia.org/wiki/Salt_(cryptography)>`_ in the signing of a 190pointer, and the key data acts as a 191`pepper <https://en.wikipedia.org/wiki/Pepper_(cryptography)>`_. That is, 192both the discriminator and key data are ultimately just added as inputs to the 193signing algorithm along with the pointer, but they serve significantly 194different roles. The key data is a common secret added to every signature, 195whereas the discriminator is a value that can be derived from 196the context in which a specific pointer is signed. However, unlike a password 197salt, it's important that discriminators be *independently* derived from the 198circumstances of the signing; they should never simply be stored alongside 199a pointer. Discriminators are then re-derived in authentication operations. 200 201The intrinsic interface in ``<ptrauth.h>`` allows an arbitrary discriminator 202value to be provided, but can only be used when running normal code. The 203discriminators used by language ABIs must be restricted to make it feasible for 204the loader to sign pointers stored in global memory without needing excessive 205amounts of metadata. Under these restrictions, a discriminator may consist of 206either or both of the following: 207 208- The address at which the pointer is stored in memory. A pointer signed with 209 a discriminator which incorporates its storage address is said to have 210 **address diversity**. In general, using address diversity means that 211 a pointer cannot be reliably copied by an attacker to or from a different 212 memory location. However, an attacker may still be able to attack a larger 213 call sequence if they can alter the address through which the pointer is 214 accessed. Furthermore, some situations cannot use address diversity because 215 of language or other restrictions. 216 217- A constant integer, called a **constant discriminator**. A pointer signed 218 with a non-zero constant discriminator is said to have **constant 219 diversity**. If the discriminator is specific to a single declaration, it is 220 said to have **declaration diversity**; if the discriminator is specific to 221 a type of value, it is said to have **type diversity**. For example, C++ 222 v-tables on arm64e sign their component functions using a hash of their 223 method names and signatures, which provides declaration diversity; similarly, 224 C++ member function pointers sign their invocation functions using a hash of 225 the member pointer type, which provides type diversity. 226 227The implementation may need to restrict constant discriminators to be 228significantly smaller than the full size of a discriminator. For example, on 229arm64e, constant discriminators are only 16-bit values. This is believed to 230not significantly weaken the mitigation, since collisions remain uncommon. 231 232The algorithm for blending a constant discriminator with a storage address is 233implementation-defined. 234 235.. _Signing schemas: 236 237Signing Schemas 238~~~~~~~~~~~~~~~ 239 240Correct use of pointer authentication requires the signing code and the 241authenticating code to agree about the **signing schema** for the pointer: 242 243- the abstract signing key with which the pointer should be signed and 244- an algorithm for computing the discriminator. 245 246As described in the section above on `Discriminators`_, in most situations, the 247discriminator is produced by taking a constant discriminator and optionally 248blending it with the storage address of the pointer. In these situations, the 249signing schema breaks down even more simply: 250 251- the abstract signing key, 252- a constant discriminator, and 253- whether to use address diversity. 254 255It is important that the signing schema be independently derived at all signing 256and authentication sites. Preferably, the schema should be hard-coded 257everywhere it is needed, but at the very least, it must not be derived by 258inspecting information stored along with the pointer. 259 260Language Features 261----------------- 262 263There is currently one main pointer authentication language feature: 264 265- The language provides the ``<ptrauth.h>`` intrinsic interface for manually 266 signing and authenticating pointers in code. These can be used in 267 circumstances where very specific behavior is required. 268 269 270Language Extensions 271~~~~~~~~~~~~~~~~~~~ 272 273Feature Testing 274^^^^^^^^^^^^^^^ 275 276Whether the current target uses pointer authentication can be tested for with 277a number of different tests. 278 279- ``__has_feature(ptrauth_intrinsics)`` is true if ``<ptrauth.h>`` provides its 280 normal interface. This may be true even on targets where pointer 281 authentication is not enabled by default. 282 283``<ptrauth.h>`` 284~~~~~~~~~~~~~~~ 285 286This header defines the following types and operations: 287 288``ptrauth_key`` 289^^^^^^^^^^^^^^^ 290 291This ``enum`` is the type of abstract signing keys. In addition to defining 292the set of implementation-specific signing keys (for example, Armv8.3 defines 293``ptrauth_key_asia``), it also defines some portable aliases for those keys. 294For example, ``ptrauth_key_function_pointer`` is the key generally used for 295C function pointers, which will generally be suitable for other 296function-signing schemas. 297 298In all the operation descriptions below, key values must be constant values 299corresponding to one of the implementation-specific abstract signing keys from 300this ``enum``. 301 302``ptrauth_extra_data_t`` 303^^^^^^^^^^^^^^^^^^^^^^^^ 304 305This is a ``typedef`` of a standard integer type of the correct size to hold 306a discriminator value. 307 308In the signing and authentication operation descriptions below, discriminator 309values must have either pointer type or integer type. If the discriminator is 310an integer, it will be coerced to ``ptrauth_extra_data_t``. 311 312``ptrauth_blend_discriminator`` 313^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 314 315.. code-block:: c 316 317 ptrauth_blend_discriminator(pointer, integer) 318 319Produce a discriminator value which blends information from the given pointer 320and the given integer. 321 322Implementations may ignore some bits from each value, which is to say, the 323blending algorithm may be chosen for speed and convenience over theoretical 324strength as a hash-combining algorithm. For example, arm64e simply overwrites 325the high 16 bits of the pointer with the low 16 bits of the integer, which can 326be done in a single instruction with an immediate integer. 327 328``pointer`` must have pointer type, and ``integer`` must have integer type. The 329result has type ``ptrauth_extra_data_t``. 330 331``ptrauth_string_discriminator`` 332^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 333 334.. code-block:: c 335 336 ptrauth_string_discriminator(string) 337 338Compute a constant discriminator from the given string. 339 340``string`` must be a string literal of ``char`` character type. The result has 341type ``ptrauth_extra_data_t``. 342 343The result value is never zero and always within range for both the 344``__ptrauth`` qualifier and ``ptrauth_blend_discriminator``. 345 346This can be used in constant expressions. 347 348``ptrauth_strip`` 349^^^^^^^^^^^^^^^^^ 350 351.. code-block:: c 352 353 ptrauth_strip(signedPointer, key) 354 355Given that ``signedPointer`` matches the layout for signed pointers signed with 356the given key, extract the raw pointer from it. This operation does not trap 357and cannot fail, even if the pointer is not validly signed. 358 359``ptrauth_sign_constant`` 360^^^^^^^^^^^^^^^^^^^^^^^^^ 361 362.. code-block:: c 363 364 ptrauth_sign_constant(pointer, key, discriminator) 365 366Return a signed pointer for a constant address in a manner which guarantees 367a non-attackable sequence. 368 369``pointer`` must be a constant expression of pointer type which evaluates to 370a non-null pointer. 371``key`` must be a constant expression of type ``ptrauth_key``. 372``discriminator`` must be a constant expression of pointer or integer type; 373if an integer, it will be coerced to ``ptrauth_extra_data_t``. 374The result will have the same type as ``pointer``. 375 376This can be used in constant expressions. 377 378``ptrauth_sign_unauthenticated`` 379^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 380 381.. code-block:: c 382 383 ptrauth_sign_unauthenticated(pointer, key, discriminator) 384 385Produce a signed pointer for the given raw pointer without applying any 386authentication or extra treatment. This operation is not required to have the 387same behavior on a null pointer that the language implementation would. 388 389This is a treacherous operation that can easily result in signing oracles. 390Programs should use it seldom and carefully. 391 392``ptrauth_auth_and_resign`` 393^^^^^^^^^^^^^^^^^^^^^^^^^^^ 394 395.. code-block:: c 396 397 ptrauth_auth_and_resign(pointer, oldKey, oldDiscriminator, newKey, newDiscriminator) 398 399Authenticate that ``pointer`` is signed with ``oldKey`` and 400``oldDiscriminator`` and then resign the raw-pointer result of that 401authentication with ``newKey`` and ``newDiscriminator``. 402 403``pointer`` must have pointer type. The result will have the same type as 404``pointer``. This operation is not required to have the same behavior on 405a null pointer that the language implementation would. 406 407The code sequence produced for this operation must not be directly attackable. 408However, if the discriminator values are not constant integers, their 409computations may still be attackable. In the future, Clang should be enhanced 410to guaranteed non-attackability if these expressions are safely-derived. 411 412``ptrauth_auth_data`` 413^^^^^^^^^^^^^^^^^^^^^ 414 415.. code-block:: c 416 417 ptrauth_auth_data(pointer, key, discriminator) 418 419Authenticate that ``pointer`` is signed with ``key`` and ``discriminator`` and 420remove the signature. 421 422``pointer`` must have object pointer type. The result will have the same type 423as ``pointer``. This operation is not required to have the same behavior on 424a null pointer that the language implementation would. 425 426In the future when Clang makes safe derivation guarantees, the result of 427this operation should be considered safely-derived. 428 429``ptrauth_sign_generic_data`` 430^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 431 432.. code-block:: c 433 434 ptrauth_sign_generic_data(value1, value2) 435 436Computes a signature for the given pair of values, incorporating a secret 437signing key. 438 439This operation can be used to verify that arbitrary data has not been tampered 440with by computing a signature for the data, storing that signature, and then 441repeating this process and verifying that it yields the same result. This can 442be reasonably done in any number of ways; for example, a library could compute 443an ordinary checksum of the data and just sign the result in order to get the 444tamper-resistance advantages of the secret signing key (since otherwise an 445attacker could reliably overwrite both the data and the checksum). 446 447``value1`` and ``value2`` must be either pointers or integers. If the integers 448are larger than ``uintptr_t`` then data not representable in ``uintptr_t`` may 449be discarded. 450 451The result will have type ``ptrauth_generic_signature_t``, which is an integer 452type. Implementations are not required to make all bits of the result equally 453significant; in particular, some implementations are known to not leave 454meaningful data in the low bits. 455 456 457 458Alternative Implementations 459--------------------------- 460 461Signature Storage 462~~~~~~~~~~~~~~~~~ 463 464It is not critical for the security of pointer authentication that the 465signature be stored "together" with the pointer, as it is in Armv8.3. An 466implementation could just as well store the signature in a separate word, so 467that the ``sizeof`` a signed pointer would be larger than the ``sizeof`` a raw 468pointer. 469 470Storing the signature in the high bits, as Armv8.3 does, has several trade-offs: 471 472- Disadvantage: there are substantially fewer bits available for the signature, 473 weakening the mitigation by making it much easier for an attacker to simply 474 guess the correct signature. 475 476- Disadvantage: future growth of the address space will necessarily further 477 weaken the mitigation. 478 479- Advantage: memory layouts don't change, so it's possible for 480 pointer-authentication-enabled code (for example, in a system library) to 481 efficiently interoperate with existing code, as long as pointer 482 authentication can be disabled dynamically. 483 484- Advantage: the size of a signed pointer doesn't grow, which might 485 significantly increase memory requirements, code size, and register pressure. 486 487- Advantage: the size of a signed pointer is the same as a raw pointer, so 488 generic APIs which work in types like `void *` (such as `dlsym`) can still 489 return signed pointers. This means that clients of these APIs will not 490 require insecure code in order to correctly receive a function pointer. 491 492Hashing vs. Encrypting Pointers 493~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 494 495Armv8.3 implements ``sign`` by computing a cryptographic hash and storing that 496in the spare bits of the pointer. This means that there are relatively few 497possible values for the valid signed pointer, since the bits corresponding to 498the raw pointer are known. Together with an ``auth`` oracle, this can make it 499computationally feasible to discover the correct signature with brute force. 500(The implementation should of course endeavor not to introduce ``auth`` 501oracles, but this can be difficult, and attackers can be devious.) 502 503If the implementation can instead *encrypt* the pointer during ``sign`` and 504*decrypt* it during ``auth``, this brute-force attack becomes far less 505feasible, even with an ``auth`` oracle. However, there are several problems 506with this idea: 507 508- It's unclear whether this kind of encryption is even possible without 509 increasing the storage size of a signed pointer. If the storage size can be 510 increased, brute-force atacks can be equally well mitigated by simply storing 511 a larger signature. 512 513- It would likely be impossible to implement a ``strip`` operation, which might 514 make debuggers and other out-of-process tools far more difficult to write, as 515 well as generally making primitive debugging more challenging. 516 517- Implementations can benefit from being able to extract the raw pointer 518 immediately from a signed pointer. An Armv8.3 processor executing an 519 ``auth``-and-load instruction can perform the load and ``auth`` in parallel; 520 a processor which instead encrypted the pointer would be forced to perform 521 these operations serially. 522