1# Pointer Authentication 2 3## Introduction 4 5Pointer Authentication is a mechanism by which certain pointers are signed. 6When a pointer gets signed, a cryptographic hash of its value and other values 7(pepper and salt) is stored in unused bits of that pointer. 8 9Before the pointer is used, it needs to be authenticated, i.e., have its 10signature checked. This prevents pointer values of unknown origin from being 11used to replace the signed pointer value. 12 13For more details, see the clang documentation page for 14[Pointer Authentication](https://clang.llvm.org/docs/PointerAuthentication.html). 15 16At the IR level, it is represented using: 17 18* a [set of intrinsics](#intrinsics) (to sign/authenticate pointers) 19* a [signed pointer constant](#constant) (to sign globals) 20* a [call operand bundle](#operand-bundle) (to authenticate called pointers) 21* a [set of function attributes](#function-attributes) (to describe what 22 pointers are signed and how, to control implicit codegen in the backend, as 23 well as preserve invariants in the mid-level optimizer) 24 25The current implementation leverages the 26[Armv8.3-A PAuth/Pointer Authentication Code](#armv8-3-a-pauth-pointer-authentication-code) 27instructions in the [AArch64 backend](#aarch64-support). 28This support is used to implement the Darwin arm64e ABI, as well as the 29[PAuth ABI Extension to ELF](https://github.com/ARM-software/abi-aa/blob/main/pauthabielf64/pauthabielf64.rst). 30 31 32## LLVM IR Representation 33 34### Intrinsics 35 36These intrinsics are provided by LLVM to expose pointer authentication 37operations. 38 39 40#### '`llvm.ptrauth.sign`' 41 42##### Syntax: 43 44```llvm 45declare i64 @llvm.ptrauth.sign(i64 <value>, i32 <key>, i64 <discriminator>) 46``` 47 48##### Overview: 49 50The '`llvm.ptrauth.sign`' intrinsic signs a raw pointer. 51 52 53##### Arguments: 54 55The `value` argument is the raw pointer value to be signed. 56The `key` argument is the identifier of the key to be used to generate the 57signed value. 58The `discriminator` argument is the additional diversity data to be used as a 59discriminator (an integer, an address, or a blend of the two). 60 61##### Semantics: 62 63The '`llvm.ptrauth.sign`' intrinsic implements the `sign`_ operation. 64It returns a signed value. 65 66If `value` is already a signed value, the behavior is undefined. 67 68If `value` is not a pointer value for which `key` is appropriate, the 69behavior is undefined. 70 71 72#### '`llvm.ptrauth.auth`' 73 74##### Syntax: 75 76```llvm 77declare i64 @llvm.ptrauth.auth(i64 <value>, i32 <key>, i64 <discriminator>) 78``` 79 80##### Overview: 81 82The '`llvm.ptrauth.auth`' intrinsic authenticates a signed pointer. 83 84##### Arguments: 85 86The `value` argument is the signed pointer value to be authenticated. 87The `key` argument is the identifier of the key that was used to generate 88the signed value. 89The `discriminator` argument is the additional diversity data to be used as a 90discriminator. 91 92##### Semantics: 93 94The '`llvm.ptrauth.auth`' intrinsic implements the `auth`_ operation. 95It returns a raw pointer value. 96If `value` does not have a correct signature for `key` and `discriminator`, 97the intrinsic traps in a target-specific way. 98 99 100#### '`llvm.ptrauth.strip`' 101 102##### Syntax: 103 104```llvm 105declare i64 @llvm.ptrauth.strip(i64 <value>, i32 <key>) 106``` 107 108##### Overview: 109 110The '`llvm.ptrauth.strip`' intrinsic strips the embedded signature out of a 111possibly-signed pointer. 112 113 114##### Arguments: 115 116The `value` argument is the signed pointer value to be stripped. 117The `key` argument is the identifier of the key that was used to generate 118the signed value. 119 120##### Semantics: 121 122The '`llvm.ptrauth.strip`' intrinsic implements the `strip`_ operation. 123It returns a raw pointer value. It does **not** check that the 124signature is valid. 125 126`key` should identify a key that is appropriate for `value`, as defined 127by the target-specific [keys](#keys)). 128 129If `value` is a raw pointer value, it is returned as-is (provided the `key` 130is appropriate for the pointer). 131 132If `value` is not a pointer value for which `key` is appropriate, the 133behavior is target-specific. 134 135If `value` is a signed pointer value, but `key` does not identify the 136same key that was used to generate `value`, the behavior is 137target-specific. 138 139 140#### '`llvm.ptrauth.resign`' 141 142##### Syntax: 143 144```llvm 145declare i64 @llvm.ptrauth.resign(i64 <value>, 146 i32 <old key>, i64 <old discriminator>, 147 i32 <new key>, i64 <new discriminator>) 148``` 149 150##### Overview: 151 152The '`llvm.ptrauth.resign`' intrinsic re-signs a signed pointer using 153a different key and diversity data. 154 155##### Arguments: 156 157The `value` argument is the signed pointer value to be authenticated. 158The `old key` argument is the identifier of the key that was used to generate 159the signed value. 160The `old discriminator` argument is the additional diversity data to be used 161as a discriminator in the auth operation. 162The `new key` argument is the identifier of the key to use to generate the 163resigned value. 164The `new discriminator` argument is the additional diversity data to be used 165as a discriminator in the sign operation. 166 167##### Semantics: 168 169The '`llvm.ptrauth.resign`' intrinsic performs a combined `auth`_ and `sign`_ 170operation, without exposing the intermediate raw pointer. 171It returns a signed pointer value. 172If `value` does not have a correct signature for `old key` and 173`old discriminator`, the intrinsic traps in a target-specific way. 174 175#### '`llvm.ptrauth.sign_generic`' 176 177##### Syntax: 178 179```llvm 180declare i64 @llvm.ptrauth.sign_generic(i64 <value>, i64 <discriminator>) 181``` 182 183##### Overview: 184 185The '`llvm.ptrauth.sign_generic`' intrinsic computes a generic signature of 186arbitrary data. 187 188##### Arguments: 189 190The `value` argument is the arbitrary data value to be signed. 191The `discriminator` argument is the additional diversity data to be used as a 192discriminator. 193 194##### Semantics: 195 196The '`llvm.ptrauth.sign_generic`' intrinsic computes the signature of a given 197combination of value and additional diversity data. 198 199It returns a full signature value (as opposed to a signed pointer value, with 200an embedded partial signature). 201 202As opposed to [`llvm.ptrauth.sign`](#llvm-ptrauth-sign), it does not interpret 203`value` as a pointer value. Instead, it is an arbitrary data value. 204 205 206#### '`llvm.ptrauth.blend`' 207 208##### Syntax: 209 210```llvm 211declare i64 @llvm.ptrauth.blend(i64 <address discriminator>, i64 <integer discriminator>) 212``` 213 214##### Overview: 215 216The '`llvm.ptrauth.blend`' intrinsic blends a pointer address discriminator 217with a small integer discriminator to produce a new "blended" discriminator. 218 219##### Arguments: 220 221The `address discriminator` argument is a pointer value. 222The `integer discriminator` argument is a small integer, as specified by the 223target. 224 225##### Semantics: 226 227The '`llvm.ptrauth.blend`' intrinsic combines a small integer discriminator 228with a pointer address discriminator, in a way that is specified by the target 229implementation. 230 231 232### Constant 233 234[Intrinsics](#intrinsics) can be used to produce signed pointers dynamically, 235in code, but not for signed pointers referenced by constants, in, e.g., global 236initializers. 237 238The latter are represented using a 239[``ptrauth`` constant](https://llvm.org/docs/LangRef.html#ptrauth-constant), 240which describes an authenticated relocation producing a signed pointer. 241 242```llvm 243ptrauth (ptr CST, i32 KEY, i64 DISC, ptr ADDRDISC) 244``` 245 246is equivalent to: 247 248```llvm 249 %disc = call i64 @llvm.ptrauth.blend(i64 ptrtoint(ptr ADDRDISC to i64), i64 DISC) 250 %signedval = call i64 @llvm.ptrauth.sign(ptr CST, i32 KEY, i64 %disc) 251``` 252 253### Operand Bundle 254 255Function pointers used as indirect call targets can be signed when materialized, 256and authenticated before calls. This can be accomplished with the 257[`llvm.ptrauth.auth`](#llvm-ptrauth-auth) intrinsic, feeding its result to 258an indirect call. 259 260However, that exposes the intermediate, unauthenticated pointer, e.g., if it 261gets spilled to the stack. An attacker can then overwrite the pointer in 262memory, negating the security benefit provided by pointer authentication. 263To prevent that, the `ptrauth` operand bundle may be used: it guarantees that 264the intermediate call target is kept in a register and never stored to memory. 265This hardening benefit is similar to that provided by 266[`llvm.ptrauth.resign`](#llvm-ptrauth-resign)). 267 268Concretely: 269 270```llvm 271define void @f(void ()* %fp) { 272 call void %fp() [ "ptrauth"(i32 <key>, i64 <data>) ] 273 ret void 274} 275``` 276 277is functionally equivalent to: 278 279```llvm 280define void @f(void ()* %fp) { 281 %fp_i = ptrtoint void ()* %fp to i64 282 %fp_auth = call i64 @llvm.ptrauth.auth(i64 %fp_i, i32 <key>, i64 <data>) 283 %fp_auth_p = inttoptr i64 %fp_auth to void ()* 284 call void %fp_auth_p() 285 ret void 286} 287``` 288 289but with the added guarantee that `%fp_i`, `%fp_auth`, and `%fp_auth_p` 290are not stored to (and reloaded from) memory. 291 292 293### Function Attributes 294 295Some function attributes are used to describe other pointer authentication 296operations that are not otherwise explicitly expressed in IR. 297 298#### ``ptrauth-indirect-gotos`` 299 300``ptrauth-indirect-gotos`` specifies that indirect gotos in this function 301should authenticate their target. At the IR level, no other change is needed. 302When lowering [``blockaddress`` constants](https://llvm.org/docs/LangRef.html#blockaddress), 303and [``indirectbr`` instructions](https://llvm.org/docs/LangRef.html#i-indirectbr), 304this tells the backend to respectively sign and authenticate the pointers. 305 306The specific scheme isn't ABI-visible. Currently, the AArch64 backend 307signs blockaddresses using the `ASIA` key, with an integer discriminator 308derived from the parent function's name, using the SipHash stable discriminator: 309``` 310 ptrauth_string_discriminator("<function_name> blockaddress") 311``` 312 313 314## AArch64 Support 315 316AArch64 is currently the only architecture with full support of the pointer 317authentication primitives, based on Armv8.3-A instructions. 318 319### Armv8.3-A PAuth Pointer Authentication Code 320 321The Armv8.3-A architecture extension defines the PAuth feature, which provides 322support for instructions that manipulate Pointer Authentication Codes (PAC). 323 324#### Keys 325 3265 keys are supported by the PAuth feature. 327 328Of those, 4 keys are interchangeably usable to specify the key used in IR 329constructs: 330* `ASIA`/`ASIB` are instruction keys (encoded as respectively 0 and 1). 331* `ASDA`/`ASDB` are data keys (encoded as respectively 2 and 3). 332 333`ASGA` is a special key that cannot be explicitly specified, and is only ever 334used implicitly, to implement the 335[`llvm.ptrauth.sign_generic`](#llvm-ptrauth-sign-generic) intrinsic. 336 337#### Instructions 338 339The IR [Intrinsics](#intrinsics) described above map onto these 340instructions as such: 341* [`llvm.ptrauth.sign`](#llvm-ptrauth-sign): `PAC{I,D}{A,B}{Z,SP,}` 342* [`llvm.ptrauth.auth`](#llvm-ptrauth-auth): `AUT{I,D}{A,B}{Z,SP,}` 343* [`llvm.ptrauth.strip`](#llvm-ptrauth-strip): `XPAC{I,D}` 344* [`llvm.ptrauth.blend`](#llvm-ptrauth-blend): The semantics of the blend 345 operation are specified by the ABI. In both the ELF PAuth ABI Extension and 346 arm64e, it's a `MOVK` into the high 16 bits. Consequently, this limits 347 the width of the integer discriminator used in blends to 16 bits. 348* [`llvm.ptrauth.sign_generic`](#llvm-ptrauth-sign-generic): `PACGA` 349* [`llvm.ptrauth.resign`](#llvm-ptrauth-resign): `AUT*+PAC*`. These are 350 represented as a single pseudo-instruction in the backend to guarantee that 351 the intermediate raw pointer value is not spilled and attackable. 352 353#### Assembly Representation 354 355At the assembly level, authenticated relocations are represented 356using the `@AUTH` modifier: 357 358```asm 359 .quad _target@AUTH(<key>,<discriminator>[,addr]) 360``` 361 362where: 363* `key` is the Armv8.3-A key identifier (`ia`, `ib`, `da`, `db`) 364* `discriminator` is the 16-bit unsigned discriminator value 365* `addr` signifies that the authenticated pointer is address-discriminated 366 (that is, that the relocation's target address is to be blended into the 367 `discriminator` before it is used in the sign operation. 368 369For example: 370```asm 371 _authenticated_reference_to_sym: 372 .quad _sym@AUTH(db,0) 373 _authenticated_reference_to_sym_addr_disc: 374 .quad _sym@AUTH(ia,12,addr) 375``` 376 377#### MachO Object File Representation 378 379At the object file level, authenticated relocations are represented using the 380``ARM64_RELOC_AUTHENTICATED_POINTER`` relocation kind (with value ``11``). 381 382The pointer authentication information is encoded into the addend as follows: 383 384``` 385| 63 | 62 | 61-51 | 50-49 | 48 | 47 - 32 | 31 - 0 | 386| -- | -- | ----- | ----- | ------ | --------------- | -------- | 387| 1 | 0 | 0 | key | addr | discriminator | addend | 388``` 389 390#### ELF Object File Representation 391 392At the object file level, authenticated relocations are represented 393using the `R_AARCH64_AUTH_ABS64` relocation kind (with value `0xE100`). 394 395The signing schema is encoded in the place of relocation to be applied 396as follows: 397 398``` 399| 63 | 62 | 61:60 | 59:48 | 47:32 | 31:0 | 400| ----------------- | -------- | -------- | -------- | ------------- | ------------------- | 401| address diversity | reserved | key | reserved | discriminator | reserved for addend | 402``` 403