xref: /llvm-project/llvm/docs/PointerAuth.md (revision 464fa3b3b047518699689b57c473c87701986593)
1# Pointer Authentication
2
3## Introduction
4
5Pointer Authentication is a mechanism by which certain pointers are signed.
6When a pointer gets signed, a cryptographic hash of its value and other values
7(pepper and salt) is stored in unused bits of that pointer.
8
9Before the pointer is used, it needs to be authenticated, i.e., have its
10signature checked.  This prevents pointer values of unknown origin from being
11used to replace the signed pointer value.
12
13For more details, see the clang documentation page for
14[Pointer Authentication](https://clang.llvm.org/docs/PointerAuthentication.html).
15
16At the IR level, it is represented using:
17
18* a [set of intrinsics](#intrinsics) (to sign/authenticate pointers)
19* a [signed pointer constant](#constant) (to sign globals)
20* a [call operand bundle](#operand-bundle) (to authenticate called pointers)
21* a [set of function attributes](#function-attributes) (to describe what
22  pointers are signed and how, to control implicit codegen in the backend, as
23  well as preserve invariants in the mid-level optimizer)
24
25The current implementation leverages the
26[Armv8.3-A PAuth/Pointer Authentication Code](#armv8-3-a-pauth-pointer-authentication-code)
27instructions in the [AArch64 backend](#aarch64-support).
28This support is used to implement the Darwin arm64e ABI, as well as the
29[PAuth ABI Extension to ELF](https://github.com/ARM-software/abi-aa/blob/main/pauthabielf64/pauthabielf64.rst).
30
31
32## LLVM IR Representation
33
34### Intrinsics
35
36These intrinsics are provided by LLVM to expose pointer authentication
37operations.
38
39
40#### '`llvm.ptrauth.sign`'
41
42##### Syntax:
43
44```llvm
45declare i64 @llvm.ptrauth.sign(i64 <value>, i32 <key>, i64 <discriminator>)
46```
47
48##### Overview:
49
50The '`llvm.ptrauth.sign`' intrinsic signs a raw pointer.
51
52
53##### Arguments:
54
55The `value` argument is the raw pointer value to be signed.
56The `key` argument is the identifier of the key to be used to generate the
57signed value.
58The `discriminator` argument is the additional diversity data to be used as a
59discriminator (an integer, an address, or a blend of the two).
60
61##### Semantics:
62
63The '`llvm.ptrauth.sign`' intrinsic implements the `sign`_ operation.
64It returns a signed value.
65
66If `value` is already a signed value, the behavior is undefined.
67
68If `value` is not a pointer value for which `key` is appropriate, the
69behavior is undefined.
70
71
72#### '`llvm.ptrauth.auth`'
73
74##### Syntax:
75
76```llvm
77declare i64 @llvm.ptrauth.auth(i64 <value>, i32 <key>, i64 <discriminator>)
78```
79
80##### Overview:
81
82The '`llvm.ptrauth.auth`' intrinsic authenticates a signed pointer.
83
84##### Arguments:
85
86The `value` argument is the signed pointer value to be authenticated.
87The `key` argument is the identifier of the key that was used to generate
88the signed value.
89The `discriminator` argument is the additional diversity data to be used as a
90discriminator.
91
92##### Semantics:
93
94The '`llvm.ptrauth.auth`' intrinsic implements the `auth`_ operation.
95It returns a raw pointer value.
96If `value` does not have a correct signature for `key` and `discriminator`,
97the intrinsic traps in a target-specific way.
98
99
100#### '`llvm.ptrauth.strip`'
101
102##### Syntax:
103
104```llvm
105declare i64 @llvm.ptrauth.strip(i64 <value>, i32 <key>)
106```
107
108##### Overview:
109
110The '`llvm.ptrauth.strip`' intrinsic strips the embedded signature out of a
111possibly-signed pointer.
112
113
114##### Arguments:
115
116The `value` argument is the signed pointer value to be stripped.
117The `key` argument is the identifier of the key that was used to generate
118the signed value.
119
120##### Semantics:
121
122The '`llvm.ptrauth.strip`' intrinsic implements the `strip`_ operation.
123It returns a raw pointer value.  It does **not** check that the
124signature is valid.
125
126`key` should identify a key that is appropriate for `value`, as defined
127by the target-specific [keys](#keys)).
128
129If `value` is a raw pointer value, it is returned as-is (provided the `key`
130is appropriate for the pointer).
131
132If `value` is not a pointer value for which `key` is appropriate, the
133behavior is target-specific.
134
135If `value` is a signed pointer value, but `key` does not identify the
136same key that was used to generate `value`, the behavior is
137target-specific.
138
139
140#### '`llvm.ptrauth.resign`'
141
142##### Syntax:
143
144```llvm
145declare i64 @llvm.ptrauth.resign(i64 <value>,
146                                 i32 <old key>, i64 <old discriminator>,
147                                 i32 <new key>, i64 <new discriminator>)
148```
149
150##### Overview:
151
152The '`llvm.ptrauth.resign`' intrinsic re-signs a signed pointer using
153a different key and diversity data.
154
155##### Arguments:
156
157The `value` argument is the signed pointer value to be authenticated.
158The `old key` argument is the identifier of the key that was used to generate
159the signed value.
160The `old discriminator` argument is the additional diversity data to be used
161as a discriminator in the auth operation.
162The `new key` argument is the identifier of the key to use to generate the
163resigned value.
164The `new discriminator` argument is the additional diversity data to be used
165as a discriminator in the sign operation.
166
167##### Semantics:
168
169The '`llvm.ptrauth.resign`' intrinsic performs a combined `auth`_ and `sign`_
170operation, without exposing the intermediate raw pointer.
171It returns a signed pointer value.
172If `value` does not have a correct signature for `old key` and
173`old discriminator`, the intrinsic traps in a target-specific way.
174
175#### '`llvm.ptrauth.sign_generic`'
176
177##### Syntax:
178
179```llvm
180declare i64 @llvm.ptrauth.sign_generic(i64 <value>, i64 <discriminator>)
181```
182
183##### Overview:
184
185The '`llvm.ptrauth.sign_generic`' intrinsic computes a generic signature of
186arbitrary data.
187
188##### Arguments:
189
190The `value` argument is the arbitrary data value to be signed.
191The `discriminator` argument is the additional diversity data to be used as a
192discriminator.
193
194##### Semantics:
195
196The '`llvm.ptrauth.sign_generic`' intrinsic computes the signature of a given
197combination of value and additional diversity data.
198
199It returns a full signature value (as opposed to a signed pointer value, with
200an embedded partial signature).
201
202As opposed to [`llvm.ptrauth.sign`](#llvm-ptrauth-sign), it does not interpret
203`value` as a pointer value.  Instead, it is an arbitrary data value.
204
205
206#### '`llvm.ptrauth.blend`'
207
208##### Syntax:
209
210```llvm
211declare i64 @llvm.ptrauth.blend(i64 <address discriminator>, i64 <integer discriminator>)
212```
213
214##### Overview:
215
216The '`llvm.ptrauth.blend`' intrinsic blends a pointer address discriminator
217with a small integer discriminator to produce a new "blended" discriminator.
218
219##### Arguments:
220
221The `address discriminator` argument is a pointer value.
222The `integer discriminator` argument is a small integer, as specified by the
223target.
224
225##### Semantics:
226
227The '`llvm.ptrauth.blend`' intrinsic combines a small integer discriminator
228with a pointer address discriminator, in a way that is specified by the target
229implementation.
230
231
232### Constant
233
234[Intrinsics](#intrinsics) can be used to produce signed pointers dynamically,
235in code, but not for signed pointers referenced by constants, in, e.g., global
236initializers.
237
238The latter are represented using a
239[``ptrauth`` constant](https://llvm.org/docs/LangRef.html#ptrauth-constant),
240which describes an authenticated relocation producing a signed pointer.
241
242```llvm
243ptrauth (ptr CST, i32 KEY, i64 DISC, ptr ADDRDISC)
244```
245
246is equivalent to:
247
248```llvm
249  %disc = call i64 @llvm.ptrauth.blend(i64 ptrtoint(ptr ADDRDISC to i64), i64 DISC)
250  %signedval = call i64 @llvm.ptrauth.sign(ptr CST, i32 KEY, i64 %disc)
251```
252
253### Operand Bundle
254
255Function pointers used as indirect call targets can be signed when materialized,
256and authenticated before calls.  This can be accomplished with the
257[`llvm.ptrauth.auth`](#llvm-ptrauth-auth) intrinsic, feeding its result to
258an indirect call.
259
260However, that exposes the intermediate, unauthenticated pointer, e.g., if it
261gets spilled to the stack.  An attacker can then overwrite the pointer in
262memory, negating the security benefit provided by pointer authentication.
263To prevent that, the `ptrauth` operand bundle may be used: it guarantees that
264the intermediate call target is kept in a register and never stored to memory.
265This hardening benefit is similar to that provided by
266[`llvm.ptrauth.resign`](#llvm-ptrauth-resign)).
267
268Concretely:
269
270```llvm
271define void @f(void ()* %fp) {
272  call void %fp() [ "ptrauth"(i32 <key>, i64 <data>) ]
273  ret void
274}
275```
276
277is functionally equivalent to:
278
279```llvm
280define void @f(void ()* %fp) {
281  %fp_i = ptrtoint void ()* %fp to i64
282  %fp_auth = call i64 @llvm.ptrauth.auth(i64 %fp_i, i32 <key>, i64 <data>)
283  %fp_auth_p = inttoptr i64 %fp_auth to void ()*
284  call void %fp_auth_p()
285  ret void
286}
287```
288
289but with the added guarantee that `%fp_i`, `%fp_auth`, and `%fp_auth_p`
290are not stored to (and reloaded from) memory.
291
292
293### Function Attributes
294
295Some function attributes are used to describe other pointer authentication
296operations that are not otherwise explicitly expressed in IR.
297
298#### ``ptrauth-indirect-gotos``
299
300``ptrauth-indirect-gotos`` specifies that indirect gotos in this function
301should authenticate their target.  At the IR level, no other change is needed.
302When lowering [``blockaddress`` constants](https://llvm.org/docs/LangRef.html#blockaddress),
303and [``indirectbr`` instructions](https://llvm.org/docs/LangRef.html#i-indirectbr),
304this tells the backend to respectively sign and authenticate the pointers.
305
306The specific scheme isn't ABI-visible.  Currently, the AArch64 backend
307signs blockaddresses using the `ASIA` key, with an integer discriminator
308derived from the parent function's name, using the SipHash stable discriminator:
309```
310  ptrauth_string_discriminator("<function_name> blockaddress")
311```
312
313
314## AArch64 Support
315
316AArch64 is currently the only architecture with full support of the pointer
317authentication primitives, based on Armv8.3-A instructions.
318
319### Armv8.3-A PAuth Pointer Authentication Code
320
321The Armv8.3-A architecture extension defines the PAuth feature, which provides
322support for instructions that manipulate Pointer Authentication Codes (PAC).
323
324#### Keys
325
3265 keys are supported by the PAuth feature.
327
328Of those, 4 keys are interchangeably usable to specify the key used in IR
329constructs:
330* `ASIA`/`ASIB` are instruction keys (encoded as respectively 0 and 1).
331* `ASDA`/`ASDB` are data keys (encoded as respectively 2 and 3).
332
333`ASGA` is a special key that cannot be explicitly specified, and is only ever
334used implicitly, to implement the
335[`llvm.ptrauth.sign_generic`](#llvm-ptrauth-sign-generic) intrinsic.
336
337#### Instructions
338
339The IR [Intrinsics](#intrinsics) described above map onto these
340instructions as such:
341* [`llvm.ptrauth.sign`](#llvm-ptrauth-sign): `PAC{I,D}{A,B}{Z,SP,}`
342* [`llvm.ptrauth.auth`](#llvm-ptrauth-auth): `AUT{I,D}{A,B}{Z,SP,}`
343* [`llvm.ptrauth.strip`](#llvm-ptrauth-strip): `XPAC{I,D}`
344* [`llvm.ptrauth.blend`](#llvm-ptrauth-blend): The semantics of the blend
345  operation are specified by the ABI.  In both the ELF PAuth ABI Extension and
346  arm64e, it's a `MOVK` into the high 16 bits.  Consequently, this limits
347  the width of the integer discriminator used in blends to 16 bits.
348* [`llvm.ptrauth.sign_generic`](#llvm-ptrauth-sign-generic): `PACGA`
349* [`llvm.ptrauth.resign`](#llvm-ptrauth-resign): `AUT*+PAC*`.  These are
350  represented as a single pseudo-instruction in the backend to guarantee that
351  the intermediate raw pointer value is not spilled and attackable.
352
353#### Assembly Representation
354
355At the assembly level, authenticated relocations are represented
356using the `@AUTH` modifier:
357
358```asm
359    .quad _target@AUTH(<key>,<discriminator>[,addr])
360```
361
362where:
363* `key` is the Armv8.3-A key identifier (`ia`, `ib`, `da`, `db`)
364* `discriminator` is the 16-bit unsigned discriminator value
365* `addr` signifies that the authenticated pointer is address-discriminated
366  (that is, that the relocation's target address is to be blended into the
367  `discriminator` before it is used in the sign operation.
368
369For example:
370```asm
371  _authenticated_reference_to_sym:
372    .quad _sym@AUTH(db,0)
373  _authenticated_reference_to_sym_addr_disc:
374    .quad _sym@AUTH(ia,12,addr)
375```
376
377#### MachO Object File Representation
378
379At the object file level, authenticated relocations are represented using the
380``ARM64_RELOC_AUTHENTICATED_POINTER`` relocation kind (with value ``11``).
381
382The pointer authentication information is encoded into the addend as follows:
383
384```
385| 63 | 62 | 61-51 | 50-49 |   48   | 47     -     32 | 31  -  0 |
386| -- | -- | ----- | ----- | ------ | --------------- | -------- |
387|  1 |  0 |   0   |  key  |  addr  |  discriminator  |  addend  |
388```
389
390#### ELF Object File Representation
391
392At the object file level, authenticated relocations are represented
393using the `R_AARCH64_AUTH_ABS64` relocation kind (with value `0xE100`).
394
395The signing schema is encoded in the place of relocation to be applied
396as follows:
397
398```
399| 63                | 62       | 61:60    | 59:48    |  47:32        | 31:0                |
400| ----------------- | -------- | -------- | -------- | ------------- | ------------------- |
401| address diversity | reserved | key      | reserved | discriminator | reserved for addend |
402```
403