xref: /llvm-project/libcxx/docs/Hardening.rst (revision 1855333e3a843174c0d7421d4c5e404649f6b75a)
1.. _hardening:
2
3===============
4Hardening Modes
5===============
6
7.. contents::
8   :local:
9
10.. _using-hardening-modes:
11
12Using hardening modes
13=====================
14
15libc++ provides several hardening modes, where each mode enables a set of
16assertions that prevent undefined behavior caused by violating preconditions of
17the standard library. Different hardening modes make different trade-offs
18between the amount of checking and runtime performance. The available hardening
19modes are:
20
21- **Unchecked mode/none**, which disables all hardening checks.
22- **Fast mode**, which contains a set of security-critical checks that can be
23  done with relatively little overhead in constant time and are intended to be
24  used in production. We recommend most projects adopt this.
25- **Extensive mode**, which contains all the checks from fast mode and some
26  additional checks for undefined behavior that incur relatively little overhead
27  but aren't security-critical. Production builds requiring a broader set of
28  checks than fast mode should consider enabling extensive mode. The additional
29  rigour impacts performance more than fast mode: we recommend benchmarking to
30  determine if that is acceptable for your program.
31- **Debug mode**, which enables all the available checks in the library,
32  including heuristic checks that might have significant performance overhead as
33  well as internal library assertions. This mode should be used in
34  non-production environments (such as test suites, CI, or local development).
35  We don’t commit to a particular level of performance in this mode and it’s
36  *not* intended to be used in production.
37
38.. note::
39
40   Enabling hardening has no impact on the ABI.
41
42Notes for users
43---------------
44
45As a libc++ user, consult with your vendor to determine the level of hardening
46enabled by default.
47
48Users wishing for a different hardening level to their vendor default are able
49to control the level by passing **one** of the following options to the compiler:
50
51- ``-D_LIBCPP_HARDENING_MODE=_LIBCPP_HARDENING_MODE_NONE``
52- ``-D_LIBCPP_HARDENING_MODE=_LIBCPP_HARDENING_MODE_FAST``
53- ``-D_LIBCPP_HARDENING_MODE=_LIBCPP_HARDENING_MODE_EXTENSIVE``
54- ``-D_LIBCPP_HARDENING_MODE=_LIBCPP_HARDENING_MODE_DEBUG``
55
56.. warning::
57
58   The exact numeric values of these macros are unspecified and users should not
59   rely on them (e.g. expect the values to be sorted in any way).
60
61.. warning::
62
63   If you would prefer to override the hardening level on a per-translation-unit
64   basis, you must do so **before** including any headers to avoid `ODR issues`_.
65
66.. _`ODR issues`: https://en.cppreference.com/w/cpp/language/definition#:~:text=is%20ill%2Dformed.-,One%20Definition%20Rule,-Only%20one%20definition
67
68.. note::
69
70   Since the static and shared library components of libc++ are built by the
71   vendor, setting this macro will have no impact on the hardening mode for the
72   pre-built components. Most libc++ code is header-based, so a user-provided
73   value for ``_LIBCPP_HARDENING_MODE`` will be mostly respected.
74
75Notes for vendors
76-----------------
77
78Vendors can set the default hardening mode by providing
79``LIBCXX_HARDENING_MODE`` as a configuration option, with the possible values of
80``none``, ``fast``, ``extensive`` and ``debug``. The default value is ``none``
81which doesn't enable any hardening checks (this mode is sometimes called the
82``unchecked`` mode).
83
84This option controls both the hardening mode that the precompiled library is
85built with and the default hardening mode that users will build with. If set to
86``none``, the precompiled library will not contain any assertions, and user code
87will default to building without assertions.
88
89Vendors can also override the way the program is terminated when an assertion
90fails by :ref:`providing a custom header <override-assertion-handler>`.
91
92Assertion categories
93====================
94
95Inside the library, individual assertions are grouped into different
96*categories*. Each hardening mode enables a different set of assertion
97categories; categories provide an additional layer of abstraction that makes it
98easier to reason about the high-level semantics of a hardening mode.
99
100.. note::
101
102  Users are not intended to interact with these categories directly -- the
103  categories are considered internal to the library and subject to change.
104
105- ``valid-element-access`` -- checks that any attempts to access a container
106  element, whether through the container object or through an iterator, are
107  valid and do not attempt to go out of bounds or otherwise access
108  a non-existent element. This also includes operations that set up an imminent
109  invalid access (e.g. incrementing an end iterator). For iterator checks to
110  work, bounded iterators must be enabled in the ABI. Types like
111  ``std::optional`` and ``std::function`` are considered containers (with at
112  most one element) for the purposes of this check.
113
114- ``valid-input-range`` -- checks that ranges (whether expressed as an iterator
115  pair, an iterator and a sentinel, an iterator and a count, or
116  a ``std::range``) given as input to library functions are valid:
117  - the sentinel is reachable from the begin iterator;
118  - TODO(hardening): both iterators refer to the same container.
119
120  ("input" here refers to "an input given to an algorithm", not to an iterator
121  category)
122
123  Violating assertions in this category leads to an out-of-bounds access.
124
125- ``non-null`` -- checks that the pointer being dereferenced is not null. On
126  most modern platforms, the zero address does not refer to an actual location
127  in memory, so a null pointer dereference would not compromise the memory
128  security of a program (however, it is still undefined behavior that can result
129  in strange errors due to compiler optimizations).
130
131- ``non-overlapping-ranges`` -- for functions that take several ranges as
132  arguments, checks that those ranges do not overlap.
133
134- ``valid-deallocation`` -- checks that an attempt to deallocate memory is valid
135  (e.g. the given object was allocated by the given allocator). Violating this
136  category typically results in a memory leak.
137
138- ``valid-external-api-call`` -- checks that a call to an external API doesn't
139  fail in an unexpected manner. This includes triggering documented cases of
140  undefined behavior in an external library (like attempting to unlock an
141  unlocked mutex in pthreads). Any API external to the library falls under this
142  category (from system calls to compiler intrinsics). We generally don't expect
143  these failures to compromise memory safety or otherwise create an immediate
144  security issue.
145
146- ``compatible-allocator`` -- checks any operations that exchange nodes between
147  containers to make sure the containers have compatible allocators.
148
149- ``argument-within-domain`` -- checks that the given argument is within the
150  domain of valid arguments for the function. Violating this typically produces
151  an incorrect result (e.g. ``std::clamp`` returns the original value without
152  clamping it due to incorrect functors) or puts an object into an invalid state
153  (e.g. a string view where only a subset of elements is accessible). This
154  category is for assertions violating which doesn't cause any immediate issues
155  in the library -- whatever the consequences are, they will happen in the user
156  code.
157
158- ``pedantic`` -- checks preconditions that are imposed by the Standard, but
159  violating which happens to be benign in libc++.
160
161- ``semantic-requirement`` -- checks that the given argument satisfies the
162  semantic requirements imposed by the Standard. Typically, there is no simple
163  way to completely prove that a semantic requirement is satisfied; thus, this
164  would often be a heuristic check and it might be quite expensive.
165
166- ``internal`` -- checks that internal invariants of the library hold. These
167  assertions don't depend on user input.
168
169- ``uncategorized`` -- for assertions that haven't been properly classified yet.
170  This category is an escape hatch used for some existing assertions in the
171  library; all new code should have its assertions properly classified.
172
173Mapping between the hardening modes and the assertion categories
174================================================================
175
176.. list-table::
177    :header-rows: 1
178    :widths: auto
179
180    * - Category name
181      - ``fast``
182      - ``extensive``
183      - ``debug``
184    * - ``valid-element-access``
185      - ✅
186      - ✅
187      - ✅
188    * - ``valid-input-range``
189      - ✅
190      - ✅
191      - ✅
192    * - ``non-null``
193      - ❌
194      - ✅
195      - ✅
196    * - ``non-overlapping-ranges``
197      - ❌
198      - ✅
199      - ✅
200    * - ``valid-deallocation``
201      - ❌
202      - ✅
203      - ✅
204    * - ``valid-external-api-call``
205      - ❌
206      - ✅
207      - ✅
208    * - ``compatible-allocator``
209      - ❌
210      - ✅
211      - ✅
212    * - ``argument-within-domain``
213      - ❌
214      - ✅
215      - ✅
216    * - ``pedantic``
217      - ❌
218      - ✅
219      - ✅
220    * - ``semantic-requirement``
221      - ❌
222      - ❌
223      - ✅
224    * - ``internal``
225      - ❌
226      - ❌
227      - ✅
228    * - ``uncategorized``
229      - ❌
230      - ✅
231      - ✅
232
233.. note::
234
235  At the moment, each subsequent hardening mode is a strict superset of the
236  previous one (in other words, each subsequent mode only enables additional
237  assertion categories without disabling any), but this won't necessarily be
238  true for any hardening modes that might be added in the future.
239
240.. note::
241
242  The categories enabled by each mode are subject to change and users should not
243  rely on the precise assertions enabled by a mode at a given point in time.
244  However, the library does guarantee to keep the hardening modes stable and
245  to fulfill the semantics documented here.
246
247Hardening assertion failure
248===========================
249
250In production modes (``fast`` and ``extensive``), a hardening assertion failure
251immediately ``_traps <https://llvm.org/docs/LangRef.html#llvm-trap-intrinsic>``
252the program. This is the safest approach that also minimizes the code size
253penalty as the failure handler maps to a single instruction. The downside is
254that the failure provides no additional details other than the stack trace
255(which might also be affected by optimizations).
256
257TODO(hardening): describe ``__builtin_verbose_trap`` once we can use it.
258
259In the ``debug`` mode, an assertion failure terminates the program in an
260unspecified manner and also outputs the associated error message to the error
261output. This is less secure and increases the size of the binary (among other
262things, it has to store the error message strings) but makes the failure easier
263to debug. It also allows testing the error messages in our test suite.
264
265.. _override-assertion-handler:
266
267Overriding the assertion failure handler
268----------------------------------------
269
270Vendors can override the default assertion handler mechanism by following these
271steps:
272
273- create a header file that provides a definition of a macro called
274  ``_LIBCPP_ASSERTION_HANDLER``. The macro will be invoked when a hardening
275  assertion fails, with a single parameter containing a null-terminated string
276  with the error message.
277- when configuring the library, provide the path to custom header (relative to
278  the root of the repository) via the CMake variable
279  ``LIBCXX_ASSERTION_HANDLER_FILE``.
280
281Note that almost all libc++ headers include the assertion handler header which
282means it should not include anything non-trivial from the standard library to
283avoid creating circular dependencies.
284
285There is no existing mechanism for users to override the assertion handler
286because the ability to do the override other than at configure-time carries an
287unavoidable code size penalty that would otherwise be imposed on all users,
288whether they require such customization or not. Instead, we let vendors decide
289what's right on their platform for their users -- a vendor who wishes to provide
290this capability is free to do so, e.g. by declaring the assertion handler as an
291overridable function.
292
293ABI
294===
295
296Setting a hardening mode does **not** affect the ABI. Each mode uses the subset
297of checks available in the current ABI configuration which is determined by the
298platform.
299
300It is important to stress that whether a particular check is enabled depends on
301the combination of the selected hardening mode and the hardening-related ABI
302options. Some checks require changing the ABI from the "default" to store
303additional information in the library classes -- e.g. checking whether an
304iterator is valid upon dereference generally requires storing data about bounds
305inside the iterator object. Using ``std::span`` as an example, setting the
306hardening mode to ``fast`` will always enable the ``valid-element-access``
307checks when accessing elements via a ``std::span`` object, but whether
308dereferencing a ``std::span`` iterator does the equivalent check depends on the
309ABI configuration.
310
311ABI options
312-----------
313
314Vendors can use some ABI options at CMake configuration time (when building libc++
315itself) to enable additional hardening checks. This is done by passing these
316macros as ``-DLIBCXX_ABI_DEFINES="_LIBCPP_ABI_FOO;_LIBCPP_ABI_BAR;etc"`` at
317CMake configuration time. The available options are:
318
319- ``_LIBCPP_ABI_BOUNDED_ITERATORS`` -- changes the iterator type of select
320  containers (see below) to a bounded iterator that keeps track of whether it's
321  within the bounds of the original container and asserts valid bounds on every
322  dereference.
323
324  ABI impact: changes the iterator type of the relevant containers.
325
326  Supported containers:
327
328  - ``span``;
329  - ``string_view``.
330
331- ``_LIBCPP_ABI_BOUNDED_ITERATORS_IN_STRING`` -- changes the iterator type of
332  ``basic_string`` to a bounded iterator that keeps track of whether it's within
333  the bounds of the original container and asserts it on every dereference and
334  when performing iterator arithmetics.
335
336  ABI impact: changes the iterator type of ``basic_string`` and its
337  specializations, such as ``string`` and ``wstring``.
338
339- ``_LIBCPP_ABI_BOUNDED_ITERATORS_IN_VECTOR`` -- changes the iterator type of
340  ``vector`` to a bounded iterator that keeps track of whether it's within the
341  bounds of the original container and asserts it on every dereference and when
342  performing iterator arithmetics. Note: this doesn't yet affect
343  ``vector<bool>``.
344
345  ABI impact: changes the iterator type of ``vector`` (except ``vector<bool>``).
346
347- ``_LIBCPP_ABI_BOUNDED_UNIQUE_PTR`` -- tracks the bounds of the array stored inside
348  a ``std::unique_ptr<T[]>``, allowing it to trap when accessed out-of-bounds. This
349  requires the ``std::unique_ptr`` to be created using an API like ``std::make_unique``
350  or ``std::make_unique_for_overwrite``, otherwise the bounds information is not available
351  to the library.
352
353  ABI impact: changes the layout of ``std::unique_ptr<T[]>``, and the representation
354              of a few library types that use ``std::unique_ptr`` internally, such as
355              the unordered containers.
356
357- ``_LIBCPP_ABI_BOUNDED_ITERATORS_IN_STD_ARRAY`` -- changes the iterator type of ``std::array`` to a
358  bounded iterator that keeps track of whether it's within the bounds of the container and asserts it
359  on every dereference and when performing iterator arithmetic.
360
361  ABI impact: changes the iterator type of ``std::array``, its size and its layout.
362
363ABI tags
364--------
365
366We use ABI tags to allow translation units built with different hardening modes
367to interact with each other without causing ODR violations. Knowing how
368hardening modes are encoded into the ABI tags might be useful to examine
369a binary and determine whether it was built with hardening enabled.
370
371.. warning::
372  We don't commit to the encoding scheme used by the ABI tags being stable
373  between different releases of libc++. The tags themselves are never stable, by
374  design -- new releases increase the version number. The following describes
375  the state of the latest release and is for informational purposes only.
376
377The first character of an ABI tag encodes the hardening mode:
378
379- ``f`` -- [f]ast mode;
380- ``s`` -- extensive ("[s]afe") mode;
381- ``d`` -- [d]ebug mode;
382- ``n`` -- [n]one mode.
383
384Hardened containers status
385==========================
386
387.. list-table::
388    :header-rows: 1
389    :widths: auto
390
391    * - Name
392      - Member functions
393      - Iterators (ABI-dependent)
394    * - ``span``
395      - ✅
396      - ✅
397    * - ``string_view``
398      - ✅
399      - ✅
400    * - ``array``
401      - ✅
402      - ❌
403    * - ``vector``
404      - ✅
405      - ✅ (see note)
406    * - ``string``
407      - ✅
408      - ✅ (see note)
409    * - ``list``
410      - ✅
411      - ❌
412    * - ``forward_list``
413      - ✅
414      - ❌
415    * - ``deque``
416      - ✅
417      - ❌
418    * - ``map``
419      - ❌
420      - ❌
421    * - ``set``
422      - ❌
423      - ❌
424    * - ``multimap``
425      - ❌
426      - ❌
427    * - ``multiset``
428      - ❌
429      - ❌
430    * - ``unordered_map``
431      - Partial
432      - Partial
433    * - ``unordered_set``
434      - Partial
435      - Partial
436    * - ``unordered_multimap``
437      - Partial
438      - Partial
439    * - ``unordered_multiset``
440      - Partial
441      - Partial
442    * - ``mdspan``
443      - ✅
444      - ❌
445    * - ``optional``
446      - ✅
447      - N/A
448    * - ``function``
449      - ❌
450      - N/A
451    * - ``variant``
452      - N/A
453      - N/A
454    * - ``any``
455      - N/A
456      - N/A
457    * - ``expected``
458      - ✅
459      - N/A
460    * - ``valarray``
461      - Partial
462      - N/A
463    * - ``bitset``
464      - ✅
465      - N/A
466
467Note: for ``vector`` and ``string``, the iterator does not check for
468invalidation (accesses made via an invalidated iterator still lead to undefined
469behavior)
470
471Note: ``vector<bool>`` iterator is not currently hardened.
472
473Testing
474=======
475
476Please see :ref:`Testing documentation <testing-hardening-assertions>`.
477
478Further reading
479===============
480
481- `Hardening RFC <https://discourse.llvm.org/t/rfc-hardening-in-libc/73925>`_:
482  contains some of the design rationale.
483