xref: /llvm-project/clang/docs/StandardCPlusPlusModules.rst (revision 411196b9bb1953372726348deb1bc77abfa7d900)
1====================
2Standard C++ Modules
3====================
4
5.. contents::
6   :local:
7
8Introduction
9============
10
11The term ``module`` is ambiguous, as it is used to mean multiple things in
12Clang. For Clang users, a module may refer to an ``Objective-C Module``,
13`Clang Module <Modules.html>`_ (also called a ``Clang Header Module``) or a
14``C++20 Module`` (or a ``Standard C++ Module``). The implementation of all
15these kinds of modules in Clang shares a lot of code, but from the perspective
16of users their semantics and command line interfaces are very different. This
17document is an introduction to the use of C++20 modules in Clang. In the
18remainder of this document, the term ``module`` will refer to Standard C++20
19modules and the term ``Clang module`` will refer to the Clang Modules
20extension.
21
22In terms of the C++ Standard, modules consist of two components: "Named
23Modules" or "Header Units". This document covers both.
24
25Standard C++ Named modules
26==========================
27
28In order to better understand the compiler's behavior, it is helpful to
29understand some terms and definitions for readers who are not familiar with the
30C++ feature. This document is not a tutorial on C++; it only introduces
31necessary concepts to better understand use of modules in a project.
32
33Background and terminology
34--------------------------
35
36Module and module unit
37~~~~~~~~~~~~~~~~~~~~~~
38
39A module consists of one or more module units. A module unit is a special kind
40of translation unit. A module unit should almost always start with a module
41declaration. The syntax of the module declaration is:
42
43.. code-block:: c++
44
45  [export] module module_name[:partition_name];
46
47Terms enclosed in ``[]`` are optional. ``module_name`` and ``partition_name``
48follow the rules for a C++ identifier, except that they may contain one or more
49period (``.``) characters. Note that a ``.`` in the name has no semantic
50meaning and does not imply any hierarchy.
51
52In this document, module units are classified as:
53
54* Primary module interface unit
55* Module implementation unit
56* Module partition interface unit
57* Internal module partition unit
58
59A primary module interface unit is a module unit whose module declaration is
60``export module module_name;`` where ``module_name`` denotes the name of the
61module. A module should have one and only one primary module interface unit.
62
63A module implementation unit is a module unit whose module declaration is
64``module module_name;``. Multiple module implementation units can be declared
65in the same module.
66
67A module partition interface unit is a module unit whose module declaration is
68``export module module_name:partition_name;``. The ``partition_name`` should be
69unique within any given module.
70
71An internal module partition unit is a module unit whose module
72declaration is ``module module_name:partition_name;``. The ``partition_name``
73should be unique within any given module.
74
75In this document, we use the following terms:
76
77* A ``module interface unit`` refers to either a ``primary module interface unit``
78  or a ``module partition interface unit``.
79
80* An ``importable module unit`` refers to either a ``module interface unit`` or
81  an ``internal module partition unit``.
82
83* A ``module partition unit`` refers to either a ``module partition interface unit``
84  or an ``internal module partition unit``.
85
86Built Module Interface
87~~~~~~~~~~~~~~~~~~~~~~
88
89A ``Built Module Interface`` (or ``BMI``) is the precompiled result of an
90importable module unit.
91
92Global module fragment
93~~~~~~~~~~~~~~~~~~~~~~
94
95The ``global module fragment`` (or ``GMF``) is the code between the ``module;``
96and the module declaration within a module unit.
97
98
99How to build projects using modules
100-----------------------------------
101
102Quick Start
103~~~~~~~~~~~
104
105Let's see a "hello world" example that uses modules.
106
107.. code-block:: c++
108
109  // Hello.cppm
110  module;
111  #include <iostream>
112  export module Hello;
113  export void hello() {
114    std::cout << "Hello World!\n";
115  }
116
117  // use.cpp
118  import Hello;
119  int main() {
120    hello();
121    return 0;
122  }
123
124Then, on the command line, invoke Clang like:
125
126.. code-block:: console
127
128  $ clang++ -std=c++20 Hello.cppm --precompile -o Hello.pcm
129  $ clang++ -std=c++20 use.cpp -fmodule-file=Hello=Hello.pcm Hello.pcm -o Hello.out
130  $ ./Hello.out
131  Hello World!
132
133In this example, we make and use a simple module ``Hello`` which contains only a
134primary module interface unit named ``Hello.cppm``.
135
136A more complex "hello world" example which uses the 4 kinds of module units is:
137
138.. code-block:: c++
139
140  // M.cppm
141  export module M;
142  export import :interface_part;
143  import :impl_part;
144  export void Hello();
145
146  // interface_part.cppm
147  export module M:interface_part;
148  export void World();
149
150  // impl_part.cppm
151  module;
152  #include <iostream>
153  #include <string>
154  module M:impl_part;
155  import :interface_part;
156
157  std::string W = "World.";
158  void World() {
159    std::cout << W << std::endl;
160  }
161
162  // Impl.cpp
163  module;
164  #include <iostream>
165  module M;
166  void Hello() {
167    std::cout << "Hello ";
168  }
169
170  // User.cpp
171  import M;
172  int main() {
173    Hello();
174    World();
175    return 0;
176  }
177
178Then, back on the command line, invoke Clang with:
179
180.. code-block:: console
181
182  # Precompiling the module
183  $ clang++ -std=c++20 interface_part.cppm --precompile -o M-interface_part.pcm
184  $ clang++ -std=c++20 impl_part.cppm --precompile -fprebuilt-module-path=. -o M-impl_part.pcm
185  $ clang++ -std=c++20 M.cppm --precompile -fprebuilt-module-path=. -o M.pcm
186  $ clang++ -std=c++20 Impl.cpp -fprebuilt-module-path=. -c -o Impl.o
187
188  # Compiling the user
189  $ clang++ -std=c++20 User.cpp -fprebuilt-module-path=. -c -o User.o
190
191  # Compiling the module and linking it together
192  $ clang++ -std=c++20 M-interface_part.pcm -fprebuilt-module-path=. -c -o M-interface_part.o
193  $ clang++ -std=c++20 M-impl_part.pcm -fprebuilt-module-path=. -c -o M-impl_part.o
194  $ clang++ -std=c++20 M.pcm -fprebuilt-module-path=. -c -o M.o
195  $ clang++ User.o M-interface_part.o  M-impl_part.o M.o Impl.o -o a.out
196
197We explain the options in the following sections.
198
199How to enable standard C++ modules
200~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
201
202Standard C++ modules are enabled automatically when the language standard mode
203is ``-std=c++20`` or newer.
204
205How to produce a BMI
206~~~~~~~~~~~~~~~~~~~~
207
208To generate a BMI for an importable module unit, use either the ``--precompile``
209or ``-fmodule-output`` command line options.
210
211The ``--precompile`` option generates the BMI as the output of the compilation
212with the output path specified using the ``-o`` option.
213
214The ``-fmodule-output`` option generates the BMI as a by-product of the
215compilation. If ``-fmodule-output=`` is specified, the BMI will be emitted to
216the specified location. If ``-fmodule-output`` and ``-c`` are specified, the
217BMI will be emitted in the directory of the output file with the name of the
218input file with the extension ``.pcm``. Otherwise, the BMI will be emitted in
219the working directory with the name of the input file with the extension
220``.pcm``.
221
222Generating BMIs with ``--precompile`` is referred to as two-phase compilation
223because it takes two steps to compile a source file to an object file.
224Generating BMIs with ``-fmodule-output`` is called one-phase compilation. The
225one-phase compilation model is simpler for build systems to implement while the
226two-phase compilation has the potential to compile faster due to higher
227parallelism. As an example, if there are two module units ``A`` and ``B``, and
228``B`` depends on ``A``, the one-phase compilation model needs to compile them
229serially, whereas the two-phase compilation model is able to be compiled as
230soon as ``A.pcm`` is available, and thus can be compiled simultaneously as the
231``A.pcm`` to ``A.o`` compilation step.
232
233File name requirements
234~~~~~~~~~~~~~~~~~~~~~~
235
236By convention, ``importable module unit`` files should use ``.cppm`` (or
237``.ccm``, ``.cxxm``, or ``.c++m``) as a file extension.
238``Module implementation unit`` files should use ``.cpp`` (or ``.cc``, ``.cxx``,
239or ``.c++``) as a file extension.
240
241A BMI should use ``.pcm`` as a file extension. The file name of the BMI for a
242``primary module interface unit`` should be ``module_name.pcm``. The file name
243of a BMI for a ``module partition unit`` should be
244``module_name-partition_name.pcm``.
245
246Clang may fail to build the module if different extensions are used. For
247example, if the filename of an ``importable module unit`` ends with ``.cpp``
248instead of ``.cppm``, then Clang cannot generate a BMI for the
249``importable module unit`` with the ``--precompile`` option because the
250``--precompile`` option would only run the preprocessor (``-E``). If using a
251different extension than the conventional one for an ``importable module unit``
252you can specify ``-x c++-module`` before the file. For example,
253
254.. code-block:: c++
255
256  // Hello.cpp
257  module;
258  #include <iostream>
259  export module Hello;
260  export void hello() {
261    std::cout << "Hello World!\n";
262  }
263
264  // use.cpp
265  import Hello;
266  int main() {
267    hello();
268    return 0;
269  }
270
271In this example, the extension used by the ``module interface`` is ``.cpp``
272instead of ``.cppm``, so it cannot be compiled like the previous example, but
273it can be compiled with:
274
275.. code-block:: console
276
277  $ clang++ -std=c++20 -x c++-module Hello.cpp --precompile -o Hello.pcm
278  $ clang++ -std=c++20 use.cpp -fprebuilt-module-path=. Hello.pcm -o Hello.out
279  $ ./Hello.out
280  Hello World!
281
282Module name requirements
283~~~~~~~~~~~~~~~~~~~~~~~~
284
285..
286
287  [module.unit]p1:
288
289  All module-names either beginning with an identifier consisting of std followed by zero
290  or more digits or containing a reserved identifier ([lex.name]) are reserved and shall not
291  be specified in a module-declaration; no diagnostic is required. If any identifier in a reserved
292  module-name is a reserved identifier, the module name is reserved for use by C++ implementations;
293  otherwise it is reserved for future standardization.
294
295Therefore, none of the following names are valid by default:
296
297.. code-block:: text
298
299    std
300    std1
301    std.foo
302    __test
303    // and so on ...
304
305Using a reserved module name is strongly discouraged, but
306``-Wno-reserved-module-identifier`` can be used to suppress the warning.
307
308Specifying dependent BMIs
309~~~~~~~~~~~~~~~~~~~~~~~~~
310
311There are 3 ways to specify a dependent BMI:
312
3131. ``-fprebuilt-module-path=<path/to/directory>``.
3142. ``-fmodule-file=<path/to/BMI>`` (Deprecated).
3153. ``-fmodule-file=<module-name>=<path/to/BMI>``.
316
317The ``-fprebuilt-module-path`` option specifies the path to search for
318dependent BMIs. Multiple paths may be specified, similar to using ``-I`` to
319specify a search path for header files. When importing a module ``M``, the
320compiler looks for ``M.pcm`` in the directories specified by
321``-fprebuilt-module-path``. Similarly, when importing a partition module unit
322``M:P``, the compiler looks for ``M-P.pcm`` in the directories specified by
323``-fprebuilt-module-path``.
324
325The ``-fmodule-file=<path/to/BMI>`` option causes the compiler to load the
326specified BMI directly. The ``-fmodule-file=<module-name>=<path/to/BMI>``
327option causes the compiler to load the specified BMI for the module specified
328by ``<module-name>`` when necessary. The main difference is that
329``-fmodule-file=<path/to/BMI>`` will load the BMI eagerly, whereas
330``-fmodule-file=<module-name>=<path/to/BMI>`` will only load the BMI lazily,
331as will ``-fprebuilt-module-path``. The ``-fmodule-file=<path/to/BMI>`` option
332for named modules is deprecated and will be removed in a future version of
333Clang.
334
335When these options are specified in the same invocation of the compiler, the
336``-fmodule-file=<path/to/BMI>`` option takes precedence over
337``-fmodule-file=<module-name>=<path/to/BMI>``, which takes precedence over
338``-fprebuilt-module-path=<path/to/directory>``.
339
340Note: all dependant BMIs must be specified explicitly, either directly or
341indirectly dependent BMIs explicitly. See
342https://github.com/llvm/llvm-project/issues/62707 for details.
343
344When compiling a ``module implementation unit``, the BMI of the corresponding
345``primary module interface unit`` must be specified because a module
346implementation unit implicitly imports the primary module interface unit.
347
348  [module.unit]p8
349
350  A module-declaration that contains neither an export-keyword nor a module-partition implicitly
351  imports the primary module interface unit of the module as if by a module-import-declaration.
352
353The ``-fprebuilt-module-path=<path/to/directory>``, ``-fmodule-file=<path/to/BMI>``,
354and ``-fmodule-file=<module-name>=<path/to/BMI>`` options may be specified
355multiple times. For example, the command line to compile ``M.cppm`` in
356the previous example could be rewritten as:
357
358.. code-block:: console
359
360  $ clang++ -std=c++20 M.cppm --precompile -fmodule-file=M:interface_part=M-interface_part.pcm -fmodule-file=M:impl_part=M-impl_part.pcm -o M.pcm
361
362When there are multiple ``-fmodule-file=<module-name>=`` options for the same
363``<module-name>``, the last ``-fmodule-file=<module-name>=`` overrides the
364previous ``-fmodule-file=<module-name>=`` option.
365
366Remember that module units still have an object counterpart to the BMI
367~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
368
369While module interfaces resemble traditional header files, they still require
370compilation. Module units are translation units, and need to be compiled to
371object files, which then need to be linked together as the following examples
372show.
373
374For example, the traditional compilation processes for headers are like:
375
376.. code-block:: text
377
378  src1.cpp -+> clang++ src1.cpp --> src1.o ---,
379  hdr1.h  --'                                 +-> clang++ src1.o src2.o ->  executable
380  hdr2.h  --,                                 |
381  src2.cpp -+> clang++ src2.cpp --> src2.o ---'
382
383And the compilation process for module units are like:
384
385.. code-block:: text
386
387                src1.cpp ----------------------------------------+> clang++ src1.cpp -------> src1.o -,
388  (header unit) hdr1.h    -> clang++ hdr1.h ...    -> hdr1.pcm --'                                    +-> clang++ src1.o mod1.o src2.o ->  executable
389                mod1.cppm -> clang++ mod1.cppm ... -> mod1.pcm --,--> clang++ mod1.pcm ... -> mod1.o -+
390                src2.cpp ----------------------------------------+> clang++ src2.cpp -------> src2.o -'
391
392As the diagrams show, we need to compile the BMI from module units to object
393files and then link the object files. (However, this cannot be done for the BMI
394from header units. See the section on :ref:`header units <header-units>` for
395more details.
396
397BMIs cannot be shipped in an archive to create a module library. Instead, the
398BMIs(``*.pcm``) are compiled into object files(``*.o``) and those object files
399are added to the archive instead.
400
401clang-cl
402~~~~~~~~
403
404``clang-cl`` supports the same options as ``clang++`` for modules as detailed above;
405there is no need to prefix these options with ``/clang:``. Note that ``cl.exe``
406`options to emit/consume IFC files <https://devblogs.microsoft.com/cppblog/using-cpp-modules-in-msvc-from-the-command-line-part-1/>` are *not* supported.
407The resultant precompiled modules are also not compatible for use with ``cl.exe``.
408
409We recommend that build system authors use the above-mentioned ``clang++`` options  with ``clang-cl`` to build modules.
410
411Consistency Requirements
412~~~~~~~~~~~~~~~~~~~~~~~~
413
414Modules can be viewed as a kind of cache to speed up compilation. Thus, like
415other caching techniques, it is important to maintain cache consistency which
416is why Clang does very strict checking for consistency.
417
418Options consistency
419^^^^^^^^^^^^^^^^^^^
420
421Compiler options related to the language dialect for a module unit and its
422non-module-unit uses need to be consistent. Consider the following example:
423
424.. code-block:: c++
425
426  // M.cppm
427  export module M;
428
429  // Use.cpp
430  import M;
431
432.. code-block:: console
433
434  $ clang++ -std=c++20 M.cppm --precompile -o M.pcm
435  $ clang++ -std=c++23 Use.cpp -fprebuilt-module-path=.
436
437Clang rejects the example due to the inconsistent language standard modes. Not
438all compiler options are language dialect options, though. For example:
439
440.. code-block:: console
441
442  $ clang++ -std=c++20 M.cppm --precompile -o M.pcm
443  # Inconsistent optimization level.
444  $ clang++ -std=c++20 -O3 Use.cpp -fprebuilt-module-path=.
445  # Inconsistent debugging level.
446  $ clang++ -std=c++20 -g Use.cpp -fprebuilt-module-path=.
447
448Although the optimization and debugging levels are inconsistent, these
449compilations are accepted because the compiler options do not impact the
450language dialect.
451
452Note that the compiler **currently** doesn't reject inconsistent macro
453definitions (this may change in the future). For example:
454
455.. code-block:: console
456
457  $ clang++ -std=c++20 M.cppm --precompile -o M.pcm
458  # Inconsistent optimization level.
459  $ clang++ -std=c++20 -O3 -DNDEBUG Use.cpp -fprebuilt-module-path=.
460
461Currently, Clang accepts the above example, though it may produce surprising
462results if the debugging code depends on consistent use of ``NDEBUG`` in other
463translation units.
464
465Source Files Consistency
466^^^^^^^^^^^^^^^^^^^^^^^^
467
468Clang may open the input files\ :sup:`1`` of a BMI during the compilation. This implies that
469when Clang consumes a BMI, all the input files need to be present in the original path
470and with the original contents.
471
472To overcome these requirements and simplify cases like distributed builds and sandboxed
473builds, users can use the ``-fmodules-embed-all-files`` flag to embed all input files
474into the BMI so that Clang does not need to open the corresponding file on disk.
475
476When the ``-fmodules-embed-all-files`` flag are enabled, Clang explicitly emits the source
477code into the BMI file, the contents of the BMI file contain a sufficiently verbose
478representation to reproduce the original source file.
479
480:sup:`1`` Input files: The source files which took part in the compilation of the BMI.
481For example:
482
483.. code-block:: c++
484
485  // M.cppm
486  module;
487  #include "foo.h"
488  export module M;
489
490  // foo.h
491  #pragma once
492  #include "bar.h"
493
494The ``M.cppm``, ``foo.h`` and ``bar.h`` are input files for the BMI of ``M.cppm``.
495
496Object definition consistency
497^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
498
499The C++ language requires that declarations of the same entity in different
500translation units have the same definition, which is known as the One
501Definition Rule (ODR). Without modules, the compiler cannot perform strong ODR
502violation checking because it only sees one translation unit at a time. With
503the use of modules, the compiler can perform checks for ODR violations across
504translation units.
505
506However, the current ODR checking mechanisms are not perfect. There are a
507significant number of false positive ODR violation diagnostics, where the
508compiler incorrectly diagnoses two identical declarations as having different
509definitions. Further, true positive ODR violations are not always reported.
510
511To give a better user experience, improve compilation performance, and for
512consistency with MSVC, ODR checking of declarations in the global module
513fragment is disabled by default. These checks can be enabled by specifying
514``-Xclang -fno-skip-odr-check-in-gmf`` when compiling. If the check is enabled
515and you encounter incorrect or missing diagnostics, please report them via the
516`community issue tracker <https://github.com/llvm/llvm-project/issues/>`_.
517
518Privacy Issue
519-------------
520
521BMIs are not and should not be treated as an information hiding mechanism.
522They should always be assumed to contain all the information that was used to
523create them, in a recoverable form.
524
525ABI Impacts
526-----------
527
528This section describes the new ABI changes brought by modules. Only changes to
529the Itanium C++ ABI are covered.
530
531Name Mangling
532~~~~~~~~~~~~~
533
534The declarations in a module unit which are not in the global module fragment
535have new linkage names.
536
537For example,
538
539.. code-block:: c++
540
541  export module M;
542  namespace NS {
543    export int foo();
544  }
545
546The linkage name of ``NS::foo()`` is ``_ZN2NSW1M3fooEv``. This couldn't be
547demangled by previous versions of the debugger or demangler. As of LLVM 15.x,
548``llvm-cxxfilt`` can be used to demangle this:
549
550.. code-block:: console
551
552  $ llvm-cxxfilt _ZN2NSW1M3fooEv
553    NS::foo@M()
554
555The result should be read as ``NS::foo()`` in module ``M``.
556
557The ABI implies that something cannot be declared in a module unit and defined
558in a non-module unit (or vice-versa), as this would result in linking errors.
559
560Despite this, it is possible to implement declarations with a compatible ABI in
561a module unit by using a language linkage specifier because the declarations in
562the language linkage specifier are attached to the global module fragment. For
563example:
564
565.. code-block:: c++
566
567  export module M;
568  namespace NS {
569    export extern "C++" int foo();
570  }
571
572Now the linkage name of ``NS::foo()`` will be ``_ZN2NS3fooEv``.
573
574Module Initializers
575~~~~~~~~~~~~~~~~~~~
576
577All importable module units are required to emit an initializer function to
578handle the dynamic initialization of non-inline variables in the module unit.
579The importable module unit has to emit the initializer even if there is no
580dynamic initialization; otherwise, the importer may call a nonexistent
581function. The initializer function emits calls to imported modules first
582followed by calls to all to of the dynamic initializers in the current module
583unit.
584
585Translation units that explicitly or implicitly import a named module must call
586the initializer functions of the imported named module within the sequence of
587the dynamic initializers in the translation unit. Initializations of entities
588at namespace scope are appearance-ordered. This (recursively) extends to
589imported modules at the point of appearance of the import declaration.
590
591If the imported module is known to be empty, the call to its initializer may be
592omitted. Additionally, if the imported module is known to have already been
593imported, the call to its initializer may be omitted.
594
595Reduced BMI
596-----------
597
598To support the two-phase compilation model, Clang puts everything needed to
599produce an object into the BMI. However, other consumers of the BMI generally
600don't need that information. This makes the BMI larger and may introduce
601unnecessary dependencies for the BMI. To mitigate the problem, Clang has a
602compiler option to reduce the information contained in the BMI. These two
603formats are known as Full BMI and Reduced BMI, respectively.
604
605Users can use the ``-fmodules-reduced-bmi`` option to produce a
606Reduced BMI.
607
608For the one-phase compilation model (CMake implements this model), with
609``-fmodules-reduced-bmi``, the generated BMI will be a Reduced
610BMI automatically. (The output path of the BMI is specified by
611``-fmodule-output=`` as usual with the one-phase compilation model).
612
613It is also possible to produce a Reduced BMI with the two-phase compilation
614model. When ``-fmodules-reduced-bmi``, ``--precompile``, and
615``-fmodule-output=`` are specified, the generated BMI specified by ``-o`` will
616be a full BMI and the BMI specified by ``-fmodule-output=`` will be a Reduced
617BMI. The dependency graph in this case would look like:
618
619.. code-block:: none
620
621  module-unit.cppm --> module-unit.full.pcm -> module-unit.o
622                    |
623                    -> module-unit.reduced.pcm -> consumer1.cpp
624                                               -> consumer2.cpp
625                                               -> ...
626                                               -> consumer_n.cpp
627
628Clang does not emit diagnostics when ``-fmodules-reduced-bmi`` is
629used with a non-module unit. This design permits users of the one-phase
630compilation model to try using reduced BMIs without needing to modify the build
631system. The two-phase compilation module requires build system support.
632
633In a Reduced BMI, Clang does not emit unreachable entities from the global
634module fragment, or definitions of non-inline functions and non-inline
635variables. This may not be a transparent change.
636
637Consider the following example:
638
639.. code-block:: c++
640
641  // foo.h
642  namespace N {
643    struct X {};
644    int d();
645    int e();
646    inline int f(X, int = d()) { return e(); }
647    int g(X);
648    int h(X);
649  }
650
651  // M.cppm
652  module;
653  #include "foo.h"
654  export module M;
655  template<typename T> int use_f() {
656    N::X x;                       // N::X, N, and :: are decl-reachable from use_f
657    return f(x, 123);             // N::f is decl-reachable from use_f,
658                                  // N::e is indirectly decl-reachable from use_f
659                                  //   because it is decl-reachable from N::f, and
660                                  // N::d is decl-reachable from use_f
661                                  //   because it is decl-reachable from N::f
662                                  //   even though it is not used in this call
663  }
664  template<typename T> int use_g() {
665    N::X x;                       // N::X, N, and :: are decl-reachable from use_g
666    return g((T(), x));           // N::g is not decl-reachable from use_g
667  }
668  template<typename T> int use_h() {
669    N::X x;                       // N::X, N, and :: are decl-reachable from use_h
670    return h((T(), x));           // N::h is not decl-reachable from use_h, but
671                                  // N::h is decl-reachable from use_h<int>
672  }
673  int k = use_h<int>();
674    // use_h<int> is decl-reachable from k, so
675    // N::h is decl-reachable from k
676
677  // M-impl.cpp
678  module M;
679  int a = use_f<int>();           // OK
680  int b = use_g<int>();           // error: no viable function for call to g;
681                                  // g is not decl-reachable from purview of
682                                  // module M's interface, so is discarded
683  int c = use_h<int>();           // OK
684
685In the above example, the function definition of ``N::g`` is elided from the
686Reduced BMI of ``M.cppm``. Then the use of ``use_g<int>`` in ``M-impl.cpp``
687fails to instantiate. For such issues, users can add references to ``N::g`` in
688the `module purview <https://eel.is/c++draft/module.unit#5>`_ of ``M.cppm`` to
689ensure it is reachable, e.g. ``using N::g;``.
690
691Support for Reduced BMIs is still experimental, but it may become the default
692in the future. The expected roadmap for Reduced BMIs as of Clang 19.x is:
693
6941. ``-fexperimental-modules-reduced-bmi`` was introduced in v19.x
6952. For v20.x, ``-fmodules-reduced-bmi`` is introduced as an equivalent non-experimental
696   option. It is expected to stay opt-in for 1~2 releases, though the period depends
697   on user feedback and may be extended.
6983. Finally, ``-fmodules-reduced-bmi`` will be the default. When that time
699   comes, the term BMI will refer to the Reduced BMI and the Full BMI will only
700   be meaningful to build systems which elect to support two-phase compilation.
701
702Experimental Non-Cascading Changes
703----------------------------------
704
705This section is primarily for build system vendors. For end compiler users,
706if you don't want to read it all, this is helpful to reduce recompilations.
707We encourage build system vendors and end users try this out and bring feedback.
708
709Before Clang 19, a change in BMI of any (transitive) dependency would cause the
710outputs of the BMI to change. Starting with Clang 19, changes to non-direct
711dependencies should not directly affect the output BMI, unless they affect the
712results of the compilations. We expect that there are many more opportunities
713for this optimization than we currently have realized and would appreaciate
714feedback about missed optimization opportunities. For example,
715
716.. code-block:: c++
717
718  // m-partA.cppm
719  export module m:partA;
720
721  // m-partB.cppm
722  export module m:partB;
723  export int getB() { return 44; }
724
725  // m.cppm
726  export module m;
727  export import :partA;
728  export import :partB;
729
730  // useBOnly.cppm
731  export module useBOnly;
732  import m;
733  export int B() {
734    return getB();
735  }
736
737  // Use.cc
738  import useBOnly;
739  int get() {
740    return B();
741  }
742
743To compile the project (for brevity, some commands are omitted.):
744
745.. code-block:: console
746
747  $ clang++ -std=c++20 m-partA.cppm --precompile -o m-partA.pcm
748  $ clang++ -std=c++20 m-partB.cppm --precompile -o m-partB.pcm
749  $ clang++ -std=c++20 m.cppm --precompile -o m.pcm -fprebuilt-module-path=.
750  $ clang++ -std=c++20 useBOnly.cppm --precompile -o useBOnly.pcm -fprebuilt-module-path=.
751  $ md5sum useBOnly.pcm
752  07656bf4a6908626795729295f9608da  useBOnly.pcm
753
754If the interface of ``m-partA.cppm`` is changed to:
755
756.. code-block:: c++
757
758  // m-partA.v1.cppm
759  export module m:partA;
760  export int getA() { return 43; }
761
762and the BMI for ``useBOnly`` is recompiled as in:
763
764.. code-block:: console
765
766  $ clang++ -std=c++20 m-partA.cppm --precompile -o m-partA.pcm
767  $ clang++ -std=c++20 m-partB.cppm --precompile -o m-partB.pcm
768  $ clang++ -std=c++20 m.cppm --precompile -o m.pcm -fprebuilt-module-path=.
769  $ clang++ -std=c++20 useBOnly.cppm --precompile -o useBOnly.pcm -fprebuilt-module-path=.
770  $ md5sum useBOnly.pcm
771  07656bf4a6908626795729295f9608da  useBOnly.pcm
772
773then the contents of ``useBOnly.pcm`` remain unchanged.
774Consequently, if the build system only bases recompilation decisions on directly imported modules,
775it becomes possible to skip the recompilation of ``Use.cc``.
776It should be fine because the altered interfaces do not affect ``Use.cc`` in any way;
777the changes do not cascade.
778
779When ``Clang`` generates a BMI, it records the hash values of all potentially contributory BMIs
780for the BMI being produced. This ensures that build systems are not required to consider
781transitively imported modules when deciding whether to recompile.
782
783What is considered to be a potential contributory BMIs is currently unspecified.
784However, it is a severe bug for a BMI to remain unchanged following an observable change
785that affects its consumers.
786
787Build systems may utilize this optimization by doing an update-if-changed operation to the BMI
788that is consumed from the BMI that is output by the compiler.
789
790We encourage build systems to add an experimental mode that
791reuses the cached BMI when **direct** dependencies did not change,
792even if **transitive** dependencies did change.
793
794Given there are potential compiler bugs, we recommend that build systems
795support this feature as a configurable option so that users
796can go back to the transitive change mode safely at any time.
797
798Interactions with Reduced BMI
799~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
800
801With reduced BMI, non-cascading changes can be more powerful. For example,
802
803.. code-block:: c++
804
805  // A.cppm
806  export module A;
807  export int a() { return 44; }
808
809  // B.cppm
810  export module B;
811  import A;
812  export int b() { return a(); }
813
814.. code-block:: console
815
816  $ clang++ -std=c++20 A.cppm -c -fmodule-output=A.pcm  -fmodules-reduced-bmi -o A.o
817  $ clang++ -std=c++20 B.cppm -c -fmodule-output=B.pcm  -fmodules-reduced-bmi -o B.o -fmodule-file=A=A.pcm
818  $ md5sum B.pcm
819  6c2bd452ca32ab418bf35cd141b060b9  B.pcm
820
821And let's change the implementation for ``A.cppm`` into:
822
823.. code-block:: c++
824
825  export module A;
826  int a_impl() { return 99; }
827  export int a() { return a_impl(); }
828
829and recompile the example:
830
831.. code-block:: console
832
833  $ clang++ -std=c++20 A.cppm -c -fmodule-output=A.pcm  -fmodules-reduced-bmi -o A.o
834  $ clang++ -std=c++20 B.cppm -c -fmodule-output=B.pcm  -fmodules-reduced-bmi -o B.o -fmodule-file=A=A.pcm
835  $ md5sum B.pcm
836  6c2bd452ca32ab418bf35cd141b060b9  B.pcm
837
838We should find the contents of ``B.pcm`` remains the same. In this case, the build system is
839allowed to skip recompilations of TUs which solely and directly depend on module ``B``.
840
841This only happens with a reduced BMI. With reduced BMIs, we won't record the function body
842of ``int b()`` in the BMI for ``B`` so that the module ``A`` doesn't contribute to the BMI of ``B``
843and we have less dependencies.
844
845Performance Tips
846----------------
847
848Reduce duplications
849~~~~~~~~~~~~~~~~~~~
850
851While it is valid to have duplicated declarations in the global module fragments
852of different module units, it is not free for Clang to deal with the duplicated
853declarations. A translation unit will compile more slowly if there is a lot of
854duplicated declarations between the translation unit and modules it imports.
855For example:
856
857.. code-block:: c++
858
859  // M-partA.cppm
860  module;
861  #include "big.header.h"
862  export module M:partA;
863  ...
864
865  // M-partB.cppm
866  module;
867  #include "big.header.h"
868  export module M:partB;
869  ...
870
871  // other partitions
872  ...
873
874  // M-partZ.cppm
875  module;
876  #include "big.header.h"
877  export module M:partZ;
878  ...
879
880  // M.cppm
881  export module M;
882  export import :partA;
883  export import :partB;
884  ...
885  export import :partZ;
886
887  // use.cpp
888  import M;
889  ... // use declarations from module M.
890
891When ``big.header.h`` is big enough and there are a lot of partitions, the
892compilation of ``use.cpp`` may be significantly slower than the following
893approach:
894
895.. code-block:: c++
896
897  module;
898  #include "big.header.h"
899  export module m:big.header.wrapper;
900  export ... // export the needed declarations
901
902  // M-partA.cppm
903  export module M:partA;
904  import :big.header.wrapper;
905  ...
906
907  // M-partB.cppm
908  export module M:partB;
909  import :big.header.wrapper;
910  ...
911
912  // other partitions
913  ...
914
915  // M-partZ.cppm
916  export module M:partZ;
917  import :big.header.wrapper;
918  ...
919
920  // M.cppm
921  export module M;
922  export import :partA;
923  export import :partB;
924  ...
925  export import :partZ;
926
927  // use.cpp
928  import M;
929  ... // use declarations from module M.
930
931Reducing the duplication from textual includes is what improves compile-time
932performance.
933
934To help users to identify such issues, we add a warning ``-Wdecls-in-multiple-modules``.
935This warning is disabled by default and it needs to be explicitly enabled or by ``-Weverything``.
936
937Transitioning to modules
938------------------------
939
940It is best for new code and libraries to use modules from the start if
941possible. However, it may be a breaking change for existing code or libraries
942to switch to modules. As a result, many existing libraries need to provide
943both headers and module interfaces for a while to not break existing users.
944
945This section suggests some suggestions on how to ease the transition process
946for existing libraries. **Note that this information is only intended as
947guidance, rather than as requirements to use modules in Clang.** It presumes
948the project is starting with no module-based dependencies.
949
950ABI non-breaking styles
951~~~~~~~~~~~~~~~~~~~~~~~
952
953export-using style
954^^^^^^^^^^^^^^^^^^
955
956.. code-block:: c++
957
958  module;
959  #include "header_1.h"
960  #include "header_2.h"
961  ...
962  #include "header_n.h"
963  export module your_library;
964  export namespace your_namespace {
965    using decl_1;
966    using decl_2;
967    ...
968    using decl_n;
969  }
970
971This example shows how to include all the headers containing declarations which
972need to be exported, and uses `using` declarations in an `export` block to
973produce the module interface.
974
975export extern-C++ style
976^^^^^^^^^^^^^^^^^^^^^^^
977
978.. code-block:: c++
979
980  module;
981  #include "third_party/A/headers.h"
982  #include "third_party/B/headers.h"
983  ...
984  #include "third_party/Z/headers.h"
985  export module your_library;
986  #define IN_MODULE_INTERFACE
987  extern "C++" {
988    #include "header_1.h"
989    #include "header_2.h"
990    ...
991    #include "header_n.h"
992  }
993
994Headers (from ``header_1.h`` to ``header_n.h``) need to define the macro:
995
996.. code-block:: c++
997
998  #ifdef IN_MODULE_INTERFACE
999  #define EXPORT export
1000  #else
1001  #define EXPORT
1002  #endif
1003
1004and put ``EXPORT`` on the declarations you want to export.
1005
1006Also, it is recommended to refactor headers to include third-party headers
1007conditionally:
1008
1009.. code-block:: c++
1010
1011  #ifndef IN_MODULE_INTERFACE
1012  #include "third_party/A/headers.h"
1013  #endif
1014
1015  #include "header_x.h"
1016
1017  ...
1018
1019This can be helpful because it gives better diagnostic messages if the module
1020interface unit is not properly updated when modifying code.
1021
1022This approach works because the declarations with language linkage are attached
1023to the global module. Thus, the ABI of the modular form of the library does not
1024change.
1025
1026While this style is more involved than the export-using style, it makes it
1027easier to further refactor the library to other styles.
1028
1029ABI breaking style
1030~~~~~~~~~~~~~~~~~~
1031
1032The term ``ABI breaking`` may sound like a bad approach. However, this style
1033forces consumers of the library use it in a consistent way. e.g., either always
1034include headers for the library or always import modules. The style prevents
1035the ability to mix includes and imports for the library.
1036
1037The pattern for ABI breaking style is similar to the export extern-C++ style.
1038
1039.. code-block:: c++
1040
1041  module;
1042  #include "third_party/A/headers.h"
1043  #include "third_party/B/headers.h"
1044  ...
1045  #include "third_party/Z/headers.h"
1046  export module your_library;
1047  #define IN_MODULE_INTERFACE
1048  #include "header_1.h"
1049  #include "header_2.h"
1050  ...
1051  #include "header_n.h"
1052
1053  #if the number of .cpp files in your project are small
1054  module :private;
1055  #include "source_1.cpp"
1056  #include "source_2.cpp"
1057  ...
1058  #include "source_n.cpp"
1059  #else // the number of .cpp files in your project are a lot
1060  // Using all the declarations from third-party libraries which are
1061  // used in the .cpp files.
1062  namespace third_party_namespace {
1063    using third_party_decl_used_in_cpp_1;
1064    using third_party_decl_used_in_cpp_2;
1065    ...
1066    using third_party_decl_used_in_cpp_n;
1067  }
1068  #endif
1069
1070(And add `EXPORT` and conditional include to the headers as suggested in the
1071export extern-C++ style section.)
1072
1073The ABI with modules is different and thus we need to compile the source files
1074into the new ABI. This is done by an additional part of the interface unit:
1075
1076.. code-block:: c++
1077
1078  #if the number of .cpp files in your project are small
1079  module :private;
1080  #include "source_1.cpp"
1081  #include "source_2.cpp"
1082  ...
1083  #include "source_n.cpp"
1084  #else // the number of .cpp files in your project are a lot
1085  // Using all the declarations from third-party libraries which are
1086  // used in the .cpp files.
1087  namespace third_party_namespace {
1088    using third_party_decl_used_in_cpp_1;
1089    using third_party_decl_used_in_cpp_2;
1090    ...
1091    using third_party_decl_used_in_cpp_n;
1092  }
1093  #endif
1094
1095If the number of source files is small, everything can be put in the private
1096module fragment directly (it is recommended to add conditional includes to the
1097source files as well). However, compile time performance will be bad if there
1098are a lot of source files to compile.
1099
1100**Note that the private module fragment can only be in the primary module
1101interface unit and the primary module interface unit containing the private
1102module fragment should be the only module unit of the corresponding module.**
1103
1104In this case, source files (.cpp files) must be converted to module
1105implementation units:
1106
1107.. code-block:: c++
1108
1109  #ifndef IN_MODULE_INTERFACE
1110  // List all the includes here.
1111  #include "third_party/A/headers.h"
1112  ...
1113  #include "header.h"
1114  #endif
1115
1116  module your_library;
1117
1118  // Following off should be unchanged.
1119  ...
1120
1121The module implementation unit will import the primary module implicitly. Do
1122not include any headers in the module implementation units as it avoids
1123duplicated declarations between translation units. This is why non-exported
1124using declarations should be added from third-party libraries in the primary
1125module interface unit.
1126
1127If the library is provided as ``libyour_library.so``, a modular library (e.g.,
1128``libyour_library_modules.so``) may also need to be provided for ABI
1129compatibility.
1130
1131What if there are headers only included by the source files
1132^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1133
1134The above practice may be problematic if there are headers only included by the
1135source files. When using a private module fragment, this issue may be solved by
1136including those headers in the private module fragment. While it is OK to solve
1137it by including the implementation headers in the module purview when using
1138implementation module units, it may be suboptimal because the primary module
1139interface units now contain entities that do not belong to the interface.
1140
1141This can potentially be improved by introducing a module partition
1142implementation unit. An internal module partition unit is an importable
1143module unit which is internal to the module itself.
1144
1145Providing a header to skip parsing redundant headers
1146~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1147
1148Many redeclarations shared between translation units causes Clang to have
1149slower compile-time performance. Further, there are known issues with
1150`include after import <https://github.com/llvm/llvm-project/issues/61465>`_.
1151Even when that issue is resolved, users may still get slower compilation speed
1152and larger BMIs. For these reasons, it is recommended to not include headers
1153after importing the corresponding module. However, it is not always easy if the
1154library is included by other dependencies, as in:
1155
1156.. code-block:: c++
1157
1158  #include "third_party/A.h" // #include "your_library/a_header.h"
1159  import your_library;
1160
1161or
1162
1163.. code-block:: c++
1164
1165  import your_library;
1166  #include "third_party/A.h" // #include "your_library/a_header.h"
1167
1168For such cases, it is best if the library providing both module and header
1169interfaces also provides a header which skips parsing so that the library can
1170be imported with the following approach that skips redundant redeclarations:
1171
1172.. code-block:: c++
1173
1174  import your_library;
1175  #include "your_library_imported.h"
1176  #include "third_party/A.h" // #include "your_library/a_header.h" but got skipped
1177
1178The implementation of ``your_library_imported.h`` can be a set of controlling
1179macros or an overall controlling macro if using `#pragma once`. Then headers
1180can be refactored to:
1181
1182.. code-block:: c++
1183
1184  #pragma once
1185  #ifndef YOUR_LIBRARY_IMPORTED
1186  ...
1187  #endif
1188
1189If the modules imported by the library provide such headers, remember to add
1190them to ``your_library_imported.h`` too.
1191
1192Importing modules
1193~~~~~~~~~~~~~~~~~
1194
1195When there are dependent libraries providing modules, they should be imported
1196in your module as well. Many existing libraries will fall into this category
1197once the ``std`` module is more widely available.
1198
1199All dependent libraries providing modules
1200^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1201
1202Of course, most of the complexity disappears if all the dependent libraries
1203provide modules.
1204
1205Headers need to be converted to include third-party headers conditionally. Then,
1206for the export-using style:
1207
1208.. code-block:: c++
1209
1210  module;
1211  import modules_from_third_party;
1212  #define IN_MODULE_INTERFACE
1213  #include "header_1.h"
1214  #include "header_2.h"
1215  ...
1216  #include "header_n.h"
1217  export module your_library;
1218  export namespace your_namespace {
1219    using decl_1;
1220    using decl_2;
1221    ...
1222    using decl_n;
1223  }
1224
1225or, for the export extern-C++ style:
1226
1227.. code-block:: c++
1228
1229  export module your_library;
1230  import modules_from_third_party;
1231  #define IN_MODULE_INTERFACE
1232  extern "C++" {
1233    #include "header_1.h"
1234    #include "header_2.h"
1235    ...
1236    #include "header_n.h"
1237  }
1238
1239or, for the ABI-breaking style,
1240
1241.. code-block:: c++
1242
1243  export module your_library;
1244  import modules_from_third_party;
1245  #define IN_MODULE_INTERFACE
1246  #include "header_1.h"
1247  #include "header_2.h"
1248  ...
1249  #include "header_n.h"
1250
1251  #if the number of .cpp files in your project are small
1252  module :private;
1253  #include "source_1.cpp"
1254  #include "source_2.cpp"
1255  ...
1256  #include "source_n.cpp"
1257  #endif
1258
1259Non-exported ``using`` declarations are unnecessary if using implementation
1260module units. Instead, third-party modules can be imported directly in
1261implementation module units.
1262
1263Partial dependent libraries providing modules
1264^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1265
1266If the library has to mix the use of ``include`` and ``import`` in its module,
1267the primary goal is still the removal of duplicated declarations in translation
1268units as much as possible. If the imported modules provide headers to skip
1269parsing their headers, those should be included after the import. If the
1270imported modules don't provide such a header, one can be made manually for
1271improved compile time performance.
1272
1273Reachability of internal partition units
1274----------------------------------------
1275
1276The internal partition units are sometimes called implementation partition units in other documentation.
1277However, the name may be confusing since implementation partition units are not implementation
1278units.
1279
1280According to `[module.reach]p1 <https://eel.is/c++draft/module.reach#1>`_ and
1281`[module.reach]p2 <https://eel.is/c++draft/module.reach#2>`_ (from N4986):
1282
1283  A translation unit U is necessarily reachable from a point P if U is a module
1284  interface unit on which the translation unit containing P has an interface
1285  dependency, or the translation unit containing P imports U, in either case
1286  prior to P.
1287
1288  All translation units that are necessarily reachable are reachable. Additional
1289  translation units on which the point within the program has an interface
1290  dependency may be considered reachable, but it is unspecified which are and
1291  under what circumstances.
1292
1293For example,
1294
1295.. code-block:: c++
1296
1297  // a.cpp
1298  import B;
1299  int main()
1300  {
1301      g<void>();
1302  }
1303
1304  // b.cppm
1305  export module B;
1306  import :C;
1307  export template <typename T> inline void g() noexcept
1308  {
1309      return f<T>();
1310  }
1311
1312  // c.cppm
1313  module B:C;
1314  template<typename> inline void f() noexcept {}
1315
1316The internal partition unit ``c.cppm`` is not necessarily reachable by
1317``a.cpp`` because ``c.cppm`` is not a module interface unit and ``a.cpp``
1318doesn't import ``c.cppm``. This leaves it up to the compiler to decide if
1319``c.cppm`` is reachable by ``a.cpp`` or not. Clang's behavior is that
1320indirectly imported internal partition units are not reachable.
1321
1322The suggested approach for using an internal partition unit in Clang is
1323to only import them in the implementation unit.
1324
1325Known Issues
1326------------
1327
1328The following describes issues in the current implementation of modules. Please
1329see
1330`the issues list for modules <https://github.com/llvm/llvm-project/labels/clang%3Amodules>`_
1331for a list of issues or to file a new issue if you don't find an existing one.
1332When creating a new issue for standard C++ modules, please start the title with
1333``[C++20] [Modules]`` (or ``[C++23] [Modules]``, etc) and add the label
1334``clang:modules`` if possible.
1335
1336A high-level overview of support for standards features, including modules, can
1337be found on the `C++ Feature Status <https://clang.llvm.org/cxx_status.html>`_
1338page.
1339
1340Including headers after import is not well-supported
1341~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1342
1343The following example is accepted:
1344
1345.. code-block:: c++
1346
1347  #include <iostream>
1348  import foo; // assume module 'foo' contain the declarations from `<iostream>`
1349
1350  int main(int argc, char *argv[])
1351  {
1352      std::cout << "Test\n";
1353      return 0;
1354  }
1355
1356but if the order of ``#include <iostream>`` and ``import foo;`` is reversed,
1357then the code is currently rejected:
1358
1359.. code-block:: c++
1360
1361  import foo; // assume module 'foo' contain the declarations from `<iostream>`
1362  #include <iostream>
1363
1364  int main(int argc, char *argv[])
1365  {
1366      std::cout << "Test\n";
1367      return 0;
1368  }
1369
1370Both of the above examples should be accepted.
1371
1372This is a limitation of the implementation. In the first example, the compiler
1373will see and parse ``<iostream>`` first then it will see the ``import``. In
1374this case, ODR checking and declaration merging will happen in the
1375deserializer. In the second example, the compiler will see the ``import`` first
1376and the ``#include`` second which results in ODR checking and declarations
1377merging happening in the semantic analyzer. This is due to a divergence in the
1378implementation path. This is tracked by
1379`#61465 <https://github.com/llvm/llvm-project/issues/61465>`_.
1380
1381Ignored ``preferred_name`` Attribute
1382~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1383
1384When Clang writes BMIs, it will ignore the ``preferred_name`` attribute on
1385declarations which use it. Thus, the preferred name will not be displayed in
1386the debugger as expected. This is tracked by
1387`#56490 <https://github.com/llvm/llvm-project/issues/56490>`_.
1388
1389Don't emit macros about module declaration
1390~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1391
1392This is covered by `P1857R3 <https://wg21.link/P1857R3>`_. It is mentioned here
1393because we want users to be aware that we don't yet implement it.
1394
1395A direct approach to write code that can be compiled by both modules and
1396non-module builds may look like:
1397
1398.. code-block:: c++
1399
1400  MODULE
1401  IMPORT header_name
1402  EXPORT_MODULE MODULE_NAME;
1403  IMPORT header_name
1404  EXPORT ...
1405
1406The intent of this is that this file can be compiled like a module unit or a
1407non-module unit depending on the definition of some macros. However, this usage
1408is forbidden by P1857R3 which is not yet implemented in Clang. This means that
1409is possible to write invalid modules which will no longer be accepted once
1410P1857R3 is implemented. This is tracked by
1411`#54047 <https://github.com/llvm/llvm-project/issues/54047>`_.
1412
1413Until then, it is recommended not to mix macros with module declarations.
1414
1415
1416In consistent filename suffix requirement for importable module units
1417~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1418
1419Currently, Clang requires the file name of an ``importable module unit`` to
1420have ``.cppm`` (or ``.ccm``, ``.cxxm``, ``.c++m``) as the file extension.
1421However, the behavior is inconsistent with other compilers. This is tracked by
1422`#57416 <https://github.com/llvm/llvm-project/issues/57416>`_.
1423
1424Incorrect ODR violation diagnostics
1425~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1426
1427ODR violations are a common issue when using modules. Clang sometimes produces
1428false-positive diagnostics or fails to produce true-positive diagnostics of the
1429One Definition Rule. One often-reported example is:
1430
1431.. code-block:: c++
1432
1433  // part.cc
1434  module;
1435  typedef long T;
1436  namespace ns {
1437  inline void fun() {
1438      (void)(T)0;
1439  }
1440  }
1441  export module repro:part;
1442
1443  // repro.cc
1444  module;
1445  typedef long T;
1446  namespace ns {
1447      using ::T;
1448  }
1449  namespace ns {
1450  inline void fun() {
1451      (void)(T)0;
1452  }
1453  }
1454  export module repro;
1455  export import :part;
1456
1457Currently the compiler incorrectly diagnoses the inconsistent definition of
1458``fun()`` in two module units. Because both definitions of ``fun()`` have the
1459same spelling and ``T`` refers to the same type entity, there is no ODR
1460violation. This is tracked by
1461`#78850 <https://github.com/llvm/llvm-project/issues/78850>`_.
1462
1463Using TU-local entity in other units
1464~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1465
1466Module units are translation units, so the entities which should be local to
1467the module unit itself should never be used by other units.
1468
1469The C++ standard defines the concept of ``TU-local`` and ``exposure`` in
1470`basic.link/p14 <https://eel.is/c++draft/basic.link#14>`_,
1471`basic.link/p15 <https://eel.is/c++draft/basic.link#15>`_,
1472`basic.link/p16 <https://eel.is/c++draft/basic.link#16>`_,
1473`basic.link/p17 <https://eel.is/c++draft/basic.link#17>`_, and
1474`basic.link/p18 <https://eel.is/c++draft/basic.link#18>`_.
1475
1476However, Clang doesn't formally support these two concepts. This results in
1477unclear or confusing diagnostic messages. Further, Clang may import
1478``TU-local`` entities to other units without any diagnostics. This is tracked
1479by `#78173 <https://github.com/llvm/llvm-project/issues/78173>`_.
1480
1481.. _header-units:
1482
1483Header Units
1484============
1485
1486How to build projects using header units
1487----------------------------------------
1488
1489.. warning::
1490
1491   The support for header units, including related command line options, is
1492   experimental. There are still many unanswered question about how tools
1493   should interact with header units. The details described here may change in
1494   the future.
1495
1496Quick Start
1497~~~~~~~~~~~
1498
1499The following example:
1500
1501.. code-block:: c++
1502
1503  import <iostream>;
1504  int main() {
1505    std::cout << "Hello World.\n";
1506  }
1507
1508could be compiled with:
1509
1510.. code-block:: console
1511
1512  $ clang++ -std=c++20 -xc++-system-header --precompile iostream -o iostream.pcm
1513  $ clang++ -std=c++20 -fmodule-file=iostream.pcm main.cpp
1514
1515How to produce BMIs
1516~~~~~~~~~~~~~~~~~~~
1517
1518Similar to named modules, ``--precompile`` can be used to produce a BMI.
1519However, that requires specifying that the input file is a header by using
1520``-xc++-system-header`` or ``-xc++-user-header``.
1521
1522The ``-fmodule-header={user,system}`` option can also be used to produce a BMI
1523for header units which have a file extension like `.h` or `.hh`. The argument to
1524``-fmodule-header`` specifies either the user search path or the system search
1525path. The default value for ``-fmodule-header`` is ``user``. For example:
1526
1527.. code-block:: c++
1528
1529  // foo.h
1530  #include <iostream>
1531  void Hello() {
1532    std::cout << "Hello World.\n";
1533  }
1534
1535  // use.cpp
1536  import "foo.h";
1537  int main() {
1538    Hello();
1539  }
1540
1541could be compiled with:
1542
1543.. code-block:: console
1544
1545  $ clang++ -std=c++20 -fmodule-header foo.h -o foo.pcm
1546  $ clang++ -std=c++20 -fmodule-file=foo.pcm use.cpp
1547
1548For headers which do not have a file extension, ``-xc++-header`` (or
1549``-xc++-system-header``, ``-xc++-user-header``) must be used to specify the
1550file as a header. For example:
1551
1552.. code-block:: c++
1553
1554  // use.cpp
1555  import "foo.h";
1556  int main() {
1557    Hello();
1558  }
1559
1560.. code-block:: console
1561
1562  $ clang++ -std=c++20 -fmodule-header=system -xc++-header iostream -o iostream.pcm
1563  $ clang++ -std=c++20 -fmodule-file=iostream.pcm use.cpp
1564
1565How to specify dependent BMIs
1566~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1567
1568``-fmodule-file`` can be used to specify a dependent BMI (or multiple times for
1569more than one dependent BMI).
1570
1571With the existing implementation, ``-fprebuilt-module-path`` cannot be used for
1572header units (because they are nominally anonymous). For header units, use
1573``-fmodule-file`` to include the relevant PCM file for each header unit.
1574
1575This is expect to be solved in a future version of Clang either by the compiler
1576finding and specifying ``-fmodule-file`` automatically, or by the use of a
1577module-mapper that understands how to map the header name to their PCMs.
1578
1579Compiling a header unit to an object file
1580~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1581
1582A header unit cannot be compiled to an object file due to the semantics of
1583header units. For example:
1584
1585.. code-block:: console
1586
1587  $ clang++ -std=c++20 -xc++-system-header --precompile iostream -o iostream.pcm
1588  # This is not allowed!
1589  $ clang++ iostream.pcm -c -o iostream.o
1590
1591Include translation
1592~~~~~~~~~~~~~~~~~~~
1593
1594The C++ standard allows vendors to convert ``#include header-name`` to
1595``import header-name;`` when possible. Currently, Clang does this translation
1596for the ``#include`` in the global module fragment. For example, the following
1597example:
1598
1599.. code-block:: c++
1600
1601  module;
1602  import <iostream>;
1603  export module M;
1604  export void Hello() {
1605    std::cout << "Hello.\n";
1606  }
1607
1608is the same as this example:
1609
1610.. code-block:: c++
1611
1612  module;
1613  #include <iostream>
1614  export module M;
1615  export void Hello() {
1616      std::cout << "Hello.\n";
1617  }
1618
1619.. code-block:: console
1620
1621  $ clang++ -std=c++20 -xc++-system-header --precompile iostream -o iostream.pcm
1622  $ clang++ -std=c++20 -fmodule-file=iostream.pcm --precompile M.cppm -o M.cpp
1623
1624In the latter example, Clang can find the BMI for ``<iostream>`` and so it
1625tries to replace the ``#include <iostream>`` with ``import <iostream>;``
1626automatically.
1627
1628
1629Differences between Clang modules and header units
1630--------------------------------------------------
1631
1632Header units have similar semantics to Clang modules. The semantics of both are
1633like headers. Therefore, header units can be mimicked by Clang modules as in
1634the following example:
1635
1636.. code-block:: c++
1637
1638  module "iostream" {
1639    export *
1640    header "/path/to/libstdcxx/iostream"
1641  }
1642
1643.. code-block:: console
1644
1645  $ clang++ -std=c++20 -fimplicit-modules -fmodule-map-file=.modulemap main.cpp
1646
1647This example is simplified when using libc++:
1648
1649.. code-block:: console
1650
1651  $ clang++ -std=c++20 main.cpp -fimplicit-modules -fimplicit-module-maps
1652
1653because libc++ already supplies a
1654`module map <https://github.com/llvm/llvm-project/blob/main/libcxx/include/module.modulemap.in>`_.
1655
1656This raises the question: why are header units not implemented through Clang
1657modules?
1658
1659This is primarily because Clang modules have more hierarchical semantics when
1660wrapping multiple headers together as one module, which is not supported by
1661Standard C++ Header units. We want to avoid the impression that these
1662additional semantics get interpreted as Standard C++ behavior.
1663
1664Another reason is that there are proposals to introduce module mappers to the
1665C++ standard (for example, https://wg21.link/p1184r2). Reusing Clang's
1666``modulemap`` may be more difficult if we need to introduce another module
1667mapper.
1668
1669Discovering Dependencies
1670========================
1671
1672Without use of modules, all the translation units in a project can be compiled
1673in parallel. However, the presence of module units requires compiling the
1674translation units in a topological order.
1675
1676The ``clang-scan-deps`` tool can extract dependency information and produce a
1677JSON file conforming to the specification described in
1678`P1689 <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p1689r5.html>`_.
1679Only named modules are supported currently.
1680
1681A compilation database is needed when using ``clang-scan-deps``. See
1682`JSON Compilation Database Format Specification <JSONCompilationDatabase.html>`_
1683for more information about compilation databases. Note that the ``output``
1684JSON attribute is necessary for ``clang-scan-deps`` to scan using the P1689
1685format. For example:
1686
1687.. code-block:: c++
1688
1689  //--- M.cppm
1690  export module M;
1691  export import :interface_part;
1692  import :impl_part;
1693  export int Hello();
1694
1695  //--- interface_part.cppm
1696  export module M:interface_part;
1697  export void World();
1698
1699  //--- Impl.cpp
1700  module;
1701  #include <iostream>
1702  module M;
1703  void Hello() {
1704      std::cout << "Hello ";
1705  }
1706
1707  //--- impl_part.cppm
1708  module;
1709  #include <string>
1710  #include <iostream>
1711  module M:impl_part;
1712  import :interface_part;
1713
1714  std::string W = "World.";
1715  void World() {
1716      std::cout << W << std::endl;
1717  }
1718
1719  //--- User.cpp
1720  import M;
1721  import third_party_module;
1722  int main() {
1723    Hello();
1724    World();
1725    return 0;
1726  }
1727
1728And here is the compilation database:
1729
1730.. code-block:: text
1731
1732  [
1733  {
1734      "directory": ".",
1735      "command": "<path-to-compiler-executable>/clang++ -std=c++20 M.cppm -c -o M.o",
1736      "file": "M.cppm",
1737      "output": "M.o"
1738  },
1739  {
1740      "directory": ".",
1741      "command": "<path-to-compiler-executable>/clang++ -std=c++20 Impl.cpp -c -o Impl.o",
1742      "file": "Impl.cpp",
1743      "output": "Impl.o"
1744  },
1745  {
1746      "directory": ".",
1747      "command": "<path-to-compiler-executable>/clang++ -std=c++20 impl_part.cppm -c -o impl_part.o",
1748      "file": "impl_part.cppm",
1749      "output": "impl_part.o"
1750  },
1751  {
1752      "directory": ".",
1753      "command": "<path-to-compiler-executable>/clang++ -std=c++20 interface_part.cppm -c -o interface_part.o",
1754      "file": "interface_part.cppm",
1755      "output": "interface_part.o"
1756  },
1757  {
1758      "directory": ".",
1759      "command": "<path-to-compiler-executable>/clang++ -std=c++20 User.cpp -c -o User.o",
1760      "file": "User.cpp",
1761      "output": "User.o"
1762  }
1763  ]
1764
1765To get the dependency information in P1689 format, use:
1766
1767.. code-block:: console
1768
1769  $ clang-scan-deps -format=p1689 -compilation-database P1689.json
1770
1771to get:
1772
1773.. code-block:: text
1774
1775  {
1776    "revision": 0,
1777    "rules": [
1778      {
1779        "primary-output": "Impl.o",
1780        "requires": [
1781          {
1782            "logical-name": "M",
1783            "source-path": "M.cppm"
1784          }
1785        ]
1786      },
1787      {
1788        "primary-output": "M.o",
1789        "provides": [
1790          {
1791            "is-interface": true,
1792            "logical-name": "M",
1793            "source-path": "M.cppm"
1794          }
1795        ],
1796        "requires": [
1797          {
1798            "logical-name": "M:interface_part",
1799            "source-path": "interface_part.cppm"
1800          },
1801          {
1802            "logical-name": "M:impl_part",
1803            "source-path": "impl_part.cppm"
1804          }
1805        ]
1806      },
1807      {
1808        "primary-output": "User.o",
1809        "requires": [
1810          {
1811            "logical-name": "M",
1812            "source-path": "M.cppm"
1813          },
1814          {
1815            "logical-name": "third_party_module"
1816          }
1817        ]
1818      },
1819      {
1820        "primary-output": "impl_part.o",
1821        "provides": [
1822          {
1823            "is-interface": false,
1824            "logical-name": "M:impl_part",
1825            "source-path": "impl_part.cppm"
1826          }
1827        ],
1828        "requires": [
1829          {
1830            "logical-name": "M:interface_part",
1831            "source-path": "interface_part.cppm"
1832          }
1833        ]
1834      },
1835      {
1836        "primary-output": "interface_part.o",
1837        "provides": [
1838          {
1839            "is-interface": true,
1840            "logical-name": "M:interface_part",
1841            "source-path": "interface_part.cppm"
1842          }
1843        ]
1844      }
1845    ],
1846    "version": 1
1847  }
1848
1849See the P1689 paper for the meaning of the fields.
1850
1851Getting dependency information per file with finer-grained control (such as
1852scanning generated source files) is possible. For example:
1853
1854.. code-block:: console
1855
1856  $ clang-scan-deps -format=p1689 -- <path-to-compiler-executable>/clang++ -std=c++20 impl_part.cppm -c -o impl_part.o
1857
1858will produce:
1859
1860.. code-block:: text
1861
1862  {
1863    "revision": 0,
1864    "rules": [
1865      {
1866        "primary-output": "impl_part.o",
1867        "provides": [
1868          {
1869            "is-interface": false,
1870            "logical-name": "M:impl_part",
1871            "source-path": "impl_part.cppm"
1872          }
1873        ],
1874        "requires": [
1875          {
1876            "logical-name": "M:interface_part"
1877          }
1878        ]
1879      }
1880    ],
1881    "version": 1
1882  }
1883
1884Individual command line options can be specified after ``--``.
1885``clang-scan-deps`` will extract the necessary information from the specified
1886options. Note that the path to the compiler executable needs to be specified
1887explicitly instead of using ``clang++`` directly.
1888
1889Users may want the scanner to get the transitional dependency information for
1890headers. Otherwise, the project has to be scanned twice, once for headers and
1891once for modules. To address this, ``clang-scan-deps`` will recognize the
1892specified preprocessor options in the given command line and generate the
1893corresponding dependency information. For example:
1894
1895.. code-block:: console
1896
1897  $ clang-scan-deps -format=p1689 -- ../bin/clang++ -std=c++20 impl_part.cppm -c -o impl_part.o -MD -MT impl_part.ddi -MF impl_part.dep
1898  $ cat impl_part.dep
1899
1900will produce:
1901
1902.. code-block:: text
1903
1904  impl_part.ddi: \
1905    /usr/include/bits/wchar.h /usr/include/bits/types/wint_t.h \
1906    /usr/include/bits/types/mbstate_t.h \
1907    /usr/include/bits/types/__mbstate_t.h /usr/include/bits/types/__FILE.h \
1908    /usr/include/bits/types/FILE.h /usr/include/bits/types/locale_t.h \
1909    /usr/include/bits/types/__locale_t.h \
1910    ...
1911
1912When ``clang-scan-deps`` detects the ``-MF`` option, it will try to write the
1913dependency information for headers to the file specified by ``-MF``.
1914
1915Possible Issues: Failed to find system headers
1916----------------------------------------------
1917
1918If encountering an error like ``fatal error: 'stddef.h' file not found``,
1919the specified ``<path-to-compiler-executable>/clang++`` probably refers to a
1920symlink instead a real binary. There are four potential solutions to the
1921problem:
1922
19231. Point the specified compiler executable to the real binary instead of the
1924   symlink.
19252. Invoke ``<path-to-compiler-executable>/clang++ -print-resource-dir`` to get
1926   the corresponding resource directory for your compiler and add that
1927   directory to the include search paths manually in the build scripts.
19283. For build systems that use a compilation database as the input for
1929   ``clang-scan-deps``, the build system can add the
1930   ``--resource-dir-recipe invoke-compiler`` option when executing
1931   ``clang-scan-deps`` to calculate the resource directory dynamically.
1932   The calculation happens only once for a unique ``<path-to-compiler-executable>/clang++``.
19334. For build systems that invoke ``clang-scan-deps`` per file, repeatedly
1934   calculating the resource directory may be inefficient. In such cases, the
1935   build system can cache the resource directory and specify
1936   ``-resource-dir <resource-dir>`` explicitly, as in:
1937
1938   .. code-block:: console
1939
1940     $ clang-scan-deps -format=p1689 -- <path-to-compiler-executable>/clang++ -std=c++20 -resource-dir <resource-dir> mod.cppm -c -o mod.o
1941
1942
1943Import modules with clang-repl
1944==============================
1945
1946``clang-repl`` supports importing C++20 named modules. For example:
1947
1948.. code-block:: c++
1949
1950  // M.cppm
1951  export module M;
1952  export const char* Hello() {
1953      return "Hello Interpreter for Modules!";
1954  }
1955
1956The named module still needs to be compiled ahead of time.
1957
1958.. code-block:: console
1959
1960  $ clang++ -std=c++20 M.cppm --precompile -o M.pcm
1961  $ clang++ M.pcm -c -o M.o
1962  $ clang++ -shared M.o -o libM.so
1963
1964Note that the module unit needs to be compiled as a dynamic library so that
1965``clang-repl`` can load the object files of the module units. Then it is
1966possible to import module ``M`` in clang-repl.
1967
1968.. code-block:: console
1969
1970  $ clang-repl -Xcc=-std=c++20 -Xcc=-fprebuilt-module-path=.
1971  # We need to load the dynamic library first before importing the modules.
1972  clang-repl> %lib libM.so
1973  clang-repl> import M;
1974  clang-repl> extern "C" int printf(const char *, ...);
1975  clang-repl> printf("%s\n", Hello());
1976  Hello Interpreter for Modules!
1977  clang-repl> %quit
1978
1979Possible Questions
1980==================
1981
1982How modules speed up compilation
1983--------------------------------
1984
1985A classic theory for the reason why modules speed up the compilation is: if
1986there are ``n`` headers and ``m`` source files and each header is included by
1987each source file, then the complexity of the compilation is ``O(n*m)``.
1988However, if there are ``n`` module interfaces and ``m`` source files, the
1989complexity of the compilation is ``O(n+m)``. Therefore, using modules would be
1990a significant improvement at scale. More simply, use of modules causes many of
1991the redundant compilations to no longer be necessary.
1992
1993While this is accurate at a high level, this depends greatly on the
1994optimization level, as illustrated below.
1995
1996First is ``-O0``. The compilation process is described in the following graph.
1997
1998.. code-block:: none
1999
2000  ├-------------frontend----------┼-------------middle end----------------┼----backend----┤
2001  │                               │                                       │               │
2002  └---parsing----sema----codegen--┴----- transformations ---- codegen ----┴---- codegen --┘
2003
2004  ├---------------------------------------------------------------------------------------┐
2005  |                                                                                       │
2006  |                                     source file                                       │
2007  |                                                                                       │
2008  └---------------------------------------------------------------------------------------┘
2009
2010              ├--------┐
2011              │        │
2012              │imported│
2013              │        │
2014              │  code  │
2015              │        │
2016              └--------┘
2017
2018In this case, the source file (which could be a non-module unit or a module
2019unit) would get processed by the entire pipeline. However, the imported code
2020would only get involved in semantic analysis, which, for the most part, is name
2021lookup, overload resolution, and template instantiation. All of these processes
2022are fast relative to the whole compilation process. More importantly, the
2023imported code only needs to be processed once during frontend code generation,
2024as well as the whole middle end and backend. So we could get a big win for the
2025compilation time in ``-O0``.
2026
2027But with optimizations, things are different (the ``code generation`` part for
2028each end is omitted due to limited space):
2029
2030.. code-block:: none
2031
2032  ├-------- frontend ---------┼--------------- middle end --------------------┼------ backend ----┤
2033  │                           │                                               │                   │
2034  └--- parsing ---- sema -----┴--- optimizations --- IPO ---- optimizations---┴--- optimizations -┘
2035
2036  ├-----------------------------------------------------------------------------------------------┐
2037  │                                                                                               │
2038  │                                         source file                                           │
2039  │                                                                                               │
2040  └-----------------------------------------------------------------------------------------------┘
2041                ├---------------------------------------┐
2042                │                                       │
2043                │                                       │
2044                │            imported code              │
2045                │                                       │
2046                │                                       │
2047                └---------------------------------------┘
2048
2049It would be very unfortunate if we end up with worse performance when using
2050modules. The main concern is that when a source file is compiled, the compiler
2051needs to see the body of imported module units so that it can perform IPO
2052(InterProcedural Optimization, primarily inlining in practice) to optimize
2053functions in the current source file with the help of the information provided
2054by the imported module units. In other words, the imported code would be
2055processed again and again in importee units by optimizations (including IPO
2056itself). The optimizations before IPO and IPO itself are the most time-consuming
2057part in whole compilation process. So from this perspective, it might not be
2058possible to get the compile time improvements described, but there could be
2059time savings for optimizations after IPO and the whole backend.
2060
2061Overall, at ``-O0`` the implementations of functions defined in a module will
2062not impact module users, but at higher optimization levels the definitions of
2063such functions are provided to user compilations for the purposes of
2064optimization (but definitions of these functions are still not included in the
2065use's object file). This means the build speedup at higher optimization levels
2066may be lower than expected given ``-O0`` experience, but does provide more
2067optimization opportunities.
2068
2069Interoperability with Clang Modules
2070-----------------------------------
2071
2072We **wish** to support Clang modules and standard C++ modules at the same time,
2073but the mixing them together is not well used/tested yet. Please file new
2074GitHub issues as you find interoperability problems.
2075