xref: /openbsd-src/gnu/llvm/clang/docs/StandardCPlusPlusModules.rst (revision 12c855180aad702bbcca06e0398d774beeafb155)
1*12c85518Srobert====================
2*12c85518SrobertStandard C++ Modules
3*12c85518Srobert====================
4*12c85518Srobert
5*12c85518Srobert.. contents::
6*12c85518Srobert   :local:
7*12c85518Srobert
8*12c85518SrobertIntroduction
9*12c85518Srobert============
10*12c85518Srobert
11*12c85518SrobertThe term ``modules`` has a lot of meanings. For the users of Clang, modules may
12*12c85518Srobertrefer to ``Objective-C Modules``, ``Clang C++ Modules`` (or ``Clang Header Modules``,
13*12c85518Srobertetc.) or ``Standard C++ Modules``. The implementation of all these kinds of modules in Clang
14*12c85518Sroberthas a lot of shared code, but from the perspective of users, their semantics and
15*12c85518Srobertcommand line interfaces are very different. This document focuses on
16*12c85518Srobertan introduction of how to use standard C++ modules in Clang.
17*12c85518Srobert
18*12c85518SrobertThere is already a detailed document about `Clang modules <Modules.html>`_, it
19*12c85518Srobertshould be helpful to read `Clang modules <Modules.html>`_ if you want to know
20*12c85518Srobertmore about the general idea of modules. Since standard C++ modules have different semantics
21*12c85518Srobert(and work flows) from `Clang modules`, this page describes the background and use of
22*12c85518SrobertClang with standard C++ modules.
23*12c85518Srobert
24*12c85518SrobertModules exist in two forms in the C++ Language Specification. They can refer to
25*12c85518Sroberteither "Named Modules" or to "Header Units". This document covers both forms.
26*12c85518Srobert
27*12c85518SrobertStandard C++ Named modules
28*12c85518Srobert==========================
29*12c85518Srobert
30*12c85518SrobertThis document was intended to be a manual first and foremost, however, we consider it helpful to
31*12c85518Srobertintroduce some language background here for readers who are not familiar with
32*12c85518Srobertthe new language feature. This document is not intended to be a language
33*12c85518Sroberttutorial; it will only introduce necessary concepts about the
34*12c85518Srobertstructure and building of the project.
35*12c85518Srobert
36*12c85518SrobertBackground and terminology
37*12c85518Srobert--------------------------
38*12c85518Srobert
39*12c85518SrobertModules
40*12c85518Srobert~~~~~~~
41*12c85518Srobert
42*12c85518SrobertIn this document, the term ``Modules``/``modules`` refers to standard C++ modules
43*12c85518Srobertfeature if it is not decorated by ``Clang``.
44*12c85518Srobert
45*12c85518SrobertClang Modules
46*12c85518Srobert~~~~~~~~~~~~~
47*12c85518Srobert
48*12c85518SrobertIn this document, the term ``Clang Modules``/``Clang modules`` refer to Clang
49*12c85518Srobertc++ modules extension. These are also known as ``Clang header modules``,
50*12c85518Srobert``Clang module map modules`` or ``Clang c++ modules``.
51*12c85518Srobert
52*12c85518SrobertModule and module unit
53*12c85518Srobert~~~~~~~~~~~~~~~~~~~~~~
54*12c85518Srobert
55*12c85518SrobertA module consists of one or more module units. A module unit is a special
56*12c85518Sroberttranslation unit. Every module unit must have a module declaration. The syntax
57*12c85518Srobertof the module declaration is:
58*12c85518Srobert
59*12c85518Srobert.. code-block:: c++
60*12c85518Srobert
61*12c85518Srobert  [export] module module_name[:partition_name];
62*12c85518Srobert
63*12c85518SrobertTerms enclosed in ``[]`` are optional. The syntax of ``module_name`` and ``partition_name``
64*12c85518Srobertin regex form corresponds to ``[a-zA-Z_][a-zA-Z_0-9\.]*``. In particular, a literal dot ``.``
65*12c85518Srobertin the name has no semantic meaning (e.g. implying a hierarchy).
66*12c85518Srobert
67*12c85518SrobertIn this document, module units are classified into:
68*12c85518Srobert
69*12c85518Srobert* Primary module interface unit.
70*12c85518Srobert
71*12c85518Srobert* Module implementation unit.
72*12c85518Srobert
73*12c85518Srobert* Module interface partition unit.
74*12c85518Srobert
75*12c85518Srobert* Internal module partition unit.
76*12c85518Srobert
77*12c85518SrobertA primary module interface unit is a module unit whose module declaration is
78*12c85518Srobert``export module module_name;``. The ``module_name`` here denotes the name of the
79*12c85518Srobertmodule. A module should have one and only one primary module interface unit.
80*12c85518Srobert
81*12c85518SrobertA module implementation unit is a module unit whose module declaration is
82*12c85518Srobert``module module_name;``. A module could have multiple module implementation
83*12c85518Srobertunits with the same declaration.
84*12c85518Srobert
85*12c85518SrobertA module interface partition unit is a module unit whose module declaration is
86*12c85518Srobert``export module module_name:partition_name;``. The ``partition_name`` should be
87*12c85518Srobertunique within any given module.
88*12c85518Srobert
89*12c85518SrobertAn internal module partition unit is a module unit whose module declaration
90*12c85518Srobertis ``module module_name:partition_name;``. The ``partition_name`` should be
91*12c85518Srobertunique within any given module.
92*12c85518Srobert
93*12c85518SrobertIn this document, we use the following umbrella terms:
94*12c85518Srobert
95*12c85518Srobert* A ``module interface unit`` refers to either a ``primary module interface unit``
96*12c85518Srobert  or a ``module interface partition unit``.
97*12c85518Srobert
98*12c85518Srobert* An ``importable module unit`` refers to either a ``module interface unit``
99*12c85518Srobert  or a ``internal module partition unit``.
100*12c85518Srobert
101*12c85518Srobert* A ``module partition unit`` refers to either a ``module interface partition unit``
102*12c85518Srobert  or a ``internal module partition unit``.
103*12c85518Srobert
104*12c85518SrobertBuilt Module Interface file
105*12c85518Srobert~~~~~~~~~~~~~~~~~~~~~~~~~~~
106*12c85518Srobert
107*12c85518SrobertA ``Built Module Interface file`` stands for the precompiled result of an importable module unit.
108*12c85518SrobertIt is also called the acronym ``BMI`` genrally.
109*12c85518Srobert
110*12c85518SrobertGlobal module fragment
111*12c85518Srobert~~~~~~~~~~~~~~~~~~~~~~
112*12c85518Srobert
113*12c85518SrobertIn a module unit, the section from ``module;`` to the module declaration is called the global module fragment.
114*12c85518Srobert
115*12c85518Srobert
116*12c85518SrobertHow to build projects using modules
117*12c85518Srobert-----------------------------------
118*12c85518Srobert
119*12c85518SrobertQuick Start
120*12c85518Srobert~~~~~~~~~~~
121*12c85518Srobert
122*12c85518SrobertLet's see a "hello world" example that uses modules.
123*12c85518Srobert
124*12c85518Srobert.. code-block:: c++
125*12c85518Srobert
126*12c85518Srobert  // Hello.cppm
127*12c85518Srobert  module;
128*12c85518Srobert  #include <iostream>
129*12c85518Srobert  export module Hello;
130*12c85518Srobert  export void hello() {
131*12c85518Srobert    std::cout << "Hello World!\n";
132*12c85518Srobert  }
133*12c85518Srobert
134*12c85518Srobert  // use.cpp
135*12c85518Srobert  import Hello;
136*12c85518Srobert  int main() {
137*12c85518Srobert    hello();
138*12c85518Srobert    return 0;
139*12c85518Srobert  }
140*12c85518Srobert
141*12c85518SrobertThen we type:
142*12c85518Srobert
143*12c85518Srobert.. code-block:: console
144*12c85518Srobert
145*12c85518Srobert  $ clang++ -std=c++20 Hello.cppm --precompile -o Hello.pcm
146*12c85518Srobert  $ clang++ -std=c++20 use.cpp -fprebuilt-module-path=. Hello.pcm -o Hello.out
147*12c85518Srobert  $ ./Hello.out
148*12c85518Srobert  Hello World!
149*12c85518Srobert
150*12c85518SrobertIn this example, we make and use a simple module ``Hello`` which contains only a
151*12c85518Srobertprimary module interface unit ``Hello.cppm``.
152*12c85518Srobert
153*12c85518SrobertThen let's see a little bit more complex "hello world" example which uses the 4 kinds of module units.
154*12c85518Srobert
155*12c85518Srobert.. code-block:: c++
156*12c85518Srobert
157*12c85518Srobert  // M.cppm
158*12c85518Srobert  export module M;
159*12c85518Srobert  export import :interface_part;
160*12c85518Srobert  import :impl_part;
161*12c85518Srobert  export void Hello();
162*12c85518Srobert
163*12c85518Srobert  // interface_part.cppm
164*12c85518Srobert  export module M:interface_part;
165*12c85518Srobert  export void World();
166*12c85518Srobert
167*12c85518Srobert  // impl_part.cppm
168*12c85518Srobert  module;
169*12c85518Srobert  #include <iostream>
170*12c85518Srobert  #include <string>
171*12c85518Srobert  module M:impl_part;
172*12c85518Srobert  import :interface_part;
173*12c85518Srobert
174*12c85518Srobert  std::string W = "World.";
175*12c85518Srobert  void World() {
176*12c85518Srobert    std::cout << W << std::endl;
177*12c85518Srobert  }
178*12c85518Srobert
179*12c85518Srobert  // Impl.cpp
180*12c85518Srobert  module;
181*12c85518Srobert  #include <iostream>
182*12c85518Srobert  module M;
183*12c85518Srobert  void Hello() {
184*12c85518Srobert    std::cout << "Hello ";
185*12c85518Srobert  }
186*12c85518Srobert
187*12c85518Srobert  // User.cpp
188*12c85518Srobert  import M;
189*12c85518Srobert  int main() {
190*12c85518Srobert    Hello();
191*12c85518Srobert    World();
192*12c85518Srobert    return 0;
193*12c85518Srobert  }
194*12c85518Srobert
195*12c85518SrobertThen we are able to compile the example by the following command:
196*12c85518Srobert
197*12c85518Srobert.. code-block:: console
198*12c85518Srobert
199*12c85518Srobert  # Precompiling the module
200*12c85518Srobert  $ clang++ -std=c++20 interface_part.cppm --precompile -o M-interface_part.pcm
201*12c85518Srobert  $ clang++ -std=c++20 impl_part.cppm --precompile -fprebuilt-module-path=. -o M-impl_part.pcm
202*12c85518Srobert  $ clang++ -std=c++20 M.cppm --precompile -fprebuilt-module-path=. -o M.pcm
203*12c85518Srobert  $ clang++ -std=c++20 Impl.cpp -fmodule-file=M.pcm -c -o Impl.o
204*12c85518Srobert
205*12c85518Srobert  # Compiling the user
206*12c85518Srobert  $ clang++ -std=c++20 User.cpp -fprebuilt-module-path=. -c -o User.o
207*12c85518Srobert
208*12c85518Srobert  # Compiling the module and linking it together
209*12c85518Srobert  $ clang++ -std=c++20 M-interface_part.pcm -c -o M-interface_part.o
210*12c85518Srobert  $ clang++ -std=c++20 M-impl_part.pcm -c -o M-impl_part.o
211*12c85518Srobert  $ clang++ -std=c++20 M.pcm -c -o M.o
212*12c85518Srobert  $ clang++ User.o M-interface_part.o  M-impl_part.o M.o Impl.o -o a.out
213*12c85518Srobert
214*12c85518SrobertWe explain the options in the following sections.
215*12c85518Srobert
216*12c85518SrobertHow to enable standard C++ modules
217*12c85518Srobert~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
218*12c85518Srobert
219*12c85518SrobertCurrently, standard C++ modules are enabled automatically
220*12c85518Srobertif the language standard is ``-std=c++20`` or newer.
221*12c85518SrobertThe ``-fmodules-ts`` option is deprecated and is planned to be removed.
222*12c85518Srobert
223*12c85518SrobertHow to produce a BMI
224*12c85518Srobert~~~~~~~~~~~~~~~~~~~~
225*12c85518Srobert
226*12c85518SrobertWe can generate a BMI for an importable module unit by either ``--precompile``
227*12c85518Srobertor ``-fmodule-output`` flags.
228*12c85518Srobert
229*12c85518SrobertThe ``--precompile`` option generates the BMI as the output of the compilation and the output path
230*12c85518Srobertcan be specified using the ``-o`` option.
231*12c85518Srobert
232*12c85518SrobertThe ``-fmodule-output`` option generates the BMI as a by-product of the compilation.
233*12c85518SrobertIf ``-fmodule-output=`` is specified, the BMI will be emitted the specified location. Then if
234*12c85518Srobert``-fmodule-output`` and ``-c`` are specified, the BMI will be emitted in the directory of the
235*12c85518Srobertoutput file with the name of the input file with the new extension ``.pcm``. Otherwise, the BMI
236*12c85518Srobertwill be emitted in the working directory with the name of the input file with the new extension
237*12c85518Srobert``.pcm``.
238*12c85518Srobert
239*12c85518SrobertThe style to generate BMIs by ``--precompile`` is called two-phase compilation since it takes
240*12c85518Srobert2 steps to compile a source file to an object file. The style to generate BMIs by ``-fmodule-output``
241*12c85518Srobertis called one-phase compilation respectively. The one-phase compilation model is simpler
242*12c85518Srobertfor build systems to implement and the two-phase compilation has the potential to compile faster due
243*12c85518Srobertto higher parallelism. As an example, if there are two module units A and B, and B depends on A, the
244*12c85518Srobertone-phase compilation model would need to compile them serially, whereas the two-phase compilation
245*12c85518Srobertmodel may be able to compile them simultaneously if the compilation from A.pcm to A.o takes a long
246*12c85518Sroberttime.
247*12c85518Srobert
248*12c85518SrobertFile name requirement
249*12c85518Srobert~~~~~~~~~~~~~~~~~~~~~
250*12c85518Srobert
251*12c85518SrobertThe file name of an ``importable module unit`` should end with ``.cppm``
252*12c85518Srobert(or ``.ccm``, ``.cxxm``, ``.c++m``). The file name of a ``module implementation unit``
253*12c85518Srobertshould end with ``.cpp`` (or ``.cc``, ``.cxx``, ``.c++``).
254*12c85518Srobert
255*12c85518SrobertThe file name of BMIs should end with ``.pcm``.
256*12c85518SrobertThe file name of the BMI of a ``primary module interface unit`` should be ``module_name.pcm``.
257*12c85518SrobertThe file name of BMIs of ``module partition unit`` should be ``module_name-partition_name.pcm``.
258*12c85518Srobert
259*12c85518SrobertIf the file names use different extensions, Clang may fail to build the module.
260*12c85518SrobertFor example, if the filename of an ``importable module unit`` ends with ``.cpp`` instead of ``.cppm``,
261*12c85518Srobertthen we can't generate a BMI for the ``importable module unit`` by ``--precompile`` option
262*12c85518Srobertsince ``--precompile`` option now would only run preprocessor, which is equal to `-E` now.
263*12c85518SrobertIf we want the filename of an ``importable module unit`` ends with other suffixes instead of ``.cppm``,
264*12c85518Srobertwe could put ``-x c++-module`` in front of the file. For example,
265*12c85518Srobert
266*12c85518Srobert.. code-block:: c++
267*12c85518Srobert
268*12c85518Srobert  // Hello.cpp
269*12c85518Srobert  module;
270*12c85518Srobert  #include <iostream>
271*12c85518Srobert  export module Hello;
272*12c85518Srobert  export void hello() {
273*12c85518Srobert    std::cout << "Hello World!\n";
274*12c85518Srobert  }
275*12c85518Srobert
276*12c85518Srobert  // use.cpp
277*12c85518Srobert  import Hello;
278*12c85518Srobert  int main() {
279*12c85518Srobert    hello();
280*12c85518Srobert    return 0;
281*12c85518Srobert  }
282*12c85518Srobert
283*12c85518SrobertNow the filename of the ``module interface`` ends with ``.cpp`` instead of ``.cppm``,
284*12c85518Srobertwe can't compile them by the original command lines. But we are still able to do it by:
285*12c85518Srobert
286*12c85518Srobert.. code-block:: console
287*12c85518Srobert
288*12c85518Srobert  $ clang++ -std=c++20 -x c++-module Hello.cpp --precompile -o Hello.pcm
289*12c85518Srobert  $ clang++ -std=c++20 use.cpp -fprebuilt-module-path=. Hello.pcm -o Hello.out
290*12c85518Srobert  $ ./Hello.out
291*12c85518Srobert  Hello World!
292*12c85518Srobert
293*12c85518SrobertModule name requirement
294*12c85518Srobert~~~~~~~~~~~~~~~~~~~~~~~
295*12c85518Srobert
296*12c85518Srobert[module.unit]p1 says:
297*12c85518Srobert
298*12c85518Srobert.. code-block:: text
299*12c85518Srobert
300*12c85518Srobert  All module-names either beginning with an identifier consisting of std followed by zero
301*12c85518Srobert  or more digits or containing a reserved identifier ([lex.name]) are reserved and shall not
302*12c85518Srobert  be specified in a module-declaration; no diagnostic is required. If any identifier in a reserved
303*12c85518Srobert  module-name is a reserved identifier, the module name is reserved for use by C++ implementations;
304*12c85518Srobert  otherwise it is reserved for future standardization.
305*12c85518Srobert
306*12c85518SrobertSo all of the following name is not valid by default:
307*12c85518Srobert
308*12c85518Srobert.. code-block:: text
309*12c85518Srobert
310*12c85518Srobert    std
311*12c85518Srobert    std1
312*12c85518Srobert    std.foo
313*12c85518Srobert    __test
314*12c85518Srobert    // and so on ...
315*12c85518Srobert
316*12c85518SrobertIf you still want to use the reserved module names for any reason, currently you can add a special line marker
317*12c85518Srobertin the front of the module declaration like:
318*12c85518Srobert
319*12c85518Srobert.. code-block:: c++
320*12c85518Srobert
321*12c85518Srobert  # __LINE_NUMBER__ __FILE__ 1 3
322*12c85518Srobert  export module std;
323*12c85518Srobert
324*12c85518SrobertHere the `__LINE_NUMBER__` is the actual line number of the corresponding line. The `__FILE__` means the filename
325*12c85518Srobertof the translation unit. The `1` means the following is a new file. And `3` means this is a system header/file so
326*12c85518Srobertthe certain warnings should be suppressed. You could find more details at:
327*12c85518Sroberthttps://gcc.gnu.org/onlinedocs/gcc-3.0.2/cpp_9.html.
328*12c85518Srobert
329*12c85518SrobertHow to specify the dependent BMIs
330*12c85518Srobert~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
331*12c85518Srobert
332*12c85518SrobertThere are 3 methods to specify the dependent BMIs:
333*12c85518Srobert
334*12c85518Srobert* (1) ``-fprebuilt-module-path=<path/to/direcotry>``.
335*12c85518Srobert* (2) ``-fmodule-file=<path/to/BMI>``.
336*12c85518Srobert* (3) ``-fmodule-file=<module-name>=<path/to/BMI>``.
337*12c85518Srobert
338*12c85518SrobertThe option ``-fprebuilt-module-path`` tells the compiler the path where to search for dependent BMIs.
339*12c85518SrobertIt may be used multiple times just like ``-I`` for specifying paths for header files. The look up rule here is:
340*12c85518Srobert
341*12c85518Srobert* (1) When we import module M. The compiler would look up M.pcm in the directories specified
342*12c85518Srobert  by ``-fprebuilt-module-path``.
343*12c85518Srobert* (2) When we import partition module unit M:P. The compiler would look up M-P.pcm in the
344*12c85518Srobert  directories specified by ``-fprebuilt-module-path``.
345*12c85518Srobert
346*12c85518SrobertThe option ``-fmodule-file=<path/to/BMI>`` tells the compiler to load the specified BMI directly.
347*12c85518SrobertThe option ``-fmodule-file=<module-name>=<path/to/BMI>`` tells the compiler to load the specified BMI
348*12c85518Srobertfor the module specified by ``<module-name>`` when necessary. The main difference is that
349*12c85518Srobert``-fmodule-file=<path/to/BMI>`` will load the BMI eagerly, whereas
350*12c85518Srobert``-fmodule-file=<module-name>=<path/to/BMI>`` will only load the BMI lazily, which is similar
351*12c85518Srobertwith ``-fprebuilt-module-path``.
352*12c85518Srobert
353*12c85518SrobertIn case all ``-fprebuilt-module-path=<path/to/direcotry>``, ``-fmodule-file=<path/to/BMI>`` and
354*12c85518Srobert``-fmodule-file=<module-name>=<path/to/BMI>`` exist, the ``-fmodule-file=<path/to/BMI>`` option
355*12c85518Sroberttakes highest precedence and ``-fmodule-file=<module-name>=<path/to/BMI>`` will take the second
356*12c85518Sroberthighest precedence.
357*12c85518Srobert
358*12c85518SrobertWhen we compile a ``module implementation unit``, we must specify the BMI of the corresponding
359*12c85518Srobert``primary module interface unit``.
360*12c85518SrobertSince the language specification says a module implementation unit implicitly imports
361*12c85518Srobertthe primary module interface unit.
362*12c85518Srobert
363*12c85518Srobert  [module.unit]p8
364*12c85518Srobert
365*12c85518Srobert  A module-declaration that contains neither an export-keyword nor a module-partition implicitly
366*12c85518Srobert  imports the primary module interface unit of the module as if by a module-import-declaration.
367*12c85518Srobert
368*12c85518SrobertAll of the 3 options ``-fprebuilt-module-path=<path/to/direcotry>``, ``-fmodule-file=<path/to/BMI>``
369*12c85518Srobertand ``-fmodule-file=<module-name>=<path/to/BMI>`` may occur multiple times.
370*12c85518SrobertFor example, the command line to compile ``M.cppm`` in
371*12c85518Srobertthe above example could be rewritten into:
372*12c85518Srobert
373*12c85518Srobert.. code-block:: console
374*12c85518Srobert
375*12c85518Srobert  $ clang++ -std=c++20 M.cppm --precompile -fmodule-file=M-interface_part.pcm -fmodule-file=M-impl_part.pcm -o M.pcm
376*12c85518Srobert  $ clang++ -std=c++20 M.cppm --precompile -fmodule-file=M:interface_part=M-interface_part.pcm -fmodule-file=M:impl_part=M-impl_part.pcm -o M.pcm
377*12c85518Srobert
378*12c85518Srobert``-fprebuilt-module-path`` is more convenient and ``-fmodule-file`` is faster since
379*12c85518Srobertit saves time for file lookup.
380*12c85518Srobert
381*12c85518SrobertRemember that module units still have an object counterpart to the BMI
382*12c85518Srobert~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
383*12c85518Srobert
384*12c85518SrobertIt is easy to forget to compile BMIs at first since we may envision module interfaces like headers.
385*12c85518SrobertHowever, this is not true.
386*12c85518SrobertModule units are translation units. We need to compile them to object files
387*12c85518Srobertand link the object files like the example shows.
388*12c85518Srobert
389*12c85518SrobertFor example, the traditional compilation processes for headers are like:
390*12c85518Srobert
391*12c85518Srobert.. code-block:: text
392*12c85518Srobert
393*12c85518Srobert  src1.cpp -+> clang++ src1.cpp --> src1.o ---,
394*12c85518Srobert  hdr1.h  --'                                 +-> clang++ src1.o src2.o ->  executable
395*12c85518Srobert  hdr2.h  --,                                 |
396*12c85518Srobert  src2.cpp -+> clang++ src2.cpp --> src2.o ---'
397*12c85518Srobert
398*12c85518SrobertAnd the compilation process for module units are like:
399*12c85518Srobert
400*12c85518Srobert.. code-block:: text
401*12c85518Srobert
402*12c85518Srobert                src1.cpp ----------------------------------------+> clang++ src1.cpp -------> src1.o -,
403*12c85518Srobert  (header unit) hdr1.h    -> clang++ hdr1.h ...    -> hdr1.pcm --'                                    +-> clang++ src1.o mod1.o src2.o ->  executable
404*12c85518Srobert                mod1.cppm -> clang++ mod1.cppm ... -> mod1.pcm --,--> clang++ mod1.pcm ... -> mod1.o -+
405*12c85518Srobert                src2.cpp ----------------------------------------+> clang++ src2.cpp -------> src2.o -'
406*12c85518Srobert
407*12c85518SrobertAs the diagrams show, we need to compile the BMI from module units to object files and link the object files.
408*12c85518Srobert(But we can't do this for the BMI from header units. See the later section for the definition of header units)
409*12c85518Srobert
410*12c85518SrobertIf we want to create a module library, we can't just ship the BMIs in an archive.
411*12c85518SrobertWe must compile these BMIs(``*.pcm``) into object files(``*.o``) and add those object files to the archive instead.
412*12c85518Srobert
413*12c85518SrobertConsistency Requirement
414*12c85518Srobert~~~~~~~~~~~~~~~~~~~~~~~
415*12c85518Srobert
416*12c85518SrobertIf we envision modules as a cache to speed up compilation, then - as with other caching techniques -
417*12c85518Srobertit is important to keep cache consistency.
418*12c85518SrobertSo **currently** Clang will do very strict check for consistency.
419*12c85518Srobert
420*12c85518SrobertOptions consistency
421*12c85518Srobert^^^^^^^^^^^^^^^^^^^
422*12c85518Srobert
423*12c85518SrobertThe language option of module units and their non-module-unit users should be consistent.
424*12c85518SrobertThe following example is not allowed:
425*12c85518Srobert
426*12c85518Srobert.. code-block:: c++
427*12c85518Srobert
428*12c85518Srobert  // M.cppm
429*12c85518Srobert  export module M;
430*12c85518Srobert
431*12c85518Srobert  // Use.cpp
432*12c85518Srobert  import M;
433*12c85518Srobert
434*12c85518Srobert.. code-block:: console
435*12c85518Srobert
436*12c85518Srobert  $ clang++ -std=c++20 M.cppm --precompile -o M.pcm
437*12c85518Srobert  $ clang++ -std=c++2b Use.cpp -fprebuilt-module-path=.
438*12c85518Srobert
439*12c85518SrobertThe compiler would reject the example due to the inconsistent language options.
440*12c85518SrobertNot all options are language options.
441*12c85518SrobertFor example, the following example is allowed:
442*12c85518Srobert
443*12c85518Srobert.. code-block:: console
444*12c85518Srobert
445*12c85518Srobert  $ clang++ -std=c++20 M.cppm --precompile -o M.pcm
446*12c85518Srobert  # Inconsistent optimization level.
447*12c85518Srobert  $ clang++ -std=c++20 -O3 Use.cpp -fprebuilt-module-path=.
448*12c85518Srobert  # Inconsistent debugging level.
449*12c85518Srobert  $ clang++ -std=c++20 -g Use.cpp -fprebuilt-module-path=.
450*12c85518Srobert
451*12c85518SrobertAlthough the two examples have inconsistent optimization and debugging level, both of them are accepted.
452*12c85518Srobert
453*12c85518SrobertNote that **currently** the compiler doesn't consider inconsistent macro definition a problem. For example:
454*12c85518Srobert
455*12c85518Srobert.. code-block:: console
456*12c85518Srobert
457*12c85518Srobert  $ clang++ -std=c++20 M.cppm --precompile -o M.pcm
458*12c85518Srobert  # Inconsistent optimization level.
459*12c85518Srobert  $ clang++ -std=c++20 -O3 -DNDEBUG Use.cpp -fprebuilt-module-path=.
460*12c85518Srobert
461*12c85518SrobertCurrently Clang would accept the above example. But it may produce surprising results if the
462*12c85518Srobertdebugging code depends on consistent use of ``NDEBUG`` also in other translation units.
463*12c85518Srobert
464*12c85518SrobertSource content consistency
465*12c85518Srobert^^^^^^^^^^^^^^^^^^^^^^^^^^
466*12c85518Srobert
467*12c85518SrobertWhen the compiler reads a BMI, the compiler will check the consistency of the corresponding
468*12c85518Srobertsource files. For example:
469*12c85518Srobert
470*12c85518Srobert.. code-block:: c++
471*12c85518Srobert
472*12c85518Srobert  // M.cppm
473*12c85518Srobert  export module M;
474*12c85518Srobert  export template <class T>
475*12c85518Srobert  T foo(T t) {
476*12c85518Srobert    return t;
477*12c85518Srobert  }
478*12c85518Srobert
479*12c85518Srobert  // Use.cpp
480*12c85518Srobert  import M;
481*12c85518Srobert  void bar() {
482*12c85518Srobert    foo(5);
483*12c85518Srobert  }
484*12c85518Srobert
485*12c85518Srobert.. code-block:: console
486*12c85518Srobert
487*12c85518Srobert  $ clang++ -std=c++20 M.cppm --precompile -o M.pcm
488*12c85518Srobert  $ rm M.cppm
489*12c85518Srobert  $ clang++ -std=c++20 Use.cpp -fmodule-file=M.pcm
490*12c85518Srobert
491*12c85518SrobertThe compiler would reject the example since the compiler failed to find the source file to check the consistency.
492*12c85518SrobertSo the following example would be rejected too.
493*12c85518Srobert
494*12c85518Srobert.. code-block:: console
495*12c85518Srobert
496*12c85518Srobert  $ clang++ -std=c++20 M.cppm --precompile -o M.pcm
497*12c85518Srobert  $ echo "int i=0;" >> M.cppm
498*12c85518Srobert  $ clang++ -std=c++20 Use.cpp -fmodule-file=M.pcm
499*12c85518Srobert
500*12c85518SrobertThe compiler would reject it too since the compiler detected the file was changed.
501*12c85518Srobert
502*12c85518SrobertBut it is OK to move the BMI as long as the source files remain:
503*12c85518Srobert
504*12c85518Srobert.. code-block:: console
505*12c85518Srobert
506*12c85518Srobert  $ clang++ -std=c++20 M.cppm --precompile -o M.pcm
507*12c85518Srobert  $ mkdir -p tmp
508*12c85518Srobert  $ mv M.pcm tmp/M.pcm
509*12c85518Srobert  $ clang++ -std=c++20 Use.cpp -fmodule-file=tmp/M.pcm
510*12c85518Srobert
511*12c85518SrobertThe above example would be accepted.
512*12c85518Srobert
513*12c85518SrobertIf the user doesn't want to follow the consistency requirement due to some reasons (e.g., distributing BMI),
514*12c85518Srobertthe user could try to use ``-Xclang -fmodules-embed-all-files`` when producing BMI. For example:
515*12c85518Srobert
516*12c85518Srobert.. code-block:: console
517*12c85518Srobert
518*12c85518Srobert  $ clang++ -std=c++20 M.cppm --precompile -Xclang -fmodules-embed-all-files -o M.pcm
519*12c85518Srobert  $ rm M.cppm
520*12c85518Srobert  $ clang++ -std=c++20 Use.cpp -fmodule-file=M.pcm
521*12c85518Srobert
522*12c85518SrobertNow the compiler would accept the above example.
523*12c85518SrobertImportant note: Xclang options are intended to be used by compiler internally and its semantics
524*12c85518Srobertare not guaranteed to be preserved in future versions.
525*12c85518Srobert
526*12c85518SrobertAlso the compiler will record the path to the header files included in the global module fragment and compare the
527*12c85518Srobertheaders when imported. For example,
528*12c85518Srobert
529*12c85518Srobert.. code-block:: c++
530*12c85518Srobert
531*12c85518Srobert  // foo.h
532*12c85518Srobert  #include <iostream>
533*12c85518Srobert  void Hello() {
534*12c85518Srobert    std::cout << "Hello World.\n";
535*12c85518Srobert  }
536*12c85518Srobert
537*12c85518Srobert  // foo.cppm
538*12c85518Srobert  module;
539*12c85518Srobert  #include "foo.h"
540*12c85518Srobert  export module foo;
541*12c85518Srobert  export using ::Hello;
542*12c85518Srobert
543*12c85518Srobert  // Use.cpp
544*12c85518Srobert  import foo;
545*12c85518Srobert  int main() {
546*12c85518Srobert    Hello();
547*12c85518Srobert  }
548*12c85518Srobert
549*12c85518SrobertThen it is problematic if we remove ``foo.h`` before import `foo` module.
550*12c85518Srobert
551*12c85518Srobert.. code-block:: console
552*12c85518Srobert
553*12c85518Srobert  $ clang++ -std=c++20 foo.cppm --precompile  -o foo.pcm
554*12c85518Srobert  $ mv foo.h foo.orig.h
555*12c85518Srobert  # The following one is rejected
556*12c85518Srobert  $ clang++ -std=c++20 Use.cpp -fmodule-file=foo.pcm -c
557*12c85518Srobert
558*12c85518SrobertThe above case will rejected. And we're still able to workaround it by ``-Xclang -fmodules-embed-all-files`` option:
559*12c85518Srobert
560*12c85518Srobert.. code-block:: console
561*12c85518Srobert
562*12c85518Srobert  $ clang++ -std=c++20 foo.cppm --precompile  -Xclang -fmodules-embed-all-files -o foo.pcm
563*12c85518Srobert  $ mv foo.h foo.orig.h
564*12c85518Srobert  $ clang++ -std=c++20 Use.cpp -fmodule-file=foo.pcm -c -o Use.o
565*12c85518Srobert  $ clang++ Use.o foo.pcm
566*12c85518Srobert
567*12c85518SrobertABI Impacts
568*12c85518Srobert-----------
569*12c85518Srobert
570*12c85518SrobertThe declarations in a module unit which are not in the global module fragment have new linkage names.
571*12c85518Srobert
572*12c85518SrobertFor example,
573*12c85518Srobert
574*12c85518Srobert.. code-block:: c++
575*12c85518Srobert
576*12c85518Srobert  export module M;
577*12c85518Srobert  namespace NS {
578*12c85518Srobert    export int foo();
579*12c85518Srobert  }
580*12c85518Srobert
581*12c85518SrobertThe linkage name of ``NS::foo()`` would be ``_ZN2NSW1M3fooEv``.
582*12c85518SrobertThis couldn't be demangled by previous versions of the debugger or demangler.
583*12c85518SrobertAs of LLVM 15.x, users can utilize ``llvm-cxxfilt`` to demangle this:
584*12c85518Srobert
585*12c85518Srobert.. code-block:: console
586*12c85518Srobert
587*12c85518Srobert  $ llvm-cxxfilt _ZN2NSW1M3fooEv
588*12c85518Srobert
589*12c85518SrobertThe result would be ``NS::foo@M()``, which reads as ``NS::foo()`` in module ``M``.
590*12c85518Srobert
591*12c85518SrobertThe ABI implies that we can't declare something in a module unit and define it in a non-module unit (or vice-versa),
592*12c85518Srobertas this would result in linking errors.
593*12c85518Srobert
594*12c85518SrobertKnown Problems
595*12c85518Srobert--------------
596*12c85518Srobert
597*12c85518SrobertThe following describes issues in the current implementation of modules.
598*12c85518SrobertPlease see https://github.com/llvm/llvm-project/labels/clang%3Amodules for more issues
599*12c85518Srobertor file a new issue if you don't find an existing one.
600*12c85518SrobertIf you're going to create a new issue for standard C++ modules,
601*12c85518Srobertplease start the title with ``[C++20] [Modules]`` (or ``[C++2b] [Modules]``, etc)
602*12c85518Srobertand add the label ``clang:modules`` (if you have permissions for that).
603*12c85518Srobert
604*12c85518SrobertFor higher level support for proposals, you could visit https://clang.llvm.org/cxx_status.html.
605*12c85518Srobert
606*12c85518SrobertSupport for clang-scan-deps
607*12c85518Srobert~~~~~~~~~~~~~~~~~~~~~~~~~~~
608*12c85518Srobert
609*12c85518SrobertThe support for clang-scan-deps may be the most urgent problem for modules now.
610*12c85518SrobertWithout the support for clang-scan-deps, it's hard to involve build systems.
611*12c85518SrobertThis means that users could only play with modules through makefiles or by writing a parser by hand.
612*12c85518SrobertIt blocks more uses for modules, which will block more defect reports or requirements.
613*12c85518Srobert
614*12c85518SrobertThis is tracked in: https://github.com/llvm/llvm-project/issues/51792.
615*12c85518Srobert
616*12c85518SrobertAmbiguous deduction guide
617*12c85518Srobert~~~~~~~~~~~~~~~~~~~~~~~~~
618*12c85518Srobert
619*12c85518SrobertCurrently, when we call deduction guides in global module fragment,
620*12c85518Srobertwe may get incorrect diagnosing message like: `ambiguous deduction`.
621*12c85518Srobert
622*12c85518SrobertSo if we're using deduction guide from global module fragment, we probably need to write:
623*12c85518Srobert
624*12c85518Srobert.. code-block:: c++
625*12c85518Srobert
626*12c85518Srobert  std::lock_guard<std::mutex> lk(mutex);
627*12c85518Srobert
628*12c85518Srobertinstead of
629*12c85518Srobert
630*12c85518Srobert.. code-block:: c++
631*12c85518Srobert
632*12c85518Srobert  std::lock_guard lk(mutex);
633*12c85518Srobert
634*12c85518SrobertThis is tracked in: https://github.com/llvm/llvm-project/issues/56916
635*12c85518Srobert
636*12c85518SrobertIgnored PreferredName Attribute
637*12c85518Srobert~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
638*12c85518Srobert
639*12c85518SrobertDue to a tricky problem, when Clang writes BMIs, Clang will ignore the ``preferred_name`` attribute, if any.
640*12c85518SrobertThis implies that the ``preferred_name`` wouldn't show in debugger or dumping.
641*12c85518Srobert
642*12c85518SrobertThis is tracked in: https://github.com/llvm/llvm-project/issues/56490
643*12c85518Srobert
644*12c85518SrobertDon't emit macros about module declaration
645*12c85518Srobert~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
646*12c85518Srobert
647*12c85518SrobertThis is covered by P1857R3. We mention it again here since users may abuse it before we implement it.
648*12c85518Srobert
649*12c85518SrobertSomeone may want to write code which could be compiled both by modules or non-modules.
650*12c85518SrobertA direct idea would be use macros like:
651*12c85518Srobert
652*12c85518Srobert.. code-block:: c++
653*12c85518Srobert
654*12c85518Srobert  MODULE
655*12c85518Srobert  IMPORT header_name
656*12c85518Srobert  EXPORT_MODULE MODULE_NAME;
657*12c85518Srobert  IMPORT header_name
658*12c85518Srobert  EXPORT ...
659*12c85518Srobert
660*12c85518SrobertSo this file could be triggered like a module unit or a non-module unit depending on the definition
661*12c85518Srobertof some macros.
662*12c85518SrobertHowever, this kind of usage is forbidden by P1857R3 but we haven't implemented P1857R3 yet.
663*12c85518SrobertThis means that is possible to write illegal modules code now, and obviously this will stop working
664*12c85518Srobertonce P1857R3 is implemented.
665*12c85518SrobertA simple suggestion would be "Don't play macro tricks with module declarations".
666*12c85518Srobert
667*12c85518SrobertThis is tracked in: https://github.com/llvm/llvm-project/issues/56917
668*12c85518Srobert
669*12c85518SrobertIn consistent filename suffix requirement for importable module units
670*12c85518Srobert~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
671*12c85518Srobert
672*12c85518SrobertCurrently, clang requires the file name of an ``importable module unit`` should end with ``.cppm``
673*12c85518Srobert(or ``.ccm``, ``.cxxm``, ``.c++m``). However, the behavior is inconsistent with other compilers.
674*12c85518Srobert
675*12c85518SrobertThis is tracked in: https://github.com/llvm/llvm-project/issues/57416
676*12c85518Srobert
677*12c85518SrobertHeader Units
678*12c85518Srobert============
679*12c85518Srobert
680*12c85518SrobertHow to build projects using header unit
681*12c85518Srobert---------------------------------------
682*12c85518Srobert
683*12c85518SrobertQuick Start
684*12c85518Srobert~~~~~~~~~~~
685*12c85518Srobert
686*12c85518SrobertFor the following example,
687*12c85518Srobert
688*12c85518Srobert.. code-block:: c++
689*12c85518Srobert
690*12c85518Srobert  import <iostream>;
691*12c85518Srobert  int main() {
692*12c85518Srobert    std::cout << "Hello World.\n";
693*12c85518Srobert  }
694*12c85518Srobert
695*12c85518Srobertwe could compile it as
696*12c85518Srobert
697*12c85518Srobert.. code-block:: console
698*12c85518Srobert
699*12c85518Srobert  $ clang++ -std=c++20 -xc++-system-header --precompile iostream -o iostream.pcm
700*12c85518Srobert  $ clang++ -std=c++20 -fmodule-file=iostream.pcm main.cpp
701*12c85518Srobert
702*12c85518SrobertHow to produce BMIs
703*12c85518Srobert~~~~~~~~~~~~~~~~~~~
704*12c85518Srobert
705*12c85518SrobertSimilar to named modules, we could use ``--precompile`` to produce the BMI.
706*12c85518SrobertBut we need to specify that the input file is a header by ``-xc++-system-header`` or ``-xc++-user-header``.
707*12c85518Srobert
708*12c85518SrobertAlso we could use `-fmodule-header={user,system}` option to produce the BMI for header units
709*12c85518Srobertwhich has suffix like `.h` or `.hh`.
710*12c85518SrobertThe value of `-fmodule-header` means the user search path or the system search path.
711*12c85518SrobertThe default value for `-fmodule-header` is `user`.
712*12c85518SrobertFor example,
713*12c85518Srobert
714*12c85518Srobert.. code-block:: c++
715*12c85518Srobert
716*12c85518Srobert  // foo.h
717*12c85518Srobert  #include <iostream>
718*12c85518Srobert  void Hello() {
719*12c85518Srobert    std::cout << "Hello World.\n";
720*12c85518Srobert  }
721*12c85518Srobert
722*12c85518Srobert  // use.cpp
723*12c85518Srobert  import "foo.h";
724*12c85518Srobert  int main() {
725*12c85518Srobert    Hello();
726*12c85518Srobert  }
727*12c85518Srobert
728*12c85518SrobertWe could compile it as:
729*12c85518Srobert
730*12c85518Srobert.. code-block:: console
731*12c85518Srobert
732*12c85518Srobert  $ clang++ -std=c++20 -fmodule-header foo.h -o foo.pcm
733*12c85518Srobert  $ clang++ -std=c++20 -fmodule-file=foo.pcm use.cpp
734*12c85518Srobert
735*12c85518SrobertFor headers which don't have a suffix, we need to pass ``-xc++-header``
736*12c85518Srobert(or ``-xc++-system-header`` or ``-xc++-user-header``) to mark it as a header.
737*12c85518SrobertFor example,
738*12c85518Srobert
739*12c85518Srobert.. code-block:: c++
740*12c85518Srobert
741*12c85518Srobert  // use.cpp
742*12c85518Srobert  import "foo.h";
743*12c85518Srobert  int main() {
744*12c85518Srobert    Hello();
745*12c85518Srobert  }
746*12c85518Srobert
747*12c85518Srobert.. code-block:: console
748*12c85518Srobert
749*12c85518Srobert  $ clang++ -std=c++20 -fmodule-header=system -xc++-header iostream -o iostream.pcm
750*12c85518Srobert  $ clang++ -std=c++20 -fmodule-file=iostream.pcm use.cpp
751*12c85518Srobert
752*12c85518SrobertHow to specify the dependent BMIs
753*12c85518Srobert~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
754*12c85518Srobert
755*12c85518SrobertWe could use ``-fmodule-file`` to specify the BMIs, and this option may occur multiple times as well.
756*12c85518Srobert
757*12c85518SrobertWith the existing implementation ``-fprebuilt-module-path`` cannot be used for header units
758*12c85518Srobert(since they are nominally anonymous).
759*12c85518SrobertFor header units, use  ``-fmodule-file`` to include the relevant PCM file for each header unit.
760*12c85518Srobert
761*12c85518SrobertThis is expect to be solved in future editions of the compiler either by the tooling finding and specifying
762*12c85518Srobertthe -fmodule-file or by the use of a module-mapper that understands how to map the header name to their PCMs.
763*12c85518Srobert
764*12c85518SrobertDon't compile the BMI
765*12c85518Srobert~~~~~~~~~~~~~~~~~~~~~
766*12c85518Srobert
767*12c85518SrobertAnother difference with modules is that we can't compile the BMI from a header unit.
768*12c85518SrobertFor example:
769*12c85518Srobert
770*12c85518Srobert.. code-block:: console
771*12c85518Srobert
772*12c85518Srobert  $ clang++ -std=c++20 -xc++-system-header --precompile iostream -o iostream.pcm
773*12c85518Srobert  # This is not allowed!
774*12c85518Srobert  $ clang++ iostream.pcm -c -o iostream.o
775*12c85518Srobert
776*12c85518SrobertIt makes sense due to the semantics of header units, which are just like headers.
777*12c85518Srobert
778*12c85518SrobertInclude translation
779*12c85518Srobert~~~~~~~~~~~~~~~~~~~
780*12c85518Srobert
781*12c85518SrobertThe C++ spec allows the vendors to convert ``#include header-name`` to ``import header-name;`` when possible.
782*12c85518SrobertCurrently, Clang would do this translation for the ``#include`` in the global module fragment.
783*12c85518Srobert
784*12c85518SrobertFor example, the following two examples are the same:
785*12c85518Srobert
786*12c85518Srobert.. code-block:: c++
787*12c85518Srobert
788*12c85518Srobert  module;
789*12c85518Srobert  import <iostream>;
790*12c85518Srobert  export module M;
791*12c85518Srobert  export void Hello() {
792*12c85518Srobert    std::cout << "Hello.\n";
793*12c85518Srobert  }
794*12c85518Srobert
795*12c85518Srobertwith the following one:
796*12c85518Srobert
797*12c85518Srobert.. code-block:: c++
798*12c85518Srobert
799*12c85518Srobert  module;
800*12c85518Srobert  #include <iostream>
801*12c85518Srobert  export module M;
802*12c85518Srobert  export void Hello() {
803*12c85518Srobert      std::cout << "Hello.\n";
804*12c85518Srobert  }
805*12c85518Srobert
806*12c85518Srobert.. code-block:: console
807*12c85518Srobert
808*12c85518Srobert  $ clang++ -std=c++20 -xc++-system-header --precompile iostream -o iostream.pcm
809*12c85518Srobert  $ clang++ -std=c++20 -fmodule-file=iostream.pcm --precompile M.cppm -o M.cpp
810*12c85518Srobert
811*12c85518SrobertIn the latter example, the Clang could find the BMI for the ``<iostream>``
812*12c85518Srobertso it would try to replace the ``#include <iostream>`` to ``import <iostream>;`` automatically.
813*12c85518Srobert
814*12c85518Srobert
815*12c85518SrobertRelationships between Clang modules
816*12c85518Srobert-----------------------------------
817*12c85518Srobert
818*12c85518SrobertHeader units have pretty similar semantics with Clang modules.
819*12c85518SrobertThe semantics of both of them are like headers.
820*12c85518Srobert
821*12c85518SrobertIn fact, we could even "mimic" the sytle of header units by Clang modules:
822*12c85518Srobert
823*12c85518Srobert.. code-block:: c++
824*12c85518Srobert
825*12c85518Srobert  module "iostream" {
826*12c85518Srobert    export *
827*12c85518Srobert    header "/path/to/libstdcxx/iostream"
828*12c85518Srobert  }
829*12c85518Srobert
830*12c85518Srobert.. code-block:: console
831*12c85518Srobert
832*12c85518Srobert  $ clang++ -std=c++20 -fimplicit-modules -fmodule-map-file=.modulemap main.cpp
833*12c85518Srobert
834*12c85518SrobertIt would be simpler if we are using libcxx:
835*12c85518Srobert
836*12c85518Srobert.. code-block:: console
837*12c85518Srobert
838*12c85518Srobert  $ clang++ -std=c++20 main.cpp -fimplicit-modules -fimplicit-module-maps
839*12c85518Srobert
840*12c85518SrobertSince there is already one
841*12c85518Srobert`module map <https://github.com/llvm/llvm-project/blob/main/libcxx/include/module.modulemap.in>`_
842*12c85518Srobertin the source of libcxx.
843*12c85518Srobert
844*12c85518SrobertThen immediately leads to the question: why don't we implement header units through Clang header modules?
845*12c85518Srobert
846*12c85518SrobertThe main reason for this is that Clang modules have more semantics like hierarchy or
847*12c85518Srobertwrapping multiple headers together as a big module.
848*12c85518SrobertHowever, these things are not part of Standard C++ Header units,
849*12c85518Srobertand we want to avoid the impression that these additional semantics get interpreted as Standard C++ behavior.
850*12c85518Srobert
851*12c85518SrobertAnother reason is that there are proposals to introduce module mappers to the C++ standard
852*12c85518Srobert(for example, https://wg21.link/p1184r2).
853*12c85518SrobertIf we decide to reuse Clang's modulemap, we may get in trouble once we need to introduce another module mapper.
854*12c85518Srobert
855*12c85518SrobertSo the final answer for why we don't reuse the interface of Clang modules for header units is that
856*12c85518Srobertthere are some differences between header units and Clang modules and that ignoring those
857*12c85518Srobertdifferences now would likely become a problem in the future.
858*12c85518Srobert
859*12c85518SrobertPossible Questions
860*12c85518Srobert==================
861*12c85518Srobert
862*12c85518SrobertHow modules speed up compilation
863*12c85518Srobert--------------------------------
864*12c85518Srobert
865*12c85518SrobertA classic theory for the reason why modules speed up the compilation is:
866*12c85518Srobertif there are ``n`` headers and ``m`` source files and each header is included by each source file,
867*12c85518Srobertthen the complexity of the compilation is ``O(n*m)``;
868*12c85518SrobertBut if there are ``n`` module interfaces and ``m`` source files, the complexity of the compilation is
869*12c85518Srobert``O(n+m)``. So, using modules would be a big win when scaling.
870*12c85518SrobertIn a simpler word, we could get rid of many redundant compilations by using modules.
871*12c85518Srobert
872*12c85518SrobertRoughly, this theory is correct. But the problem is that it is too rough.
873*12c85518SrobertThe behavior depends on the optimization level, as we will illustrate below.
874*12c85518Srobert
875*12c85518SrobertFirst is ``O0``. The compilation process is described in the following graph.
876*12c85518Srobert
877*12c85518Srobert.. code-block:: none
878*12c85518Srobert
879*12c85518Srobert  ├-------------frontend----------┼-------------middle end----------------┼----backend----┤
880*12c85518Srobert  │                               │                                       │               │
881*12c85518Srobert  └---parsing----sema----codegen--┴----- transformations ---- codegen ----┴---- codegen --┘
882*12c85518Srobert
883*12c85518Srobert  ┌---------------------------------------------------------------------------------------┐
884*12c85518Srobert  |                                                                                       │
885*12c85518Srobert  |                                     source file                                       │
886*12c85518Srobert  |                                                                                       │
887*12c85518Srobert  └---------------------------------------------------------------------------------------┘
888*12c85518Srobert
889*12c85518Srobert              ┌--------┐
890*12c85518Srobert              │        │
891*12c85518Srobert              │imported│
892*12c85518Srobert              │        │
893*12c85518Srobert              │  code  │
894*12c85518Srobert              │        │
895*12c85518Srobert              └--------┘
896*12c85518Srobert
897*12c85518SrobertHere we can see that the source file (could be a non-module unit or a module unit) would get processed by the
898*12c85518Srobertwhole pipeline.
899*12c85518SrobertBut the imported code would only get involved in semantic analysis, which is mainly about name lookup,
900*12c85518Srobertoverload resolution and template instantiation.
901*12c85518SrobertAll of these processes are fast relative to the whole compilation process.
902*12c85518SrobertMore importantly, the imported code only needs to be processed once in frontend code generation,
903*12c85518Srobertas well as the whole middle end and backend.
904*12c85518SrobertSo we could get a big win for the compilation time in O0.
905*12c85518Srobert
906*12c85518SrobertBut with optimizations, things are different:
907*12c85518Srobert
908*12c85518Srobert(we omit ``code generation`` part for each end due to the limited space)
909*12c85518Srobert
910*12c85518Srobert.. code-block:: none
911*12c85518Srobert
912*12c85518Srobert  ├-------- frontend ---------┼--------------- middle end --------------------┼------ backend ----┤
913*12c85518Srobert  │                           │                                               │                   │
914*12c85518Srobert  └--- parsing ---- sema -----┴--- optimizations --- IPO ---- optimizations---┴--- optimizations -┘
915*12c85518Srobert
916*12c85518Srobert  ┌-----------------------------------------------------------------------------------------------┐
917*12c85518Srobert  │                                                                                               │
918*12c85518Srobert  │                                         source file                                           │
919*12c85518Srobert  │                                                                                               │
920*12c85518Srobert  └-----------------------------------------------------------------------------------------------┘
921*12c85518Srobert                ┌---------------------------------------┐
922*12c85518Srobert                │                                       │
923*12c85518Srobert                │                                       │
924*12c85518Srobert                │            imported code              │
925*12c85518Srobert                │                                       │
926*12c85518Srobert                │                                       │
927*12c85518Srobert                └---------------------------------------┘
928*12c85518Srobert
929*12c85518SrobertIt would be very unfortunate if we end up with worse performance after using modules.
930*12c85518SrobertThe main concern is that when we compile a source file, the compiler needs to see the function body
931*12c85518Srobertof imported module units so that it can perform IPO (InterProcedural Optimization, primarily inlining
932*12c85518Srobertin practice) to optimize functions in current source file with the help of the information provided by
933*12c85518Srobertthe imported module units.
934*12c85518SrobertIn other words, the imported code would be processed again and again in importee units
935*12c85518Srobertby optimizations (including IPO itself).
936*12c85518SrobertThe optimizations before IPO and the IPO itself are the most time-consuming part in whole compilation process.
937*12c85518SrobertSo from this perspective, we might not be able to get the improvements described in the theory.
938*12c85518SrobertBut we could still save the time for optimizations after IPO and the whole backend.
939*12c85518Srobert
940*12c85518SrobertOverall, at ``O0`` the implementations of functions defined in a module will not impact module users,
941*12c85518Srobertbut at higher optimization levels the definitions of such functions are provided to user compilations for the
942*12c85518Srobertpurposes of optimization (but definitions of these functions are still not included in the use's object file)-
943*12c85518Srobertthis means the build speedup at higher optimization levels may be lower than expected given ``O0`` experience,
944*12c85518Srobertbut does provide by more optimization opportunities.
945*12c85518Srobert
946*12c85518SrobertInteroperability with Clang Modules
947*12c85518Srobert-----------------------------------
948*12c85518Srobert
949*12c85518SrobertWe **wish** to support clang modules and standard c++ modules at the same time,
950*12c85518Srobertbut the mixed using form is not well used/tested yet.
951*12c85518Srobert
952*12c85518SrobertPlease file new github issues as you find interoperability problems.
953