1==================== 2Standard C++ Modules 3==================== 4 5.. contents:: 6 :local: 7 8Introduction 9============ 10 11The term ``modules`` has a lot of meanings. For the users of Clang, modules may 12refer to ``Objective-C Modules``, ``Clang C++ Modules`` (or ``Clang Header Modules``, 13etc.) or ``Standard C++ Modules``. The implementation of all these kinds of modules in Clang 14has a lot of shared code, but from the perspective of users, their semantics and 15command line interfaces are very different. This document focuses on 16an introduction of how to use standard C++ modules in Clang. 17 18There is already a detailed document about `Clang modules <Modules.html>`_, it 19should be helpful to read `Clang modules <Modules.html>`_ if you want to know 20more about the general idea of modules. Since standard C++ modules have different semantics 21(and work flows) from `Clang modules`, this page describes the background and use of 22Clang with standard C++ modules. 23 24Modules exist in two forms in the C++ Language Specification. They can refer to 25either "Named Modules" or to "Header Units". This document covers both forms. 26 27Standard C++ Named modules 28========================== 29 30This document was intended to be a manual first and foremost, however, we consider it helpful to 31introduce some language background here for readers who are not familiar with 32the new language feature. This document is not intended to be a language 33tutorial; it will only introduce necessary concepts about the 34structure and building of the project. 35 36Background and terminology 37-------------------------- 38 39Modules 40~~~~~~~ 41 42In this document, the term ``Modules``/``modules`` refers to standard C++ modules 43feature if it is not decorated by ``Clang``. 44 45Clang Modules 46~~~~~~~~~~~~~ 47 48In this document, the term ``Clang Modules``/``Clang modules`` refer to Clang 49c++ modules extension. These are also known as ``Clang header modules``, 50``Clang module map modules`` or ``Clang c++ modules``. 51 52Module and module unit 53~~~~~~~~~~~~~~~~~~~~~~ 54 55A module consists of one or more module units. A module unit is a special 56translation unit. Every module unit must have a module declaration. The syntax 57of the module declaration is: 58 59.. code-block:: c++ 60 61 [export] module module_name[:partition_name]; 62 63Terms enclosed in ``[]`` are optional. The syntax of ``module_name`` and ``partition_name`` 64in regex form corresponds to ``[a-zA-Z_][a-zA-Z_0-9\.]*``. In particular, a literal dot ``.`` 65in the name has no semantic meaning (e.g. implying a hierarchy). 66 67In this document, module units are classified into: 68 69* Primary module interface unit. 70 71* Module implementation unit. 72 73* Module interface partition unit. 74 75* Internal module partition unit. 76 77A primary module interface unit is a module unit whose module declaration is 78``export module module_name;``. The ``module_name`` here denotes the name of the 79module. A module should have one and only one primary module interface unit. 80 81A module implementation unit is a module unit whose module declaration is 82``module module_name;``. A module could have multiple module implementation 83units with the same declaration. 84 85A module interface partition unit is a module unit whose module declaration is 86``export module module_name:partition_name;``. The ``partition_name`` should be 87unique within any given module. 88 89An internal module partition unit is a module unit whose module declaration 90is ``module module_name:partition_name;``. The ``partition_name`` should be 91unique within any given module. 92 93In this document, we use the following umbrella terms: 94 95* A ``module interface unit`` refers to either a ``primary module interface unit`` 96 or a ``module interface partition unit``. 97 98* An ``importable module unit`` refers to either a ``module interface unit`` 99 or a ``internal module partition unit``. 100 101* A ``module partition unit`` refers to either a ``module interface partition unit`` 102 or a ``internal module partition unit``. 103 104Built Module Interface file 105~~~~~~~~~~~~~~~~~~~~~~~~~~~ 106 107A ``Built Module Interface file`` stands for the precompiled result of an importable module unit. 108It is also called the acronym ``BMI`` genrally. 109 110Global module fragment 111~~~~~~~~~~~~~~~~~~~~~~ 112 113In a module unit, the section from ``module;`` to the module declaration is called the global module fragment. 114 115 116How to build projects using modules 117----------------------------------- 118 119Quick Start 120~~~~~~~~~~~ 121 122Let's see a "hello world" example that uses modules. 123 124.. code-block:: c++ 125 126 // Hello.cppm 127 module; 128 #include <iostream> 129 export module Hello; 130 export void hello() { 131 std::cout << "Hello World!\n"; 132 } 133 134 // use.cpp 135 import Hello; 136 int main() { 137 hello(); 138 return 0; 139 } 140 141Then we type: 142 143.. code-block:: console 144 145 $ clang++ -std=c++20 Hello.cppm --precompile -o Hello.pcm 146 $ clang++ -std=c++20 use.cpp -fprebuilt-module-path=. Hello.pcm -o Hello.out 147 $ ./Hello.out 148 Hello World! 149 150In this example, we make and use a simple module ``Hello`` which contains only a 151primary module interface unit ``Hello.cppm``. 152 153Then let's see a little bit more complex "hello world" example which uses the 4 kinds of module units. 154 155.. code-block:: c++ 156 157 // M.cppm 158 export module M; 159 export import :interface_part; 160 import :impl_part; 161 export void Hello(); 162 163 // interface_part.cppm 164 export module M:interface_part; 165 export void World(); 166 167 // impl_part.cppm 168 module; 169 #include <iostream> 170 #include <string> 171 module M:impl_part; 172 import :interface_part; 173 174 std::string W = "World."; 175 void World() { 176 std::cout << W << std::endl; 177 } 178 179 // Impl.cpp 180 module; 181 #include <iostream> 182 module M; 183 void Hello() { 184 std::cout << "Hello "; 185 } 186 187 // User.cpp 188 import M; 189 int main() { 190 Hello(); 191 World(); 192 return 0; 193 } 194 195Then we are able to compile the example by the following command: 196 197.. code-block:: console 198 199 # Precompiling the module 200 $ clang++ -std=c++20 interface_part.cppm --precompile -o M-interface_part.pcm 201 $ clang++ -std=c++20 impl_part.cppm --precompile -fprebuilt-module-path=. -o M-impl_part.pcm 202 $ clang++ -std=c++20 M.cppm --precompile -fprebuilt-module-path=. -o M.pcm 203 $ clang++ -std=c++20 Impl.cpp -fmodule-file=M.pcm -c -o Impl.o 204 205 # Compiling the user 206 $ clang++ -std=c++20 User.cpp -fprebuilt-module-path=. -c -o User.o 207 208 # Compiling the module and linking it together 209 $ clang++ -std=c++20 M-interface_part.pcm -c -o M-interface_part.o 210 $ clang++ -std=c++20 M-impl_part.pcm -c -o M-impl_part.o 211 $ clang++ -std=c++20 M.pcm -c -o M.o 212 $ clang++ User.o M-interface_part.o M-impl_part.o M.o Impl.o -o a.out 213 214We explain the options in the following sections. 215 216How to enable standard C++ modules 217~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 218 219Currently, standard C++ modules are enabled automatically 220if the language standard is ``-std=c++20`` or newer. 221The ``-fmodules-ts`` option is deprecated and is planned to be removed. 222 223How to produce a BMI 224~~~~~~~~~~~~~~~~~~~~ 225 226We can generate a BMI for an importable module unit by either ``--precompile`` 227or ``-fmodule-output`` flags. 228 229The ``--precompile`` option generates the BMI as the output of the compilation and the output path 230can be specified using the ``-o`` option. 231 232The ``-fmodule-output`` option generates the BMI as a by-product of the compilation. 233If ``-fmodule-output=`` is specified, the BMI will be emitted the specified location. Then if 234``-fmodule-output`` and ``-c`` are specified, the BMI will be emitted in the directory of the 235output file with the name of the input file with the new extension ``.pcm``. Otherwise, the BMI 236will be emitted in the working directory with the name of the input file with the new extension 237``.pcm``. 238 239The style to generate BMIs by ``--precompile`` is called two-phase compilation since it takes 2402 steps to compile a source file to an object file. The style to generate BMIs by ``-fmodule-output`` 241is called one-phase compilation respectively. The one-phase compilation model is simpler 242for build systems to implement and the two-phase compilation has the potential to compile faster due 243to higher parallelism. As an example, if there are two module units A and B, and B depends on A, the 244one-phase compilation model would need to compile them serially, whereas the two-phase compilation 245model may be able to compile them simultaneously if the compilation from A.pcm to A.o takes a long 246time. 247 248File name requirement 249~~~~~~~~~~~~~~~~~~~~~ 250 251The file name of an ``importable module unit`` should end with ``.cppm`` 252(or ``.ccm``, ``.cxxm``, ``.c++m``). The file name of a ``module implementation unit`` 253should end with ``.cpp`` (or ``.cc``, ``.cxx``, ``.c++``). 254 255The file name of BMIs should end with ``.pcm``. 256The file name of the BMI of a ``primary module interface unit`` should be ``module_name.pcm``. 257The file name of BMIs of ``module partition unit`` should be ``module_name-partition_name.pcm``. 258 259If the file names use different extensions, Clang may fail to build the module. 260For example, if the filename of an ``importable module unit`` ends with ``.cpp`` instead of ``.cppm``, 261then we can't generate a BMI for the ``importable module unit`` by ``--precompile`` option 262since ``--precompile`` option now would only run preprocessor, which is equal to `-E` now. 263If we want the filename of an ``importable module unit`` ends with other suffixes instead of ``.cppm``, 264we could put ``-x c++-module`` in front of the file. For example, 265 266.. code-block:: c++ 267 268 // Hello.cpp 269 module; 270 #include <iostream> 271 export module Hello; 272 export void hello() { 273 std::cout << "Hello World!\n"; 274 } 275 276 // use.cpp 277 import Hello; 278 int main() { 279 hello(); 280 return 0; 281 } 282 283Now the filename of the ``module interface`` ends with ``.cpp`` instead of ``.cppm``, 284we can't compile them by the original command lines. But we are still able to do it by: 285 286.. code-block:: console 287 288 $ clang++ -std=c++20 -x c++-module Hello.cpp --precompile -o Hello.pcm 289 $ clang++ -std=c++20 use.cpp -fprebuilt-module-path=. Hello.pcm -o Hello.out 290 $ ./Hello.out 291 Hello World! 292 293Module name requirement 294~~~~~~~~~~~~~~~~~~~~~~~ 295 296[module.unit]p1 says: 297 298.. code-block:: text 299 300 All module-names either beginning with an identifier consisting of std followed by zero 301 or more digits or containing a reserved identifier ([lex.name]) are reserved and shall not 302 be specified in a module-declaration; no diagnostic is required. If any identifier in a reserved 303 module-name is a reserved identifier, the module name is reserved for use by C++ implementations; 304 otherwise it is reserved for future standardization. 305 306So all of the following name is not valid by default: 307 308.. code-block:: text 309 310 std 311 std1 312 std.foo 313 __test 314 // and so on ... 315 316If you still want to use the reserved module names for any reason, currently you can add a special line marker 317in the front of the module declaration like: 318 319.. code-block:: c++ 320 321 # __LINE_NUMBER__ __FILE__ 1 3 322 export module std; 323 324Here the `__LINE_NUMBER__` is the actual line number of the corresponding line. The `__FILE__` means the filename 325of the translation unit. The `1` means the following is a new file. And `3` means this is a system header/file so 326the certain warnings should be suppressed. You could find more details at: 327https://gcc.gnu.org/onlinedocs/gcc-3.0.2/cpp_9.html. 328 329How to specify the dependent BMIs 330~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 331 332There are 3 methods to specify the dependent BMIs: 333 334* (1) ``-fprebuilt-module-path=<path/to/direcotry>``. 335* (2) ``-fmodule-file=<path/to/BMI>``. 336* (3) ``-fmodule-file=<module-name>=<path/to/BMI>``. 337 338The option ``-fprebuilt-module-path`` tells the compiler the path where to search for dependent BMIs. 339It may be used multiple times just like ``-I`` for specifying paths for header files. The look up rule here is: 340 341* (1) When we import module M. The compiler would look up M.pcm in the directories specified 342 by ``-fprebuilt-module-path``. 343* (2) When we import partition module unit M:P. The compiler would look up M-P.pcm in the 344 directories specified by ``-fprebuilt-module-path``. 345 346The option ``-fmodule-file=<path/to/BMI>`` tells the compiler to load the specified BMI directly. 347The option ``-fmodule-file=<module-name>=<path/to/BMI>`` tells the compiler to load the specified BMI 348for the module specified by ``<module-name>`` when necessary. The main difference is that 349``-fmodule-file=<path/to/BMI>`` will load the BMI eagerly, whereas 350``-fmodule-file=<module-name>=<path/to/BMI>`` will only load the BMI lazily, which is similar 351with ``-fprebuilt-module-path``. 352 353In case all ``-fprebuilt-module-path=<path/to/direcotry>``, ``-fmodule-file=<path/to/BMI>`` and 354``-fmodule-file=<module-name>=<path/to/BMI>`` exist, the ``-fmodule-file=<path/to/BMI>`` option 355takes highest precedence and ``-fmodule-file=<module-name>=<path/to/BMI>`` will take the second 356highest precedence. 357 358When we compile a ``module implementation unit``, we must specify the BMI of the corresponding 359``primary module interface unit``. 360Since the language specification says a module implementation unit implicitly imports 361the primary module interface unit. 362 363 [module.unit]p8 364 365 A module-declaration that contains neither an export-keyword nor a module-partition implicitly 366 imports the primary module interface unit of the module as if by a module-import-declaration. 367 368All of the 3 options ``-fprebuilt-module-path=<path/to/direcotry>``, ``-fmodule-file=<path/to/BMI>`` 369and ``-fmodule-file=<module-name>=<path/to/BMI>`` may occur multiple times. 370For example, the command line to compile ``M.cppm`` in 371the above example could be rewritten into: 372 373.. code-block:: console 374 375 $ clang++ -std=c++20 M.cppm --precompile -fmodule-file=M-interface_part.pcm -fmodule-file=M-impl_part.pcm -o M.pcm 376 $ clang++ -std=c++20 M.cppm --precompile -fmodule-file=M:interface_part=M-interface_part.pcm -fmodule-file=M:impl_part=M-impl_part.pcm -o M.pcm 377 378``-fprebuilt-module-path`` is more convenient and ``-fmodule-file`` is faster since 379it saves time for file lookup. 380 381Remember that module units still have an object counterpart to the BMI 382~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 383 384It is easy to forget to compile BMIs at first since we may envision module interfaces like headers. 385However, this is not true. 386Module units are translation units. We need to compile them to object files 387and link the object files like the example shows. 388 389For example, the traditional compilation processes for headers are like: 390 391.. code-block:: text 392 393 src1.cpp -+> clang++ src1.cpp --> src1.o ---, 394 hdr1.h --' +-> clang++ src1.o src2.o -> executable 395 hdr2.h --, | 396 src2.cpp -+> clang++ src2.cpp --> src2.o ---' 397 398And the compilation process for module units are like: 399 400.. code-block:: text 401 402 src1.cpp ----------------------------------------+> clang++ src1.cpp -------> src1.o -, 403 (header unit) hdr1.h -> clang++ hdr1.h ... -> hdr1.pcm --' +-> clang++ src1.o mod1.o src2.o -> executable 404 mod1.cppm -> clang++ mod1.cppm ... -> mod1.pcm --,--> clang++ mod1.pcm ... -> mod1.o -+ 405 src2.cpp ----------------------------------------+> clang++ src2.cpp -------> src2.o -' 406 407As the diagrams show, we need to compile the BMI from module units to object files and link the object files. 408(But we can't do this for the BMI from header units. See the later section for the definition of header units) 409 410If we want to create a module library, we can't just ship the BMIs in an archive. 411We must compile these BMIs(``*.pcm``) into object files(``*.o``) and add those object files to the archive instead. 412 413Consistency Requirement 414~~~~~~~~~~~~~~~~~~~~~~~ 415 416If we envision modules as a cache to speed up compilation, then - as with other caching techniques - 417it is important to keep cache consistency. 418So **currently** Clang will do very strict check for consistency. 419 420Options consistency 421^^^^^^^^^^^^^^^^^^^ 422 423The language option of module units and their non-module-unit users should be consistent. 424The following example is not allowed: 425 426.. code-block:: c++ 427 428 // M.cppm 429 export module M; 430 431 // Use.cpp 432 import M; 433 434.. code-block:: console 435 436 $ clang++ -std=c++20 M.cppm --precompile -o M.pcm 437 $ clang++ -std=c++2b Use.cpp -fprebuilt-module-path=. 438 439The compiler would reject the example due to the inconsistent language options. 440Not all options are language options. 441For example, the following example is allowed: 442 443.. code-block:: console 444 445 $ clang++ -std=c++20 M.cppm --precompile -o M.pcm 446 # Inconsistent optimization level. 447 $ clang++ -std=c++20 -O3 Use.cpp -fprebuilt-module-path=. 448 # Inconsistent debugging level. 449 $ clang++ -std=c++20 -g Use.cpp -fprebuilt-module-path=. 450 451Although the two examples have inconsistent optimization and debugging level, both of them are accepted. 452 453Note that **currently** the compiler doesn't consider inconsistent macro definition a problem. For example: 454 455.. code-block:: console 456 457 $ clang++ -std=c++20 M.cppm --precompile -o M.pcm 458 # Inconsistent optimization level. 459 $ clang++ -std=c++20 -O3 -DNDEBUG Use.cpp -fprebuilt-module-path=. 460 461Currently Clang would accept the above example. But it may produce surprising results if the 462debugging code depends on consistent use of ``NDEBUG`` also in other translation units. 463 464Source content consistency 465^^^^^^^^^^^^^^^^^^^^^^^^^^ 466 467When the compiler reads a BMI, the compiler will check the consistency of the corresponding 468source files. For example: 469 470.. code-block:: c++ 471 472 // M.cppm 473 export module M; 474 export template <class T> 475 T foo(T t) { 476 return t; 477 } 478 479 // Use.cpp 480 import M; 481 void bar() { 482 foo(5); 483 } 484 485.. code-block:: console 486 487 $ clang++ -std=c++20 M.cppm --precompile -o M.pcm 488 $ rm M.cppm 489 $ clang++ -std=c++20 Use.cpp -fmodule-file=M.pcm 490 491The compiler would reject the example since the compiler failed to find the source file to check the consistency. 492So the following example would be rejected too. 493 494.. code-block:: console 495 496 $ clang++ -std=c++20 M.cppm --precompile -o M.pcm 497 $ echo "int i=0;" >> M.cppm 498 $ clang++ -std=c++20 Use.cpp -fmodule-file=M.pcm 499 500The compiler would reject it too since the compiler detected the file was changed. 501 502But it is OK to move the BMI as long as the source files remain: 503 504.. code-block:: console 505 506 $ clang++ -std=c++20 M.cppm --precompile -o M.pcm 507 $ mkdir -p tmp 508 $ mv M.pcm tmp/M.pcm 509 $ clang++ -std=c++20 Use.cpp -fmodule-file=tmp/M.pcm 510 511The above example would be accepted. 512 513If the user doesn't want to follow the consistency requirement due to some reasons (e.g., distributing BMI), 514the user could try to use ``-Xclang -fmodules-embed-all-files`` when producing BMI. For example: 515 516.. code-block:: console 517 518 $ clang++ -std=c++20 M.cppm --precompile -Xclang -fmodules-embed-all-files -o M.pcm 519 $ rm M.cppm 520 $ clang++ -std=c++20 Use.cpp -fmodule-file=M.pcm 521 522Now the compiler would accept the above example. 523Important note: Xclang options are intended to be used by compiler internally and its semantics 524are not guaranteed to be preserved in future versions. 525 526Also the compiler will record the path to the header files included in the global module fragment and compare the 527headers when imported. For example, 528 529.. code-block:: c++ 530 531 // foo.h 532 #include <iostream> 533 void Hello() { 534 std::cout << "Hello World.\n"; 535 } 536 537 // foo.cppm 538 module; 539 #include "foo.h" 540 export module foo; 541 export using ::Hello; 542 543 // Use.cpp 544 import foo; 545 int main() { 546 Hello(); 547 } 548 549Then it is problematic if we remove ``foo.h`` before import `foo` module. 550 551.. code-block:: console 552 553 $ clang++ -std=c++20 foo.cppm --precompile -o foo.pcm 554 $ mv foo.h foo.orig.h 555 # The following one is rejected 556 $ clang++ -std=c++20 Use.cpp -fmodule-file=foo.pcm -c 557 558The above case will rejected. And we're still able to workaround it by ``-Xclang -fmodules-embed-all-files`` option: 559 560.. code-block:: console 561 562 $ clang++ -std=c++20 foo.cppm --precompile -Xclang -fmodules-embed-all-files -o foo.pcm 563 $ mv foo.h foo.orig.h 564 $ clang++ -std=c++20 Use.cpp -fmodule-file=foo.pcm -c -o Use.o 565 $ clang++ Use.o foo.pcm 566 567ABI Impacts 568----------- 569 570The declarations in a module unit which are not in the global module fragment have new linkage names. 571 572For example, 573 574.. code-block:: c++ 575 576 export module M; 577 namespace NS { 578 export int foo(); 579 } 580 581The linkage name of ``NS::foo()`` would be ``_ZN2NSW1M3fooEv``. 582This couldn't be demangled by previous versions of the debugger or demangler. 583As of LLVM 15.x, users can utilize ``llvm-cxxfilt`` to demangle this: 584 585.. code-block:: console 586 587 $ llvm-cxxfilt _ZN2NSW1M3fooEv 588 589The result would be ``NS::foo@M()``, which reads as ``NS::foo()`` in module ``M``. 590 591The ABI implies that we can't declare something in a module unit and define it in a non-module unit (or vice-versa), 592as this would result in linking errors. 593 594Known Problems 595-------------- 596 597The following describes issues in the current implementation of modules. 598Please see https://github.com/llvm/llvm-project/labels/clang%3Amodules for more issues 599or file a new issue if you don't find an existing one. 600If you're going to create a new issue for standard C++ modules, 601please start the title with ``[C++20] [Modules]`` (or ``[C++2b] [Modules]``, etc) 602and add the label ``clang:modules`` (if you have permissions for that). 603 604For higher level support for proposals, you could visit https://clang.llvm.org/cxx_status.html. 605 606Support for clang-scan-deps 607~~~~~~~~~~~~~~~~~~~~~~~~~~~ 608 609The support for clang-scan-deps may be the most urgent problem for modules now. 610Without the support for clang-scan-deps, it's hard to involve build systems. 611This means that users could only play with modules through makefiles or by writing a parser by hand. 612It blocks more uses for modules, which will block more defect reports or requirements. 613 614This is tracked in: https://github.com/llvm/llvm-project/issues/51792. 615 616Ambiguous deduction guide 617~~~~~~~~~~~~~~~~~~~~~~~~~ 618 619Currently, when we call deduction guides in global module fragment, 620we may get incorrect diagnosing message like: `ambiguous deduction`. 621 622So if we're using deduction guide from global module fragment, we probably need to write: 623 624.. code-block:: c++ 625 626 std::lock_guard<std::mutex> lk(mutex); 627 628instead of 629 630.. code-block:: c++ 631 632 std::lock_guard lk(mutex); 633 634This is tracked in: https://github.com/llvm/llvm-project/issues/56916 635 636Ignored PreferredName Attribute 637~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 638 639Due to a tricky problem, when Clang writes BMIs, Clang will ignore the ``preferred_name`` attribute, if any. 640This implies that the ``preferred_name`` wouldn't show in debugger or dumping. 641 642This is tracked in: https://github.com/llvm/llvm-project/issues/56490 643 644Don't emit macros about module declaration 645~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 646 647This is covered by P1857R3. We mention it again here since users may abuse it before we implement it. 648 649Someone may want to write code which could be compiled both by modules or non-modules. 650A direct idea would be use macros like: 651 652.. code-block:: c++ 653 654 MODULE 655 IMPORT header_name 656 EXPORT_MODULE MODULE_NAME; 657 IMPORT header_name 658 EXPORT ... 659 660So this file could be triggered like a module unit or a non-module unit depending on the definition 661of some macros. 662However, this kind of usage is forbidden by P1857R3 but we haven't implemented P1857R3 yet. 663This means that is possible to write illegal modules code now, and obviously this will stop working 664once P1857R3 is implemented. 665A simple suggestion would be "Don't play macro tricks with module declarations". 666 667This is tracked in: https://github.com/llvm/llvm-project/issues/56917 668 669In consistent filename suffix requirement for importable module units 670~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 671 672Currently, clang requires the file name of an ``importable module unit`` should end with ``.cppm`` 673(or ``.ccm``, ``.cxxm``, ``.c++m``). However, the behavior is inconsistent with other compilers. 674 675This is tracked in: https://github.com/llvm/llvm-project/issues/57416 676 677Header Units 678============ 679 680How to build projects using header unit 681--------------------------------------- 682 683Quick Start 684~~~~~~~~~~~ 685 686For the following example, 687 688.. code-block:: c++ 689 690 import <iostream>; 691 int main() { 692 std::cout << "Hello World.\n"; 693 } 694 695we could compile it as 696 697.. code-block:: console 698 699 $ clang++ -std=c++20 -xc++-system-header --precompile iostream -o iostream.pcm 700 $ clang++ -std=c++20 -fmodule-file=iostream.pcm main.cpp 701 702How to produce BMIs 703~~~~~~~~~~~~~~~~~~~ 704 705Similar to named modules, we could use ``--precompile`` to produce the BMI. 706But we need to specify that the input file is a header by ``-xc++-system-header`` or ``-xc++-user-header``. 707 708Also we could use `-fmodule-header={user,system}` option to produce the BMI for header units 709which has suffix like `.h` or `.hh`. 710The value of `-fmodule-header` means the user search path or the system search path. 711The default value for `-fmodule-header` is `user`. 712For example, 713 714.. code-block:: c++ 715 716 // foo.h 717 #include <iostream> 718 void Hello() { 719 std::cout << "Hello World.\n"; 720 } 721 722 // use.cpp 723 import "foo.h"; 724 int main() { 725 Hello(); 726 } 727 728We could compile it as: 729 730.. code-block:: console 731 732 $ clang++ -std=c++20 -fmodule-header foo.h -o foo.pcm 733 $ clang++ -std=c++20 -fmodule-file=foo.pcm use.cpp 734 735For headers which don't have a suffix, we need to pass ``-xc++-header`` 736(or ``-xc++-system-header`` or ``-xc++-user-header``) to mark it as a header. 737For example, 738 739.. code-block:: c++ 740 741 // use.cpp 742 import "foo.h"; 743 int main() { 744 Hello(); 745 } 746 747.. code-block:: console 748 749 $ clang++ -std=c++20 -fmodule-header=system -xc++-header iostream -o iostream.pcm 750 $ clang++ -std=c++20 -fmodule-file=iostream.pcm use.cpp 751 752How to specify the dependent BMIs 753~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 754 755We could use ``-fmodule-file`` to specify the BMIs, and this option may occur multiple times as well. 756 757With the existing implementation ``-fprebuilt-module-path`` cannot be used for header units 758(since they are nominally anonymous). 759For header units, use ``-fmodule-file`` to include the relevant PCM file for each header unit. 760 761This is expect to be solved in future editions of the compiler either by the tooling finding and specifying 762the -fmodule-file or by the use of a module-mapper that understands how to map the header name to their PCMs. 763 764Don't compile the BMI 765~~~~~~~~~~~~~~~~~~~~~ 766 767Another difference with modules is that we can't compile the BMI from a header unit. 768For example: 769 770.. code-block:: console 771 772 $ clang++ -std=c++20 -xc++-system-header --precompile iostream -o iostream.pcm 773 # This is not allowed! 774 $ clang++ iostream.pcm -c -o iostream.o 775 776It makes sense due to the semantics of header units, which are just like headers. 777 778Include translation 779~~~~~~~~~~~~~~~~~~~ 780 781The C++ spec allows the vendors to convert ``#include header-name`` to ``import header-name;`` when possible. 782Currently, Clang would do this translation for the ``#include`` in the global module fragment. 783 784For example, the following two examples are the same: 785 786.. code-block:: c++ 787 788 module; 789 import <iostream>; 790 export module M; 791 export void Hello() { 792 std::cout << "Hello.\n"; 793 } 794 795with the following one: 796 797.. code-block:: c++ 798 799 module; 800 #include <iostream> 801 export module M; 802 export void Hello() { 803 std::cout << "Hello.\n"; 804 } 805 806.. code-block:: console 807 808 $ clang++ -std=c++20 -xc++-system-header --precompile iostream -o iostream.pcm 809 $ clang++ -std=c++20 -fmodule-file=iostream.pcm --precompile M.cppm -o M.cpp 810 811In the latter example, the Clang could find the BMI for the ``<iostream>`` 812so it would try to replace the ``#include <iostream>`` to ``import <iostream>;`` automatically. 813 814 815Relationships between Clang modules 816----------------------------------- 817 818Header units have pretty similar semantics with Clang modules. 819The semantics of both of them are like headers. 820 821In fact, we could even "mimic" the sytle of header units by Clang modules: 822 823.. code-block:: c++ 824 825 module "iostream" { 826 export * 827 header "/path/to/libstdcxx/iostream" 828 } 829 830.. code-block:: console 831 832 $ clang++ -std=c++20 -fimplicit-modules -fmodule-map-file=.modulemap main.cpp 833 834It would be simpler if we are using libcxx: 835 836.. code-block:: console 837 838 $ clang++ -std=c++20 main.cpp -fimplicit-modules -fimplicit-module-maps 839 840Since there is already one 841`module map <https://github.com/llvm/llvm-project/blob/main/libcxx/include/module.modulemap.in>`_ 842in the source of libcxx. 843 844Then immediately leads to the question: why don't we implement header units through Clang header modules? 845 846The main reason for this is that Clang modules have more semantics like hierarchy or 847wrapping multiple headers together as a big module. 848However, these things are not part of Standard C++ Header units, 849and we want to avoid the impression that these additional semantics get interpreted as Standard C++ behavior. 850 851Another reason is that there are proposals to introduce module mappers to the C++ standard 852(for example, https://wg21.link/p1184r2). 853If we decide to reuse Clang's modulemap, we may get in trouble once we need to introduce another module mapper. 854 855So the final answer for why we don't reuse the interface of Clang modules for header units is that 856there are some differences between header units and Clang modules and that ignoring those 857differences now would likely become a problem in the future. 858 859Possible Questions 860================== 861 862How modules speed up compilation 863-------------------------------- 864 865A classic theory for the reason why modules speed up the compilation is: 866if there are ``n`` headers and ``m`` source files and each header is included by each source file, 867then the complexity of the compilation is ``O(n*m)``; 868But if there are ``n`` module interfaces and ``m`` source files, the complexity of the compilation is 869``O(n+m)``. So, using modules would be a big win when scaling. 870In a simpler word, we could get rid of many redundant compilations by using modules. 871 872Roughly, this theory is correct. But the problem is that it is too rough. 873The behavior depends on the optimization level, as we will illustrate below. 874 875First is ``O0``. The compilation process is described in the following graph. 876 877.. code-block:: none 878 879 ├-------------frontend----------┼-------------middle end----------------┼----backend----┤ 880 │ │ │ │ 881 └---parsing----sema----codegen--┴----- transformations ---- codegen ----┴---- codegen --┘ 882 883 ┌---------------------------------------------------------------------------------------┐ 884 | │ 885 | source file │ 886 | │ 887 └---------------------------------------------------------------------------------------┘ 888 889 ┌--------┐ 890 │ │ 891 │imported│ 892 │ │ 893 │ code │ 894 │ │ 895 └--------┘ 896 897Here we can see that the source file (could be a non-module unit or a module unit) would get processed by the 898whole pipeline. 899But the imported code would only get involved in semantic analysis, which is mainly about name lookup, 900overload resolution and template instantiation. 901All of these processes are fast relative to the whole compilation process. 902More importantly, the imported code only needs to be processed once in frontend code generation, 903as well as the whole middle end and backend. 904So we could get a big win for the compilation time in O0. 905 906But with optimizations, things are different: 907 908(we omit ``code generation`` part for each end due to the limited space) 909 910.. code-block:: none 911 912 ├-------- frontend ---------┼--------------- middle end --------------------┼------ backend ----┤ 913 │ │ │ │ 914 └--- parsing ---- sema -----┴--- optimizations --- IPO ---- optimizations---┴--- optimizations -┘ 915 916 ┌-----------------------------------------------------------------------------------------------┐ 917 │ │ 918 │ source file │ 919 │ │ 920 └-----------------------------------------------------------------------------------------------┘ 921 ┌---------------------------------------┐ 922 │ │ 923 │ │ 924 │ imported code │ 925 │ │ 926 │ │ 927 └---------------------------------------┘ 928 929It would be very unfortunate if we end up with worse performance after using modules. 930The main concern is that when we compile a source file, the compiler needs to see the function body 931of imported module units so that it can perform IPO (InterProcedural Optimization, primarily inlining 932in practice) to optimize functions in current source file with the help of the information provided by 933the imported module units. 934In other words, the imported code would be processed again and again in importee units 935by optimizations (including IPO itself). 936The optimizations before IPO and the IPO itself are the most time-consuming part in whole compilation process. 937So from this perspective, we might not be able to get the improvements described in the theory. 938But we could still save the time for optimizations after IPO and the whole backend. 939 940Overall, at ``O0`` the implementations of functions defined in a module will not impact module users, 941but at higher optimization levels the definitions of such functions are provided to user compilations for the 942purposes of optimization (but definitions of these functions are still not included in the use's object file)- 943this means the build speedup at higher optimization levels may be lower than expected given ``O0`` experience, 944but does provide by more optimization opportunities. 945 946Interoperability with Clang Modules 947----------------------------------- 948 949We **wish** to support clang modules and standard c++ modules at the same time, 950but the mixed using form is not well used/tested yet. 951 952Please file new github issues as you find interoperability problems. 953