1<appendix xmlns="http://docbook.org/ns/docbook" version="5.0" 2 xml:id="appendix.contrib" xreflabel="Contributing"> 3<?dbhtml filename="appendix_contributing.html"?> 4 5<info><title> 6 Contributing 7 <indexterm> 8 <primary>Appendix</primary> 9 <secondary>Contributing</secondary> 10 </indexterm> 11</title> 12 <keywordset> 13 <keyword>ISO C++</keyword> 14 <keyword>library</keyword> 15 </keywordset> 16</info> 17 18 19 20<para> 21 The GNU C++ Library is part of GCC and follows the same development model, 22 so the general rules for 23 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://gcc.gnu.org/contribute.html">contributing 24 to GCC</link> apply. Active 25 contributors are assigned maintainership responsibility, and given 26 write access to the source repository. First-time contributors 27 should follow this procedure: 28</para> 29 30<section xml:id="contrib.list" xreflabel="Contributor Checklist"><info><title>Contributor Checklist</title></info> 31 32 33 <section xml:id="list.reading"><info><title>Reading</title></info> 34 35 36 <itemizedlist> 37 <listitem> 38 <para> 39 Get and read the relevant sections of the C++ language 40 specification. Copies of the full ISO 14882 standard are 41 available on line via the ISO mirror site for committee 42 members. Non-members, or those who have not paid for the 43 privilege of sitting on the committee and sustained their 44 two meeting commitment for voting rights, may get a copy of 45 the standard from their respective national standards 46 organization. In the USA, this national standards 47 organization is 48 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.ansi.org">ANSI</link>. 49 (And if you've already registered with them you can <link 50 xmlns:xlink="http://www.w3.org/1999/xlink" 51 xlink:href="https://webstore.ansi.org/Standards/ISO/ISOIEC148822014">buy 52 the standard on-line</link>.) 53 </para> 54 </listitem> 55 56 <listitem> 57 <para> 58 The library working group bugs, and known defects, can 59 be obtained here: 60 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.open-std.org/jtc1/sc22/wg21/">http://www.open-std.org/jtc1/sc22/wg21</link> 61 </para> 62 </listitem> 63 64 <listitem> 65 <para> 66 Peruse 67 the <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.gnu.org/prep/standards/">GNU 68 Coding Standards</link>, and chuckle when you hit the part 69 about <quote>Using Languages Other Than C</quote>. 70 </para> 71 </listitem> 72 73 <listitem> 74 <para> 75 Be familiar with the extensions that preceded these 76 general GNU rules. These style issues for libstdc++ can be 77 found in <link linkend="contrib.coding_style">Coding Style</link>. 78 </para> 79 </listitem> 80 81 <listitem> 82 <para> 83 And last but certainly not least, read the 84 library-specific information found in 85 <link linkend="appendix.porting">Porting and Maintenance</link>. 86 </para> 87 </listitem> 88 </itemizedlist> 89 90 </section> 91 <section xml:id="list.copyright"><info><title>Assignment</title></info> 92 93 <para> 94 See the <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://gcc.gnu.org/contribute.html#legal">legal prerequisites</link> for all GCC contributions. 95 </para> 96 97 <para> 98 Historically, the libstdc++ assignment form added the following 99 question: 100 </para> 101 102 <para> 103 <quote> 104 Which Belgian comic book character is better, Tintin or Asterix, and 105 why? 106 </quote> 107 </para> 108 109 <para> 110 While not strictly necessary, humoring the maintainers and answering 111 this question would be appreciated. 112 </para> 113 114 <para> 115 Please contact 116 Paolo Carlini at <email>paolo.carlini@oracle.com</email> 117 or 118 Jonathan Wakely at <email>jwakely+assign@redhat.com</email> 119 if you are confused about the assignment or have general licensing 120 questions. When requesting an assignment form from 121 <email>assign@gnu.org</email>, please CC the libstdc++ 122 maintainers above so that progress can be monitored. 123 </para> 124 </section> 125 126 <section xml:id="list.getting"><info><title>Getting Sources</title></info> 127 128 <para> 129 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://gcc.gnu.org/gitwrite.html">Getting write access 130 (look for "Write after approval")</link> 131 </para> 132 </section> 133 134 <section xml:id="list.patches"><info><title>Submitting Patches</title></info> 135 136 137 <para> 138 Every patch must have several pieces of information before it can be 139 properly evaluated. Ideally (and to ensure the fastest possible 140 response from the maintainers) it would have all of these pieces: 141 </para> 142 143 <itemizedlist> 144 <listitem> 145 <para> 146 A description of the bug and how your patch fixes this 147 bug. For new features a description of the feature and your 148 implementation. 149 </para> 150 </listitem> 151 152 <listitem> 153 <para> 154 A ChangeLog entry as part of the Git commit message. Check 155 some recent commits for format and content. The 156 <filename>contrib/mklog.py</filename> script can be used to 157 generate a ChangeLog template for commit messages. See 158 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://gcc.gnu.org/gitwrite.html">Read-write Git access</link> 159 for scripts and aliases that are useful here. 160 </para> 161 </listitem> 162 163 <listitem> 164 <para> 165 A testsuite submission or sample program that will 166 easily and simply show the existing error or test new 167 functionality. 168 </para> 169 </listitem> 170 171 <listitem> 172 <para> 173 The patch itself. If you are using the Git repository use 174 <command>git show</command> or <command>git format-patch</command> 175 to produce a patch; 176 otherwise, use <command>diff -cp OLD NEW</command>. If your 177 version of diff does not support these options, then get the 178 latest version of GNU diff. 179 </para> 180 </listitem> 181 182 <listitem> 183 <para> 184 When you have all these pieces, bundle them up in a 185 mail message and send it to libstdc++@gcc.gnu.org. All 186 patches and related discussion should be sent to the 187 libstdc++ mailing list. In common with the rest of GCC, 188 patches should also be sent to the gcc-patches mailing list. 189 So you could send your email To:libstdc++@gcc.gnu.org and 190 Cc:gcc-patches@gcc.gnu.org for example. 191 </para> 192 </listitem> 193 </itemizedlist> 194 195 </section> 196 197</section> 198 199<section xml:id="contrib.organization" xreflabel="Source Organization"><info><title>Directory Layout and Source Conventions</title></info> 200 <?dbhtml filename="source_organization.html"?> 201 202 203 <para> 204 The <filename class="directory">libstdc++-v3</filename> directory in the 205 GCC sources contains the files needed to create the GNU C++ Library. 206 </para> 207 208<para> 209It has subdirectories: 210</para> 211 212<variablelist> 213 <varlistentry> 214 <term><filename class="directory">doc</filename></term> 215 <listitem> 216 Files in HTML and text format that document usage, quirks of the 217 implementation, and contributor checklists. 218 </listitem> 219 </varlistentry> 220 221 <varlistentry> 222 <term><filename class="directory">include</filename></term> 223 <listitem> 224 All header files for the C++ library are within this directory, 225 modulo specific runtime-related files that are in the libsupc++ 226 directory. 227 228 <variablelist> 229 <varlistentry> 230 <term><filename class="directory">include/std</filename></term> 231 <listitem> 232 Files meant to be found by <code>#include <name></code> directives 233 in standard-conforming user programs. 234 </listitem> 235 </varlistentry> 236 237 <varlistentry> 238 <term><filename class="directory">include/c</filename></term> 239 <listitem> 240 Headers intended to directly include standard C headers. 241 [NB: this can be enabled via <option>--enable-cheaders=c</option>] 242 </listitem> 243 </varlistentry> 244 245 <varlistentry> 246 <term><filename class="directory">include/c_global</filename></term> 247 <listitem> 248 Headers intended to include standard C headers in 249 the global namespace, and put select names into the <code>std::</code> 250 namespace. [NB: this is the default, and is the same as 251 <option>--enable-cheaders=c_global</option>] 252 </listitem> 253 </varlistentry> 254 255 <varlistentry> 256 <term><filename class="directory">include/c_std</filename></term> 257 <listitem> 258 Headers intended to include standard C headers 259 already in namespace std, and put select names into the <code>std::</code> 260 namespace. [NB: this is the same as 261 <option>--enable-cheaders=c_std</option>] 262 </listitem> 263 </varlistentry> 264 265 <varlistentry> 266 <term><filename class="directory">include/bits</filename></term> 267 <listitem> 268 Files included by standard headers and by other files in 269 the bits directory. 270 </listitem> 271 </varlistentry> 272 273 <varlistentry> 274 <term><filename class="directory">include/backward</filename></term> 275 <listitem> 276 Headers provided for backward compatibility, such as 277 <filename class="headerfile"><backward/hash_map></filename>. 278 They are not used in this library. 279 </listitem> 280 </varlistentry> 281 282 <varlistentry> 283 <term><filename class="directory">include/ext</filename></term> 284 <listitem> 285 Headers that define extensions to the standard library. No 286 standard header refers to any of them, in theory (there are some 287 exceptions). 288 </listitem> 289 </varlistentry> 290 291 <varlistentry> 292 <term> 293 <filename class="directory">include/debug</filename>, 294 <filename class="directory">include/parallel</filename>, and 295 </term> 296 <listitem> 297 Headers that implement the Debug Mode and Parallel Mode extensions. 298 </listitem> 299 </varlistentry> 300 </variablelist> 301 </listitem> 302 </varlistentry> 303 304 <varlistentry> 305 <term><filename class="directory">scripts</filename></term> 306 <listitem> 307 Scripts that are used during the configure, build, make, or test 308 process. 309 </listitem> 310 </varlistentry> 311 312 <varlistentry> 313 <term><filename class="directory">src</filename></term> 314 <listitem> 315 Files that are used in constructing the library, but are not 316 installed. 317 318 <variablelist> 319 <varlistentry> 320 <term><filename class="directory">src/c++98</filename></term> 321 <listitem> 322 Source files compiled using <option>-std=gnu++98</option>. 323 </listitem> 324 </varlistentry> 325 326 <varlistentry> 327 <term><filename class="directory">src/c++11</filename></term> 328 <listitem> 329 Source files compiled using <option>-std=gnu++11</option>. 330 </listitem> 331 </varlistentry> 332 333 <varlistentry> 334 <term><filename class="directory">src/filesystem</filename></term> 335 <listitem> 336 Source files for the Filesystem TS. 337 </listitem> 338 </varlistentry> 339 340 <varlistentry> 341 <term><filename class="directory">src/shared</filename></term> 342 <listitem> 343 Source code included by other files under both 344 <filename class="directory">src/c++98</filename> and 345 <filename class="directory">src/c++11</filename> 346 </listitem> 347 </varlistentry> 348 </variablelist> 349 </listitem> 350 </varlistentry> 351 352 <varlistentry> 353 <term><filename class="directory">testsuites/[backward, demangle, ext, performance, thread, 17_* to 30_*]</filename></term> 354 <listitem> 355 Test programs are here, and may be used to begin to exercise the 356 library. Support for "make check" and "make check-install" is 357 complete, and runs through all the subdirectories here when this 358 command is issued from the build directory. Please note that 359 "make check" requires DejaGnu 1.4 or later to be installed, 360 or for extra <link linkend="test.run.permutations">permutations</link> 361 DejaGnu 1.5.3 or later. 362 </listitem> 363 </varlistentry> 364</variablelist> 365 366<para> 367Other subdirectories contain variant versions of certain files 368that are meant to be copied or linked by the configure script. 369Currently these are: 370<literallayout><filename class="directory">config/abi</filename> 371<filename class="directory">config/allocator</filename> 372<filename class="directory">config/cpu</filename> 373<filename class="directory">config/io</filename> 374<filename class="directory">config/locale</filename> 375<filename class="directory">config/os</filename> 376</literallayout> 377</para> 378 379<para> 380In addition, a subdirectory holds the convenience library libsupc++. 381</para> 382 383<variablelist> 384<varlistentry> 385 <term><filename class="directory">libsupc++</filename></term> 386 <listitem> 387 Contains the runtime library for C++, including exception 388 handling and memory allocation and deallocation, RTTI, terminate 389 handlers, etc. 390 </listitem> 391</varlistentry> 392</variablelist> 393 394<para> 395Note that glibc also has a <filename class="directory">bits/</filename> 396subdirectory. We need to be careful not to collide with names in its 397<filename class="directory">bits/</filename> directory. For example 398<filename class="headerfile"><bits/std_mutex.h></filename> has to be 399renamed from <filename class="headerfile"><bits/mutex.h></filename>. 400Another solution would be to rename <filename class="directory">bits</filename> 401to (e.g.) <filename class="directory">cppbits</filename>. 402</para> 403 404<para> 405In files throughout the system, lines marked with an "XXX" indicate 406a bug or incompletely-implemented feature. Lines marked "XXX MT" 407indicate a place that may require attention for multi-thread safety. 408</para> 409 410</section> 411 412<section xml:id="contrib.coding_style" xreflabel="Coding Style"><info><title>Coding Style</title></info> 413 <?dbhtml filename="source_code_style.html"?> 414 415 <para> 416 </para> 417 418 <section xml:id="coding_style.bad_identifiers"><info><title>Bad Identifiers</title></info> <!-- BADNAMES --> 419 420 <para> 421 Identifiers that conflict and should be avoided. 422 </para> 423 424 <literallayout class="normal"> 425 This is the list of names <quote>reserved to the 426 implementation</quote> that have been claimed by certain 427 compilers and system headers of interest, and should not be used 428 in the library. It will grow, of course. We generally are 429 interested in names that are not all-caps, except for those like 430 "_T" 431 432 For Solaris: 433 _B 434 _C 435 _L 436 _N 437 _P 438 _S 439 _U 440 _X 441 _E1 442 .. 443 _E24 444 445 Irix adds: 446 _A 447 _G 448 449 MS adds: 450 _T 451 __deref 452 453 BSD adds: 454 __used 455 __unused 456 __inline 457 _Complex 458 __istype 459 __maskrune 460 __tolower 461 __toupper 462 __wchar_t 463 __wint_t 464 _res 465 _res_ext 466 __tg_* 467 468 VxWorks adds: 469 _C2 470 471 For GCC: 472 473 [Note that this list is out of date. It applies to the old 474 name-mangling; in G++ 3.0 and higher a different name-mangling is 475 used. In addition, many of the bugs relating to G++ interpreting 476 these names as operators have been fixed.] 477 478 The full set of __* identifiers (combined from gcc/cp/lex.c and 479 gcc/cplus-dem.c) that are either old or new, but are definitely 480 recognized by the demangler, is: 481 482 __aa 483 __aad 484 __ad 485 __addr 486 __adv 487 __aer 488 __als 489 __alshift 490 __amd 491 __ami 492 __aml 493 __amu 494 __aor 495 __apl 496 __array 497 __ars 498 __arshift 499 __as 500 __bit_and 501 __bit_ior 502 __bit_not 503 __bit_xor 504 __call 505 __cl 506 __cm 507 __cn 508 __co 509 __component 510 __compound 511 __cond 512 __convert 513 __delete 514 __dl 515 __dv 516 __eq 517 __er 518 __ge 519 __gt 520 __indirect 521 __le 522 __ls 523 __lt 524 __max 525 __md 526 __method_call 527 __mi 528 __min 529 __minus 530 __ml 531 __mm 532 __mn 533 __mult 534 __mx 535 __ne 536 __negate 537 __new 538 __nop 539 __nt 540 __nw 541 __oo 542 __op 543 __or 544 __pl 545 __plus 546 __postdecrement 547 __postincrement 548 __pp 549 __pt 550 __rf 551 __rm 552 __rs 553 __sz 554 __trunc_div 555 __trunc_mod 556 __truth_andif 557 __truth_not 558 __truth_orif 559 __vc 560 __vd 561 __vn 562 563 SGI badnames: 564 __builtin_alloca 565 __builtin_fsqrt 566 __builtin_sqrt 567 __builtin_fabs 568 __builtin_dabs 569 __builtin_cast_f2i 570 __builtin_cast_i2f 571 __builtin_cast_d2ll 572 __builtin_cast_ll2d 573 __builtin_copy_dhi2i 574 __builtin_copy_i2dhi 575 __builtin_copy_dlo2i 576 __builtin_copy_i2dlo 577 __add_and_fetch 578 __sub_and_fetch 579 __or_and_fetch 580 __xor_and_fetch 581 __and_and_fetch 582 __nand_and_fetch 583 __mpy_and_fetch 584 __min_and_fetch 585 __max_and_fetch 586 __fetch_and_add 587 __fetch_and_sub 588 __fetch_and_or 589 __fetch_and_xor 590 __fetch_and_and 591 __fetch_and_nand 592 __fetch_and_mpy 593 __fetch_and_min 594 __fetch_and_max 595 __lock_test_and_set 596 __lock_release 597 __lock_acquire 598 __compare_and_swap 599 __synchronize 600 __high_multiply 601 __unix 602 __sgi 603 __linux__ 604 __i386__ 605 __i486__ 606 __cplusplus 607 __embedded_cplusplus 608 // long double conversion members mangled as __opr 609 // http://gcc.gnu.org/ml/libstdc++/1999-q4/msg00060.html 610 __opr 611 </literallayout> 612 </section> 613 614 <section xml:id="coding_style.example"><info><title>By Example</title></info> 615 616 <literallayout class="normal"> 617 This library is written to appropriate C++ coding standards. As such, 618 it is intended to precede the recommendations of the GNU Coding 619 Standard, which can be referenced in full here: 620 621 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.gnu.org/prep/standards/standards.html#Formatting">https://www.gnu.org/prep/standards/standards.html#Formatting</link> 622 623 The rest of this is also interesting reading, but skip the "Design 624 Advice" part. 625 626 The GCC coding conventions are here, and are also useful: 627 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://gcc.gnu.org/codingconventions.html">https://gcc.gnu.org/codingconventions.html</link> 628 629 In addition, because it doesn't seem to be stated explicitly anywhere 630 else, there is an 80 column source limit. 631 632 <filename>ChangeLog</filename> entries for member functions should use the 633 classname::member function name syntax as follows: 634 635<code> 6361999-04-15 Dennis Ritchie <dr@att.com> 637 638 * src/basic_file.cc (__basic_file::open): Fix thinko in 639 _G_HAVE_IO_FILE_OPEN bits. 640</code> 641 642 Notable areas of divergence from what may be previous local practice 643 (particularly for GNU C) include: 644 645 01. Pointers and references 646 <code> 647 char* p = "flop"; 648 char& c = *p; 649 -NOT- 650 char *p = "flop"; // wrong 651 char &c = *p; // wrong 652 </code> 653 654 Reason: In C++, definitions are mixed with executable code. Here, 655 <code>p</code> is being initialized, not <code>*p</code>. This is near-universal 656 practice among C++ programmers; it is normal for C hackers 657 to switch spontaneously as they gain experience. 658 659 02. Operator names and parentheses 660 <code> 661 operator==(type) 662 -NOT- 663 operator == (type) // wrong 664 </code> 665 666 Reason: The <code>==</code> is part of the function name. Separating 667 it makes the declaration look like an expression. 668 669 03. Function names and parentheses 670 <code> 671 void mangle() 672 -NOT- 673 void mangle () // wrong 674 </code> 675 676 Reason: no space before parentheses (except after a control-flow 677 keyword) is near-universal practice for C++. It identifies the 678 parentheses as the function-call operator or declarator, as 679 opposed to an expression or other overloaded use of parentheses. 680 681 04. Template function indentation 682 <code> 683 template<typename T> 684 void 685 template_function(args) 686 { } 687 -NOT- 688 template<class T> 689 void template_function(args) {}; 690 </code> 691 692 Reason: In class definitions, without indentation whitespace is 693 needed both above and below the declaration to distinguish 694 it visually from other members. (Also, re: "typename" 695 rather than "class".) <code>T</code> often could be <code>int</code>, which is 696 not a class. ("class", here, is an anachronism.) 697 698 05. Template class indentation 699 <code> 700 template<typename _CharT, typename _Traits> 701 class basic_ios : public ios_base 702 { 703 public: 704 // Types: 705 }; 706 -NOT- 707 template<class _CharT, class _Traits> 708 class basic_ios : public ios_base 709 { 710 public: 711 // Types: 712 }; 713 -NOT- 714 template<class _CharT, class _Traits> 715 class basic_ios : public ios_base 716 { 717 public: 718 // Types: 719 }; 720 </code> 721 722 06. Enumerators 723 <code> 724 enum 725 { 726 space = _ISspace, 727 print = _ISprint, 728 cntrl = _IScntrl 729 }; 730 -NOT- 731 enum { space = _ISspace, print = _ISprint, cntrl = _IScntrl }; 732 </code> 733 734 07. Member initialization lists 735 All one line, separate from class name. 736 737 <code> 738 gribble::gribble() 739 : _M_private_data(0), _M_more_stuff(0), _M_helper(0) 740 { } 741 -NOT- 742 gribble::gribble() : _M_private_data(0), _M_more_stuff(0), _M_helper(0) 743 { } 744 </code> 745 746 08. Try/Catch blocks 747 <code> 748 try 749 { 750 // 751 } 752 catch (...) 753 { 754 // 755 } 756 -NOT- 757 try { 758 // 759 } catch(...) { 760 // 761 } 762 </code> 763 764 09. Member functions declarations and definitions 765 Keywords such as extern, static, export, explicit, inline, etc 766 go on the line above the function name. Thus 767 768 <code> 769 virtual int 770 foo() 771 -NOT- 772 virtual int foo() 773 </code> 774 775 Reason: GNU coding conventions dictate return types for functions 776 are on a separate line than the function name and parameter list 777 for definitions. For C++, where we have member functions that can 778 be either inline definitions or declarations, keeping to this 779 standard allows all member function names for a given class to be 780 aligned to the same margin, increasing readability. 781 782 783 10. Invocation of member functions with "this->" 784 For non-uglified names, use <code>this->name</code> to call the function. 785 786 <code> 787 this->sync() 788 -NOT- 789 sync() 790 </code> 791 792 Reason: Koenig lookup. 793 794 11. Namespaces 795 <code> 796 namespace std 797 { 798 blah blah blah; 799 } // namespace std 800 801 -NOT- 802 803 namespace std { 804 blah blah blah; 805 } // namespace std 806 </code> 807 808 12. Spacing under protected and private in class declarations: 809 space above, none below 810 i.e. 811 812 <code> 813 public: 814 int foo; 815 816 -NOT- 817 public: 818 819 int foo; 820 </code> 821 822 13. Spacing WRT return statements. 823 no extra spacing before returns, no parenthesis 824 i.e. 825 826 <code> 827 } 828 return __ret; 829 830 -NOT- 831 } 832 833 return __ret; 834 835 -NOT- 836 837 } 838 return (__ret); 839 </code> 840 841 842 14. Location of global variables. 843 All global variables of class type, whether in the "user visible" 844 space (e.g., <code>cin</code>) or the implementation namespace, must be defined 845 as a character array with the appropriate alignment and then later 846 re-initialized to the correct value. 847 848 This is due to startup issues on certain platforms, such as AIX. 849 For more explanation and examples, see <filename>src/globals.cc</filename>. All such 850 variables should be contained in that file, for simplicity. 851 852 15. Exception abstractions 853 Use the exception abstractions found in <filename class="headerfile">functexcept.h</filename>, which allow 854 C++ programmers to use this library with <literal>-fno-exceptions</literal>. (Even if 855 that is rarely advisable, it's a necessary evil for backwards 856 compatibility.) 857 858 16. Exception error messages 859 All start with the name of the function where the exception is 860 thrown, and then (optional) descriptive text is added. Example: 861 862 <code> 863 __throw_logic_error(__N("basic_string::_S_construct NULL not valid")); 864 </code> 865 866 Reason: The verbose terminate handler prints out <code>exception::what()</code>, 867 as well as the typeinfo for the thrown exception. As this is the 868 default terminate handler, by putting location info into the 869 exception string, a very useful error message is printed out for 870 uncaught exceptions. So useful, in fact, that non-programmers can 871 give useful error messages, and programmers can intelligently 872 speculate what went wrong without even using a debugger. 873 874 17. The doxygen style guide to comments is a separate document, 875 see index. 876 877 The library currently has a mixture of GNU-C and modern C++ coding 878 styles. The GNU C usages will be combed out gradually. 879 880 Name patterns: 881 882 For nonstandard names appearing in Standard headers, we are constrained 883 to use names that begin with underscores. This is called "uglification". 884 The convention is: 885 886 Local and argument names: <literal>__[a-z].*</literal> 887 888 Examples: <code>__count __ix __s1</code> 889 890 Type names and template formal-argument names: <literal>_[A-Z][^_].*</literal> 891 892 Examples: <code>_Helper _CharT _N</code> 893 894 Member data and function names: <literal>_M_.*</literal> 895 896 Examples: <code>_M_num_elements _M_initialize ()</code> 897 898 Static data members, constants, and enumerations: <literal>_S_.*</literal> 899 900 Examples: <code>_S_max_elements _S_default_value</code> 901 902 Don't use names in the same scope that differ only in the prefix, 903 e.g. _S_top and _M_top. See <link linkend="coding_style.bad_identifiers">BADNAMES</link> for a list of forbidden names. 904 (The most tempting of these seem to be and "_T" and "__sz".) 905 906 Names must never have "__" internally; it would confuse name 907 unmanglers on some targets. Also, never use "__[0-9]", same reason. 908 909 -------------------------- 910 911 [BY EXAMPLE] 912 <code> 913 914 #ifndef _HEADER_ 915 #define _HEADER_ 1 916 917 namespace std 918 { 919 class gribble 920 { 921 public: 922 gribble() throw(); 923 924 gribble(const gribble&); 925 926 explicit 927 gribble(int __howmany); 928 929 gribble& 930 operator=(const gribble&); 931 932 virtual 933 ~gribble() throw (); 934 935 // Start with a capital letter, end with a period. 936 inline void 937 public_member(const char* __arg) const; 938 939 // In-class function definitions should be restricted to one-liners. 940 int 941 one_line() { return 0 } 942 943 int 944 two_lines(const char* arg) 945 { return strchr(arg, 'a'); } 946 947 inline int 948 three_lines(); // inline, but defined below. 949 950 // Note indentation. 951 template<typename _Formal_argument> 952 void 953 public_template() const throw(); 954 955 template<typename _Iterator> 956 void 957 other_template(); 958 959 private: 960 class _Helper; 961 962 int _M_private_data; 963 int _M_more_stuff; 964 _Helper* _M_helper; 965 int _M_private_function(); 966 967 enum _Enum 968 { 969 _S_one, 970 _S_two 971 }; 972 973 static void 974 _S_initialize_library(); 975 }; 976 977 // More-or-less-standard language features described by lack, not presence. 978 # ifndef _G_NO_LONGLONG 979 extern long long _G_global_with_a_good_long_name; // avoid globals! 980 # endif 981 982 // Avoid in-class inline definitions, define separately; 983 // likewise for member class definitions: 984 inline int 985 gribble::public_member() const 986 { int __local = 0; return __local; } 987 988 class gribble::_Helper 989 { 990 int _M_stuff; 991 992 friend class gribble; 993 }; 994 } 995 996 // Names beginning with "__": only for arguments and 997 // local variables; never use "__" in a type name, or 998 // within any name; never use "__[0-9]". 999 1000 #endif /* _HEADER_ */ 1001 1002 1003 namespace std 1004 { 1005 template<typename T> // notice: "typename", not "class", no space 1006 long_return_value_type<with_many, args> 1007 function_name(char* pointer, // "char *pointer" is wrong. 1008 char* argument, 1009 const Reference& ref) 1010 { 1011 // int a_local; /* wrong; see below. */ 1012 if (test) 1013 { 1014 nested code 1015 } 1016 1017 int a_local = 0; // declare variable at first use. 1018 1019 // char a, b, *p; /* wrong */ 1020 char a = 'a'; 1021 char b = a + 1; 1022 char* c = "abc"; // each variable goes on its own line, always. 1023 1024 // except maybe here... 1025 for (unsigned i = 0, mask = 1; mask; ++i, mask <<= 1) { 1026 // ... 1027 } 1028 } 1029 1030 gribble::gribble() 1031 : _M_private_data(0), _M_more_stuff(0), _M_helper(0) 1032 { } 1033 1034 int 1035 gribble::three_lines() 1036 { 1037 // doesn't fit in one line. 1038 } 1039 } // namespace std 1040 </code> 1041 </literallayout> 1042 </section> 1043</section> 1044 1045<section xml:id="contrib.design_notes" xreflabel="Design Notes"><info><title>Design Notes</title></info> 1046 <?dbhtml filename="source_design_notes.html"?> 1047 1048 <para> 1049 </para> 1050 1051 <literallayout class="normal"> 1052 1053 The Library 1054 ----------- 1055 1056 This paper is covers two major areas: 1057 1058 - Features and policies not mentioned in the standard that 1059 the quality of the library implementation depends on, including 1060 extensions and "implementation-defined" features; 1061 1062 - Plans for required but unimplemented library features and 1063 optimizations to them. 1064 1065 Overhead 1066 -------- 1067 1068 The standard defines a large library, much larger than the standard 1069 C library. A naive implementation would suffer substantial overhead 1070 in compile time, executable size, and speed, rendering it unusable 1071 in many (particularly embedded) applications. The alternative demands 1072 care in construction, and some compiler support, but there is no 1073 need for library subsets. 1074 1075 What are the sources of this overhead? There are four main causes: 1076 1077 - The library is specified almost entirely as templates, which 1078 with current compilers must be included in-line, resulting in 1079 very slow builds as tens or hundreds of thousands of lines 1080 of function definitions are read for each user source file. 1081 Indeed, the entire SGI STL, as well as the dos Reis valarray, 1082 are provided purely as header files, largely for simplicity in 1083 porting. Iostream/locale is (or will be) as large again. 1084 1085 - The library is very flexible, specifying a multitude of hooks 1086 where users can insert their own code in place of defaults. 1087 When these hooks are not used, any time and code expended to 1088 support that flexibility is wasted. 1089 1090 - Templates are often described as causing to "code bloat". In 1091 practice, this refers (when it refers to anything real) to several 1092 independent processes. First, when a class template is manually 1093 instantiated in its entirely, current compilers place the definitions 1094 for all members in a single object file, so that a program linking 1095 to one member gets definitions of all. Second, template functions 1096 which do not actually depend on the template argument are, under 1097 current compilers, generated anew for each instantiation, rather 1098 than being shared with other instantiations. Third, some of the 1099 flexibility mentioned above comes from virtual functions (both in 1100 regular classes and template classes) which current linkers add 1101 to the executable file even when they manifestly cannot be called. 1102 1103 - The library is specified to use a language feature, exceptions, 1104 which in the current gcc compiler ABI imposes a run time and 1105 code space cost to handle the possibility of exceptions even when 1106 they are not used. Under the new ABI (accessed with -fnew-abi), 1107 there is a space overhead and a small reduction in code efficiency 1108 resulting from lost optimization opportunities associated with 1109 non-local branches associated with exceptions. 1110 1111 What can be done to eliminate this overhead? A variety of coding 1112 techniques, and compiler, linker and library improvements and 1113 extensions may be used, as covered below. Most are not difficult, 1114 and some are already implemented in varying degrees. 1115 1116 Overhead: Compilation Time 1117 -------------------------- 1118 1119 Providing "ready-instantiated" template code in object code archives 1120 allows us to avoid generating and optimizing template instantiations 1121 in each compilation unit which uses them. However, the number of such 1122 instantiations that are useful to provide is limited, and anyway this 1123 is not enough, by itself, to minimize compilation time. In particular, 1124 it does not reduce time spent parsing conforming headers. 1125 1126 Quicker header parsing will depend on library extensions and compiler 1127 improvements. One approach is some variation on the techniques 1128 previously marketed as "pre-compiled headers", now standardized as 1129 support for the "export" keyword. "Exported" template definitions 1130 can be placed (once) in a "repository" -- really just a library, but 1131 of template definitions rather than object code -- to be drawn upon 1132 at link time when an instantiation is needed, rather than placed in 1133 header files to be parsed along with every compilation unit. 1134 1135 Until "export" is implemented we can put some of the lengthy template 1136 definitions in #if guards or alternative headers so that users can skip 1137 over the full definitions when they need only the ready-instantiated 1138 specializations. 1139 1140 To be precise, this means that certain headers which define 1141 templates which users normally use only for certain arguments 1142 can be instrumented to avoid exposing the template definitions 1143 to the compiler unless a macro is defined. For example, in 1144 <string>, we might have: 1145 1146 template <class _CharT, ... > class basic_string { 1147 ... // member declarations 1148 }; 1149 ... // operator declarations 1150 1151 #ifdef _STRICT_ISO_ 1152 # if _G_NO_TEMPLATE_EXPORT 1153 # include <bits/std_locale.h> // headers needed by definitions 1154 # ... 1155 # include <bits/string.tcc> // member and global template definitions. 1156 # endif 1157 #endif 1158 1159 Users who compile without specifying a strict-ISO-conforming flag 1160 would not see many of the template definitions they now see, and rely 1161 instead on ready-instantiated specializations in the library. This 1162 technique would be useful for the following substantial components: 1163 string, locale/iostreams, valarray. It would *not* be useful or 1164 usable with the following: containers, algorithms, iterators, 1165 allocator. Since these constitute a large (though decreasing) 1166 fraction of the library, the benefit the technique offers is 1167 limited. 1168 1169 The language specifies the semantics of the "export" keyword, but 1170 the gcc compiler does not yet support it. When it does, problems 1171 with large template inclusions can largely disappear, given some 1172 minor library reorganization, along with the need for the apparatus 1173 described above. 1174 1175 Overhead: Flexibility Cost 1176 -------------------------- 1177 1178 The library offers many places where users can specify operations 1179 to be performed by the library in place of defaults. Sometimes 1180 this seems to require that the library use a more-roundabout, and 1181 possibly slower, way to accomplish the default requirements than 1182 would be used otherwise. 1183 1184 The primary protection against this overhead is thorough compiler 1185 optimization, to crush out layers of inline function interfaces. 1186 Kuck & Associates has demonstrated the practicality of this kind 1187 of optimization. 1188 1189 The second line of defense against this overhead is explicit 1190 specialization. By defining helper function templates, and writing 1191 specialized code for the default case, overhead can be eliminated 1192 for that case without sacrificing flexibility. This takes full 1193 advantage of any ability of the optimizer to crush out degenerate 1194 code. 1195 1196 The library specifies many virtual functions which current linkers 1197 load even when they cannot be called. Some minor improvements to the 1198 compiler and to ld would eliminate any such overhead by simply 1199 omitting virtual functions that the complete program does not call. 1200 A prototype of this work has already been done. For targets where 1201 GNU ld is not used, a "pre-linker" could do the same job. 1202 1203 The main areas in the standard interface where user flexibility 1204 can result in overhead are: 1205 1206 - Allocators: Containers are specified to use user-definable 1207 allocator types and objects, making tuning for the container 1208 characteristics tricky. 1209 1210 - Locales: the standard specifies locale objects used to implement 1211 iostream operations, involving many virtual functions which use 1212 streambuf iterators. 1213 1214 - Algorithms and containers: these may be instantiated on any type, 1215 frequently duplicating code for identical operations. 1216 1217 - Iostreams and strings: users are permitted to use these on their 1218 own types, and specify the operations the stream must use on these 1219 types. 1220 1221 Note that these sources of overhead are _avoidable_. The techniques 1222 to avoid them are covered below. 1223 1224 Code Bloat 1225 ---------- 1226 1227 In the SGI STL, and in some other headers, many of the templates 1228 are defined "inline" -- either explicitly or by their placement 1229 in class definitions -- which should not be inline. This is a 1230 source of code bloat. Matt had remarked that he was relying on 1231 the compiler to recognize what was too big to benefit from inlining, 1232 and generate it out-of-line automatically. However, this also can 1233 result in code bloat except where the linker can eliminate the extra 1234 copies. 1235 1236 Fixing these cases will require an audit of all inline functions 1237 defined in the library to determine which merit inlining, and moving 1238 the rest out of line. This is an issue mainly in clauses 23, 25, and 1239 27. Of course it can be done incrementally, and we should generally 1240 accept patches that move large functions out of line and into ".tcc" 1241 files, which can later be pulled into a repository. Compiler/linker 1242 improvements to recognize very large inline functions and move them 1243 out-of-line, but shared among compilation units, could make this 1244 work unnecessary. 1245 1246 Pre-instantiating template specializations currently produces large 1247 amounts of dead code which bloats statically linked programs. The 1248 current state of the static library, libstdc++.a, is intolerable on 1249 this account, and will fuel further confused speculation about a need 1250 for a library "subset". A compiler improvement that treats each 1251 instantiated function as a separate object file, for linking purposes, 1252 would be one solution to this problem. An alternative would be to 1253 split up the manual instantiation files into dozens upon dozens of 1254 little files, each compiled separately, but an abortive attempt at 1255 this was done for <string> and, though it is far from complete, it 1256 is already a nuisance. A better interim solution (just until we have 1257 "export") is badly needed. 1258 1259 When building a shared library, the current compiler/linker cannot 1260 automatically generate the instantiations needed. This creates a 1261 miserable situation; it means any time something is changed in the 1262 library, before a shared library can be built someone must manually 1263 copy the declarations of all templates that are needed by other parts 1264 of the library to an "instantiation" file, and add it to the build 1265 system to be compiled and linked to the library. This process is 1266 readily automated, and should be automated as soon as possible. 1267 Users building their own shared libraries experience identical 1268 frustrations. 1269 1270 Sharing common aspects of template definitions among instantiations 1271 can radically reduce code bloat. The compiler could help a great 1272 deal here by recognizing when a function depends on nothing about 1273 a template parameter, or only on its size, and giving the resulting 1274 function a link-name "equate" that allows it to be shared with other 1275 instantiations. Implementation code could take advantage of the 1276 capability by factoring out code that does not depend on the template 1277 argument into separate functions to be merged by the compiler. 1278 1279 Until such a compiler optimization is implemented, much can be done 1280 manually (if tediously) in this direction. One such optimization is 1281 to derive class templates from non-template classes, and move as much 1282 implementation as possible into the base class. Another is to partial- 1283 specialize certain common instantiations, such as vector<T*>, to share 1284 code for instantiations on all types T. While these techniques work, 1285 they are far from the complete solution that a compiler improvement 1286 would afford. 1287 1288 Overhead: Expensive Language Features 1289 ------------------------------------- 1290 1291 The main "expensive" language feature used in the standard library 1292 is exception support, which requires compiling in cleanup code with 1293 static table data to locate it, and linking in library code to use 1294 the table. For small embedded programs the amount of such library 1295 code and table data is assumed by some to be excessive. Under the 1296 "new" ABI this perception is generally exaggerated, although in some 1297 cases it may actually be excessive. 1298 1299 To implement a library which does not use exceptions directly is 1300 not difficult given minor compiler support (to "turn off" exceptions 1301 and ignore exception constructs), and results in no great library 1302 maintenance difficulties. To be precise, given "-fno-exceptions", 1303 the compiler should treat "try" blocks as ordinary blocks, and 1304 "catch" blocks as dead code to ignore or eliminate. Compiler 1305 support is not strictly necessary, except in the case of "function 1306 try blocks"; otherwise the following macros almost suffice: 1307 1308 #define throw(X) 1309 #define try if (true) 1310 #define catch(X) else if (false) 1311 1312 However, there may be a need to use function try blocks in the 1313 library implementation, and use of macros in this way can make 1314 correct diagnostics impossible. Furthermore, use of this scheme 1315 would require the library to call a function to re-throw exceptions 1316 from a try block. Implementing the above semantics in the compiler 1317 is preferable. 1318 1319 Given the support above (however implemented) it only remains to 1320 replace code that "throws" with a call to a well-documented "handler" 1321 function in a separate compilation unit which may be replaced by 1322 the user. The main source of exceptions that would be difficult 1323 for users to avoid is memory allocation failures, but users can 1324 define their own memory allocation primitives that never throw. 1325 Otherwise, the complete list of such handlers, and which library 1326 functions may call them, would be needed for users to be able to 1327 implement the necessary substitutes. (Fortunately, they have the 1328 source code.) 1329 1330 Opportunities 1331 ------------- 1332 1333 The template capabilities of C++ offer enormous opportunities for 1334 optimizing common library operations, well beyond what would be 1335 considered "eliminating overhead". In particular, many operations 1336 done in Glibc with macros that depend on proprietary language 1337 extensions can be implemented in pristine Standard C++. For example, 1338 the chapter 25 algorithms, and even C library functions such as strchr, 1339 can be specialized for the case of static arrays of known (small) size. 1340 1341 Detailed optimization opportunities are identified below where 1342 the component where they would appear is discussed. Of course new 1343 opportunities will be identified during implementation. 1344 1345 Unimplemented Required Library Features 1346 --------------------------------------- 1347 1348 The standard specifies hundreds of components, grouped broadly by 1349 chapter. These are listed in excruciating detail in the CHECKLIST 1350 file. 1351 1352 17 general 1353 18 support 1354 19 diagnostics 1355 20 utilities 1356 21 string 1357 22 locale 1358 23 containers 1359 24 iterators 1360 25 algorithms 1361 26 numerics 1362 27 iostreams 1363 Annex D backward compatibility 1364 1365 Anyone participating in implementation of the library should obtain 1366 a copy of the standard, ISO 14882. People in the U.S. can obtain an 1367 electronic copy for US$18 from ANSI's web site. Those from other 1368 countries should visit http://www.iso.org/ to find out the location 1369 of their country's representation in ISO, in order to know who can 1370 sell them a copy. 1371 1372 The emphasis in the following sections is on unimplemented features 1373 and optimization opportunities. 1374 1375 Chapter 17 General 1376 ------------------- 1377 1378 Chapter 17 concerns overall library requirements. 1379 1380 The standard doesn't mention threads. A multi-thread (MT) extension 1381 primarily affects operators new and delete (18), allocator (20), 1382 string (21), locale (22), and iostreams (27). The common underlying 1383 support needed for this is discussed under chapter 20. 1384 1385 The standard requirements on names from the C headers create a 1386 lot of work, mostly done. Names in the C headers must be visible 1387 in the std:: and sometimes the global namespace; the names in the 1388 two scopes must refer to the same object. More stringent is that 1389 Koenig lookup implies that any types specified as defined in std:: 1390 really are defined in std::. Names optionally implemented as 1391 macros in C cannot be macros in C++. (An overview may be read at 1392 <http://www.cantrip.org/cheaders.html>). The scripts "inclosure" 1393 and "mkcshadow", and the directories shadow/ and cshadow/, are the 1394 beginning of an effort to conform in this area. 1395 1396 A correct conforming definition of C header names based on underlying 1397 C library headers, and practical linking of conforming namespaced 1398 customer code with third-party C libraries depends ultimately on 1399 an ABI change, allowing namespaced C type names to be mangled into 1400 type names as if they were global, somewhat as C function names in a 1401 namespace, or C++ global variable names, are left unmangled. Perhaps 1402 another "extern" mode, such as 'extern "C-global"' would be an 1403 appropriate place for such type definitions. Such a type would 1404 affect mangling as follows: 1405 1406 namespace A { 1407 struct X {}; 1408 extern "C-global" { // or maybe just 'extern "C"' 1409 struct Y {}; 1410 }; 1411 } 1412 void f(A::X*); // mangles to f__FPQ21A1X 1413 void f(A::Y*); // mangles to f__FP1Y 1414 1415 (It may be that this is really the appropriate semantics for regular 1416 'extern "C"', and 'extern "C-global"', as an extension, would not be 1417 necessary.) This would allow functions declared in non-standard C headers 1418 (and thus fixable by neither us nor users) to link properly with functions 1419 declared using C types defined in properly-namespaced headers. The 1420 problem this solves is that C headers (which C++ programmers do persist 1421 in using) frequently forward-declare C struct tags without including 1422 the header where the type is defined, as in 1423 1424 struct tm; 1425 void munge(tm*); 1426 1427 Without some compiler accommodation, munge cannot be called by correct 1428 C++ code using a pointer to a correctly-scoped tm* value. 1429 1430 The current C headers use the preprocessor extension "#include_next", 1431 which the compiler complains about when run "-pedantic". 1432 (Incidentally, it appears that "-fpedantic" is currently ignored, 1433 probably a bug.) The solution in the C compiler is to use 1434 "-isystem" rather than "-I", but unfortunately in g++ this seems 1435 also to wrap the whole header in an 'extern "C"' block, so it's 1436 unusable for C++ headers. The correct solution appears to be to 1437 allow the various special include-directory options, if not given 1438 an argument, to affect subsequent include-directory options additively, 1439 so that if one said 1440 1441 -pedantic -iprefix $(prefix) \ 1442 -idirafter -ino-pedantic -ino-extern-c -iwithprefix -I g++-v3 \ 1443 -iwithprefix -I g++-v3/ext 1444 1445 the compiler would search $(prefix)/g++-v3 and not report 1446 pedantic warnings for files found there, but treat files in 1447 $(prefix)/g++-v3/ext pedantically. (The undocumented semantics 1448 of "-isystem" in g++ stink. Can they be rescinded? If not it 1449 must be replaced with something more rationally behaved.) 1450 1451 All the C headers need the treatment above; in the standard these 1452 headers are mentioned in various clauses. Below, I have only 1453 mentioned those that present interesting implementation issues. 1454 1455 The components identified as "mostly complete", below, have not been 1456 audited for conformance. In many cases where the library passes 1457 conformance tests we have non-conforming extensions that must be 1458 wrapped in #if guards for "pedantic" use, and in some cases renamed 1459 in a conforming way for continued use in the implementation regardless 1460 of conformance flags. 1461 1462 The STL portion of the library still depends on a header 1463 stl/bits/stl_config.h full of #ifdef clauses. This apparatus 1464 should be replaced with autoconf/automake machinery. 1465 1466 The SGI STL defines a type_traits<> template, specialized for 1467 many types in their code including the built-in numeric and 1468 pointer types and some library types, to direct optimizations of 1469 standard functions. The SGI compiler has been extended to generate 1470 specializations of this template automatically for user types, 1471 so that use of STL templates on user types can take advantage of 1472 these optimizations. Specializations for other, non-STL, types 1473 would make more optimizations possible, but extending the gcc 1474 compiler in the same way would be much better. Probably the next 1475 round of standardization will ratify this, but probably with 1476 changes, so it probably should be renamed to place it in the 1477 implementation namespace. 1478 1479 The SGI STL also defines a large number of extensions visible in 1480 standard headers. (Other extensions that appear in separate headers 1481 have been sequestered in subdirectories ext/ and backward/.) All 1482 these extensions should be moved to other headers where possible, 1483 and in any case wrapped in a namespace (not std!), and (where kept 1484 in a standard header) girded about with macro guards. Some cannot be 1485 moved out of standard headers because they are used to implement 1486 standard features. The canonical method for accommodating these 1487 is to use a protected name, aliased in macro guards to a user-space 1488 name. Unfortunately C++ offers no satisfactory template typedef 1489 mechanism, so very ad-hoc and unsatisfactory aliasing must be used 1490 instead. 1491 1492 Implementation of a template typedef mechanism should have the highest 1493 priority among possible extensions, on the same level as implementation 1494 of the template "export" feature. 1495 1496 Chapter 18 Language support 1497 ---------------------------- 1498 1499 Headers: <limits> <new> <typeinfo> <exception> 1500 C headers: <cstddef> <climits> <cfloat> <cstdarg> <csetjmp> 1501 <ctime> <csignal> <cstdlib> (also 21, 25, 26) 1502 1503 This defines the built-in exceptions, rtti, numeric_limits<>, 1504 operator new and delete. Much of this is provided by the 1505 compiler in its static runtime library. 1506 1507 Work to do includes defining numeric_limits<> specializations in 1508 separate files for all target architectures. Values for integer types 1509 except for bool and wchar_t are readily obtained from the C header 1510 <limits.h>, but values for the remaining numeric types (bool, wchar_t, 1511 float, double, long double) must be entered manually. This is 1512 largely dog work except for those members whose values are not 1513 easily deduced from available documentation. Also, this involves 1514 some work in target configuration to identify the correct choice of 1515 file to build against and to install. 1516 1517 The definitions of the various operators new and delete must be 1518 made thread-safe, which depends on a portable exclusion mechanism, 1519 discussed under chapter 20. Of course there is always plenty of 1520 room for improvements to the speed of operators new and delete. 1521 1522 <cstdarg>, in Glibc, defines some macros that gcc does not allow to 1523 be wrapped into an inline function. Probably this header will demand 1524 attention whenever a new target is chosen. The functions atexit(), 1525 exit(), and abort() in cstdlib have different semantics in C++, so 1526 must be re-implemented for C++. 1527 1528 Chapter 19 Diagnostics 1529 ----------------------- 1530 1531 Headers: <stdexcept> 1532 C headers: <cassert> <cerrno> 1533 1534 This defines the standard exception objects, which are "mostly complete". 1535 Cygnus has a version, and now SGI provides a slightly different one. 1536 It makes little difference which we use. 1537 1538 The C global name "errno", which C allows to be a variable or a macro, 1539 is required in C++ to be a macro. For MT it must typically result in 1540 a function call. 1541 1542 Chapter 20 Utilities 1543 --------------------- 1544 Headers: <utility> <functional> <memory> 1545 C header: <ctime> (also in 18) 1546 1547 SGI STL provides "mostly complete" versions of all the components 1548 defined in this chapter. However, the auto_ptr<> implementation 1549 is known to be wrong. Furthermore, the standard definition of it 1550 is known to be unimplementable as written. A minor change to the 1551 standard would fix it, and auto_ptr<> should be adjusted to match. 1552 1553 Multi-threading affects the allocator implementation, and there must 1554 be configuration/installation choices for different users' MT 1555 requirements. Anyway, users will want to tune allocator options 1556 to support different target conditions, MT or no. 1557 1558 The primitives used for MT implementation should be exposed, as an 1559 extension, for users' own work. We need cross-CPU "mutex" support, 1560 multi-processor shared-memory atomic integer operations, and single- 1561 processor uninterruptible integer operations, and all three configurable 1562 to be stubbed out for non-MT use, or to use an appropriately-loaded 1563 dynamic library for the actual runtime environment, or statically 1564 compiled in for cases where the target architecture is known. 1565 1566 Chapter 21 String 1567 ------------------ 1568 Headers: <string> 1569 C headers: <cctype> <cwctype> <cstring> <cwchar> (also in 27) 1570 <cstdlib> (also in 18, 25, 26) 1571 1572 We have "mostly-complete" char_traits<> implementations. Many of the 1573 char_traits<char> operations might be optimized further using existing 1574 proprietary language extensions. 1575 1576 We have a "mostly-complete" basic_string<> implementation. The work 1577 to manually instantiate char and wchar_t specializations in object 1578 files to improve link-time behavior is extremely unsatisfactory, 1579 literally tripling library-build time with no commensurate improvement 1580 in static program link sizes. It must be redone. (Similar work is 1581 needed for some components in clauses 22 and 27.) 1582 1583 Other work needed for strings is MT-safety, as discussed under the 1584 chapter 20 heading. 1585 1586 The standard C type mbstate_t from <cwchar> and used in char_traits<> 1587 must be different in C++ than in C, because in C++ the default constructor 1588 value mbstate_t() must be the "base" or "ground" sequence state. 1589 (According to the likely resolution of a recently raised Core issue, 1590 this may become unnecessary. However, there are other reasons to 1591 use a state type not as limited as whatever the C library provides.) 1592 If we might want to provide conversions from (e.g.) internally- 1593 represented EUC-wide to externally-represented Unicode, or vice- 1594 versa, the mbstate_t we choose will need to be more accommodating 1595 than what might be provided by an underlying C library. 1596 1597 There remain some basic_string template-member functions which do 1598 not overload properly with their non-template brethren. The infamous 1599 hack akin to what was done in vector<> is needed, to conform to 1600 23.1.1 para 10. The CHECKLIST items for basic_string marked 'X', 1601 or incomplete, are so marked for this reason. 1602 1603 Replacing the string iterators, which currently are simple character 1604 pointers, with class objects would greatly increase the safety of the 1605 client interface, and also permit a "debug" mode in which range, 1606 ownership, and validity are rigorously checked. The current use of 1607 raw pointers as string iterators is evil. vector<> iterators need the 1608 same treatment. Note that the current implementation freely mixes 1609 pointers and iterators, and that must be fixed before safer iterators 1610 can be introduced. 1611 1612 Some of the functions in <cstring> are different from the C version. 1613 generally overloaded on const and non-const argument pointers. For 1614 example, in <cstring> strchr is overloaded. The functions isupper 1615 etc. in <cctype> typically implemented as macros in C are functions 1616 in C++, because they are overloaded with others of the same name 1617 defined in <locale>. 1618 1619 Many of the functions required in <cwctype> and <cwchar> cannot be 1620 implemented using underlying C facilities on intended targets because 1621 such facilities only partly exist. 1622 1623 Chapter 22 Locale 1624 ------------------ 1625 Headers: <locale> 1626 C headers: <clocale> 1627 1628 We have a "mostly complete" class locale, with the exception of 1629 code for constructing, and handling the names of, named locales. 1630 The ways that locales are named (particularly when categories 1631 (e.g. LC_TIME, LC_COLLATE) are different) varies among all target 1632 environments. This code must be written in various versions and 1633 chosen by configuration parameters. 1634 1635 Members of many of the facets defined in <locale> are stubs. Generally, 1636 there are two sets of facets: the base class facets (which are supposed 1637 to implement the "C" locale) and the "byname" facets, which are supposed 1638 to read files to determine their behavior. The base ctype<>, collate<>, 1639 and numpunct<> facets are "mostly complete", except that the table of 1640 bitmask values used for "is" operations, and corresponding mask values, 1641 are still defined in libio and just included/linked. (We will need to 1642 implement these tables independently, soon, but should take advantage 1643 of libio where possible.) The num_put<>::put members for integer types 1644 are "mostly complete". 1645 1646 A complete list of what has and has not been implemented may be 1647 found in CHECKLIST. However, note that the current definition of 1648 codecvt<wchar_t,char,mbstate_t> is wrong. It should simply write 1649 out the raw bytes representing the wide characters, rather than 1650 trying to convert each to a corresponding single "char" value. 1651 1652 Some of the facets are more important than others. Specifically, 1653 the members of ctype<>, numpunct<>, num_put<>, and num_get<> facets 1654 are used by other library facilities defined in <string>, <istream>, 1655 and <ostream>, and the codecvt<> facet is used by basic_filebuf<> 1656 in <fstream>, so a conforming iostream implementation depends on 1657 these. 1658 1659 The "long long" type eventually must be supported, but code mentioning 1660 it should be wrapped in #if guards to allow pedantic-mode compiling. 1661 1662 Performance of num_put<> and num_get<> depend critically on 1663 caching computed values in ios_base objects, and on extensions 1664 to the interface with streambufs. 1665 1666 Specifically: retrieving a copy of the locale object, extracting 1667 the needed facets, and gathering data from them, for each call to 1668 (e.g.) operator<< would be prohibitively slow. To cache format 1669 data for use by num_put<> and num_get<> we have a _Format_cache<> 1670 object stored in the ios_base::pword() array. This is constructed 1671 and initialized lazily, and is organized purely for utility. It 1672 is discarded when a new locale with different facets is imbued. 1673 1674 Using only the public interfaces of the iterator arguments to the 1675 facet functions would limit performance by forbidding "vector-style" 1676 character operations. The streambuf iterator optimizations are 1677 described under chapter 24, but facets can also bypass the streambuf 1678 iterators via explicit specializations and operate directly on the 1679 streambufs, and use extended interfaces to get direct access to the 1680 streambuf internal buffer arrays. These extensions are mentioned 1681 under chapter 27. These optimizations are particularly important 1682 for input parsing. 1683 1684 Unused virtual members of locale facets can be omitted, as mentioned 1685 above, by a smart linker. 1686 1687 Chapter 23 Containers 1688 ---------------------- 1689 Headers: <deque> <list> <queue> <stack> <vector> <map> <set> <bitset> 1690 1691 All the components in chapter 23 are implemented in the SGI STL. 1692 They are "mostly complete"; they include a large number of 1693 nonconforming extensions which must be wrapped. Some of these 1694 are used internally and must be renamed or duplicated. 1695 1696 The SGI components are optimized for large-memory environments. For 1697 embedded targets, different criteria might be more appropriate. Users 1698 will want to be able to tune this behavior. We should provide 1699 ways for users to compile the library with different memory usage 1700 characteristics. 1701 1702 A lot more work is needed on factoring out common code from different 1703 specializations to reduce code size here and in chapter 25. The 1704 easiest fix for this would be a compiler/ABI improvement that allows 1705 the compiler to recognize when a specialization depends only on the 1706 size (or other gross quality) of a template argument, and allow the 1707 linker to share the code with similar specializations. In its 1708 absence, many of the algorithms and containers can be partial- 1709 specialized, at least for the case of pointers, but this only solves 1710 a small part of the problem. Use of a type_traits-style template 1711 allows a few more optimization opportunities, more if the compiler 1712 can generate the specializations automatically. 1713 1714 As an optimization, containers can specialize on the default allocator 1715 and bypass it, or take advantage of details of its implementation 1716 after it has been improved upon. 1717 1718 Replacing the vector iterators, which currently are simple element 1719 pointers, with class objects would greatly increase the safety of the 1720 client interface, and also permit a "debug" mode in which range, 1721 ownership, and validity are rigorously checked. The current use of 1722 pointers for iterators is evil. 1723 1724 As mentioned for chapter 24, the deque iterator is a good example of 1725 an opportunity to implement a "staged" iterator that would benefit 1726 from specializations of some algorithms. 1727 1728 Chapter 24 Iterators 1729 --------------------- 1730 Headers: <iterator> 1731 1732 Standard iterators are "mostly complete", with the exception of 1733 the stream iterators, which are not yet templatized on the 1734 stream type. Also, the base class template iterator<> appears 1735 to be wrong, so everything derived from it must also be wrong, 1736 currently. 1737 1738 The streambuf iterators (currently located in stl/bits/std_iterator.h, 1739 but should be under bits/) can be rewritten to take advantage of 1740 friendship with the streambuf implementation. 1741 1742 Matt Austern has identified opportunities where certain iterator 1743 types, particularly including streambuf iterators and deque 1744 iterators, have a "two-stage" quality, such that an intermediate 1745 limit can be checked much more quickly than the true limit on 1746 range operations. If identified with a member of iterator_traits, 1747 algorithms may be specialized for this case. Of course the 1748 iterators that have this quality can be identified by specializing 1749 a traits class. 1750 1751 Many of the algorithms must be specialized for the streambuf 1752 iterators, to take advantage of block-mode operations, in order 1753 to allow iostream/locale operations' performance not to suffer. 1754 It may be that they could be treated as staged iterators and 1755 take advantage of those optimizations. 1756 1757 Chapter 25 Algorithms 1758 ---------------------- 1759 Headers: <algorithm> 1760 C headers: <cstdlib> (also in 18, 21, 26)) 1761 1762 The algorithms are "mostly complete". As mentioned above, they 1763 are optimized for speed at the expense of code and data size. 1764 1765 Specializations of many of the algorithms for non-STL types would 1766 give performance improvements, but we must use great care not to 1767 interfere with fragile template overloading semantics for the 1768 standard interfaces. Conventionally the standard function template 1769 interface is an inline which delegates to a non-standard function 1770 which is then overloaded (this is already done in many places in 1771 the library). Particularly appealing opportunities for the sake of 1772 iostream performance are for copy and find applied to streambuf 1773 iterators or (as noted elsewhere) for staged iterators, of which 1774 the streambuf iterators are a good example. 1775 1776 The bsearch and qsort functions cannot be overloaded properly as 1777 required by the standard because gcc does not yet allow overloading 1778 on the extern-"C"-ness of a function pointer. 1779 1780 Chapter 26 Numerics 1781 -------------------- 1782 Headers: <complex> <valarray> <numeric> 1783 C headers: <cmath>, <cstdlib> (also 18, 21, 25) 1784 1785 Numeric components: Gabriel dos Reis's valarray, Drepper's complex, 1786 and the few algorithms from the STL are "mostly done". Of course 1787 optimization opportunities abound for the numerically literate. It 1788 is not clear whether the valarray implementation really conforms 1789 fully, in the assumptions it makes about aliasing (and lack thereof) 1790 in its arguments. 1791 1792 The C div() and ldiv() functions are interesting, because they are the 1793 only case where a C library function returns a class object by value. 1794 Since the C++ type div_t must be different from the underlying C type 1795 (which is in the wrong namespace) the underlying functions div() and 1796 ldiv() cannot be re-used efficiently. Fortunately they are trivial to 1797 re-implement. 1798 1799 Chapter 27 Iostreams 1800 --------------------- 1801 Headers: <iosfwd> <streambuf> <ios> <ostream> <istream> <iostream> 1802 <iomanip> <sstream> <fstream> 1803 C headers: <cstdio> <cwchar> (also in 21) 1804 1805 Iostream is currently in a very incomplete state. <iosfwd>, <iomanip>, 1806 ios_base, and basic_ios<> are "mostly complete". basic_streambuf<> and 1807 basic_ostream<> are well along, but basic_istream<> has had little work 1808 done. The standard stream objects, <sstream> and <fstream> have been 1809 started; basic_filebuf<> "write" functions have been implemented just 1810 enough to do "hello, world". 1811 1812 Most of the istream and ostream operators << and >> (with the exception 1813 of the op<<(integer) ones) have not been changed to use locale primitives, 1814 sentry objects, or char_traits members. 1815 1816 All these templates should be manually instantiated for char and 1817 wchar_t in a way that links only used members into user programs. 1818 1819 Streambuf is fertile ground for optimization extensions. An extended 1820 interface giving iterator access to its internal buffer would be very 1821 useful for other library components. 1822 1823 Iostream operations (primarily operators << and >>) can take advantage 1824 of the case where user code has not specified a locale, and bypass locale 1825 operations entirely. The current implementation of op<</num_put<>::put, 1826 for the integer types, demonstrates how they can cache encoding details 1827 from the locale on each operation. There is lots more room for 1828 optimization in this area. 1829 1830 The definition of the relationship between the standard streams 1831 cout et al. and stdout et al. requires something like a "stdiobuf". 1832 The SGI solution of using double-indirection to actually use a 1833 stdio FILE object for buffering is unsatisfactory, because it 1834 interferes with peephole loop optimizations. 1835 1836 The <sstream> header work has begun. stringbuf can benefit from 1837 friendship with basic_string<> and basic_string<>::_Rep to use 1838 those objects directly as buffers, and avoid allocating and making 1839 copies. 1840 1841 The basic_filebuf<> template is a complex beast. It is specified to 1842 use the locale facet codecvt<> to translate characters between native 1843 files and the locale character encoding. In general this involves 1844 two buffers, one of "char" representing the file and another of 1845 "char_type", for the stream, with codecvt<> translating. The process 1846 is complicated by the variable-length nature of the translation, and 1847 the need to seek to corresponding places in the two representations. 1848 For the case of basic_filebuf<char>, when no translation is needed, 1849 a single buffer suffices. A specialized filebuf can be used to reduce 1850 code space overhead when no locale has been imbued. Matt Austern's 1851 work at SGI will be useful, perhaps directly as a source of code, or 1852 at least as an example to draw on. 1853 1854 Filebuf, almost uniquely (cf. operator new), depends heavily on 1855 underlying environmental facilities. In current releases iostream 1856 depends fairly heavily on libio constant definitions, but it should 1857 be made independent. It also depends on operating system primitives 1858 for file operations. There is immense room for optimizations using 1859 (e.g.) mmap for reading. The shadow/ directory wraps, besides the 1860 standard C headers, the libio.h and unistd.h headers, for use mainly 1861 by filebuf. These wrappings have not been completed, though there 1862 is scaffolding in place. 1863 1864 The encapsulation of certain C header <cstdio> names presents an 1865 interesting problem. It is possible to define an inline std::fprintf() 1866 implemented in terms of the 'extern "C"' vfprintf(), but there is no 1867 standard vfscanf() to use to implement std::fscanf(). It appears that 1868 vfscanf but be re-implemented in C++ for targets where no vfscanf 1869 extension has been defined. This is interesting in that it seems 1870 to be the only significant case in the C library where this kind of 1871 rewriting is necessary. (Of course Glibc provides the vfscanf() 1872 extension.) (The functions related to exit() must be rewritten 1873 for other reasons.) 1874 1875 1876 Annex D 1877 ------- 1878 Headers: <strstream> 1879 1880 Annex D defines many non-library features, and many minor 1881 modifications to various headers, and a complete header. 1882 It is "mostly done", except that the libstdc++-2 <strstream> 1883 header has not been adopted into the library, or checked to 1884 verify that it matches the draft in those details that were 1885 clarified by the committee. Certainly it must at least be 1886 moved into the std namespace. 1887 1888 We still need to wrap all the deprecated features in #if guards 1889 so that pedantic compile modes can detect their use. 1890 1891 Nonstandard Extensions 1892 ---------------------- 1893 Headers: <iostream.h> <strstream.h> <hash> <rbtree> 1894 <pthread_alloc> <stdiobuf> (etc.) 1895 1896 User code has come to depend on a variety of nonstandard components 1897 that we must not omit. Much of this code can be adopted from 1898 libstdc++-v2 or from the SGI STL. This particularly includes 1899 <iostream.h>, <strstream.h>, and various SGI extensions such 1900 as <hash_map.h>. Many of these are already placed in the 1901 subdirectories ext/ and backward/. (Note that it is better to 1902 include them via "<backward/hash_map.h>" or "<ext/hash_map>" than 1903 to search the subdirectory itself via a "-I" directive. 1904 </literallayout> 1905</section> 1906 1907</appendix> 1908