1<appendix xmlns="http://docbook.org/ns/docbook" version="5.0" 2 xml:id="appendix.contrib" xreflabel="Contributing"> 3<?dbhtml filename="appendix_contributing.html"?> 4 5<info><title> 6 Contributing 7 <indexterm> 8 <primary>Appendix</primary> 9 <secondary>Contributing</secondary> 10 </indexterm> 11</title> 12 <keywordset> 13 <keyword>ISO C++</keyword> 14 <keyword>library</keyword> 15 </keywordset> 16</info> 17 18 19 20<para> 21 The GNU C++ Library is part of GCC and follows the same development model, 22 so the general rules for 23 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://gcc.gnu.org/contribute.html">contributing 24 to GCC</link> apply. Active 25 contributors are assigned maintainership responsibility, and given 26 write access to the source repository. First-time contributors 27 should follow this procedure: 28</para> 29 30<section xml:id="contrib.list" xreflabel="Contributor Checklist"><info><title>Contributor Checklist</title></info> 31 32 33 <section xml:id="list.reading"><info><title>Reading</title></info> 34 35 36 <itemizedlist> 37 <listitem> 38 <para> 39 Get and read the relevant sections of the C++ language 40 specification. Copies of the full ISO 14882 standard are 41 available on line via the ISO mirror site for committee 42 members. Non-members, or those who have not paid for the 43 privilege of sitting on the committee and sustained their 44 two meeting commitment for voting rights, may get a copy of 45 the standard from their respective national standards 46 organization. In the USA, this national standards 47 organization is 48 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.ansi.org">ANSI</link>. 49 (And if you've already registered with them you can <link 50 xmlns:xlink="http://www.w3.org/1999/xlink" 51 xlink:href="https://webstore.ansi.org/Standards/ISO/ISOIEC148822014">buy 52 the standard on-line</link>.) 53 </para> 54 </listitem> 55 56 <listitem> 57 <para> 58 The library working group bugs, and known defects, can 59 be obtained here: 60 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.open-std.org/jtc1/sc22/wg21/">http://www.open-std.org/jtc1/sc22/wg21</link> 61 </para> 62 </listitem> 63 64 <listitem> 65 <para> 66 Peruse 67 the <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.gnu.org/prep/standards/">GNU 68 Coding Standards</link>, and chuckle when you hit the part 69 about <quote>Using Languages Other Than C</quote>. 70 </para> 71 </listitem> 72 73 <listitem> 74 <para> 75 Be familiar with the extensions that preceded these 76 general GNU rules. These style issues for libstdc++ can be 77 found in <link linkend="contrib.coding_style">Coding Style</link>. 78 </para> 79 </listitem> 80 81 <listitem> 82 <para> 83 And last but certainly not least, read the 84 library-specific information found in 85 <link linkend="appendix.porting">Porting and Maintenance</link>. 86 </para> 87 </listitem> 88 </itemizedlist> 89 90 </section> 91 <section xml:id="list.copyright"><info><title>Assignment</title></info> 92 93 <para> 94 See the <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://gcc.gnu.org/contribute.html#legal">legal prerequisites</link> for all GCC contributions. 95 </para> 96 97 <para> 98 Historically, the libstdc++ assignment form added the following 99 question: 100 </para> 101 102 <para> 103 <quote> 104 Which Belgian comic book character is better, Tintin or Asterix, and 105 why? 106 </quote> 107 </para> 108 109 <para> 110 While not strictly necessary, humoring the maintainers and answering 111 this question would be appreciated. 112 </para> 113 114 <para> 115 Please contact 116 Paolo Carlini at <email>paolo.carlini@oracle.com</email> 117 or 118 Jonathan Wakely at <email>jwakely+assign@redhat.com</email> 119 if you are confused about the assignment or have general licensing 120 questions. When requesting an assignment form from 121 <email>assign@gnu.org</email>, please CC the libstdc++ 122 maintainers above so that progress can be monitored. 123 </para> 124 </section> 125 126 <section xml:id="list.getting"><info><title>Getting Sources</title></info> 127 128 <para> 129 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://gcc.gnu.org/gitwrite.html">Getting write access 130 (look for "Write after approval")</link> 131 </para> 132 </section> 133 134 <section xml:id="list.patches"><info><title>Submitting Patches</title></info> 135 136 137 <para> 138 Every patch must have several pieces of information before it can be 139 properly evaluated. Ideally (and to ensure the fastest possible 140 response from the maintainers) it would have all of these pieces: 141 </para> 142 143 <itemizedlist> 144 <listitem> 145 <para> 146 A description of the bug and how your patch fixes this 147 bug. For new features a description of the feature and your 148 implementation. 149 </para> 150 </listitem> 151 152 <listitem> 153 <para> 154 A ChangeLog entry as plain text; see the various 155 ChangeLog files for format and content. If you are 156 using emacs as your editor, simply position the insertion 157 point at the beginning of your change and hit CX-4a to bring 158 up the appropriate ChangeLog entry. See--magic! Similar 159 functionality also exists for vi. 160 </para> 161 </listitem> 162 163 <listitem> 164 <para> 165 A testsuite submission or sample program that will 166 easily and simply show the existing error or test new 167 functionality. 168 </para> 169 </listitem> 170 171 <listitem> 172 <para> 173 The patch itself. If you are using the Git repository use 174 <command>git diff</command> or <command>git format-patch</command> 175 to produce a patch; 176 otherwise, use <command>diff -cp OLD NEW</command>. If your 177 version of diff does not support these options, then get the 178 latest version of GNU diff. 179 </para> 180 </listitem> 181 182 <listitem> 183 <para> 184 When you have all these pieces, bundle them up in a 185 mail message and send it to libstdc++@gcc.gnu.org. All 186 patches and related discussion should be sent to the 187 libstdc++ mailing list. In common with the rest of GCC, 188 patches should also be sent to the gcc-patches mailing list. 189 </para> 190 </listitem> 191 </itemizedlist> 192 193 </section> 194 195</section> 196 197<section xml:id="contrib.organization" xreflabel="Source Organization"><info><title>Directory Layout and Source Conventions</title></info> 198 <?dbhtml filename="source_organization.html"?> 199 200 201 <para> 202 The <filename class="directory">libstdc++-v3</filename> directory in the 203 GCC sources contains the files needed to create the GNU C++ Library. 204 </para> 205 206<para> 207It has subdirectories: 208</para> 209 210<variablelist> 211 <varlistentry> 212 <term><filename class="directory">doc</filename></term> 213 <listitem> 214 Files in HTML and text format that document usage, quirks of the 215 implementation, and contributor checklists. 216 </listitem> 217 </varlistentry> 218 219 <varlistentry> 220 <term><filename class="directory">include</filename></term> 221 <listitem> 222 All header files for the C++ library are within this directory, 223 modulo specific runtime-related files that are in the libsupc++ 224 directory. 225 226 <variablelist> 227 <varlistentry> 228 <term><filename class="directory">include/std</filename></term> 229 <listitem> 230 Files meant to be found by <code>#include <name></code> directives 231 in standard-conforming user programs. 232 </listitem> 233 </varlistentry> 234 235 <varlistentry> 236 <term><filename class="directory">include/c</filename></term> 237 <listitem> 238 Headers intended to directly include standard C headers. 239 [NB: this can be enabled via <option>--enable-cheaders=c</option>] 240 </listitem> 241 </varlistentry> 242 243 <varlistentry> 244 <term><filename class="directory">include/c_global</filename></term> 245 <listitem> 246 Headers intended to include standard C headers in 247 the global namespace, and put select names into the <code>std::</code> 248 namespace. [NB: this is the default, and is the same as 249 <option>--enable-cheaders=c_global</option>] 250 </listitem> 251 </varlistentry> 252 253 <varlistentry> 254 <term><filename class="directory">include/c_std</filename></term> 255 <listitem> 256 Headers intended to include standard C headers 257 already in namespace std, and put select names into the <code>std::</code> 258 namespace. [NB: this is the same as 259 <option>--enable-cheaders=c_std</option>] 260 </listitem> 261 </varlistentry> 262 263 <varlistentry> 264 <term><filename class="directory">include/bits</filename></term> 265 <listitem> 266 Files included by standard headers and by other files in 267 the bits directory. 268 </listitem> 269 </varlistentry> 270 271 <varlistentry> 272 <term><filename class="directory">include/backward</filename></term> 273 <listitem> 274 Headers provided for backward compatibility, such as 275 <filename class="headerfile"><backward/hash_map></filename>. 276 They are not used in this library. 277 </listitem> 278 </varlistentry> 279 280 <varlistentry> 281 <term><filename class="directory">include/ext</filename></term> 282 <listitem> 283 Headers that define extensions to the standard library. No 284 standard header refers to any of them, in theory (there are some 285 exceptions). 286 </listitem> 287 </varlistentry> 288 289 <varlistentry> 290 <term> 291 <filename class="directory">include/debug</filename>, 292 <filename class="directory">include/parallel</filename>, and 293 </term> 294 <listitem> 295 Headers that implement the Debug Mode and Parallel Mode extensions. 296 </listitem> 297 </varlistentry> 298 </variablelist> 299 </listitem> 300 </varlistentry> 301 302 <varlistentry> 303 <term><filename class="directory">scripts</filename></term> 304 <listitem> 305 Scripts that are used during the configure, build, make, or test 306 process. 307 </listitem> 308 </varlistentry> 309 310 <varlistentry> 311 <term><filename class="directory">src</filename></term> 312 <listitem> 313 Files that are used in constructing the library, but are not 314 installed. 315 316 <variablelist> 317 <varlistentry> 318 <term><filename class="directory">src/c++98</filename></term> 319 <listitem> 320 Source files compiled using <option>-std=gnu++98</option>. 321 </listitem> 322 </varlistentry> 323 324 <varlistentry> 325 <term><filename class="directory">src/c++11</filename></term> 326 <listitem> 327 Source files compiled using <option>-std=gnu++11</option>. 328 </listitem> 329 </varlistentry> 330 331 <varlistentry> 332 <term><filename class="directory">src/filesystem</filename></term> 333 <listitem> 334 Source files for the Filesystem TS. 335 </listitem> 336 </varlistentry> 337 338 <varlistentry> 339 <term><filename class="directory">src/shared</filename></term> 340 <listitem> 341 Source code included by other files under both 342 <filename class="directory">src/c++98</filename> and 343 <filename class="directory">src/c++11</filename> 344 </listitem> 345 </varlistentry> 346 </variablelist> 347 </listitem> 348 </varlistentry> 349 350 <varlistentry> 351 <term><filename class="directory">testsuites/[backward, demangle, ext, performance, thread, 17_* to 30_*]</filename></term> 352 <listitem> 353 Test programs are here, and may be used to begin to exercise the 354 library. Support for "make check" and "make check-install" is 355 complete, and runs through all the subdirectories here when this 356 command is issued from the build directory. Please note that 357 "make check" requires DejaGnu 1.4 or later to be installed, 358 or for extra <link linkend="test.run.permutations">permutations</link> 359 DejaGnu 1.5.3 or later. 360 </listitem> 361 </varlistentry> 362</variablelist> 363 364<para> 365Other subdirectories contain variant versions of certain files 366that are meant to be copied or linked by the configure script. 367Currently these are: 368<literallayout><filename class="directory">config/abi</filename> 369<filename class="directory">config/allocator</filename> 370<filename class="directory">config/cpu</filename> 371<filename class="directory">config/io</filename> 372<filename class="directory">config/locale</filename> 373<filename class="directory">config/os</filename> 374</literallayout> 375</para> 376 377<para> 378In addition, a subdirectory holds the convenience library libsupc++. 379</para> 380 381<variablelist> 382<varlistentry> 383 <term><filename class="directory">libsupc++</filename></term> 384 <listitem> 385 Contains the runtime library for C++, including exception 386 handling and memory allocation and deallocation, RTTI, terminate 387 handlers, etc. 388 </listitem> 389</varlistentry> 390</variablelist> 391 392<para> 393Note that glibc also has a <filename class="directory">bits/</filename> 394subdirectory. We need to be careful not to collide with names in its 395<filename class="directory">bits/</filename> directory. For example 396<filename class="headerfile"><bits/std_mutex.h></filename> has to be 397renamed from <filename class="headerfile"><bits/mutex.h></filename>. 398Another solution would be to rename <filename class="directory">bits</filename> 399to (e.g.) <filename class="directory">cppbits</filename>. 400</para> 401 402<para> 403In files throughout the system, lines marked with an "XXX" indicate 404a bug or incompletely-implemented feature. Lines marked "XXX MT" 405indicate a place that may require attention for multi-thread safety. 406</para> 407 408</section> 409 410<section xml:id="contrib.coding_style" xreflabel="Coding Style"><info><title>Coding Style</title></info> 411 <?dbhtml filename="source_code_style.html"?> 412 413 <para> 414 </para> 415 416 <section xml:id="coding_style.bad_identifiers"><info><title>Bad Identifiers</title></info> <!-- BADNAMES --> 417 418 <para> 419 Identifiers that conflict and should be avoided. 420 </para> 421 422 <literallayout class="normal"> 423 This is the list of names <quote>reserved to the 424 implementation</quote> that have been claimed by certain 425 compilers and system headers of interest, and should not be used 426 in the library. It will grow, of course. We generally are 427 interested in names that are not all-caps, except for those like 428 "_T" 429 430 For Solaris: 431 _B 432 _C 433 _L 434 _N 435 _P 436 _S 437 _U 438 _X 439 _E1 440 .. 441 _E24 442 443 Irix adds: 444 _A 445 _G 446 447 MS adds: 448 _T 449 __deref 450 451 BSD adds: 452 __used 453 __unused 454 __inline 455 _Complex 456 __istype 457 __maskrune 458 __tolower 459 __toupper 460 __wchar_t 461 __wint_t 462 _res 463 _res_ext 464 __tg_* 465 466 VxWorks adds: 467 _C2 468 469 For GCC: 470 471 [Note that this list is out of date. It applies to the old 472 name-mangling; in G++ 3.0 and higher a different name-mangling is 473 used. In addition, many of the bugs relating to G++ interpreting 474 these names as operators have been fixed.] 475 476 The full set of __* identifiers (combined from gcc/cp/lex.c and 477 gcc/cplus-dem.c) that are either old or new, but are definitely 478 recognized by the demangler, is: 479 480 __aa 481 __aad 482 __ad 483 __addr 484 __adv 485 __aer 486 __als 487 __alshift 488 __amd 489 __ami 490 __aml 491 __amu 492 __aor 493 __apl 494 __array 495 __ars 496 __arshift 497 __as 498 __bit_and 499 __bit_ior 500 __bit_not 501 __bit_xor 502 __call 503 __cl 504 __cm 505 __cn 506 __co 507 __component 508 __compound 509 __cond 510 __convert 511 __delete 512 __dl 513 __dv 514 __eq 515 __er 516 __ge 517 __gt 518 __indirect 519 __le 520 __ls 521 __lt 522 __max 523 __md 524 __method_call 525 __mi 526 __min 527 __minus 528 __ml 529 __mm 530 __mn 531 __mult 532 __mx 533 __ne 534 __negate 535 __new 536 __nop 537 __nt 538 __nw 539 __oo 540 __op 541 __or 542 __pl 543 __plus 544 __postdecrement 545 __postincrement 546 __pp 547 __pt 548 __rf 549 __rm 550 __rs 551 __sz 552 __trunc_div 553 __trunc_mod 554 __truth_andif 555 __truth_not 556 __truth_orif 557 __vc 558 __vd 559 __vn 560 561 SGI badnames: 562 __builtin_alloca 563 __builtin_fsqrt 564 __builtin_sqrt 565 __builtin_fabs 566 __builtin_dabs 567 __builtin_cast_f2i 568 __builtin_cast_i2f 569 __builtin_cast_d2ll 570 __builtin_cast_ll2d 571 __builtin_copy_dhi2i 572 __builtin_copy_i2dhi 573 __builtin_copy_dlo2i 574 __builtin_copy_i2dlo 575 __add_and_fetch 576 __sub_and_fetch 577 __or_and_fetch 578 __xor_and_fetch 579 __and_and_fetch 580 __nand_and_fetch 581 __mpy_and_fetch 582 __min_and_fetch 583 __max_and_fetch 584 __fetch_and_add 585 __fetch_and_sub 586 __fetch_and_or 587 __fetch_and_xor 588 __fetch_and_and 589 __fetch_and_nand 590 __fetch_and_mpy 591 __fetch_and_min 592 __fetch_and_max 593 __lock_test_and_set 594 __lock_release 595 __lock_acquire 596 __compare_and_swap 597 __synchronize 598 __high_multiply 599 __unix 600 __sgi 601 __linux__ 602 __i386__ 603 __i486__ 604 __cplusplus 605 __embedded_cplusplus 606 // long double conversion members mangled as __opr 607 // http://gcc.gnu.org/ml/libstdc++/1999-q4/msg00060.html 608 __opr 609 </literallayout> 610 </section> 611 612 <section xml:id="coding_style.example"><info><title>By Example</title></info> 613 614 <literallayout class="normal"> 615 This library is written to appropriate C++ coding standards. As such, 616 it is intended to precede the recommendations of the GNU Coding 617 Standard, which can be referenced in full here: 618 619 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.gnu.org/prep/standards/standards.html#Formatting">http://www.gnu.org/prep/standards/standards.html#Formatting</link> 620 621 The rest of this is also interesting reading, but skip the "Design 622 Advice" part. 623 624 The GCC coding conventions are here, and are also useful: 625 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://gcc.gnu.org/codingconventions.html">http://gcc.gnu.org/codingconventions.html</link> 626 627 In addition, because it doesn't seem to be stated explicitly anywhere 628 else, there is an 80 column source limit. 629 630 <filename>ChangeLog</filename> entries for member functions should use the 631 classname::member function name syntax as follows: 632 633<code> 6341999-04-15 Dennis Ritchie <dr@att.com> 635 636 * src/basic_file.cc (__basic_file::open): Fix thinko in 637 _G_HAVE_IO_FILE_OPEN bits. 638</code> 639 640 Notable areas of divergence from what may be previous local practice 641 (particularly for GNU C) include: 642 643 01. Pointers and references 644 <code> 645 char* p = "flop"; 646 char& c = *p; 647 -NOT- 648 char *p = "flop"; // wrong 649 char &c = *p; // wrong 650 </code> 651 652 Reason: In C++, definitions are mixed with executable code. Here, 653 <code>p</code> is being initialized, not <code>*p</code>. This is near-universal 654 practice among C++ programmers; it is normal for C hackers 655 to switch spontaneously as they gain experience. 656 657 02. Operator names and parentheses 658 <code> 659 operator==(type) 660 -NOT- 661 operator == (type) // wrong 662 </code> 663 664 Reason: The <code>==</code> is part of the function name. Separating 665 it makes the declaration look like an expression. 666 667 03. Function names and parentheses 668 <code> 669 void mangle() 670 -NOT- 671 void mangle () // wrong 672 </code> 673 674 Reason: no space before parentheses (except after a control-flow 675 keyword) is near-universal practice for C++. It identifies the 676 parentheses as the function-call operator or declarator, as 677 opposed to an expression or other overloaded use of parentheses. 678 679 04. Template function indentation 680 <code> 681 template<typename T> 682 void 683 template_function(args) 684 { } 685 -NOT- 686 template<class T> 687 void template_function(args) {}; 688 </code> 689 690 Reason: In class definitions, without indentation whitespace is 691 needed both above and below the declaration to distinguish 692 it visually from other members. (Also, re: "typename" 693 rather than "class".) <code>T</code> often could be <code>int</code>, which is 694 not a class. ("class", here, is an anachronism.) 695 696 05. Template class indentation 697 <code> 698 template<typename _CharT, typename _Traits> 699 class basic_ios : public ios_base 700 { 701 public: 702 // Types: 703 }; 704 -NOT- 705 template<class _CharT, class _Traits> 706 class basic_ios : public ios_base 707 { 708 public: 709 // Types: 710 }; 711 -NOT- 712 template<class _CharT, class _Traits> 713 class basic_ios : public ios_base 714 { 715 public: 716 // Types: 717 }; 718 </code> 719 720 06. Enumerators 721 <code> 722 enum 723 { 724 space = _ISspace, 725 print = _ISprint, 726 cntrl = _IScntrl 727 }; 728 -NOT- 729 enum { space = _ISspace, print = _ISprint, cntrl = _IScntrl }; 730 </code> 731 732 07. Member initialization lists 733 All one line, separate from class name. 734 735 <code> 736 gribble::gribble() 737 : _M_private_data(0), _M_more_stuff(0), _M_helper(0) 738 { } 739 -NOT- 740 gribble::gribble() : _M_private_data(0), _M_more_stuff(0), _M_helper(0) 741 { } 742 </code> 743 744 08. Try/Catch blocks 745 <code> 746 try 747 { 748 // 749 } 750 catch (...) 751 { 752 // 753 } 754 -NOT- 755 try { 756 // 757 } catch(...) { 758 // 759 } 760 </code> 761 762 09. Member functions declarations and definitions 763 Keywords such as extern, static, export, explicit, inline, etc 764 go on the line above the function name. Thus 765 766 <code> 767 virtual int 768 foo() 769 -NOT- 770 virtual int foo() 771 </code> 772 773 Reason: GNU coding conventions dictate return types for functions 774 are on a separate line than the function name and parameter list 775 for definitions. For C++, where we have member functions that can 776 be either inline definitions or declarations, keeping to this 777 standard allows all member function names for a given class to be 778 aligned to the same margin, increasing readability. 779 780 781 10. Invocation of member functions with "this->" 782 For non-uglified names, use <code>this->name</code> to call the function. 783 784 <code> 785 this->sync() 786 -NOT- 787 sync() 788 </code> 789 790 Reason: Koenig lookup. 791 792 11. Namespaces 793 <code> 794 namespace std 795 { 796 blah blah blah; 797 } // namespace std 798 799 -NOT- 800 801 namespace std { 802 blah blah blah; 803 } // namespace std 804 </code> 805 806 12. Spacing under protected and private in class declarations: 807 space above, none below 808 i.e. 809 810 <code> 811 public: 812 int foo; 813 814 -NOT- 815 public: 816 817 int foo; 818 </code> 819 820 13. Spacing WRT return statements. 821 no extra spacing before returns, no parenthesis 822 i.e. 823 824 <code> 825 } 826 return __ret; 827 828 -NOT- 829 } 830 831 return __ret; 832 833 -NOT- 834 835 } 836 return (__ret); 837 </code> 838 839 840 14. Location of global variables. 841 All global variables of class type, whether in the "user visible" 842 space (e.g., <code>cin</code>) or the implementation namespace, must be defined 843 as a character array with the appropriate alignment and then later 844 re-initialized to the correct value. 845 846 This is due to startup issues on certain platforms, such as AIX. 847 For more explanation and examples, see <filename>src/globals.cc</filename>. All such 848 variables should be contained in that file, for simplicity. 849 850 15. Exception abstractions 851 Use the exception abstractions found in <filename class="headerfile">functexcept.h</filename>, which allow 852 C++ programmers to use this library with <literal>-fno-exceptions</literal>. (Even if 853 that is rarely advisable, it's a necessary evil for backwards 854 compatibility.) 855 856 16. Exception error messages 857 All start with the name of the function where the exception is 858 thrown, and then (optional) descriptive text is added. Example: 859 860 <code> 861 __throw_logic_error(__N("basic_string::_S_construct NULL not valid")); 862 </code> 863 864 Reason: The verbose terminate handler prints out <code>exception::what()</code>, 865 as well as the typeinfo for the thrown exception. As this is the 866 default terminate handler, by putting location info into the 867 exception string, a very useful error message is printed out for 868 uncaught exceptions. So useful, in fact, that non-programmers can 869 give useful error messages, and programmers can intelligently 870 speculate what went wrong without even using a debugger. 871 872 17. The doxygen style guide to comments is a separate document, 873 see index. 874 875 The library currently has a mixture of GNU-C and modern C++ coding 876 styles. The GNU C usages will be combed out gradually. 877 878 Name patterns: 879 880 For nonstandard names appearing in Standard headers, we are constrained 881 to use names that begin with underscores. This is called "uglification". 882 The convention is: 883 884 Local and argument names: <literal>__[a-z].*</literal> 885 886 Examples: <code>__count __ix __s1</code> 887 888 Type names and template formal-argument names: <literal>_[A-Z][^_].*</literal> 889 890 Examples: <code>_Helper _CharT _N</code> 891 892 Member data and function names: <literal>_M_.*</literal> 893 894 Examples: <code>_M_num_elements _M_initialize ()</code> 895 896 Static data members, constants, and enumerations: <literal>_S_.*</literal> 897 898 Examples: <code>_S_max_elements _S_default_value</code> 899 900 Don't use names in the same scope that differ only in the prefix, 901 e.g. _S_top and _M_top. See <link linkend="coding_style.bad_identifiers">BADNAMES</link> for a list of forbidden names. 902 (The most tempting of these seem to be and "_T" and "__sz".) 903 904 Names must never have "__" internally; it would confuse name 905 unmanglers on some targets. Also, never use "__[0-9]", same reason. 906 907 -------------------------- 908 909 [BY EXAMPLE] 910 <code> 911 912 #ifndef _HEADER_ 913 #define _HEADER_ 1 914 915 namespace std 916 { 917 class gribble 918 { 919 public: 920 gribble() throw(); 921 922 gribble(const gribble&); 923 924 explicit 925 gribble(int __howmany); 926 927 gribble& 928 operator=(const gribble&); 929 930 virtual 931 ~gribble() throw (); 932 933 // Start with a capital letter, end with a period. 934 inline void 935 public_member(const char* __arg) const; 936 937 // In-class function definitions should be restricted to one-liners. 938 int 939 one_line() { return 0 } 940 941 int 942 two_lines(const char* arg) 943 { return strchr(arg, 'a'); } 944 945 inline int 946 three_lines(); // inline, but defined below. 947 948 // Note indentation. 949 template<typename _Formal_argument> 950 void 951 public_template() const throw(); 952 953 template<typename _Iterator> 954 void 955 other_template(); 956 957 private: 958 class _Helper; 959 960 int _M_private_data; 961 int _M_more_stuff; 962 _Helper* _M_helper; 963 int _M_private_function(); 964 965 enum _Enum 966 { 967 _S_one, 968 _S_two 969 }; 970 971 static void 972 _S_initialize_library(); 973 }; 974 975 // More-or-less-standard language features described by lack, not presence. 976 # ifndef _G_NO_LONGLONG 977 extern long long _G_global_with_a_good_long_name; // avoid globals! 978 # endif 979 980 // Avoid in-class inline definitions, define separately; 981 // likewise for member class definitions: 982 inline int 983 gribble::public_member() const 984 { int __local = 0; return __local; } 985 986 class gribble::_Helper 987 { 988 int _M_stuff; 989 990 friend class gribble; 991 }; 992 } 993 994 // Names beginning with "__": only for arguments and 995 // local variables; never use "__" in a type name, or 996 // within any name; never use "__[0-9]". 997 998 #endif /* _HEADER_ */ 999 1000 1001 namespace std 1002 { 1003 template<typename T> // notice: "typename", not "class", no space 1004 long_return_value_type<with_many, args> 1005 function_name(char* pointer, // "char *pointer" is wrong. 1006 char* argument, 1007 const Reference& ref) 1008 { 1009 // int a_local; /* wrong; see below. */ 1010 if (test) 1011 { 1012 nested code 1013 } 1014 1015 int a_local = 0; // declare variable at first use. 1016 1017 // char a, b, *p; /* wrong */ 1018 char a = 'a'; 1019 char b = a + 1; 1020 char* c = "abc"; // each variable goes on its own line, always. 1021 1022 // except maybe here... 1023 for (unsigned i = 0, mask = 1; mask; ++i, mask <<= 1) { 1024 // ... 1025 } 1026 } 1027 1028 gribble::gribble() 1029 : _M_private_data(0), _M_more_stuff(0), _M_helper(0) 1030 { } 1031 1032 int 1033 gribble::three_lines() 1034 { 1035 // doesn't fit in one line. 1036 } 1037 } // namespace std 1038 </code> 1039 </literallayout> 1040 </section> 1041</section> 1042 1043<section xml:id="contrib.design_notes" xreflabel="Design Notes"><info><title>Design Notes</title></info> 1044 <?dbhtml filename="source_design_notes.html"?> 1045 1046 <para> 1047 </para> 1048 1049 <literallayout class="normal"> 1050 1051 The Library 1052 ----------- 1053 1054 This paper is covers two major areas: 1055 1056 - Features and policies not mentioned in the standard that 1057 the quality of the library implementation depends on, including 1058 extensions and "implementation-defined" features; 1059 1060 - Plans for required but unimplemented library features and 1061 optimizations to them. 1062 1063 Overhead 1064 -------- 1065 1066 The standard defines a large library, much larger than the standard 1067 C library. A naive implementation would suffer substantial overhead 1068 in compile time, executable size, and speed, rendering it unusable 1069 in many (particularly embedded) applications. The alternative demands 1070 care in construction, and some compiler support, but there is no 1071 need for library subsets. 1072 1073 What are the sources of this overhead? There are four main causes: 1074 1075 - The library is specified almost entirely as templates, which 1076 with current compilers must be included in-line, resulting in 1077 very slow builds as tens or hundreds of thousands of lines 1078 of function definitions are read for each user source file. 1079 Indeed, the entire SGI STL, as well as the dos Reis valarray, 1080 are provided purely as header files, largely for simplicity in 1081 porting. Iostream/locale is (or will be) as large again. 1082 1083 - The library is very flexible, specifying a multitude of hooks 1084 where users can insert their own code in place of defaults. 1085 When these hooks are not used, any time and code expended to 1086 support that flexibility is wasted. 1087 1088 - Templates are often described as causing to "code bloat". In 1089 practice, this refers (when it refers to anything real) to several 1090 independent processes. First, when a class template is manually 1091 instantiated in its entirely, current compilers place the definitions 1092 for all members in a single object file, so that a program linking 1093 to one member gets definitions of all. Second, template functions 1094 which do not actually depend on the template argument are, under 1095 current compilers, generated anew for each instantiation, rather 1096 than being shared with other instantiations. Third, some of the 1097 flexibility mentioned above comes from virtual functions (both in 1098 regular classes and template classes) which current linkers add 1099 to the executable file even when they manifestly cannot be called. 1100 1101 - The library is specified to use a language feature, exceptions, 1102 which in the current gcc compiler ABI imposes a run time and 1103 code space cost to handle the possibility of exceptions even when 1104 they are not used. Under the new ABI (accessed with -fnew-abi), 1105 there is a space overhead and a small reduction in code efficiency 1106 resulting from lost optimization opportunities associated with 1107 non-local branches associated with exceptions. 1108 1109 What can be done to eliminate this overhead? A variety of coding 1110 techniques, and compiler, linker and library improvements and 1111 extensions may be used, as covered below. Most are not difficult, 1112 and some are already implemented in varying degrees. 1113 1114 Overhead: Compilation Time 1115 -------------------------- 1116 1117 Providing "ready-instantiated" template code in object code archives 1118 allows us to avoid generating and optimizing template instantiations 1119 in each compilation unit which uses them. However, the number of such 1120 instantiations that are useful to provide is limited, and anyway this 1121 is not enough, by itself, to minimize compilation time. In particular, 1122 it does not reduce time spent parsing conforming headers. 1123 1124 Quicker header parsing will depend on library extensions and compiler 1125 improvements. One approach is some variation on the techniques 1126 previously marketed as "pre-compiled headers", now standardized as 1127 support for the "export" keyword. "Exported" template definitions 1128 can be placed (once) in a "repository" -- really just a library, but 1129 of template definitions rather than object code -- to be drawn upon 1130 at link time when an instantiation is needed, rather than placed in 1131 header files to be parsed along with every compilation unit. 1132 1133 Until "export" is implemented we can put some of the lengthy template 1134 definitions in #if guards or alternative headers so that users can skip 1135 over the full definitions when they need only the ready-instantiated 1136 specializations. 1137 1138 To be precise, this means that certain headers which define 1139 templates which users normally use only for certain arguments 1140 can be instrumented to avoid exposing the template definitions 1141 to the compiler unless a macro is defined. For example, in 1142 <string>, we might have: 1143 1144 template <class _CharT, ... > class basic_string { 1145 ... // member declarations 1146 }; 1147 ... // operator declarations 1148 1149 #ifdef _STRICT_ISO_ 1150 # if _G_NO_TEMPLATE_EXPORT 1151 # include <bits/std_locale.h> // headers needed by definitions 1152 # ... 1153 # include <bits/string.tcc> // member and global template definitions. 1154 # endif 1155 #endif 1156 1157 Users who compile without specifying a strict-ISO-conforming flag 1158 would not see many of the template definitions they now see, and rely 1159 instead on ready-instantiated specializations in the library. This 1160 technique would be useful for the following substantial components: 1161 string, locale/iostreams, valarray. It would *not* be useful or 1162 usable with the following: containers, algorithms, iterators, 1163 allocator. Since these constitute a large (though decreasing) 1164 fraction of the library, the benefit the technique offers is 1165 limited. 1166 1167 The language specifies the semantics of the "export" keyword, but 1168 the gcc compiler does not yet support it. When it does, problems 1169 with large template inclusions can largely disappear, given some 1170 minor library reorganization, along with the need for the apparatus 1171 described above. 1172 1173 Overhead: Flexibility Cost 1174 -------------------------- 1175 1176 The library offers many places where users can specify operations 1177 to be performed by the library in place of defaults. Sometimes 1178 this seems to require that the library use a more-roundabout, and 1179 possibly slower, way to accomplish the default requirements than 1180 would be used otherwise. 1181 1182 The primary protection against this overhead is thorough compiler 1183 optimization, to crush out layers of inline function interfaces. 1184 Kuck & Associates has demonstrated the practicality of this kind 1185 of optimization. 1186 1187 The second line of defense against this overhead is explicit 1188 specialization. By defining helper function templates, and writing 1189 specialized code for the default case, overhead can be eliminated 1190 for that case without sacrificing flexibility. This takes full 1191 advantage of any ability of the optimizer to crush out degenerate 1192 code. 1193 1194 The library specifies many virtual functions which current linkers 1195 load even when they cannot be called. Some minor improvements to the 1196 compiler and to ld would eliminate any such overhead by simply 1197 omitting virtual functions that the complete program does not call. 1198 A prototype of this work has already been done. For targets where 1199 GNU ld is not used, a "pre-linker" could do the same job. 1200 1201 The main areas in the standard interface where user flexibility 1202 can result in overhead are: 1203 1204 - Allocators: Containers are specified to use user-definable 1205 allocator types and objects, making tuning for the container 1206 characteristics tricky. 1207 1208 - Locales: the standard specifies locale objects used to implement 1209 iostream operations, involving many virtual functions which use 1210 streambuf iterators. 1211 1212 - Algorithms and containers: these may be instantiated on any type, 1213 frequently duplicating code for identical operations. 1214 1215 - Iostreams and strings: users are permitted to use these on their 1216 own types, and specify the operations the stream must use on these 1217 types. 1218 1219 Note that these sources of overhead are _avoidable_. The techniques 1220 to avoid them are covered below. 1221 1222 Code Bloat 1223 ---------- 1224 1225 In the SGI STL, and in some other headers, many of the templates 1226 are defined "inline" -- either explicitly or by their placement 1227 in class definitions -- which should not be inline. This is a 1228 source of code bloat. Matt had remarked that he was relying on 1229 the compiler to recognize what was too big to benefit from inlining, 1230 and generate it out-of-line automatically. However, this also can 1231 result in code bloat except where the linker can eliminate the extra 1232 copies. 1233 1234 Fixing these cases will require an audit of all inline functions 1235 defined in the library to determine which merit inlining, and moving 1236 the rest out of line. This is an issue mainly in clauses 23, 25, and 1237 27. Of course it can be done incrementally, and we should generally 1238 accept patches that move large functions out of line and into ".tcc" 1239 files, which can later be pulled into a repository. Compiler/linker 1240 improvements to recognize very large inline functions and move them 1241 out-of-line, but shared among compilation units, could make this 1242 work unnecessary. 1243 1244 Pre-instantiating template specializations currently produces large 1245 amounts of dead code which bloats statically linked programs. The 1246 current state of the static library, libstdc++.a, is intolerable on 1247 this account, and will fuel further confused speculation about a need 1248 for a library "subset". A compiler improvement that treats each 1249 instantiated function as a separate object file, for linking purposes, 1250 would be one solution to this problem. An alternative would be to 1251 split up the manual instantiation files into dozens upon dozens of 1252 little files, each compiled separately, but an abortive attempt at 1253 this was done for <string> and, though it is far from complete, it 1254 is already a nuisance. A better interim solution (just until we have 1255 "export") is badly needed. 1256 1257 When building a shared library, the current compiler/linker cannot 1258 automatically generate the instantiations needed. This creates a 1259 miserable situation; it means any time something is changed in the 1260 library, before a shared library can be built someone must manually 1261 copy the declarations of all templates that are needed by other parts 1262 of the library to an "instantiation" file, and add it to the build 1263 system to be compiled and linked to the library. This process is 1264 readily automated, and should be automated as soon as possible. 1265 Users building their own shared libraries experience identical 1266 frustrations. 1267 1268 Sharing common aspects of template definitions among instantiations 1269 can radically reduce code bloat. The compiler could help a great 1270 deal here by recognizing when a function depends on nothing about 1271 a template parameter, or only on its size, and giving the resulting 1272 function a link-name "equate" that allows it to be shared with other 1273 instantiations. Implementation code could take advantage of the 1274 capability by factoring out code that does not depend on the template 1275 argument into separate functions to be merged by the compiler. 1276 1277 Until such a compiler optimization is implemented, much can be done 1278 manually (if tediously) in this direction. One such optimization is 1279 to derive class templates from non-template classes, and move as much 1280 implementation as possible into the base class. Another is to partial- 1281 specialize certain common instantiations, such as vector<T*>, to share 1282 code for instantiations on all types T. While these techniques work, 1283 they are far from the complete solution that a compiler improvement 1284 would afford. 1285 1286 Overhead: Expensive Language Features 1287 ------------------------------------- 1288 1289 The main "expensive" language feature used in the standard library 1290 is exception support, which requires compiling in cleanup code with 1291 static table data to locate it, and linking in library code to use 1292 the table. For small embedded programs the amount of such library 1293 code and table data is assumed by some to be excessive. Under the 1294 "new" ABI this perception is generally exaggerated, although in some 1295 cases it may actually be excessive. 1296 1297 To implement a library which does not use exceptions directly is 1298 not difficult given minor compiler support (to "turn off" exceptions 1299 and ignore exception constructs), and results in no great library 1300 maintenance difficulties. To be precise, given "-fno-exceptions", 1301 the compiler should treat "try" blocks as ordinary blocks, and 1302 "catch" blocks as dead code to ignore or eliminate. Compiler 1303 support is not strictly necessary, except in the case of "function 1304 try blocks"; otherwise the following macros almost suffice: 1305 1306 #define throw(X) 1307 #define try if (true) 1308 #define catch(X) else if (false) 1309 1310 However, there may be a need to use function try blocks in the 1311 library implementation, and use of macros in this way can make 1312 correct diagnostics impossible. Furthermore, use of this scheme 1313 would require the library to call a function to re-throw exceptions 1314 from a try block. Implementing the above semantics in the compiler 1315 is preferable. 1316 1317 Given the support above (however implemented) it only remains to 1318 replace code that "throws" with a call to a well-documented "handler" 1319 function in a separate compilation unit which may be replaced by 1320 the user. The main source of exceptions that would be difficult 1321 for users to avoid is memory allocation failures, but users can 1322 define their own memory allocation primitives that never throw. 1323 Otherwise, the complete list of such handlers, and which library 1324 functions may call them, would be needed for users to be able to 1325 implement the necessary substitutes. (Fortunately, they have the 1326 source code.) 1327 1328 Opportunities 1329 ------------- 1330 1331 The template capabilities of C++ offer enormous opportunities for 1332 optimizing common library operations, well beyond what would be 1333 considered "eliminating overhead". In particular, many operations 1334 done in Glibc with macros that depend on proprietary language 1335 extensions can be implemented in pristine Standard C++. For example, 1336 the chapter 25 algorithms, and even C library functions such as strchr, 1337 can be specialized for the case of static arrays of known (small) size. 1338 1339 Detailed optimization opportunities are identified below where 1340 the component where they would appear is discussed. Of course new 1341 opportunities will be identified during implementation. 1342 1343 Unimplemented Required Library Features 1344 --------------------------------------- 1345 1346 The standard specifies hundreds of components, grouped broadly by 1347 chapter. These are listed in excruciating detail in the CHECKLIST 1348 file. 1349 1350 17 general 1351 18 support 1352 19 diagnostics 1353 20 utilities 1354 21 string 1355 22 locale 1356 23 containers 1357 24 iterators 1358 25 algorithms 1359 26 numerics 1360 27 iostreams 1361 Annex D backward compatibility 1362 1363 Anyone participating in implementation of the library should obtain 1364 a copy of the standard, ISO 14882. People in the U.S. can obtain an 1365 electronic copy for US$18 from ANSI's web site. Those from other 1366 countries should visit http://www.iso.org/ to find out the location 1367 of their country's representation in ISO, in order to know who can 1368 sell them a copy. 1369 1370 The emphasis in the following sections is on unimplemented features 1371 and optimization opportunities. 1372 1373 Chapter 17 General 1374 ------------------- 1375 1376 Chapter 17 concerns overall library requirements. 1377 1378 The standard doesn't mention threads. A multi-thread (MT) extension 1379 primarily affects operators new and delete (18), allocator (20), 1380 string (21), locale (22), and iostreams (27). The common underlying 1381 support needed for this is discussed under chapter 20. 1382 1383 The standard requirements on names from the C headers create a 1384 lot of work, mostly done. Names in the C headers must be visible 1385 in the std:: and sometimes the global namespace; the names in the 1386 two scopes must refer to the same object. More stringent is that 1387 Koenig lookup implies that any types specified as defined in std:: 1388 really are defined in std::. Names optionally implemented as 1389 macros in C cannot be macros in C++. (An overview may be read at 1390 <http://www.cantrip.org/cheaders.html>). The scripts "inclosure" 1391 and "mkcshadow", and the directories shadow/ and cshadow/, are the 1392 beginning of an effort to conform in this area. 1393 1394 A correct conforming definition of C header names based on underlying 1395 C library headers, and practical linking of conforming namespaced 1396 customer code with third-party C libraries depends ultimately on 1397 an ABI change, allowing namespaced C type names to be mangled into 1398 type names as if they were global, somewhat as C function names in a 1399 namespace, or C++ global variable names, are left unmangled. Perhaps 1400 another "extern" mode, such as 'extern "C-global"' would be an 1401 appropriate place for such type definitions. Such a type would 1402 affect mangling as follows: 1403 1404 namespace A { 1405 struct X {}; 1406 extern "C-global" { // or maybe just 'extern "C"' 1407 struct Y {}; 1408 }; 1409 } 1410 void f(A::X*); // mangles to f__FPQ21A1X 1411 void f(A::Y*); // mangles to f__FP1Y 1412 1413 (It may be that this is really the appropriate semantics for regular 1414 'extern "C"', and 'extern "C-global"', as an extension, would not be 1415 necessary.) This would allow functions declared in non-standard C headers 1416 (and thus fixable by neither us nor users) to link properly with functions 1417 declared using C types defined in properly-namespaced headers. The 1418 problem this solves is that C headers (which C++ programmers do persist 1419 in using) frequently forward-declare C struct tags without including 1420 the header where the type is defined, as in 1421 1422 struct tm; 1423 void munge(tm*); 1424 1425 Without some compiler accommodation, munge cannot be called by correct 1426 C++ code using a pointer to a correctly-scoped tm* value. 1427 1428 The current C headers use the preprocessor extension "#include_next", 1429 which the compiler complains about when run "-pedantic". 1430 (Incidentally, it appears that "-fpedantic" is currently ignored, 1431 probably a bug.) The solution in the C compiler is to use 1432 "-isystem" rather than "-I", but unfortunately in g++ this seems 1433 also to wrap the whole header in an 'extern "C"' block, so it's 1434 unusable for C++ headers. The correct solution appears to be to 1435 allow the various special include-directory options, if not given 1436 an argument, to affect subsequent include-directory options additively, 1437 so that if one said 1438 1439 -pedantic -iprefix $(prefix) \ 1440 -idirafter -ino-pedantic -ino-extern-c -iwithprefix -I g++-v3 \ 1441 -iwithprefix -I g++-v3/ext 1442 1443 the compiler would search $(prefix)/g++-v3 and not report 1444 pedantic warnings for files found there, but treat files in 1445 $(prefix)/g++-v3/ext pedantically. (The undocumented semantics 1446 of "-isystem" in g++ stink. Can they be rescinded? If not it 1447 must be replaced with something more rationally behaved.) 1448 1449 All the C headers need the treatment above; in the standard these 1450 headers are mentioned in various clauses. Below, I have only 1451 mentioned those that present interesting implementation issues. 1452 1453 The components identified as "mostly complete", below, have not been 1454 audited for conformance. In many cases where the library passes 1455 conformance tests we have non-conforming extensions that must be 1456 wrapped in #if guards for "pedantic" use, and in some cases renamed 1457 in a conforming way for continued use in the implementation regardless 1458 of conformance flags. 1459 1460 The STL portion of the library still depends on a header 1461 stl/bits/stl_config.h full of #ifdef clauses. This apparatus 1462 should be replaced with autoconf/automake machinery. 1463 1464 The SGI STL defines a type_traits<> template, specialized for 1465 many types in their code including the built-in numeric and 1466 pointer types and some library types, to direct optimizations of 1467 standard functions. The SGI compiler has been extended to generate 1468 specializations of this template automatically for user types, 1469 so that use of STL templates on user types can take advantage of 1470 these optimizations. Specializations for other, non-STL, types 1471 would make more optimizations possible, but extending the gcc 1472 compiler in the same way would be much better. Probably the next 1473 round of standardization will ratify this, but probably with 1474 changes, so it probably should be renamed to place it in the 1475 implementation namespace. 1476 1477 The SGI STL also defines a large number of extensions visible in 1478 standard headers. (Other extensions that appear in separate headers 1479 have been sequestered in subdirectories ext/ and backward/.) All 1480 these extensions should be moved to other headers where possible, 1481 and in any case wrapped in a namespace (not std!), and (where kept 1482 in a standard header) girded about with macro guards. Some cannot be 1483 moved out of standard headers because they are used to implement 1484 standard features. The canonical method for accommodating these 1485 is to use a protected name, aliased in macro guards to a user-space 1486 name. Unfortunately C++ offers no satisfactory template typedef 1487 mechanism, so very ad-hoc and unsatisfactory aliasing must be used 1488 instead. 1489 1490 Implementation of a template typedef mechanism should have the highest 1491 priority among possible extensions, on the same level as implementation 1492 of the template "export" feature. 1493 1494 Chapter 18 Language support 1495 ---------------------------- 1496 1497 Headers: <limits> <new> <typeinfo> <exception> 1498 C headers: <cstddef> <climits> <cfloat> <cstdarg> <csetjmp> 1499 <ctime> <csignal> <cstdlib> (also 21, 25, 26) 1500 1501 This defines the built-in exceptions, rtti, numeric_limits<>, 1502 operator new and delete. Much of this is provided by the 1503 compiler in its static runtime library. 1504 1505 Work to do includes defining numeric_limits<> specializations in 1506 separate files for all target architectures. Values for integer types 1507 except for bool and wchar_t are readily obtained from the C header 1508 <limits.h>, but values for the remaining numeric types (bool, wchar_t, 1509 float, double, long double) must be entered manually. This is 1510 largely dog work except for those members whose values are not 1511 easily deduced from available documentation. Also, this involves 1512 some work in target configuration to identify the correct choice of 1513 file to build against and to install. 1514 1515 The definitions of the various operators new and delete must be 1516 made thread-safe, which depends on a portable exclusion mechanism, 1517 discussed under chapter 20. Of course there is always plenty of 1518 room for improvements to the speed of operators new and delete. 1519 1520 <cstdarg>, in Glibc, defines some macros that gcc does not allow to 1521 be wrapped into an inline function. Probably this header will demand 1522 attention whenever a new target is chosen. The functions atexit(), 1523 exit(), and abort() in cstdlib have different semantics in C++, so 1524 must be re-implemented for C++. 1525 1526 Chapter 19 Diagnostics 1527 ----------------------- 1528 1529 Headers: <stdexcept> 1530 C headers: <cassert> <cerrno> 1531 1532 This defines the standard exception objects, which are "mostly complete". 1533 Cygnus has a version, and now SGI provides a slightly different one. 1534 It makes little difference which we use. 1535 1536 The C global name "errno", which C allows to be a variable or a macro, 1537 is required in C++ to be a macro. For MT it must typically result in 1538 a function call. 1539 1540 Chapter 20 Utilities 1541 --------------------- 1542 Headers: <utility> <functional> <memory> 1543 C header: <ctime> (also in 18) 1544 1545 SGI STL provides "mostly complete" versions of all the components 1546 defined in this chapter. However, the auto_ptr<> implementation 1547 is known to be wrong. Furthermore, the standard definition of it 1548 is known to be unimplementable as written. A minor change to the 1549 standard would fix it, and auto_ptr<> should be adjusted to match. 1550 1551 Multi-threading affects the allocator implementation, and there must 1552 be configuration/installation choices for different users' MT 1553 requirements. Anyway, users will want to tune allocator options 1554 to support different target conditions, MT or no. 1555 1556 The primitives used for MT implementation should be exposed, as an 1557 extension, for users' own work. We need cross-CPU "mutex" support, 1558 multi-processor shared-memory atomic integer operations, and single- 1559 processor uninterruptible integer operations, and all three configurable 1560 to be stubbed out for non-MT use, or to use an appropriately-loaded 1561 dynamic library for the actual runtime environment, or statically 1562 compiled in for cases where the target architecture is known. 1563 1564 Chapter 21 String 1565 ------------------ 1566 Headers: <string> 1567 C headers: <cctype> <cwctype> <cstring> <cwchar> (also in 27) 1568 <cstdlib> (also in 18, 25, 26) 1569 1570 We have "mostly-complete" char_traits<> implementations. Many of the 1571 char_traits<char> operations might be optimized further using existing 1572 proprietary language extensions. 1573 1574 We have a "mostly-complete" basic_string<> implementation. The work 1575 to manually instantiate char and wchar_t specializations in object 1576 files to improve link-time behavior is extremely unsatisfactory, 1577 literally tripling library-build time with no commensurate improvement 1578 in static program link sizes. It must be redone. (Similar work is 1579 needed for some components in clauses 22 and 27.) 1580 1581 Other work needed for strings is MT-safety, as discussed under the 1582 chapter 20 heading. 1583 1584 The standard C type mbstate_t from <cwchar> and used in char_traits<> 1585 must be different in C++ than in C, because in C++ the default constructor 1586 value mbstate_t() must be the "base" or "ground" sequence state. 1587 (According to the likely resolution of a recently raised Core issue, 1588 this may become unnecessary. However, there are other reasons to 1589 use a state type not as limited as whatever the C library provides.) 1590 If we might want to provide conversions from (e.g.) internally- 1591 represented EUC-wide to externally-represented Unicode, or vice- 1592 versa, the mbstate_t we choose will need to be more accommodating 1593 than what might be provided by an underlying C library. 1594 1595 There remain some basic_string template-member functions which do 1596 not overload properly with their non-template brethren. The infamous 1597 hack akin to what was done in vector<> is needed, to conform to 1598 23.1.1 para 10. The CHECKLIST items for basic_string marked 'X', 1599 or incomplete, are so marked for this reason. 1600 1601 Replacing the string iterators, which currently are simple character 1602 pointers, with class objects would greatly increase the safety of the 1603 client interface, and also permit a "debug" mode in which range, 1604 ownership, and validity are rigorously checked. The current use of 1605 raw pointers as string iterators is evil. vector<> iterators need the 1606 same treatment. Note that the current implementation freely mixes 1607 pointers and iterators, and that must be fixed before safer iterators 1608 can be introduced. 1609 1610 Some of the functions in <cstring> are different from the C version. 1611 generally overloaded on const and non-const argument pointers. For 1612 example, in <cstring> strchr is overloaded. The functions isupper 1613 etc. in <cctype> typically implemented as macros in C are functions 1614 in C++, because they are overloaded with others of the same name 1615 defined in <locale>. 1616 1617 Many of the functions required in <cwctype> and <cwchar> cannot be 1618 implemented using underlying C facilities on intended targets because 1619 such facilities only partly exist. 1620 1621 Chapter 22 Locale 1622 ------------------ 1623 Headers: <locale> 1624 C headers: <clocale> 1625 1626 We have a "mostly complete" class locale, with the exception of 1627 code for constructing, and handling the names of, named locales. 1628 The ways that locales are named (particularly when categories 1629 (e.g. LC_TIME, LC_COLLATE) are different) varies among all target 1630 environments. This code must be written in various versions and 1631 chosen by configuration parameters. 1632 1633 Members of many of the facets defined in <locale> are stubs. Generally, 1634 there are two sets of facets: the base class facets (which are supposed 1635 to implement the "C" locale) and the "byname" facets, which are supposed 1636 to read files to determine their behavior. The base ctype<>, collate<>, 1637 and numpunct<> facets are "mostly complete", except that the table of 1638 bitmask values used for "is" operations, and corresponding mask values, 1639 are still defined in libio and just included/linked. (We will need to 1640 implement these tables independently, soon, but should take advantage 1641 of libio where possible.) The num_put<>::put members for integer types 1642 are "mostly complete". 1643 1644 A complete list of what has and has not been implemented may be 1645 found in CHECKLIST. However, note that the current definition of 1646 codecvt<wchar_t,char,mbstate_t> is wrong. It should simply write 1647 out the raw bytes representing the wide characters, rather than 1648 trying to convert each to a corresponding single "char" value. 1649 1650 Some of the facets are more important than others. Specifically, 1651 the members of ctype<>, numpunct<>, num_put<>, and num_get<> facets 1652 are used by other library facilities defined in <string>, <istream>, 1653 and <ostream>, and the codecvt<> facet is used by basic_filebuf<> 1654 in <fstream>, so a conforming iostream implementation depends on 1655 these. 1656 1657 The "long long" type eventually must be supported, but code mentioning 1658 it should be wrapped in #if guards to allow pedantic-mode compiling. 1659 1660 Performance of num_put<> and num_get<> depend critically on 1661 caching computed values in ios_base objects, and on extensions 1662 to the interface with streambufs. 1663 1664 Specifically: retrieving a copy of the locale object, extracting 1665 the needed facets, and gathering data from them, for each call to 1666 (e.g.) operator<< would be prohibitively slow. To cache format 1667 data for use by num_put<> and num_get<> we have a _Format_cache<> 1668 object stored in the ios_base::pword() array. This is constructed 1669 and initialized lazily, and is organized purely for utility. It 1670 is discarded when a new locale with different facets is imbued. 1671 1672 Using only the public interfaces of the iterator arguments to the 1673 facet functions would limit performance by forbidding "vector-style" 1674 character operations. The streambuf iterator optimizations are 1675 described under chapter 24, but facets can also bypass the streambuf 1676 iterators via explicit specializations and operate directly on the 1677 streambufs, and use extended interfaces to get direct access to the 1678 streambuf internal buffer arrays. These extensions are mentioned 1679 under chapter 27. These optimizations are particularly important 1680 for input parsing. 1681 1682 Unused virtual members of locale facets can be omitted, as mentioned 1683 above, by a smart linker. 1684 1685 Chapter 23 Containers 1686 ---------------------- 1687 Headers: <deque> <list> <queue> <stack> <vector> <map> <set> <bitset> 1688 1689 All the components in chapter 23 are implemented in the SGI STL. 1690 They are "mostly complete"; they include a large number of 1691 nonconforming extensions which must be wrapped. Some of these 1692 are used internally and must be renamed or duplicated. 1693 1694 The SGI components are optimized for large-memory environments. For 1695 embedded targets, different criteria might be more appropriate. Users 1696 will want to be able to tune this behavior. We should provide 1697 ways for users to compile the library with different memory usage 1698 characteristics. 1699 1700 A lot more work is needed on factoring out common code from different 1701 specializations to reduce code size here and in chapter 25. The 1702 easiest fix for this would be a compiler/ABI improvement that allows 1703 the compiler to recognize when a specialization depends only on the 1704 size (or other gross quality) of a template argument, and allow the 1705 linker to share the code with similar specializations. In its 1706 absence, many of the algorithms and containers can be partial- 1707 specialized, at least for the case of pointers, but this only solves 1708 a small part of the problem. Use of a type_traits-style template 1709 allows a few more optimization opportunities, more if the compiler 1710 can generate the specializations automatically. 1711 1712 As an optimization, containers can specialize on the default allocator 1713 and bypass it, or take advantage of details of its implementation 1714 after it has been improved upon. 1715 1716 Replacing the vector iterators, which currently are simple element 1717 pointers, with class objects would greatly increase the safety of the 1718 client interface, and also permit a "debug" mode in which range, 1719 ownership, and validity are rigorously checked. The current use of 1720 pointers for iterators is evil. 1721 1722 As mentioned for chapter 24, the deque iterator is a good example of 1723 an opportunity to implement a "staged" iterator that would benefit 1724 from specializations of some algorithms. 1725 1726 Chapter 24 Iterators 1727 --------------------- 1728 Headers: <iterator> 1729 1730 Standard iterators are "mostly complete", with the exception of 1731 the stream iterators, which are not yet templatized on the 1732 stream type. Also, the base class template iterator<> appears 1733 to be wrong, so everything derived from it must also be wrong, 1734 currently. 1735 1736 The streambuf iterators (currently located in stl/bits/std_iterator.h, 1737 but should be under bits/) can be rewritten to take advantage of 1738 friendship with the streambuf implementation. 1739 1740 Matt Austern has identified opportunities where certain iterator 1741 types, particularly including streambuf iterators and deque 1742 iterators, have a "two-stage" quality, such that an intermediate 1743 limit can be checked much more quickly than the true limit on 1744 range operations. If identified with a member of iterator_traits, 1745 algorithms may be specialized for this case. Of course the 1746 iterators that have this quality can be identified by specializing 1747 a traits class. 1748 1749 Many of the algorithms must be specialized for the streambuf 1750 iterators, to take advantage of block-mode operations, in order 1751 to allow iostream/locale operations' performance not to suffer. 1752 It may be that they could be treated as staged iterators and 1753 take advantage of those optimizations. 1754 1755 Chapter 25 Algorithms 1756 ---------------------- 1757 Headers: <algorithm> 1758 C headers: <cstdlib> (also in 18, 21, 26)) 1759 1760 The algorithms are "mostly complete". As mentioned above, they 1761 are optimized for speed at the expense of code and data size. 1762 1763 Specializations of many of the algorithms for non-STL types would 1764 give performance improvements, but we must use great care not to 1765 interfere with fragile template overloading semantics for the 1766 standard interfaces. Conventionally the standard function template 1767 interface is an inline which delegates to a non-standard function 1768 which is then overloaded (this is already done in many places in 1769 the library). Particularly appealing opportunities for the sake of 1770 iostream performance are for copy and find applied to streambuf 1771 iterators or (as noted elsewhere) for staged iterators, of which 1772 the streambuf iterators are a good example. 1773 1774 The bsearch and qsort functions cannot be overloaded properly as 1775 required by the standard because gcc does not yet allow overloading 1776 on the extern-"C"-ness of a function pointer. 1777 1778 Chapter 26 Numerics 1779 -------------------- 1780 Headers: <complex> <valarray> <numeric> 1781 C headers: <cmath>, <cstdlib> (also 18, 21, 25) 1782 1783 Numeric components: Gabriel dos Reis's valarray, Drepper's complex, 1784 and the few algorithms from the STL are "mostly done". Of course 1785 optimization opportunities abound for the numerically literate. It 1786 is not clear whether the valarray implementation really conforms 1787 fully, in the assumptions it makes about aliasing (and lack thereof) 1788 in its arguments. 1789 1790 The C div() and ldiv() functions are interesting, because they are the 1791 only case where a C library function returns a class object by value. 1792 Since the C++ type div_t must be different from the underlying C type 1793 (which is in the wrong namespace) the underlying functions div() and 1794 ldiv() cannot be re-used efficiently. Fortunately they are trivial to 1795 re-implement. 1796 1797 Chapter 27 Iostreams 1798 --------------------- 1799 Headers: <iosfwd> <streambuf> <ios> <ostream> <istream> <iostream> 1800 <iomanip> <sstream> <fstream> 1801 C headers: <cstdio> <cwchar> (also in 21) 1802 1803 Iostream is currently in a very incomplete state. <iosfwd>, <iomanip>, 1804 ios_base, and basic_ios<> are "mostly complete". basic_streambuf<> and 1805 basic_ostream<> are well along, but basic_istream<> has had little work 1806 done. The standard stream objects, <sstream> and <fstream> have been 1807 started; basic_filebuf<> "write" functions have been implemented just 1808 enough to do "hello, world". 1809 1810 Most of the istream and ostream operators << and >> (with the exception 1811 of the op<<(integer) ones) have not been changed to use locale primitives, 1812 sentry objects, or char_traits members. 1813 1814 All these templates should be manually instantiated for char and 1815 wchar_t in a way that links only used members into user programs. 1816 1817 Streambuf is fertile ground for optimization extensions. An extended 1818 interface giving iterator access to its internal buffer would be very 1819 useful for other library components. 1820 1821 Iostream operations (primarily operators << and >>) can take advantage 1822 of the case where user code has not specified a locale, and bypass locale 1823 operations entirely. The current implementation of op<</num_put<>::put, 1824 for the integer types, demonstrates how they can cache encoding details 1825 from the locale on each operation. There is lots more room for 1826 optimization in this area. 1827 1828 The definition of the relationship between the standard streams 1829 cout et al. and stdout et al. requires something like a "stdiobuf". 1830 The SGI solution of using double-indirection to actually use a 1831 stdio FILE object for buffering is unsatisfactory, because it 1832 interferes with peephole loop optimizations. 1833 1834 The <sstream> header work has begun. stringbuf can benefit from 1835 friendship with basic_string<> and basic_string<>::_Rep to use 1836 those objects directly as buffers, and avoid allocating and making 1837 copies. 1838 1839 The basic_filebuf<> template is a complex beast. It is specified to 1840 use the locale facet codecvt<> to translate characters between native 1841 files and the locale character encoding. In general this involves 1842 two buffers, one of "char" representing the file and another of 1843 "char_type", for the stream, with codecvt<> translating. The process 1844 is complicated by the variable-length nature of the translation, and 1845 the need to seek to corresponding places in the two representations. 1846 For the case of basic_filebuf<char>, when no translation is needed, 1847 a single buffer suffices. A specialized filebuf can be used to reduce 1848 code space overhead when no locale has been imbued. Matt Austern's 1849 work at SGI will be useful, perhaps directly as a source of code, or 1850 at least as an example to draw on. 1851 1852 Filebuf, almost uniquely (cf. operator new), depends heavily on 1853 underlying environmental facilities. In current releases iostream 1854 depends fairly heavily on libio constant definitions, but it should 1855 be made independent. It also depends on operating system primitives 1856 for file operations. There is immense room for optimizations using 1857 (e.g.) mmap for reading. The shadow/ directory wraps, besides the 1858 standard C headers, the libio.h and unistd.h headers, for use mainly 1859 by filebuf. These wrappings have not been completed, though there 1860 is scaffolding in place. 1861 1862 The encapsulation of certain C header <cstdio> names presents an 1863 interesting problem. It is possible to define an inline std::fprintf() 1864 implemented in terms of the 'extern "C"' vfprintf(), but there is no 1865 standard vfscanf() to use to implement std::fscanf(). It appears that 1866 vfscanf but be re-implemented in C++ for targets where no vfscanf 1867 extension has been defined. This is interesting in that it seems 1868 to be the only significant case in the C library where this kind of 1869 rewriting is necessary. (Of course Glibc provides the vfscanf() 1870 extension.) (The functions related to exit() must be rewritten 1871 for other reasons.) 1872 1873 1874 Annex D 1875 ------- 1876 Headers: <strstream> 1877 1878 Annex D defines many non-library features, and many minor 1879 modifications to various headers, and a complete header. 1880 It is "mostly done", except that the libstdc++-2 <strstream> 1881 header has not been adopted into the library, or checked to 1882 verify that it matches the draft in those details that were 1883 clarified by the committee. Certainly it must at least be 1884 moved into the std namespace. 1885 1886 We still need to wrap all the deprecated features in #if guards 1887 so that pedantic compile modes can detect their use. 1888 1889 Nonstandard Extensions 1890 ---------------------- 1891 Headers: <iostream.h> <strstream.h> <hash> <rbtree> 1892 <pthread_alloc> <stdiobuf> (etc.) 1893 1894 User code has come to depend on a variety of nonstandard components 1895 that we must not omit. Much of this code can be adopted from 1896 libstdc++-v2 or from the SGI STL. This particularly includes 1897 <iostream.h>, <strstream.h>, and various SGI extensions such 1898 as <hash_map.h>. Many of these are already placed in the 1899 subdirectories ext/ and backward/. (Note that it is better to 1900 include them via "<backward/hash_map.h>" or "<ext/hash_map>" than 1901 to search the subdirectory itself via a "-I" directive. 1902 </literallayout> 1903</section> 1904 1905</appendix> 1906