1<appendix xmlns="http://docbook.org/ns/docbook" version="5.0" 2 xml:id="appendix.contrib" xreflabel="Contributing"> 3<?dbhtml filename="appendix_contributing.html"?> 4 5<info><title> 6 Contributing 7 <indexterm> 8 <primary>Appendix</primary> 9 <secondary>Contributing</secondary> 10 </indexterm> 11</title> 12 <keywordset> 13 <keyword>ISO C++</keyword> 14 <keyword>library</keyword> 15 </keywordset> 16</info> 17 18 19 20<para> 21 The GNU C++ Library is part of GCC and follows the same development model, 22 so the general rules for 23 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://gcc.gnu.org/contribute.html">contributing 24 to GCC</link> apply. Active 25 contributors are assigned maintainership responsibility, and given 26 write access to the source repository. First-time contributors 27 should follow this procedure: 28</para> 29 30<section xml:id="contrib.list" xreflabel="Contributor Checklist"><info><title>Contributor Checklist</title></info> 31 32 33 <section xml:id="list.reading"><info><title>Reading</title></info> 34 35 36 <itemizedlist> 37 <listitem> 38 <para> 39 Get and read the relevant sections of the C++ language 40 specification. Copies of the full ISO 14882 standard are 41 available on line via the ISO mirror site for committee 42 members. Non-members, or those who have not paid for the 43 privilege of sitting on the committee and sustained their 44 two meeting commitment for voting rights, may get a copy of 45 the standard from their respective national standards 46 organization. In the USA, this national standards 47 organization is 48 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.ansi.org">ANSI</link>. 49 (And if you've already registered with them you can 50 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://webstore.ansi.org/RecordDetail.aspx?sku=INCITS%2fISO%2fIEC+14882-2012">buy the standard on-line</link>.) 51 </para> 52 </listitem> 53 54 <listitem> 55 <para> 56 The library working group bugs, and known defects, can 57 be obtained here: 58 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.open-std.org/jtc1/sc22/wg21/">http://www.open-std.org/jtc1/sc22/wg21</link> 59 </para> 60 </listitem> 61 62 <listitem> 63 <para> 64 Peruse 65 the <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.gnu.org/prep/standards/">GNU 66 Coding Standards</link>, and chuckle when you hit the part 67 about <quote>Using Languages Other Than C</quote>. 68 </para> 69 </listitem> 70 71 <listitem> 72 <para> 73 Be familiar with the extensions that preceded these 74 general GNU rules. These style issues for libstdc++ can be 75 found in <link linkend="contrib.coding_style">Coding Style</link>. 76 </para> 77 </listitem> 78 79 <listitem> 80 <para> 81 And last but certainly not least, read the 82 library-specific information found in 83 <link linkend="appendix.porting">Porting and Maintenance</link>. 84 </para> 85 </listitem> 86 </itemizedlist> 87 88 </section> 89 <section xml:id="list.copyright"><info><title>Assignment</title></info> 90 91 <para> 92 See the <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://gcc.gnu.org/contribute.html#legal">legal prerequisites</link> for all GCC contributions. 93 </para> 94 95 <para> 96 Historically, the libstdc++ assignment form added the following 97 question: 98 </para> 99 100 <para> 101 <quote> 102 Which Belgian comic book character is better, Tintin or Asterix, and 103 why? 104 </quote> 105 </para> 106 107 <para> 108 While not strictly necessary, humoring the maintainers and answering 109 this question would be appreciated. 110 </para> 111 112 <para> 113 Please contact 114 Paolo Carlini at <email>paolo.carlini@oracle.com</email> 115 or 116 Jonathan Wakely at <email>jwakely+assign@redhat.com</email> 117 if you are confused about the assignment or have general licensing 118 questions. When requesting an assignment form from 119 <email>assign@gnu.org</email>, please CC the libstdc++ 120 maintainers above so that progress can be monitored. 121 </para> 122 </section> 123 124 <section xml:id="list.getting"><info><title>Getting Sources</title></info> 125 126 <para> 127 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://gcc.gnu.org/svnwrite.html">Getting write access 128 (look for "Write after approval")</link> 129 </para> 130 </section> 131 132 <section xml:id="list.patches"><info><title>Submitting Patches</title></info> 133 134 135 <para> 136 Every patch must have several pieces of information before it can be 137 properly evaluated. Ideally (and to ensure the fastest possible 138 response from the maintainers) it would have all of these pieces: 139 </para> 140 141 <itemizedlist> 142 <listitem> 143 <para> 144 A description of the bug and how your patch fixes this 145 bug. For new features a description of the feature and your 146 implementation. 147 </para> 148 </listitem> 149 150 <listitem> 151 <para> 152 A ChangeLog entry as plain text; see the various 153 ChangeLog files for format and content. If you are 154 using emacs as your editor, simply position the insertion 155 point at the beginning of your change and hit CX-4a to bring 156 up the appropriate ChangeLog entry. See--magic! Similar 157 functionality also exists for vi. 158 </para> 159 </listitem> 160 161 <listitem> 162 <para> 163 A testsuite submission or sample program that will 164 easily and simply show the existing error or test new 165 functionality. 166 </para> 167 </listitem> 168 169 <listitem> 170 <para> 171 The patch itself. If you are accessing the SVN 172 repository use <command>svn update; svn diff NEW</command>; 173 else, use <command>diff -cp OLD NEW</command> ... If your 174 version of diff does not support these options, then get the 175 latest version of GNU 176 diff. The <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://gcc.gnu.org/wiki/SvnTricks">SVN 177 Tricks</link> wiki page has information on customising the 178 output of <code>svn diff</code>. 179 </para> 180 </listitem> 181 182 <listitem> 183 <para> 184 When you have all these pieces, bundle them up in a 185 mail message and send it to libstdc++@gcc.gnu.org. All 186 patches and related discussion should be sent to the 187 libstdc++ mailing list. 188 </para> 189 </listitem> 190 </itemizedlist> 191 192 </section> 193 194</section> 195 196<section xml:id="contrib.organization" xreflabel="Source Organization"><info><title>Directory Layout and Source Conventions</title></info> 197 <?dbhtml filename="source_organization.html"?> 198 199 200 <para> 201 The <filename class="directory">libstdc++-v3</filename> directory in the 202 GCC sources contains the files needed to create the GNU C++ Library. 203 </para> 204 205<para> 206It has subdirectories: 207</para> 208 209<variablelist> 210 <varlistentry> 211 <term><filename class="directory">doc</filename></term> 212 <listitem> 213 Files in HTML and text format that document usage, quirks of the 214 implementation, and contributor checklists. 215 </listitem> 216 </varlistentry> 217 218 <varlistentry> 219 <term><filename class="directory">include</filename></term> 220 <listitem> 221 All header files for the C++ library are within this directory, 222 modulo specific runtime-related files that are in the libsupc++ 223 directory. 224 225 <variablelist> 226 <varlistentry> 227 <term><filename class="directory">include/std</filename></term> 228 <listitem> 229 Files meant to be found by <code>#include <name></code> directives 230 in standard-conforming user programs. 231 </listitem> 232 </varlistentry> 233 234 <varlistentry> 235 <term><filename class="directory">include/c</filename></term> 236 <listitem> 237 Headers intended to directly include standard C headers. 238 [NB: this can be enabled via <option>--enable-cheaders=c</option>] 239 </listitem> 240 </varlistentry> 241 242 <varlistentry> 243 <term><filename class="directory">include/c_global</filename></term> 244 <listitem> 245 Headers intended to include standard C headers in 246 the global namespace, and put select names into the <code>std::</code> 247 namespace. [NB: this is the default, and is the same as 248 <option>--enable-cheaders=c_global</option>] 249 </listitem> 250 </varlistentry> 251 252 <varlistentry> 253 <term><filename class="directory">include/c_std</filename></term> 254 <listitem> 255 Headers intended to include standard C headers 256 already in namespace std, and put select names into the <code>std::</code> 257 namespace. [NB: this is the same as 258 <option>--enable-cheaders=c_std</option>] 259 </listitem> 260 </varlistentry> 261 262 <varlistentry> 263 <term><filename class="directory">include/bits</filename></term> 264 <listitem> 265 Files included by standard headers and by other files in 266 the bits directory. 267 </listitem> 268 </varlistentry> 269 270 <varlistentry> 271 <term><filename class="directory">include/backward</filename></term> 272 <listitem> 273 Headers provided for backward compatibility, such as 274 <filename class="headerfile"><backward/hash_map></filename>. 275 They are not used in this library. 276 </listitem> 277 </varlistentry> 278 279 <varlistentry> 280 <term><filename class="directory">include/ext</filename></term> 281 <listitem> 282 Headers that define extensions to the standard library. No 283 standard header refers to any of them, in theory (there are some 284 exceptions). 285 </listitem> 286 </varlistentry> 287 </variablelist> 288 </listitem> 289 </varlistentry> 290 291 <varlistentry> 292 <term><filename class="directory">scripts</filename></term> 293 <listitem> 294 Scripts that are used during the configure, build, make, or test 295 process. 296 </listitem> 297 </varlistentry> 298 299 <varlistentry> 300 <term><filename class="directory">src</filename></term> 301 <listitem> 302 Files that are used in constructing the library, but are not 303 installed. 304 305 <variablelist> 306 <varlistentry> 307 <term><filename class="directory">src/c++98</filename></term> 308 <listitem> 309 Source files compiled using <option>-std=gnu++98</option>. 310 </listitem> 311 </varlistentry> 312 313 <varlistentry> 314 <term><filename class="directory">src/c++11</filename></term> 315 <listitem> 316 Source files compiled using <option>-std=gnu++11</option>. 317 </listitem> 318 </varlistentry> 319 320 <varlistentry> 321 <term><filename class="directory">src/filesystem</filename></term> 322 <listitem> 323 Source files for the Filesystem TS. 324 </listitem> 325 </varlistentry> 326 327 <varlistentry> 328 <term><filename class="directory">src/shared</filename></term> 329 <listitem> 330 Source code included by other files under both 331 <filename class="directory">src/c++98</filename> and 332 <filename class="directory">src/c++11</filename> 333 </listitem> 334 </varlistentry> 335 </variablelist> 336 </listitem> 337 </varlistentry> 338 339 <varlistentry> 340 <term><filename class="directory">testsuites/[backward, demangle, ext, performance, thread, 17_* to 30_*]</filename></term> 341 <listitem> 342 Test programs are here, and may be used to begin to exercise the 343 library. Support for "make check" and "make check-install" is 344 complete, and runs through all the subdirectories here when this 345 command is issued from the build directory. Please note that 346 "make check" requires DejaGnu 1.4 or later to be installed, 347 or for extra <link linkend="test.run.permutations">permutations</link> 348 DejaGnu 1.5.3 or later. 349 </listitem> 350 </varlistentry> 351</variablelist> 352 353<para> 354Other subdirectories contain variant versions of certain files 355that are meant to be copied or linked by the configure script. 356Currently these are: 357<literallayout><filename class="directory">config/abi</filename> 358<filename class="directory">config/allocator</filename> 359<filename class="directory">config/cpu</filename> 360<filename class="directory">config/io</filename> 361<filename class="directory">config/locale</filename> 362<filename class="directory">config/os</filename> 363</literallayout> 364</para> 365 366<para> 367In addition, a subdirectory holds the convenience library libsupc++. 368</para> 369 370<variablelist> 371<varlistentry> 372 <term><filename class="directory">libsupc++</filename></term> 373 <listitem> 374 Contains the runtime library for C++, including exception 375 handling and memory allocation and deallocation, RTTI, terminate 376 handlers, etc. 377 </listitem> 378</varlistentry> 379</variablelist> 380 381<para> 382Note that glibc also has a <filename class="directory">bits/</filename> 383subdirectory. We need to be careful not to collide with names in its 384<filename class="directory">bits/</filename> directory. 385Another solution would be to rename <filename class="directory">bits</filename> 386to (e.g.) <filename class="directory">cppbits</filename>. 387</para> 388 389<para> 390In files throughout the system, lines marked with an "XXX" indicate 391a bug or incompletely-implemented feature. Lines marked "XXX MT" 392indicate a place that may require attention for multi-thread safety. 393</para> 394 395</section> 396 397<section xml:id="contrib.coding_style" xreflabel="Coding Style"><info><title>Coding Style</title></info> 398 <?dbhtml filename="source_code_style.html"?> 399 400 <para> 401 </para> 402 <section xml:id="coding_style.bad_identifiers"><info><title>Bad Identifiers</title></info> 403 404 <para> 405 Identifiers that conflict and should be avoided. 406 </para> 407 408 <literallayout class="normal"> 409 This is the list of names <quote>reserved to the 410 implementation</quote> that have been claimed by certain 411 compilers and system headers of interest, and should not be used 412 in the library. It will grow, of course. We generally are 413 interested in names that are not all-caps, except for those like 414 "_T" 415 416 For Solaris: 417 _B 418 _C 419 _L 420 _N 421 _P 422 _S 423 _U 424 _X 425 _E1 426 .. 427 _E24 428 429 Irix adds: 430 _A 431 _G 432 433 MS adds: 434 _T 435 436 BSD adds: 437 __used 438 __unused 439 __inline 440 _Complex 441 __istype 442 __maskrune 443 __tolower 444 __toupper 445 __wchar_t 446 __wint_t 447 _res 448 _res_ext 449 __tg_* 450 451 SPU adds: 452 __ea 453 454 For GCC: 455 456 [Note that this list is out of date. It applies to the old 457 name-mangling; in G++ 3.0 and higher a different name-mangling is 458 used. In addition, many of the bugs relating to G++ interpreting 459 these names as operators have been fixed.] 460 461 The full set of __* identifiers (combined from gcc/cp/lex.c and 462 gcc/cplus-dem.c) that are either old or new, but are definitely 463 recognized by the demangler, is: 464 465 __aa 466 __aad 467 __ad 468 __addr 469 __adv 470 __aer 471 __als 472 __alshift 473 __amd 474 __ami 475 __aml 476 __amu 477 __aor 478 __apl 479 __array 480 __ars 481 __arshift 482 __as 483 __bit_and 484 __bit_ior 485 __bit_not 486 __bit_xor 487 __call 488 __cl 489 __cm 490 __cn 491 __co 492 __component 493 __compound 494 __cond 495 __convert 496 __delete 497 __dl 498 __dv 499 __eq 500 __er 501 __ge 502 __gt 503 __indirect 504 __le 505 __ls 506 __lt 507 __max 508 __md 509 __method_call 510 __mi 511 __min 512 __minus 513 __ml 514 __mm 515 __mn 516 __mult 517 __mx 518 __ne 519 __negate 520 __new 521 __nop 522 __nt 523 __nw 524 __oo 525 __op 526 __or 527 __pl 528 __plus 529 __postdecrement 530 __postincrement 531 __pp 532 __pt 533 __rf 534 __rm 535 __rs 536 __sz 537 __trunc_div 538 __trunc_mod 539 __truth_andif 540 __truth_not 541 __truth_orif 542 __vc 543 __vd 544 __vn 545 546 SGI badnames: 547 __builtin_alloca 548 __builtin_fsqrt 549 __builtin_sqrt 550 __builtin_fabs 551 __builtin_dabs 552 __builtin_cast_f2i 553 __builtin_cast_i2f 554 __builtin_cast_d2ll 555 __builtin_cast_ll2d 556 __builtin_copy_dhi2i 557 __builtin_copy_i2dhi 558 __builtin_copy_dlo2i 559 __builtin_copy_i2dlo 560 __add_and_fetch 561 __sub_and_fetch 562 __or_and_fetch 563 __xor_and_fetch 564 __and_and_fetch 565 __nand_and_fetch 566 __mpy_and_fetch 567 __min_and_fetch 568 __max_and_fetch 569 __fetch_and_add 570 __fetch_and_sub 571 __fetch_and_or 572 __fetch_and_xor 573 __fetch_and_and 574 __fetch_and_nand 575 __fetch_and_mpy 576 __fetch_and_min 577 __fetch_and_max 578 __lock_test_and_set 579 __lock_release 580 __lock_acquire 581 __compare_and_swap 582 __synchronize 583 __high_multiply 584 __unix 585 __sgi 586 __linux__ 587 __i386__ 588 __i486__ 589 __cplusplus 590 __embedded_cplusplus 591 // long double conversion members mangled as __opr 592 // http://gcc.gnu.org/ml/libstdc++/1999-q4/msg00060.html 593 __opr 594 </literallayout> 595 </section> 596 597 <section xml:id="coding_style.example"><info><title>By Example</title></info> 598 599 <literallayout class="normal"> 600 This library is written to appropriate C++ coding standards. As such, 601 it is intended to precede the recommendations of the GNU Coding 602 Standard, which can be referenced in full here: 603 604 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.gnu.org/prep/standards/standards.html#Formatting">http://www.gnu.org/prep/standards/standards.html#Formatting</link> 605 606 The rest of this is also interesting reading, but skip the "Design 607 Advice" part. 608 609 The GCC coding conventions are here, and are also useful: 610 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://gcc.gnu.org/codingconventions.html">http://gcc.gnu.org/codingconventions.html</link> 611 612 In addition, because it doesn't seem to be stated explicitly anywhere 613 else, there is an 80 column source limit. 614 615 <filename>ChangeLog</filename> entries for member functions should use the 616 classname::member function name syntax as follows: 617 618<code> 6191999-04-15 Dennis Ritchie <dr@att.com> 620 621 * src/basic_file.cc (__basic_file::open): Fix thinko in 622 _G_HAVE_IO_FILE_OPEN bits. 623</code> 624 625 Notable areas of divergence from what may be previous local practice 626 (particularly for GNU C) include: 627 628 01. Pointers and references 629 <code> 630 char* p = "flop"; 631 char& c = *p; 632 -NOT- 633 char *p = "flop"; // wrong 634 char &c = *p; // wrong 635 </code> 636 637 Reason: In C++, definitions are mixed with executable code. Here, 638 <code>p</code> is being initialized, not <code>*p</code>. This is near-universal 639 practice among C++ programmers; it is normal for C hackers 640 to switch spontaneously as they gain experience. 641 642 02. Operator names and parentheses 643 <code> 644 operator==(type) 645 -NOT- 646 operator == (type) // wrong 647 </code> 648 649 Reason: The <code>==</code> is part of the function name. Separating 650 it makes the declaration look like an expression. 651 652 03. Function names and parentheses 653 <code> 654 void mangle() 655 -NOT- 656 void mangle () // wrong 657 </code> 658 659 Reason: no space before parentheses (except after a control-flow 660 keyword) is near-universal practice for C++. It identifies the 661 parentheses as the function-call operator or declarator, as 662 opposed to an expression or other overloaded use of parentheses. 663 664 04. Template function indentation 665 <code> 666 template<typename T> 667 void 668 template_function(args) 669 { } 670 -NOT- 671 template<class T> 672 void template_function(args) {}; 673 </code> 674 675 Reason: In class definitions, without indentation whitespace is 676 needed both above and below the declaration to distinguish 677 it visually from other members. (Also, re: "typename" 678 rather than "class".) <code>T</code> often could be <code>int</code>, which is 679 not a class. ("class", here, is an anachronism.) 680 681 05. Template class indentation 682 <code> 683 template<typename _CharT, typename _Traits> 684 class basic_ios : public ios_base 685 { 686 public: 687 // Types: 688 }; 689 -NOT- 690 template<class _CharT, class _Traits> 691 class basic_ios : public ios_base 692 { 693 public: 694 // Types: 695 }; 696 -NOT- 697 template<class _CharT, class _Traits> 698 class basic_ios : public ios_base 699 { 700 public: 701 // Types: 702 }; 703 </code> 704 705 06. Enumerators 706 <code> 707 enum 708 { 709 space = _ISspace, 710 print = _ISprint, 711 cntrl = _IScntrl 712 }; 713 -NOT- 714 enum { space = _ISspace, print = _ISprint, cntrl = _IScntrl }; 715 </code> 716 717 07. Member initialization lists 718 All one line, separate from class name. 719 720 <code> 721 gribble::gribble() 722 : _M_private_data(0), _M_more_stuff(0), _M_helper(0) 723 { } 724 -NOT- 725 gribble::gribble() : _M_private_data(0), _M_more_stuff(0), _M_helper(0) 726 { } 727 </code> 728 729 08. Try/Catch blocks 730 <code> 731 try 732 { 733 // 734 } 735 catch (...) 736 { 737 // 738 } 739 -NOT- 740 try { 741 // 742 } catch(...) { 743 // 744 } 745 </code> 746 747 09. Member functions declarations and definitions 748 Keywords such as extern, static, export, explicit, inline, etc 749 go on the line above the function name. Thus 750 751 <code> 752 virtual int 753 foo() 754 -NOT- 755 virtual int foo() 756 </code> 757 758 Reason: GNU coding conventions dictate return types for functions 759 are on a separate line than the function name and parameter list 760 for definitions. For C++, where we have member functions that can 761 be either inline definitions or declarations, keeping to this 762 standard allows all member function names for a given class to be 763 aligned to the same margin, increasing readability. 764 765 766 10. Invocation of member functions with "this->" 767 For non-uglified names, use <code>this->name</code> to call the function. 768 769 <code> 770 this->sync() 771 -NOT- 772 sync() 773 </code> 774 775 Reason: Koenig lookup. 776 777 11. Namespaces 778 <code> 779 namespace std 780 { 781 blah blah blah; 782 } // namespace std 783 784 -NOT- 785 786 namespace std { 787 blah blah blah; 788 } // namespace std 789 </code> 790 791 12. Spacing under protected and private in class declarations: 792 space above, none below 793 i.e. 794 795 <code> 796 public: 797 int foo; 798 799 -NOT- 800 public: 801 802 int foo; 803 </code> 804 805 13. Spacing WRT return statements. 806 no extra spacing before returns, no parenthesis 807 i.e. 808 809 <code> 810 } 811 return __ret; 812 813 -NOT- 814 } 815 816 return __ret; 817 818 -NOT- 819 820 } 821 return (__ret); 822 </code> 823 824 825 14. Location of global variables. 826 All global variables of class type, whether in the "user visible" 827 space (e.g., <code>cin</code>) or the implementation namespace, must be defined 828 as a character array with the appropriate alignment and then later 829 re-initialized to the correct value. 830 831 This is due to startup issues on certain platforms, such as AIX. 832 For more explanation and examples, see <filename>src/globals.cc</filename>. All such 833 variables should be contained in that file, for simplicity. 834 835 15. Exception abstractions 836 Use the exception abstractions found in <filename class="headerfile">functexcept.h</filename>, which allow 837 C++ programmers to use this library with <literal>-fno-exceptions</literal>. (Even if 838 that is rarely advisable, it's a necessary evil for backwards 839 compatibility.) 840 841 16. Exception error messages 842 All start with the name of the function where the exception is 843 thrown, and then (optional) descriptive text is added. Example: 844 845 <code> 846 __throw_logic_error(__N("basic_string::_S_construct NULL not valid")); 847 </code> 848 849 Reason: The verbose terminate handler prints out <code>exception::what()</code>, 850 as well as the typeinfo for the thrown exception. As this is the 851 default terminate handler, by putting location info into the 852 exception string, a very useful error message is printed out for 853 uncaught exceptions. So useful, in fact, that non-programmers can 854 give useful error messages, and programmers can intelligently 855 speculate what went wrong without even using a debugger. 856 857 17. The doxygen style guide to comments is a separate document, 858 see index. 859 860 The library currently has a mixture of GNU-C and modern C++ coding 861 styles. The GNU C usages will be combed out gradually. 862 863 Name patterns: 864 865 For nonstandard names appearing in Standard headers, we are constrained 866 to use names that begin with underscores. This is called "uglification". 867 The convention is: 868 869 Local and argument names: <literal>__[a-z].*</literal> 870 871 Examples: <code>__count __ix __s1</code> 872 873 Type names and template formal-argument names: <literal>_[A-Z][^_].*</literal> 874 875 Examples: <code>_Helper _CharT _N</code> 876 877 Member data and function names: <literal>_M_.*</literal> 878 879 Examples: <code>_M_num_elements _M_initialize ()</code> 880 881 Static data members, constants, and enumerations: <literal>_S_.*</literal> 882 883 Examples: <code>_S_max_elements _S_default_value</code> 884 885 Don't use names in the same scope that differ only in the prefix, 886 e.g. _S_top and _M_top. See <link linkend="coding_style.bad_identifiers">BADNAMES</link> for a list of forbidden names. 887 (The most tempting of these seem to be and "_T" and "__sz".) 888 889 Names must never have "__" internally; it would confuse name 890 unmanglers on some targets. Also, never use "__[0-9]", same reason. 891 892 -------------------------- 893 894 [BY EXAMPLE] 895 <code> 896 897 #ifndef _HEADER_ 898 #define _HEADER_ 1 899 900 namespace std 901 { 902 class gribble 903 { 904 public: 905 gribble() throw(); 906 907 gribble(const gribble&); 908 909 explicit 910 gribble(int __howmany); 911 912 gribble& 913 operator=(const gribble&); 914 915 virtual 916 ~gribble() throw (); 917 918 // Start with a capital letter, end with a period. 919 inline void 920 public_member(const char* __arg) const; 921 922 // In-class function definitions should be restricted to one-liners. 923 int 924 one_line() { return 0 } 925 926 int 927 two_lines(const char* arg) 928 { return strchr(arg, 'a'); } 929 930 inline int 931 three_lines(); // inline, but defined below. 932 933 // Note indentation. 934 template<typename _Formal_argument> 935 void 936 public_template() const throw(); 937 938 template<typename _Iterator> 939 void 940 other_template(); 941 942 private: 943 class _Helper; 944 945 int _M_private_data; 946 int _M_more_stuff; 947 _Helper* _M_helper; 948 int _M_private_function(); 949 950 enum _Enum 951 { 952 _S_one, 953 _S_two 954 }; 955 956 static void 957 _S_initialize_library(); 958 }; 959 960 // More-or-less-standard language features described by lack, not presence. 961 # ifndef _G_NO_LONGLONG 962 extern long long _G_global_with_a_good_long_name; // avoid globals! 963 # endif 964 965 // Avoid in-class inline definitions, define separately; 966 // likewise for member class definitions: 967 inline int 968 gribble::public_member() const 969 { int __local = 0; return __local; } 970 971 class gribble::_Helper 972 { 973 int _M_stuff; 974 975 friend class gribble; 976 }; 977 } 978 979 // Names beginning with "__": only for arguments and 980 // local variables; never use "__" in a type name, or 981 // within any name; never use "__[0-9]". 982 983 #endif /* _HEADER_ */ 984 985 986 namespace std 987 { 988 template<typename T> // notice: "typename", not "class", no space 989 long_return_value_type<with_many, args> 990 function_name(char* pointer, // "char *pointer" is wrong. 991 char* argument, 992 const Reference& ref) 993 { 994 // int a_local; /* wrong; see below. */ 995 if (test) 996 { 997 nested code 998 } 999 1000 int a_local = 0; // declare variable at first use. 1001 1002 // char a, b, *p; /* wrong */ 1003 char a = 'a'; 1004 char b = a + 1; 1005 char* c = "abc"; // each variable goes on its own line, always. 1006 1007 // except maybe here... 1008 for (unsigned i = 0, mask = 1; mask; ++i, mask <<= 1) { 1009 // ... 1010 } 1011 } 1012 1013 gribble::gribble() 1014 : _M_private_data(0), _M_more_stuff(0), _M_helper(0) 1015 { } 1016 1017 int 1018 gribble::three_lines() 1019 { 1020 // doesn't fit in one line. 1021 } 1022 } // namespace std 1023 </code> 1024 </literallayout> 1025 </section> 1026</section> 1027 1028<section xml:id="contrib.design_notes" xreflabel="Design Notes"><info><title>Design Notes</title></info> 1029 <?dbhtml filename="source_design_notes.html"?> 1030 1031 <para> 1032 </para> 1033 1034 <literallayout class="normal"> 1035 1036 The Library 1037 ----------- 1038 1039 This paper is covers two major areas: 1040 1041 - Features and policies not mentioned in the standard that 1042 the quality of the library implementation depends on, including 1043 extensions and "implementation-defined" features; 1044 1045 - Plans for required but unimplemented library features and 1046 optimizations to them. 1047 1048 Overhead 1049 -------- 1050 1051 The standard defines a large library, much larger than the standard 1052 C library. A naive implementation would suffer substantial overhead 1053 in compile time, executable size, and speed, rendering it unusable 1054 in many (particularly embedded) applications. The alternative demands 1055 care in construction, and some compiler support, but there is no 1056 need for library subsets. 1057 1058 What are the sources of this overhead? There are four main causes: 1059 1060 - The library is specified almost entirely as templates, which 1061 with current compilers must be included in-line, resulting in 1062 very slow builds as tens or hundreds of thousands of lines 1063 of function definitions are read for each user source file. 1064 Indeed, the entire SGI STL, as well as the dos Reis valarray, 1065 are provided purely as header files, largely for simplicity in 1066 porting. Iostream/locale is (or will be) as large again. 1067 1068 - The library is very flexible, specifying a multitude of hooks 1069 where users can insert their own code in place of defaults. 1070 When these hooks are not used, any time and code expended to 1071 support that flexibility is wasted. 1072 1073 - Templates are often described as causing to "code bloat". In 1074 practice, this refers (when it refers to anything real) to several 1075 independent processes. First, when a class template is manually 1076 instantiated in its entirely, current compilers place the definitions 1077 for all members in a single object file, so that a program linking 1078 to one member gets definitions of all. Second, template functions 1079 which do not actually depend on the template argument are, under 1080 current compilers, generated anew for each instantiation, rather 1081 than being shared with other instantiations. Third, some of the 1082 flexibility mentioned above comes from virtual functions (both in 1083 regular classes and template classes) which current linkers add 1084 to the executable file even when they manifestly cannot be called. 1085 1086 - The library is specified to use a language feature, exceptions, 1087 which in the current gcc compiler ABI imposes a run time and 1088 code space cost to handle the possibility of exceptions even when 1089 they are not used. Under the new ABI (accessed with -fnew-abi), 1090 there is a space overhead and a small reduction in code efficiency 1091 resulting from lost optimization opportunities associated with 1092 non-local branches associated with exceptions. 1093 1094 What can be done to eliminate this overhead? A variety of coding 1095 techniques, and compiler, linker and library improvements and 1096 extensions may be used, as covered below. Most are not difficult, 1097 and some are already implemented in varying degrees. 1098 1099 Overhead: Compilation Time 1100 -------------------------- 1101 1102 Providing "ready-instantiated" template code in object code archives 1103 allows us to avoid generating and optimizing template instantiations 1104 in each compilation unit which uses them. However, the number of such 1105 instantiations that are useful to provide is limited, and anyway this 1106 is not enough, by itself, to minimize compilation time. In particular, 1107 it does not reduce time spent parsing conforming headers. 1108 1109 Quicker header parsing will depend on library extensions and compiler 1110 improvements. One approach is some variation on the techniques 1111 previously marketed as "pre-compiled headers", now standardized as 1112 support for the "export" keyword. "Exported" template definitions 1113 can be placed (once) in a "repository" -- really just a library, but 1114 of template definitions rather than object code -- to be drawn upon 1115 at link time when an instantiation is needed, rather than placed in 1116 header files to be parsed along with every compilation unit. 1117 1118 Until "export" is implemented we can put some of the lengthy template 1119 definitions in #if guards or alternative headers so that users can skip 1120 over the full definitions when they need only the ready-instantiated 1121 specializations. 1122 1123 To be precise, this means that certain headers which define 1124 templates which users normally use only for certain arguments 1125 can be instrumented to avoid exposing the template definitions 1126 to the compiler unless a macro is defined. For example, in 1127 <string>, we might have: 1128 1129 template <class _CharT, ... > class basic_string { 1130 ... // member declarations 1131 }; 1132 ... // operator declarations 1133 1134 #ifdef _STRICT_ISO_ 1135 # if _G_NO_TEMPLATE_EXPORT 1136 # include <bits/std_locale.h> // headers needed by definitions 1137 # ... 1138 # include <bits/string.tcc> // member and global template definitions. 1139 # endif 1140 #endif 1141 1142 Users who compile without specifying a strict-ISO-conforming flag 1143 would not see many of the template definitions they now see, and rely 1144 instead on ready-instantiated specializations in the library. This 1145 technique would be useful for the following substantial components: 1146 string, locale/iostreams, valarray. It would *not* be useful or 1147 usable with the following: containers, algorithms, iterators, 1148 allocator. Since these constitute a large (though decreasing) 1149 fraction of the library, the benefit the technique offers is 1150 limited. 1151 1152 The language specifies the semantics of the "export" keyword, but 1153 the gcc compiler does not yet support it. When it does, problems 1154 with large template inclusions can largely disappear, given some 1155 minor library reorganization, along with the need for the apparatus 1156 described above. 1157 1158 Overhead: Flexibility Cost 1159 -------------------------- 1160 1161 The library offers many places where users can specify operations 1162 to be performed by the library in place of defaults. Sometimes 1163 this seems to require that the library use a more-roundabout, and 1164 possibly slower, way to accomplish the default requirements than 1165 would be used otherwise. 1166 1167 The primary protection against this overhead is thorough compiler 1168 optimization, to crush out layers of inline function interfaces. 1169 Kuck & Associates has demonstrated the practicality of this kind 1170 of optimization. 1171 1172 The second line of defense against this overhead is explicit 1173 specialization. By defining helper function templates, and writing 1174 specialized code for the default case, overhead can be eliminated 1175 for that case without sacrificing flexibility. This takes full 1176 advantage of any ability of the optimizer to crush out degenerate 1177 code. 1178 1179 The library specifies many virtual functions which current linkers 1180 load even when they cannot be called. Some minor improvements to the 1181 compiler and to ld would eliminate any such overhead by simply 1182 omitting virtual functions that the complete program does not call. 1183 A prototype of this work has already been done. For targets where 1184 GNU ld is not used, a "pre-linker" could do the same job. 1185 1186 The main areas in the standard interface where user flexibility 1187 can result in overhead are: 1188 1189 - Allocators: Containers are specified to use user-definable 1190 allocator types and objects, making tuning for the container 1191 characteristics tricky. 1192 1193 - Locales: the standard specifies locale objects used to implement 1194 iostream operations, involving many virtual functions which use 1195 streambuf iterators. 1196 1197 - Algorithms and containers: these may be instantiated on any type, 1198 frequently duplicating code for identical operations. 1199 1200 - Iostreams and strings: users are permitted to use these on their 1201 own types, and specify the operations the stream must use on these 1202 types. 1203 1204 Note that these sources of overhead are _avoidable_. The techniques 1205 to avoid them are covered below. 1206 1207 Code Bloat 1208 ---------- 1209 1210 In the SGI STL, and in some other headers, many of the templates 1211 are defined "inline" -- either explicitly or by their placement 1212 in class definitions -- which should not be inline. This is a 1213 source of code bloat. Matt had remarked that he was relying on 1214 the compiler to recognize what was too big to benefit from inlining, 1215 and generate it out-of-line automatically. However, this also can 1216 result in code bloat except where the linker can eliminate the extra 1217 copies. 1218 1219 Fixing these cases will require an audit of all inline functions 1220 defined in the library to determine which merit inlining, and moving 1221 the rest out of line. This is an issue mainly in clauses 23, 25, and 1222 27. Of course it can be done incrementally, and we should generally 1223 accept patches that move large functions out of line and into ".tcc" 1224 files, which can later be pulled into a repository. Compiler/linker 1225 improvements to recognize very large inline functions and move them 1226 out-of-line, but shared among compilation units, could make this 1227 work unnecessary. 1228 1229 Pre-instantiating template specializations currently produces large 1230 amounts of dead code which bloats statically linked programs. The 1231 current state of the static library, libstdc++.a, is intolerable on 1232 this account, and will fuel further confused speculation about a need 1233 for a library "subset". A compiler improvement that treats each 1234 instantiated function as a separate object file, for linking purposes, 1235 would be one solution to this problem. An alternative would be to 1236 split up the manual instantiation files into dozens upon dozens of 1237 little files, each compiled separately, but an abortive attempt at 1238 this was done for <string> and, though it is far from complete, it 1239 is already a nuisance. A better interim solution (just until we have 1240 "export") is badly needed. 1241 1242 When building a shared library, the current compiler/linker cannot 1243 automatically generate the instantiations needed. This creates a 1244 miserable situation; it means any time something is changed in the 1245 library, before a shared library can be built someone must manually 1246 copy the declarations of all templates that are needed by other parts 1247 of the library to an "instantiation" file, and add it to the build 1248 system to be compiled and linked to the library. This process is 1249 readily automated, and should be automated as soon as possible. 1250 Users building their own shared libraries experience identical 1251 frustrations. 1252 1253 Sharing common aspects of template definitions among instantiations 1254 can radically reduce code bloat. The compiler could help a great 1255 deal here by recognizing when a function depends on nothing about 1256 a template parameter, or only on its size, and giving the resulting 1257 function a link-name "equate" that allows it to be shared with other 1258 instantiations. Implementation code could take advantage of the 1259 capability by factoring out code that does not depend on the template 1260 argument into separate functions to be merged by the compiler. 1261 1262 Until such a compiler optimization is implemented, much can be done 1263 manually (if tediously) in this direction. One such optimization is 1264 to derive class templates from non-template classes, and move as much 1265 implementation as possible into the base class. Another is to partial- 1266 specialize certain common instantiations, such as vector<T*>, to share 1267 code for instantiations on all types T. While these techniques work, 1268 they are far from the complete solution that a compiler improvement 1269 would afford. 1270 1271 Overhead: Expensive Language Features 1272 ------------------------------------- 1273 1274 The main "expensive" language feature used in the standard library 1275 is exception support, which requires compiling in cleanup code with 1276 static table data to locate it, and linking in library code to use 1277 the table. For small embedded programs the amount of such library 1278 code and table data is assumed by some to be excessive. Under the 1279 "new" ABI this perception is generally exaggerated, although in some 1280 cases it may actually be excessive. 1281 1282 To implement a library which does not use exceptions directly is 1283 not difficult given minor compiler support (to "turn off" exceptions 1284 and ignore exception constructs), and results in no great library 1285 maintenance difficulties. To be precise, given "-fno-exceptions", 1286 the compiler should treat "try" blocks as ordinary blocks, and 1287 "catch" blocks as dead code to ignore or eliminate. Compiler 1288 support is not strictly necessary, except in the case of "function 1289 try blocks"; otherwise the following macros almost suffice: 1290 1291 #define throw(X) 1292 #define try if (true) 1293 #define catch(X) else if (false) 1294 1295 However, there may be a need to use function try blocks in the 1296 library implementation, and use of macros in this way can make 1297 correct diagnostics impossible. Furthermore, use of this scheme 1298 would require the library to call a function to re-throw exceptions 1299 from a try block. Implementing the above semantics in the compiler 1300 is preferable. 1301 1302 Given the support above (however implemented) it only remains to 1303 replace code that "throws" with a call to a well-documented "handler" 1304 function in a separate compilation unit which may be replaced by 1305 the user. The main source of exceptions that would be difficult 1306 for users to avoid is memory allocation failures, but users can 1307 define their own memory allocation primitives that never throw. 1308 Otherwise, the complete list of such handlers, and which library 1309 functions may call them, would be needed for users to be able to 1310 implement the necessary substitutes. (Fortunately, they have the 1311 source code.) 1312 1313 Opportunities 1314 ------------- 1315 1316 The template capabilities of C++ offer enormous opportunities for 1317 optimizing common library operations, well beyond what would be 1318 considered "eliminating overhead". In particular, many operations 1319 done in Glibc with macros that depend on proprietary language 1320 extensions can be implemented in pristine Standard C++. For example, 1321 the chapter 25 algorithms, and even C library functions such as strchr, 1322 can be specialized for the case of static arrays of known (small) size. 1323 1324 Detailed optimization opportunities are identified below where 1325 the component where they would appear is discussed. Of course new 1326 opportunities will be identified during implementation. 1327 1328 Unimplemented Required Library Features 1329 --------------------------------------- 1330 1331 The standard specifies hundreds of components, grouped broadly by 1332 chapter. These are listed in excruciating detail in the CHECKLIST 1333 file. 1334 1335 17 general 1336 18 support 1337 19 diagnostics 1338 20 utilities 1339 21 string 1340 22 locale 1341 23 containers 1342 24 iterators 1343 25 algorithms 1344 26 numerics 1345 27 iostreams 1346 Annex D backward compatibility 1347 1348 Anyone participating in implementation of the library should obtain 1349 a copy of the standard, ISO 14882. People in the U.S. can obtain an 1350 electronic copy for US$18 from ANSI's web site. Those from other 1351 countries should visit http://www.iso.org/ to find out the location 1352 of their country's representation in ISO, in order to know who can 1353 sell them a copy. 1354 1355 The emphasis in the following sections is on unimplemented features 1356 and optimization opportunities. 1357 1358 Chapter 17 General 1359 ------------------- 1360 1361 Chapter 17 concerns overall library requirements. 1362 1363 The standard doesn't mention threads. A multi-thread (MT) extension 1364 primarily affects operators new and delete (18), allocator (20), 1365 string (21), locale (22), and iostreams (27). The common underlying 1366 support needed for this is discussed under chapter 20. 1367 1368 The standard requirements on names from the C headers create a 1369 lot of work, mostly done. Names in the C headers must be visible 1370 in the std:: and sometimes the global namespace; the names in the 1371 two scopes must refer to the same object. More stringent is that 1372 Koenig lookup implies that any types specified as defined in std:: 1373 really are defined in std::. Names optionally implemented as 1374 macros in C cannot be macros in C++. (An overview may be read at 1375 <http://www.cantrip.org/cheaders.html>). The scripts "inclosure" 1376 and "mkcshadow", and the directories shadow/ and cshadow/, are the 1377 beginning of an effort to conform in this area. 1378 1379 A correct conforming definition of C header names based on underlying 1380 C library headers, and practical linking of conforming namespaced 1381 customer code with third-party C libraries depends ultimately on 1382 an ABI change, allowing namespaced C type names to be mangled into 1383 type names as if they were global, somewhat as C function names in a 1384 namespace, or C++ global variable names, are left unmangled. Perhaps 1385 another "extern" mode, such as 'extern "C-global"' would be an 1386 appropriate place for such type definitions. Such a type would 1387 affect mangling as follows: 1388 1389 namespace A { 1390 struct X {}; 1391 extern "C-global" { // or maybe just 'extern "C"' 1392 struct Y {}; 1393 }; 1394 } 1395 void f(A::X*); // mangles to f__FPQ21A1X 1396 void f(A::Y*); // mangles to f__FP1Y 1397 1398 (It may be that this is really the appropriate semantics for regular 1399 'extern "C"', and 'extern "C-global"', as an extension, would not be 1400 necessary.) This would allow functions declared in non-standard C headers 1401 (and thus fixable by neither us nor users) to link properly with functions 1402 declared using C types defined in properly-namespaced headers. The 1403 problem this solves is that C headers (which C++ programmers do persist 1404 in using) frequently forward-declare C struct tags without including 1405 the header where the type is defined, as in 1406 1407 struct tm; 1408 void munge(tm*); 1409 1410 Without some compiler accommodation, munge cannot be called by correct 1411 C++ code using a pointer to a correctly-scoped tm* value. 1412 1413 The current C headers use the preprocessor extension "#include_next", 1414 which the compiler complains about when run "-pedantic". 1415 (Incidentally, it appears that "-fpedantic" is currently ignored, 1416 probably a bug.) The solution in the C compiler is to use 1417 "-isystem" rather than "-I", but unfortunately in g++ this seems 1418 also to wrap the whole header in an 'extern "C"' block, so it's 1419 unusable for C++ headers. The correct solution appears to be to 1420 allow the various special include-directory options, if not given 1421 an argument, to affect subsequent include-directory options additively, 1422 so that if one said 1423 1424 -pedantic -iprefix $(prefix) \ 1425 -idirafter -ino-pedantic -ino-extern-c -iwithprefix -I g++-v3 \ 1426 -iwithprefix -I g++-v3/ext 1427 1428 the compiler would search $(prefix)/g++-v3 and not report 1429 pedantic warnings for files found there, but treat files in 1430 $(prefix)/g++-v3/ext pedantically. (The undocumented semantics 1431 of "-isystem" in g++ stink. Can they be rescinded? If not it 1432 must be replaced with something more rationally behaved.) 1433 1434 All the C headers need the treatment above; in the standard these 1435 headers are mentioned in various clauses. Below, I have only 1436 mentioned those that present interesting implementation issues. 1437 1438 The components identified as "mostly complete", below, have not been 1439 audited for conformance. In many cases where the library passes 1440 conformance tests we have non-conforming extensions that must be 1441 wrapped in #if guards for "pedantic" use, and in some cases renamed 1442 in a conforming way for continued use in the implementation regardless 1443 of conformance flags. 1444 1445 The STL portion of the library still depends on a header 1446 stl/bits/stl_config.h full of #ifdef clauses. This apparatus 1447 should be replaced with autoconf/automake machinery. 1448 1449 The SGI STL defines a type_traits<> template, specialized for 1450 many types in their code including the built-in numeric and 1451 pointer types and some library types, to direct optimizations of 1452 standard functions. The SGI compiler has been extended to generate 1453 specializations of this template automatically for user types, 1454 so that use of STL templates on user types can take advantage of 1455 these optimizations. Specializations for other, non-STL, types 1456 would make more optimizations possible, but extending the gcc 1457 compiler in the same way would be much better. Probably the next 1458 round of standardization will ratify this, but probably with 1459 changes, so it probably should be renamed to place it in the 1460 implementation namespace. 1461 1462 The SGI STL also defines a large number of extensions visible in 1463 standard headers. (Other extensions that appear in separate headers 1464 have been sequestered in subdirectories ext/ and backward/.) All 1465 these extensions should be moved to other headers where possible, 1466 and in any case wrapped in a namespace (not std!), and (where kept 1467 in a standard header) girded about with macro guards. Some cannot be 1468 moved out of standard headers because they are used to implement 1469 standard features. The canonical method for accommodating these 1470 is to use a protected name, aliased in macro guards to a user-space 1471 name. Unfortunately C++ offers no satisfactory template typedef 1472 mechanism, so very ad-hoc and unsatisfactory aliasing must be used 1473 instead. 1474 1475 Implementation of a template typedef mechanism should have the highest 1476 priority among possible extensions, on the same level as implementation 1477 of the template "export" feature. 1478 1479 Chapter 18 Language support 1480 ---------------------------- 1481 1482 Headers: <limits> <new> <typeinfo> <exception> 1483 C headers: <cstddef> <climits> <cfloat> <cstdarg> <csetjmp> 1484 <ctime> <csignal> <cstdlib> (also 21, 25, 26) 1485 1486 This defines the built-in exceptions, rtti, numeric_limits<>, 1487 operator new and delete. Much of this is provided by the 1488 compiler in its static runtime library. 1489 1490 Work to do includes defining numeric_limits<> specializations in 1491 separate files for all target architectures. Values for integer types 1492 except for bool and wchar_t are readily obtained from the C header 1493 <limits.h>, but values for the remaining numeric types (bool, wchar_t, 1494 float, double, long double) must be entered manually. This is 1495 largely dog work except for those members whose values are not 1496 easily deduced from available documentation. Also, this involves 1497 some work in target configuration to identify the correct choice of 1498 file to build against and to install. 1499 1500 The definitions of the various operators new and delete must be 1501 made thread-safe, which depends on a portable exclusion mechanism, 1502 discussed under chapter 20. Of course there is always plenty of 1503 room for improvements to the speed of operators new and delete. 1504 1505 <cstdarg>, in Glibc, defines some macros that gcc does not allow to 1506 be wrapped into an inline function. Probably this header will demand 1507 attention whenever a new target is chosen. The functions atexit(), 1508 exit(), and abort() in cstdlib have different semantics in C++, so 1509 must be re-implemented for C++. 1510 1511 Chapter 19 Diagnostics 1512 ----------------------- 1513 1514 Headers: <stdexcept> 1515 C headers: <cassert> <cerrno> 1516 1517 This defines the standard exception objects, which are "mostly complete". 1518 Cygnus has a version, and now SGI provides a slightly different one. 1519 It makes little difference which we use. 1520 1521 The C global name "errno", which C allows to be a variable or a macro, 1522 is required in C++ to be a macro. For MT it must typically result in 1523 a function call. 1524 1525 Chapter 20 Utilities 1526 --------------------- 1527 Headers: <utility> <functional> <memory> 1528 C header: <ctime> (also in 18) 1529 1530 SGI STL provides "mostly complete" versions of all the components 1531 defined in this chapter. However, the auto_ptr<> implementation 1532 is known to be wrong. Furthermore, the standard definition of it 1533 is known to be unimplementable as written. A minor change to the 1534 standard would fix it, and auto_ptr<> should be adjusted to match. 1535 1536 Multi-threading affects the allocator implementation, and there must 1537 be configuration/installation choices for different users' MT 1538 requirements. Anyway, users will want to tune allocator options 1539 to support different target conditions, MT or no. 1540 1541 The primitives used for MT implementation should be exposed, as an 1542 extension, for users' own work. We need cross-CPU "mutex" support, 1543 multi-processor shared-memory atomic integer operations, and single- 1544 processor uninterruptible integer operations, and all three configurable 1545 to be stubbed out for non-MT use, or to use an appropriately-loaded 1546 dynamic library for the actual runtime environment, or statically 1547 compiled in for cases where the target architecture is known. 1548 1549 Chapter 21 String 1550 ------------------ 1551 Headers: <string> 1552 C headers: <cctype> <cwctype> <cstring> <cwchar> (also in 27) 1553 <cstdlib> (also in 18, 25, 26) 1554 1555 We have "mostly-complete" char_traits<> implementations. Many of the 1556 char_traits<char> operations might be optimized further using existing 1557 proprietary language extensions. 1558 1559 We have a "mostly-complete" basic_string<> implementation. The work 1560 to manually instantiate char and wchar_t specializations in object 1561 files to improve link-time behavior is extremely unsatisfactory, 1562 literally tripling library-build time with no commensurate improvement 1563 in static program link sizes. It must be redone. (Similar work is 1564 needed for some components in clauses 22 and 27.) 1565 1566 Other work needed for strings is MT-safety, as discussed under the 1567 chapter 20 heading. 1568 1569 The standard C type mbstate_t from <cwchar> and used in char_traits<> 1570 must be different in C++ than in C, because in C++ the default constructor 1571 value mbstate_t() must be the "base" or "ground" sequence state. 1572 (According to the likely resolution of a recently raised Core issue, 1573 this may become unnecessary. However, there are other reasons to 1574 use a state type not as limited as whatever the C library provides.) 1575 If we might want to provide conversions from (e.g.) internally- 1576 represented EUC-wide to externally-represented Unicode, or vice- 1577 versa, the mbstate_t we choose will need to be more accommodating 1578 than what might be provided by an underlying C library. 1579 1580 There remain some basic_string template-member functions which do 1581 not overload properly with their non-template brethren. The infamous 1582 hack akin to what was done in vector<> is needed, to conform to 1583 23.1.1 para 10. The CHECKLIST items for basic_string marked 'X', 1584 or incomplete, are so marked for this reason. 1585 1586 Replacing the string iterators, which currently are simple character 1587 pointers, with class objects would greatly increase the safety of the 1588 client interface, and also permit a "debug" mode in which range, 1589 ownership, and validity are rigorously checked. The current use of 1590 raw pointers as string iterators is evil. vector<> iterators need the 1591 same treatment. Note that the current implementation freely mixes 1592 pointers and iterators, and that must be fixed before safer iterators 1593 can be introduced. 1594 1595 Some of the functions in <cstring> are different from the C version. 1596 generally overloaded on const and non-const argument pointers. For 1597 example, in <cstring> strchr is overloaded. The functions isupper 1598 etc. in <cctype> typically implemented as macros in C are functions 1599 in C++, because they are overloaded with others of the same name 1600 defined in <locale>. 1601 1602 Many of the functions required in <cwctype> and <cwchar> cannot be 1603 implemented using underlying C facilities on intended targets because 1604 such facilities only partly exist. 1605 1606 Chapter 22 Locale 1607 ------------------ 1608 Headers: <locale> 1609 C headers: <clocale> 1610 1611 We have a "mostly complete" class locale, with the exception of 1612 code for constructing, and handling the names of, named locales. 1613 The ways that locales are named (particularly when categories 1614 (e.g. LC_TIME, LC_COLLATE) are different) varies among all target 1615 environments. This code must be written in various versions and 1616 chosen by configuration parameters. 1617 1618 Members of many of the facets defined in <locale> are stubs. Generally, 1619 there are two sets of facets: the base class facets (which are supposed 1620 to implement the "C" locale) and the "byname" facets, which are supposed 1621 to read files to determine their behavior. The base ctype<>, collate<>, 1622 and numpunct<> facets are "mostly complete", except that the table of 1623 bitmask values used for "is" operations, and corresponding mask values, 1624 are still defined in libio and just included/linked. (We will need to 1625 implement these tables independently, soon, but should take advantage 1626 of libio where possible.) The num_put<>::put members for integer types 1627 are "mostly complete". 1628 1629 A complete list of what has and has not been implemented may be 1630 found in CHECKLIST. However, note that the current definition of 1631 codecvt<wchar_t,char,mbstate_t> is wrong. It should simply write 1632 out the raw bytes representing the wide characters, rather than 1633 trying to convert each to a corresponding single "char" value. 1634 1635 Some of the facets are more important than others. Specifically, 1636 the members of ctype<>, numpunct<>, num_put<>, and num_get<> facets 1637 are used by other library facilities defined in <string>, <istream>, 1638 and <ostream>, and the codecvt<> facet is used by basic_filebuf<> 1639 in <fstream>, so a conforming iostream implementation depends on 1640 these. 1641 1642 The "long long" type eventually must be supported, but code mentioning 1643 it should be wrapped in #if guards to allow pedantic-mode compiling. 1644 1645 Performance of num_put<> and num_get<> depend critically on 1646 caching computed values in ios_base objects, and on extensions 1647 to the interface with streambufs. 1648 1649 Specifically: retrieving a copy of the locale object, extracting 1650 the needed facets, and gathering data from them, for each call to 1651 (e.g.) operator<< would be prohibitively slow. To cache format 1652 data for use by num_put<> and num_get<> we have a _Format_cache<> 1653 object stored in the ios_base::pword() array. This is constructed 1654 and initialized lazily, and is organized purely for utility. It 1655 is discarded when a new locale with different facets is imbued. 1656 1657 Using only the public interfaces of the iterator arguments to the 1658 facet functions would limit performance by forbidding "vector-style" 1659 character operations. The streambuf iterator optimizations are 1660 described under chapter 24, but facets can also bypass the streambuf 1661 iterators via explicit specializations and operate directly on the 1662 streambufs, and use extended interfaces to get direct access to the 1663 streambuf internal buffer arrays. These extensions are mentioned 1664 under chapter 27. These optimizations are particularly important 1665 for input parsing. 1666 1667 Unused virtual members of locale facets can be omitted, as mentioned 1668 above, by a smart linker. 1669 1670 Chapter 23 Containers 1671 ---------------------- 1672 Headers: <deque> <list> <queue> <stack> <vector> <map> <set> <bitset> 1673 1674 All the components in chapter 23 are implemented in the SGI STL. 1675 They are "mostly complete"; they include a large number of 1676 nonconforming extensions which must be wrapped. Some of these 1677 are used internally and must be renamed or duplicated. 1678 1679 The SGI components are optimized for large-memory environments. For 1680 embedded targets, different criteria might be more appropriate. Users 1681 will want to be able to tune this behavior. We should provide 1682 ways for users to compile the library with different memory usage 1683 characteristics. 1684 1685 A lot more work is needed on factoring out common code from different 1686 specializations to reduce code size here and in chapter 25. The 1687 easiest fix for this would be a compiler/ABI improvement that allows 1688 the compiler to recognize when a specialization depends only on the 1689 size (or other gross quality) of a template argument, and allow the 1690 linker to share the code with similar specializations. In its 1691 absence, many of the algorithms and containers can be partial- 1692 specialized, at least for the case of pointers, but this only solves 1693 a small part of the problem. Use of a type_traits-style template 1694 allows a few more optimization opportunities, more if the compiler 1695 can generate the specializations automatically. 1696 1697 As an optimization, containers can specialize on the default allocator 1698 and bypass it, or take advantage of details of its implementation 1699 after it has been improved upon. 1700 1701 Replacing the vector iterators, which currently are simple element 1702 pointers, with class objects would greatly increase the safety of the 1703 client interface, and also permit a "debug" mode in which range, 1704 ownership, and validity are rigorously checked. The current use of 1705 pointers for iterators is evil. 1706 1707 As mentioned for chapter 24, the deque iterator is a good example of 1708 an opportunity to implement a "staged" iterator that would benefit 1709 from specializations of some algorithms. 1710 1711 Chapter 24 Iterators 1712 --------------------- 1713 Headers: <iterator> 1714 1715 Standard iterators are "mostly complete", with the exception of 1716 the stream iterators, which are not yet templatized on the 1717 stream type. Also, the base class template iterator<> appears 1718 to be wrong, so everything derived from it must also be wrong, 1719 currently. 1720 1721 The streambuf iterators (currently located in stl/bits/std_iterator.h, 1722 but should be under bits/) can be rewritten to take advantage of 1723 friendship with the streambuf implementation. 1724 1725 Matt Austern has identified opportunities where certain iterator 1726 types, particularly including streambuf iterators and deque 1727 iterators, have a "two-stage" quality, such that an intermediate 1728 limit can be checked much more quickly than the true limit on 1729 range operations. If identified with a member of iterator_traits, 1730 algorithms may be specialized for this case. Of course the 1731 iterators that have this quality can be identified by specializing 1732 a traits class. 1733 1734 Many of the algorithms must be specialized for the streambuf 1735 iterators, to take advantage of block-mode operations, in order 1736 to allow iostream/locale operations' performance not to suffer. 1737 It may be that they could be treated as staged iterators and 1738 take advantage of those optimizations. 1739 1740 Chapter 25 Algorithms 1741 ---------------------- 1742 Headers: <algorithm> 1743 C headers: <cstdlib> (also in 18, 21, 26)) 1744 1745 The algorithms are "mostly complete". As mentioned above, they 1746 are optimized for speed at the expense of code and data size. 1747 1748 Specializations of many of the algorithms for non-STL types would 1749 give performance improvements, but we must use great care not to 1750 interfere with fragile template overloading semantics for the 1751 standard interfaces. Conventionally the standard function template 1752 interface is an inline which delegates to a non-standard function 1753 which is then overloaded (this is already done in many places in 1754 the library). Particularly appealing opportunities for the sake of 1755 iostream performance are for copy and find applied to streambuf 1756 iterators or (as noted elsewhere) for staged iterators, of which 1757 the streambuf iterators are a good example. 1758 1759 The bsearch and qsort functions cannot be overloaded properly as 1760 required by the standard because gcc does not yet allow overloading 1761 on the extern-"C"-ness of a function pointer. 1762 1763 Chapter 26 Numerics 1764 -------------------- 1765 Headers: <complex> <valarray> <numeric> 1766 C headers: <cmath>, <cstdlib> (also 18, 21, 25) 1767 1768 Numeric components: Gabriel dos Reis's valarray, Drepper's complex, 1769 and the few algorithms from the STL are "mostly done". Of course 1770 optimization opportunities abound for the numerically literate. It 1771 is not clear whether the valarray implementation really conforms 1772 fully, in the assumptions it makes about aliasing (and lack thereof) 1773 in its arguments. 1774 1775 The C div() and ldiv() functions are interesting, because they are the 1776 only case where a C library function returns a class object by value. 1777 Since the C++ type div_t must be different from the underlying C type 1778 (which is in the wrong namespace) the underlying functions div() and 1779 ldiv() cannot be re-used efficiently. Fortunately they are trivial to 1780 re-implement. 1781 1782 Chapter 27 Iostreams 1783 --------------------- 1784 Headers: <iosfwd> <streambuf> <ios> <ostream> <istream> <iostream> 1785 <iomanip> <sstream> <fstream> 1786 C headers: <cstdio> <cwchar> (also in 21) 1787 1788 Iostream is currently in a very incomplete state. <iosfwd>, <iomanip>, 1789 ios_base, and basic_ios<> are "mostly complete". basic_streambuf<> and 1790 basic_ostream<> are well along, but basic_istream<> has had little work 1791 done. The standard stream objects, <sstream> and <fstream> have been 1792 started; basic_filebuf<> "write" functions have been implemented just 1793 enough to do "hello, world". 1794 1795 Most of the istream and ostream operators << and >> (with the exception 1796 of the op<<(integer) ones) have not been changed to use locale primitives, 1797 sentry objects, or char_traits members. 1798 1799 All these templates should be manually instantiated for char and 1800 wchar_t in a way that links only used members into user programs. 1801 1802 Streambuf is fertile ground for optimization extensions. An extended 1803 interface giving iterator access to its internal buffer would be very 1804 useful for other library components. 1805 1806 Iostream operations (primarily operators << and >>) can take advantage 1807 of the case where user code has not specified a locale, and bypass locale 1808 operations entirely. The current implementation of op<</num_put<>::put, 1809 for the integer types, demonstrates how they can cache encoding details 1810 from the locale on each operation. There is lots more room for 1811 optimization in this area. 1812 1813 The definition of the relationship between the standard streams 1814 cout et al. and stdout et al. requires something like a "stdiobuf". 1815 The SGI solution of using double-indirection to actually use a 1816 stdio FILE object for buffering is unsatisfactory, because it 1817 interferes with peephole loop optimizations. 1818 1819 The <sstream> header work has begun. stringbuf can benefit from 1820 friendship with basic_string<> and basic_string<>::_Rep to use 1821 those objects directly as buffers, and avoid allocating and making 1822 copies. 1823 1824 The basic_filebuf<> template is a complex beast. It is specified to 1825 use the locale facet codecvt<> to translate characters between native 1826 files and the locale character encoding. In general this involves 1827 two buffers, one of "char" representing the file and another of 1828 "char_type", for the stream, with codecvt<> translating. The process 1829 is complicated by the variable-length nature of the translation, and 1830 the need to seek to corresponding places in the two representations. 1831 For the case of basic_filebuf<char>, when no translation is needed, 1832 a single buffer suffices. A specialized filebuf can be used to reduce 1833 code space overhead when no locale has been imbued. Matt Austern's 1834 work at SGI will be useful, perhaps directly as a source of code, or 1835 at least as an example to draw on. 1836 1837 Filebuf, almost uniquely (cf. operator new), depends heavily on 1838 underlying environmental facilities. In current releases iostream 1839 depends fairly heavily on libio constant definitions, but it should 1840 be made independent. It also depends on operating system primitives 1841 for file operations. There is immense room for optimizations using 1842 (e.g.) mmap for reading. The shadow/ directory wraps, besides the 1843 standard C headers, the libio.h and unistd.h headers, for use mainly 1844 by filebuf. These wrappings have not been completed, though there 1845 is scaffolding in place. 1846 1847 The encapsulation of certain C header <cstdio> names presents an 1848 interesting problem. It is possible to define an inline std::fprintf() 1849 implemented in terms of the 'extern "C"' vfprintf(), but there is no 1850 standard vfscanf() to use to implement std::fscanf(). It appears that 1851 vfscanf but be re-implemented in C++ for targets where no vfscanf 1852 extension has been defined. This is interesting in that it seems 1853 to be the only significant case in the C library where this kind of 1854 rewriting is necessary. (Of course Glibc provides the vfscanf() 1855 extension.) (The functions related to exit() must be rewritten 1856 for other reasons.) 1857 1858 1859 Annex D 1860 ------- 1861 Headers: <strstream> 1862 1863 Annex D defines many non-library features, and many minor 1864 modifications to various headers, and a complete header. 1865 It is "mostly done", except that the libstdc++-2 <strstream> 1866 header has not been adopted into the library, or checked to 1867 verify that it matches the draft in those details that were 1868 clarified by the committee. Certainly it must at least be 1869 moved into the std namespace. 1870 1871 We still need to wrap all the deprecated features in #if guards 1872 so that pedantic compile modes can detect their use. 1873 1874 Nonstandard Extensions 1875 ---------------------- 1876 Headers: <iostream.h> <strstream.h> <hash> <rbtree> 1877 <pthread_alloc> <stdiobuf> (etc.) 1878 1879 User code has come to depend on a variety of nonstandard components 1880 that we must not omit. Much of this code can be adopted from 1881 libstdc++-v2 or from the SGI STL. This particularly includes 1882 <iostream.h>, <strstream.h>, and various SGI extensions such 1883 as <hash_map.h>. Many of these are already placed in the 1884 subdirectories ext/ and backward/. (Note that it is better to 1885 include them via "<backward/hash_map.h>" or "<ext/hash_map>" than 1886 to search the subdirectory itself via a "-I" directive. 1887 </literallayout> 1888</section> 1889 1890</appendix> 1891