1<appendix xmlns="http://docbook.org/ns/docbook" version="5.0" 2 xml:id="appendix.contrib" xreflabel="Contributing"> 3<?dbhtml filename="appendix_contributing.html"?> 4 5<info><title> 6 Contributing 7 <indexterm> 8 <primary>Appendix</primary> 9 <secondary>Contributing</secondary> 10 </indexterm> 11</title> 12 <keywordset> 13 <keyword>ISO C++</keyword> 14 <keyword>library</keyword> 15 </keywordset> 16</info> 17 18 19 20<para> 21 The GNU C++ Library is part of GCC and follows the same development model, 22 so the general rules for 23 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://gcc.gnu.org/contribute.html">contributing 24 to GCC</link> apply. Active 25 contributors are assigned maintainership responsibility, and given 26 write access to the source repository. First-time contributors 27 should follow this procedure: 28</para> 29 30<section xml:id="contrib.list" xreflabel="Contributor Checklist"><info><title>Contributor Checklist</title></info> 31 32 33 <section xml:id="list.reading"><info><title>Reading</title></info> 34 35 36 <itemizedlist> 37 <listitem> 38 <para> 39 Get and read the relevant sections of the C++ language 40 specification. Copies of the full ISO 14882 standard are 41 available on line via the ISO mirror site for committee 42 members. Non-members, or those who have not paid for the 43 privilege of sitting on the committee and sustained their 44 two meeting commitment for voting rights, may get a copy of 45 the standard from their respective national standards 46 organization. In the USA, this national standards 47 organization is 48 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.ansi.org">ANSI</link>. 49 (And if you've already registered with them you can 50 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://webstore.ansi.org/RecordDetail.aspx?sku=INCITS%2fISO%2fIEC+14882-2012">buy the standard on-line</link>.) 51 </para> 52 </listitem> 53 54 <listitem> 55 <para> 56 The library working group bugs, and known defects, can 57 be obtained here: 58 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.open-std.org/jtc1/sc22/wg21/">http://www.open-std.org/jtc1/sc22/wg21</link> 59 </para> 60 </listitem> 61 62 <listitem> 63 <para> 64 Peruse 65 the <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.gnu.org/prep/standards/">GNU 66 Coding Standards</link>, and chuckle when you hit the part 67 about <quote>Using Languages Other Than C</quote>. 68 </para> 69 </listitem> 70 71 <listitem> 72 <para> 73 Be familiar with the extensions that preceded these 74 general GNU rules. These style issues for libstdc++ can be 75 found in <link linkend="contrib.coding_style">Coding Style</link>. 76 </para> 77 </listitem> 78 79 <listitem> 80 <para> 81 And last but certainly not least, read the 82 library-specific information found in 83 <link linkend="appendix.porting">Porting and Maintenance</link>. 84 </para> 85 </listitem> 86 </itemizedlist> 87 88 </section> 89 <section xml:id="list.copyright"><info><title>Assignment</title></info> 90 91 <para> 92 See the <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://gcc.gnu.org/contribute.html#legal">legal prerequisites</link> for all GCC contributions. 93 </para> 94 95 <para> 96 Historically, the libstdc++ assignment form added the following 97 question: 98 </para> 99 100 <para> 101 <quote> 102 Which Belgian comic book character is better, Tintin or Asterix, and 103 why? 104 </quote> 105 </para> 106 107 <para> 108 While not strictly necessary, humoring the maintainers and answering 109 this question would be appreciated. 110 </para> 111 112 <para> 113 Please contact 114 Paolo Carlini at <email>paolo.carlini@oracle.com</email> 115 or 116 Jonathan Wakely at <email>jwakely+assign@redhat.com</email> 117 if you are confused about the assignment or have general licensing 118 questions. When requesting an assignment form from 119 <email>assign@gnu.org</email>, please CC the libstdc++ 120 maintainers above so that progress can be monitored. 121 </para> 122 </section> 123 124 <section xml:id="list.getting"><info><title>Getting Sources</title></info> 125 126 <para> 127 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://gcc.gnu.org/svnwrite.html">Getting write access 128 (look for "Write after approval")</link> 129 </para> 130 </section> 131 132 <section xml:id="list.patches"><info><title>Submitting Patches</title></info> 133 134 135 <para> 136 Every patch must have several pieces of information before it can be 137 properly evaluated. Ideally (and to ensure the fastest possible 138 response from the maintainers) it would have all of these pieces: 139 </para> 140 141 <itemizedlist> 142 <listitem> 143 <para> 144 A description of the bug and how your patch fixes this 145 bug. For new features a description of the feature and your 146 implementation. 147 </para> 148 </listitem> 149 150 <listitem> 151 <para> 152 A ChangeLog entry as plain text; see the various 153 ChangeLog files for format and content. If you are 154 using emacs as your editor, simply position the insertion 155 point at the beginning of your change and hit CX-4a to bring 156 up the appropriate ChangeLog entry. See--magic! Similar 157 functionality also exists for vi. 158 </para> 159 </listitem> 160 161 <listitem> 162 <para> 163 A testsuite submission or sample program that will 164 easily and simply show the existing error or test new 165 functionality. 166 </para> 167 </listitem> 168 169 <listitem> 170 <para> 171 The patch itself. If you are accessing the SVN 172 repository use <command>svn update; svn diff NEW</command>; 173 else, use <command>diff -cp OLD NEW</command> ... If your 174 version of diff does not support these options, then get the 175 latest version of GNU 176 diff. The <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://gcc.gnu.org/wiki/SvnTricks">SVN 177 Tricks</link> wiki page has information on customising the 178 output of <code>svn diff</code>. 179 </para> 180 </listitem> 181 182 <listitem> 183 <para> 184 When you have all these pieces, bundle them up in a 185 mail message and send it to libstdc++@gcc.gnu.org. All 186 patches and related discussion should be sent to the 187 libstdc++ mailing list. In common with the rest of GCC, 188 patches should also be sent to the gcc-patches mailing list. 189 </para> 190 </listitem> 191 </itemizedlist> 192 193 </section> 194 195</section> 196 197<section xml:id="contrib.organization" xreflabel="Source Organization"><info><title>Directory Layout and Source Conventions</title></info> 198 <?dbhtml filename="source_organization.html"?> 199 200 201 <para> 202 The <filename class="directory">libstdc++-v3</filename> directory in the 203 GCC sources contains the files needed to create the GNU C++ Library. 204 </para> 205 206<para> 207It has subdirectories: 208</para> 209 210<variablelist> 211 <varlistentry> 212 <term><filename class="directory">doc</filename></term> 213 <listitem> 214 Files in HTML and text format that document usage, quirks of the 215 implementation, and contributor checklists. 216 </listitem> 217 </varlistentry> 218 219 <varlistentry> 220 <term><filename class="directory">include</filename></term> 221 <listitem> 222 All header files for the C++ library are within this directory, 223 modulo specific runtime-related files that are in the libsupc++ 224 directory. 225 226 <variablelist> 227 <varlistentry> 228 <term><filename class="directory">include/std</filename></term> 229 <listitem> 230 Files meant to be found by <code>#include <name></code> directives 231 in standard-conforming user programs. 232 </listitem> 233 </varlistentry> 234 235 <varlistentry> 236 <term><filename class="directory">include/c</filename></term> 237 <listitem> 238 Headers intended to directly include standard C headers. 239 [NB: this can be enabled via <option>--enable-cheaders=c</option>] 240 </listitem> 241 </varlistentry> 242 243 <varlistentry> 244 <term><filename class="directory">include/c_global</filename></term> 245 <listitem> 246 Headers intended to include standard C headers in 247 the global namespace, and put select names into the <code>std::</code> 248 namespace. [NB: this is the default, and is the same as 249 <option>--enable-cheaders=c_global</option>] 250 </listitem> 251 </varlistentry> 252 253 <varlistentry> 254 <term><filename class="directory">include/c_std</filename></term> 255 <listitem> 256 Headers intended to include standard C headers 257 already in namespace std, and put select names into the <code>std::</code> 258 namespace. [NB: this is the same as 259 <option>--enable-cheaders=c_std</option>] 260 </listitem> 261 </varlistentry> 262 263 <varlistentry> 264 <term><filename class="directory">include/bits</filename></term> 265 <listitem> 266 Files included by standard headers and by other files in 267 the bits directory. 268 </listitem> 269 </varlistentry> 270 271 <varlistentry> 272 <term><filename class="directory">include/backward</filename></term> 273 <listitem> 274 Headers provided for backward compatibility, such as 275 <filename class="headerfile"><backward/hash_map></filename>. 276 They are not used in this library. 277 </listitem> 278 </varlistentry> 279 280 <varlistentry> 281 <term><filename class="directory">include/ext</filename></term> 282 <listitem> 283 Headers that define extensions to the standard library. No 284 standard header refers to any of them, in theory (there are some 285 exceptions). 286 </listitem> 287 </varlistentry> 288 </variablelist> 289 </listitem> 290 </varlistentry> 291 292 <varlistentry> 293 <term><filename class="directory">scripts</filename></term> 294 <listitem> 295 Scripts that are used during the configure, build, make, or test 296 process. 297 </listitem> 298 </varlistentry> 299 300 <varlistentry> 301 <term><filename class="directory">src</filename></term> 302 <listitem> 303 Files that are used in constructing the library, but are not 304 installed. 305 306 <variablelist> 307 <varlistentry> 308 <term><filename class="directory">src/c++98</filename></term> 309 <listitem> 310 Source files compiled using <option>-std=gnu++98</option>. 311 </listitem> 312 </varlistentry> 313 314 <varlistentry> 315 <term><filename class="directory">src/c++11</filename></term> 316 <listitem> 317 Source files compiled using <option>-std=gnu++11</option>. 318 </listitem> 319 </varlistentry> 320 321 <varlistentry> 322 <term><filename class="directory">src/filesystem</filename></term> 323 <listitem> 324 Source files for the Filesystem TS. 325 </listitem> 326 </varlistentry> 327 328 <varlistentry> 329 <term><filename class="directory">src/shared</filename></term> 330 <listitem> 331 Source code included by other files under both 332 <filename class="directory">src/c++98</filename> and 333 <filename class="directory">src/c++11</filename> 334 </listitem> 335 </varlistentry> 336 </variablelist> 337 </listitem> 338 </varlistentry> 339 340 <varlistentry> 341 <term><filename class="directory">testsuites/[backward, demangle, ext, performance, thread, 17_* to 30_*]</filename></term> 342 <listitem> 343 Test programs are here, and may be used to begin to exercise the 344 library. Support for "make check" and "make check-install" is 345 complete, and runs through all the subdirectories here when this 346 command is issued from the build directory. Please note that 347 "make check" requires DejaGnu 1.4 or later to be installed, 348 or for extra <link linkend="test.run.permutations">permutations</link> 349 DejaGnu 1.5.3 or later. 350 </listitem> 351 </varlistentry> 352</variablelist> 353 354<para> 355Other subdirectories contain variant versions of certain files 356that are meant to be copied or linked by the configure script. 357Currently these are: 358<literallayout><filename class="directory">config/abi</filename> 359<filename class="directory">config/allocator</filename> 360<filename class="directory">config/cpu</filename> 361<filename class="directory">config/io</filename> 362<filename class="directory">config/locale</filename> 363<filename class="directory">config/os</filename> 364</literallayout> 365</para> 366 367<para> 368In addition, a subdirectory holds the convenience library libsupc++. 369</para> 370 371<variablelist> 372<varlistentry> 373 <term><filename class="directory">libsupc++</filename></term> 374 <listitem> 375 Contains the runtime library for C++, including exception 376 handling and memory allocation and deallocation, RTTI, terminate 377 handlers, etc. 378 </listitem> 379</varlistentry> 380</variablelist> 381 382<para> 383Note that glibc also has a <filename class="directory">bits/</filename> 384subdirectory. We need to be careful not to collide with names in its 385<filename class="directory">bits/</filename> directory. For example 386<filename class="headerfile"><bits/std_mutex.h></filename> has to be 387renamed from <filename class="headerfile"><bits/mutex.h></filename>. 388Another solution would be to rename <filename class="directory">bits</filename> 389to (e.g.) <filename class="directory">cppbits</filename>. 390</para> 391 392<para> 393In files throughout the system, lines marked with an "XXX" indicate 394a bug or incompletely-implemented feature. Lines marked "XXX MT" 395indicate a place that may require attention for multi-thread safety. 396</para> 397 398</section> 399 400<section xml:id="contrib.coding_style" xreflabel="Coding Style"><info><title>Coding Style</title></info> 401 <?dbhtml filename="source_code_style.html"?> 402 403 <para> 404 </para> 405 <section xml:id="coding_style.bad_identifiers"><info><title>Bad Identifiers</title></info> 406 407 <para> 408 Identifiers that conflict and should be avoided. 409 </para> 410 411 <literallayout class="normal"> 412 This is the list of names <quote>reserved to the 413 implementation</quote> that have been claimed by certain 414 compilers and system headers of interest, and should not be used 415 in the library. It will grow, of course. We generally are 416 interested in names that are not all-caps, except for those like 417 "_T" 418 419 For Solaris: 420 _B 421 _C 422 _L 423 _N 424 _P 425 _S 426 _U 427 _X 428 _E1 429 .. 430 _E24 431 432 Irix adds: 433 _A 434 _G 435 436 MS adds: 437 _T 438 439 BSD adds: 440 __used 441 __unused 442 __inline 443 _Complex 444 __istype 445 __maskrune 446 __tolower 447 __toupper 448 __wchar_t 449 __wint_t 450 _res 451 _res_ext 452 __tg_* 453 454 SPU adds: 455 __ea 456 457 For GCC: 458 459 [Note that this list is out of date. It applies to the old 460 name-mangling; in G++ 3.0 and higher a different name-mangling is 461 used. In addition, many of the bugs relating to G++ interpreting 462 these names as operators have been fixed.] 463 464 The full set of __* identifiers (combined from gcc/cp/lex.c and 465 gcc/cplus-dem.c) that are either old or new, but are definitely 466 recognized by the demangler, is: 467 468 __aa 469 __aad 470 __ad 471 __addr 472 __adv 473 __aer 474 __als 475 __alshift 476 __amd 477 __ami 478 __aml 479 __amu 480 __aor 481 __apl 482 __array 483 __ars 484 __arshift 485 __as 486 __bit_and 487 __bit_ior 488 __bit_not 489 __bit_xor 490 __call 491 __cl 492 __cm 493 __cn 494 __co 495 __component 496 __compound 497 __cond 498 __convert 499 __delete 500 __dl 501 __dv 502 __eq 503 __er 504 __ge 505 __gt 506 __indirect 507 __le 508 __ls 509 __lt 510 __max 511 __md 512 __method_call 513 __mi 514 __min 515 __minus 516 __ml 517 __mm 518 __mn 519 __mult 520 __mx 521 __ne 522 __negate 523 __new 524 __nop 525 __nt 526 __nw 527 __oo 528 __op 529 __or 530 __pl 531 __plus 532 __postdecrement 533 __postincrement 534 __pp 535 __pt 536 __rf 537 __rm 538 __rs 539 __sz 540 __trunc_div 541 __trunc_mod 542 __truth_andif 543 __truth_not 544 __truth_orif 545 __vc 546 __vd 547 __vn 548 549 SGI badnames: 550 __builtin_alloca 551 __builtin_fsqrt 552 __builtin_sqrt 553 __builtin_fabs 554 __builtin_dabs 555 __builtin_cast_f2i 556 __builtin_cast_i2f 557 __builtin_cast_d2ll 558 __builtin_cast_ll2d 559 __builtin_copy_dhi2i 560 __builtin_copy_i2dhi 561 __builtin_copy_dlo2i 562 __builtin_copy_i2dlo 563 __add_and_fetch 564 __sub_and_fetch 565 __or_and_fetch 566 __xor_and_fetch 567 __and_and_fetch 568 __nand_and_fetch 569 __mpy_and_fetch 570 __min_and_fetch 571 __max_and_fetch 572 __fetch_and_add 573 __fetch_and_sub 574 __fetch_and_or 575 __fetch_and_xor 576 __fetch_and_and 577 __fetch_and_nand 578 __fetch_and_mpy 579 __fetch_and_min 580 __fetch_and_max 581 __lock_test_and_set 582 __lock_release 583 __lock_acquire 584 __compare_and_swap 585 __synchronize 586 __high_multiply 587 __unix 588 __sgi 589 __linux__ 590 __i386__ 591 __i486__ 592 __cplusplus 593 __embedded_cplusplus 594 // long double conversion members mangled as __opr 595 // http://gcc.gnu.org/ml/libstdc++/1999-q4/msg00060.html 596 __opr 597 </literallayout> 598 </section> 599 600 <section xml:id="coding_style.example"><info><title>By Example</title></info> 601 602 <literallayout class="normal"> 603 This library is written to appropriate C++ coding standards. As such, 604 it is intended to precede the recommendations of the GNU Coding 605 Standard, which can be referenced in full here: 606 607 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.gnu.org/prep/standards/standards.html#Formatting">http://www.gnu.org/prep/standards/standards.html#Formatting</link> 608 609 The rest of this is also interesting reading, but skip the "Design 610 Advice" part. 611 612 The GCC coding conventions are here, and are also useful: 613 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://gcc.gnu.org/codingconventions.html">http://gcc.gnu.org/codingconventions.html</link> 614 615 In addition, because it doesn't seem to be stated explicitly anywhere 616 else, there is an 80 column source limit. 617 618 <filename>ChangeLog</filename> entries for member functions should use the 619 classname::member function name syntax as follows: 620 621<code> 6221999-04-15 Dennis Ritchie <dr@att.com> 623 624 * src/basic_file.cc (__basic_file::open): Fix thinko in 625 _G_HAVE_IO_FILE_OPEN bits. 626</code> 627 628 Notable areas of divergence from what may be previous local practice 629 (particularly for GNU C) include: 630 631 01. Pointers and references 632 <code> 633 char* p = "flop"; 634 char& c = *p; 635 -NOT- 636 char *p = "flop"; // wrong 637 char &c = *p; // wrong 638 </code> 639 640 Reason: In C++, definitions are mixed with executable code. Here, 641 <code>p</code> is being initialized, not <code>*p</code>. This is near-universal 642 practice among C++ programmers; it is normal for C hackers 643 to switch spontaneously as they gain experience. 644 645 02. Operator names and parentheses 646 <code> 647 operator==(type) 648 -NOT- 649 operator == (type) // wrong 650 </code> 651 652 Reason: The <code>==</code> is part of the function name. Separating 653 it makes the declaration look like an expression. 654 655 03. Function names and parentheses 656 <code> 657 void mangle() 658 -NOT- 659 void mangle () // wrong 660 </code> 661 662 Reason: no space before parentheses (except after a control-flow 663 keyword) is near-universal practice for C++. It identifies the 664 parentheses as the function-call operator or declarator, as 665 opposed to an expression or other overloaded use of parentheses. 666 667 04. Template function indentation 668 <code> 669 template<typename T> 670 void 671 template_function(args) 672 { } 673 -NOT- 674 template<class T> 675 void template_function(args) {}; 676 </code> 677 678 Reason: In class definitions, without indentation whitespace is 679 needed both above and below the declaration to distinguish 680 it visually from other members. (Also, re: "typename" 681 rather than "class".) <code>T</code> often could be <code>int</code>, which is 682 not a class. ("class", here, is an anachronism.) 683 684 05. Template class indentation 685 <code> 686 template<typename _CharT, typename _Traits> 687 class basic_ios : public ios_base 688 { 689 public: 690 // Types: 691 }; 692 -NOT- 693 template<class _CharT, class _Traits> 694 class basic_ios : public ios_base 695 { 696 public: 697 // Types: 698 }; 699 -NOT- 700 template<class _CharT, class _Traits> 701 class basic_ios : public ios_base 702 { 703 public: 704 // Types: 705 }; 706 </code> 707 708 06. Enumerators 709 <code> 710 enum 711 { 712 space = _ISspace, 713 print = _ISprint, 714 cntrl = _IScntrl 715 }; 716 -NOT- 717 enum { space = _ISspace, print = _ISprint, cntrl = _IScntrl }; 718 </code> 719 720 07. Member initialization lists 721 All one line, separate from class name. 722 723 <code> 724 gribble::gribble() 725 : _M_private_data(0), _M_more_stuff(0), _M_helper(0) 726 { } 727 -NOT- 728 gribble::gribble() : _M_private_data(0), _M_more_stuff(0), _M_helper(0) 729 { } 730 </code> 731 732 08. Try/Catch blocks 733 <code> 734 try 735 { 736 // 737 } 738 catch (...) 739 { 740 // 741 } 742 -NOT- 743 try { 744 // 745 } catch(...) { 746 // 747 } 748 </code> 749 750 09. Member functions declarations and definitions 751 Keywords such as extern, static, export, explicit, inline, etc 752 go on the line above the function name. Thus 753 754 <code> 755 virtual int 756 foo() 757 -NOT- 758 virtual int foo() 759 </code> 760 761 Reason: GNU coding conventions dictate return types for functions 762 are on a separate line than the function name and parameter list 763 for definitions. For C++, where we have member functions that can 764 be either inline definitions or declarations, keeping to this 765 standard allows all member function names for a given class to be 766 aligned to the same margin, increasing readability. 767 768 769 10. Invocation of member functions with "this->" 770 For non-uglified names, use <code>this->name</code> to call the function. 771 772 <code> 773 this->sync() 774 -NOT- 775 sync() 776 </code> 777 778 Reason: Koenig lookup. 779 780 11. Namespaces 781 <code> 782 namespace std 783 { 784 blah blah blah; 785 } // namespace std 786 787 -NOT- 788 789 namespace std { 790 blah blah blah; 791 } // namespace std 792 </code> 793 794 12. Spacing under protected and private in class declarations: 795 space above, none below 796 i.e. 797 798 <code> 799 public: 800 int foo; 801 802 -NOT- 803 public: 804 805 int foo; 806 </code> 807 808 13. Spacing WRT return statements. 809 no extra spacing before returns, no parenthesis 810 i.e. 811 812 <code> 813 } 814 return __ret; 815 816 -NOT- 817 } 818 819 return __ret; 820 821 -NOT- 822 823 } 824 return (__ret); 825 </code> 826 827 828 14. Location of global variables. 829 All global variables of class type, whether in the "user visible" 830 space (e.g., <code>cin</code>) or the implementation namespace, must be defined 831 as a character array with the appropriate alignment and then later 832 re-initialized to the correct value. 833 834 This is due to startup issues on certain platforms, such as AIX. 835 For more explanation and examples, see <filename>src/globals.cc</filename>. All such 836 variables should be contained in that file, for simplicity. 837 838 15. Exception abstractions 839 Use the exception abstractions found in <filename class="headerfile">functexcept.h</filename>, which allow 840 C++ programmers to use this library with <literal>-fno-exceptions</literal>. (Even if 841 that is rarely advisable, it's a necessary evil for backwards 842 compatibility.) 843 844 16. Exception error messages 845 All start with the name of the function where the exception is 846 thrown, and then (optional) descriptive text is added. Example: 847 848 <code> 849 __throw_logic_error(__N("basic_string::_S_construct NULL not valid")); 850 </code> 851 852 Reason: The verbose terminate handler prints out <code>exception::what()</code>, 853 as well as the typeinfo for the thrown exception. As this is the 854 default terminate handler, by putting location info into the 855 exception string, a very useful error message is printed out for 856 uncaught exceptions. So useful, in fact, that non-programmers can 857 give useful error messages, and programmers can intelligently 858 speculate what went wrong without even using a debugger. 859 860 17. The doxygen style guide to comments is a separate document, 861 see index. 862 863 The library currently has a mixture of GNU-C and modern C++ coding 864 styles. The GNU C usages will be combed out gradually. 865 866 Name patterns: 867 868 For nonstandard names appearing in Standard headers, we are constrained 869 to use names that begin with underscores. This is called "uglification". 870 The convention is: 871 872 Local and argument names: <literal>__[a-z].*</literal> 873 874 Examples: <code>__count __ix __s1</code> 875 876 Type names and template formal-argument names: <literal>_[A-Z][^_].*</literal> 877 878 Examples: <code>_Helper _CharT _N</code> 879 880 Member data and function names: <literal>_M_.*</literal> 881 882 Examples: <code>_M_num_elements _M_initialize ()</code> 883 884 Static data members, constants, and enumerations: <literal>_S_.*</literal> 885 886 Examples: <code>_S_max_elements _S_default_value</code> 887 888 Don't use names in the same scope that differ only in the prefix, 889 e.g. _S_top and _M_top. See <link linkend="coding_style.bad_identifiers">BADNAMES</link> for a list of forbidden names. 890 (The most tempting of these seem to be and "_T" and "__sz".) 891 892 Names must never have "__" internally; it would confuse name 893 unmanglers on some targets. Also, never use "__[0-9]", same reason. 894 895 -------------------------- 896 897 [BY EXAMPLE] 898 <code> 899 900 #ifndef _HEADER_ 901 #define _HEADER_ 1 902 903 namespace std 904 { 905 class gribble 906 { 907 public: 908 gribble() throw(); 909 910 gribble(const gribble&); 911 912 explicit 913 gribble(int __howmany); 914 915 gribble& 916 operator=(const gribble&); 917 918 virtual 919 ~gribble() throw (); 920 921 // Start with a capital letter, end with a period. 922 inline void 923 public_member(const char* __arg) const; 924 925 // In-class function definitions should be restricted to one-liners. 926 int 927 one_line() { return 0 } 928 929 int 930 two_lines(const char* arg) 931 { return strchr(arg, 'a'); } 932 933 inline int 934 three_lines(); // inline, but defined below. 935 936 // Note indentation. 937 template<typename _Formal_argument> 938 void 939 public_template() const throw(); 940 941 template<typename _Iterator> 942 void 943 other_template(); 944 945 private: 946 class _Helper; 947 948 int _M_private_data; 949 int _M_more_stuff; 950 _Helper* _M_helper; 951 int _M_private_function(); 952 953 enum _Enum 954 { 955 _S_one, 956 _S_two 957 }; 958 959 static void 960 _S_initialize_library(); 961 }; 962 963 // More-or-less-standard language features described by lack, not presence. 964 # ifndef _G_NO_LONGLONG 965 extern long long _G_global_with_a_good_long_name; // avoid globals! 966 # endif 967 968 // Avoid in-class inline definitions, define separately; 969 // likewise for member class definitions: 970 inline int 971 gribble::public_member() const 972 { int __local = 0; return __local; } 973 974 class gribble::_Helper 975 { 976 int _M_stuff; 977 978 friend class gribble; 979 }; 980 } 981 982 // Names beginning with "__": only for arguments and 983 // local variables; never use "__" in a type name, or 984 // within any name; never use "__[0-9]". 985 986 #endif /* _HEADER_ */ 987 988 989 namespace std 990 { 991 template<typename T> // notice: "typename", not "class", no space 992 long_return_value_type<with_many, args> 993 function_name(char* pointer, // "char *pointer" is wrong. 994 char* argument, 995 const Reference& ref) 996 { 997 // int a_local; /* wrong; see below. */ 998 if (test) 999 { 1000 nested code 1001 } 1002 1003 int a_local = 0; // declare variable at first use. 1004 1005 // char a, b, *p; /* wrong */ 1006 char a = 'a'; 1007 char b = a + 1; 1008 char* c = "abc"; // each variable goes on its own line, always. 1009 1010 // except maybe here... 1011 for (unsigned i = 0, mask = 1; mask; ++i, mask <<= 1) { 1012 // ... 1013 } 1014 } 1015 1016 gribble::gribble() 1017 : _M_private_data(0), _M_more_stuff(0), _M_helper(0) 1018 { } 1019 1020 int 1021 gribble::three_lines() 1022 { 1023 // doesn't fit in one line. 1024 } 1025 } // namespace std 1026 </code> 1027 </literallayout> 1028 </section> 1029</section> 1030 1031<section xml:id="contrib.design_notes" xreflabel="Design Notes"><info><title>Design Notes</title></info> 1032 <?dbhtml filename="source_design_notes.html"?> 1033 1034 <para> 1035 </para> 1036 1037 <literallayout class="normal"> 1038 1039 The Library 1040 ----------- 1041 1042 This paper is covers two major areas: 1043 1044 - Features and policies not mentioned in the standard that 1045 the quality of the library implementation depends on, including 1046 extensions and "implementation-defined" features; 1047 1048 - Plans for required but unimplemented library features and 1049 optimizations to them. 1050 1051 Overhead 1052 -------- 1053 1054 The standard defines a large library, much larger than the standard 1055 C library. A naive implementation would suffer substantial overhead 1056 in compile time, executable size, and speed, rendering it unusable 1057 in many (particularly embedded) applications. The alternative demands 1058 care in construction, and some compiler support, but there is no 1059 need for library subsets. 1060 1061 What are the sources of this overhead? There are four main causes: 1062 1063 - The library is specified almost entirely as templates, which 1064 with current compilers must be included in-line, resulting in 1065 very slow builds as tens or hundreds of thousands of lines 1066 of function definitions are read for each user source file. 1067 Indeed, the entire SGI STL, as well as the dos Reis valarray, 1068 are provided purely as header files, largely for simplicity in 1069 porting. Iostream/locale is (or will be) as large again. 1070 1071 - The library is very flexible, specifying a multitude of hooks 1072 where users can insert their own code in place of defaults. 1073 When these hooks are not used, any time and code expended to 1074 support that flexibility is wasted. 1075 1076 - Templates are often described as causing to "code bloat". In 1077 practice, this refers (when it refers to anything real) to several 1078 independent processes. First, when a class template is manually 1079 instantiated in its entirely, current compilers place the definitions 1080 for all members in a single object file, so that a program linking 1081 to one member gets definitions of all. Second, template functions 1082 which do not actually depend on the template argument are, under 1083 current compilers, generated anew for each instantiation, rather 1084 than being shared with other instantiations. Third, some of the 1085 flexibility mentioned above comes from virtual functions (both in 1086 regular classes and template classes) which current linkers add 1087 to the executable file even when they manifestly cannot be called. 1088 1089 - The library is specified to use a language feature, exceptions, 1090 which in the current gcc compiler ABI imposes a run time and 1091 code space cost to handle the possibility of exceptions even when 1092 they are not used. Under the new ABI (accessed with -fnew-abi), 1093 there is a space overhead and a small reduction in code efficiency 1094 resulting from lost optimization opportunities associated with 1095 non-local branches associated with exceptions. 1096 1097 What can be done to eliminate this overhead? A variety of coding 1098 techniques, and compiler, linker and library improvements and 1099 extensions may be used, as covered below. Most are not difficult, 1100 and some are already implemented in varying degrees. 1101 1102 Overhead: Compilation Time 1103 -------------------------- 1104 1105 Providing "ready-instantiated" template code in object code archives 1106 allows us to avoid generating and optimizing template instantiations 1107 in each compilation unit which uses them. However, the number of such 1108 instantiations that are useful to provide is limited, and anyway this 1109 is not enough, by itself, to minimize compilation time. In particular, 1110 it does not reduce time spent parsing conforming headers. 1111 1112 Quicker header parsing will depend on library extensions and compiler 1113 improvements. One approach is some variation on the techniques 1114 previously marketed as "pre-compiled headers", now standardized as 1115 support for the "export" keyword. "Exported" template definitions 1116 can be placed (once) in a "repository" -- really just a library, but 1117 of template definitions rather than object code -- to be drawn upon 1118 at link time when an instantiation is needed, rather than placed in 1119 header files to be parsed along with every compilation unit. 1120 1121 Until "export" is implemented we can put some of the lengthy template 1122 definitions in #if guards or alternative headers so that users can skip 1123 over the full definitions when they need only the ready-instantiated 1124 specializations. 1125 1126 To be precise, this means that certain headers which define 1127 templates which users normally use only for certain arguments 1128 can be instrumented to avoid exposing the template definitions 1129 to the compiler unless a macro is defined. For example, in 1130 <string>, we might have: 1131 1132 template <class _CharT, ... > class basic_string { 1133 ... // member declarations 1134 }; 1135 ... // operator declarations 1136 1137 #ifdef _STRICT_ISO_ 1138 # if _G_NO_TEMPLATE_EXPORT 1139 # include <bits/std_locale.h> // headers needed by definitions 1140 # ... 1141 # include <bits/string.tcc> // member and global template definitions. 1142 # endif 1143 #endif 1144 1145 Users who compile without specifying a strict-ISO-conforming flag 1146 would not see many of the template definitions they now see, and rely 1147 instead on ready-instantiated specializations in the library. This 1148 technique would be useful for the following substantial components: 1149 string, locale/iostreams, valarray. It would *not* be useful or 1150 usable with the following: containers, algorithms, iterators, 1151 allocator. Since these constitute a large (though decreasing) 1152 fraction of the library, the benefit the technique offers is 1153 limited. 1154 1155 The language specifies the semantics of the "export" keyword, but 1156 the gcc compiler does not yet support it. When it does, problems 1157 with large template inclusions can largely disappear, given some 1158 minor library reorganization, along with the need for the apparatus 1159 described above. 1160 1161 Overhead: Flexibility Cost 1162 -------------------------- 1163 1164 The library offers many places where users can specify operations 1165 to be performed by the library in place of defaults. Sometimes 1166 this seems to require that the library use a more-roundabout, and 1167 possibly slower, way to accomplish the default requirements than 1168 would be used otherwise. 1169 1170 The primary protection against this overhead is thorough compiler 1171 optimization, to crush out layers of inline function interfaces. 1172 Kuck & Associates has demonstrated the practicality of this kind 1173 of optimization. 1174 1175 The second line of defense against this overhead is explicit 1176 specialization. By defining helper function templates, and writing 1177 specialized code for the default case, overhead can be eliminated 1178 for that case without sacrificing flexibility. This takes full 1179 advantage of any ability of the optimizer to crush out degenerate 1180 code. 1181 1182 The library specifies many virtual functions which current linkers 1183 load even when they cannot be called. Some minor improvements to the 1184 compiler and to ld would eliminate any such overhead by simply 1185 omitting virtual functions that the complete program does not call. 1186 A prototype of this work has already been done. For targets where 1187 GNU ld is not used, a "pre-linker" could do the same job. 1188 1189 The main areas in the standard interface where user flexibility 1190 can result in overhead are: 1191 1192 - Allocators: Containers are specified to use user-definable 1193 allocator types and objects, making tuning for the container 1194 characteristics tricky. 1195 1196 - Locales: the standard specifies locale objects used to implement 1197 iostream operations, involving many virtual functions which use 1198 streambuf iterators. 1199 1200 - Algorithms and containers: these may be instantiated on any type, 1201 frequently duplicating code for identical operations. 1202 1203 - Iostreams and strings: users are permitted to use these on their 1204 own types, and specify the operations the stream must use on these 1205 types. 1206 1207 Note that these sources of overhead are _avoidable_. The techniques 1208 to avoid them are covered below. 1209 1210 Code Bloat 1211 ---------- 1212 1213 In the SGI STL, and in some other headers, many of the templates 1214 are defined "inline" -- either explicitly or by their placement 1215 in class definitions -- which should not be inline. This is a 1216 source of code bloat. Matt had remarked that he was relying on 1217 the compiler to recognize what was too big to benefit from inlining, 1218 and generate it out-of-line automatically. However, this also can 1219 result in code bloat except where the linker can eliminate the extra 1220 copies. 1221 1222 Fixing these cases will require an audit of all inline functions 1223 defined in the library to determine which merit inlining, and moving 1224 the rest out of line. This is an issue mainly in clauses 23, 25, and 1225 27. Of course it can be done incrementally, and we should generally 1226 accept patches that move large functions out of line and into ".tcc" 1227 files, which can later be pulled into a repository. Compiler/linker 1228 improvements to recognize very large inline functions and move them 1229 out-of-line, but shared among compilation units, could make this 1230 work unnecessary. 1231 1232 Pre-instantiating template specializations currently produces large 1233 amounts of dead code which bloats statically linked programs. The 1234 current state of the static library, libstdc++.a, is intolerable on 1235 this account, and will fuel further confused speculation about a need 1236 for a library "subset". A compiler improvement that treats each 1237 instantiated function as a separate object file, for linking purposes, 1238 would be one solution to this problem. An alternative would be to 1239 split up the manual instantiation files into dozens upon dozens of 1240 little files, each compiled separately, but an abortive attempt at 1241 this was done for <string> and, though it is far from complete, it 1242 is already a nuisance. A better interim solution (just until we have 1243 "export") is badly needed. 1244 1245 When building a shared library, the current compiler/linker cannot 1246 automatically generate the instantiations needed. This creates a 1247 miserable situation; it means any time something is changed in the 1248 library, before a shared library can be built someone must manually 1249 copy the declarations of all templates that are needed by other parts 1250 of the library to an "instantiation" file, and add it to the build 1251 system to be compiled and linked to the library. This process is 1252 readily automated, and should be automated as soon as possible. 1253 Users building their own shared libraries experience identical 1254 frustrations. 1255 1256 Sharing common aspects of template definitions among instantiations 1257 can radically reduce code bloat. The compiler could help a great 1258 deal here by recognizing when a function depends on nothing about 1259 a template parameter, or only on its size, and giving the resulting 1260 function a link-name "equate" that allows it to be shared with other 1261 instantiations. Implementation code could take advantage of the 1262 capability by factoring out code that does not depend on the template 1263 argument into separate functions to be merged by the compiler. 1264 1265 Until such a compiler optimization is implemented, much can be done 1266 manually (if tediously) in this direction. One such optimization is 1267 to derive class templates from non-template classes, and move as much 1268 implementation as possible into the base class. Another is to partial- 1269 specialize certain common instantiations, such as vector<T*>, to share 1270 code for instantiations on all types T. While these techniques work, 1271 they are far from the complete solution that a compiler improvement 1272 would afford. 1273 1274 Overhead: Expensive Language Features 1275 ------------------------------------- 1276 1277 The main "expensive" language feature used in the standard library 1278 is exception support, which requires compiling in cleanup code with 1279 static table data to locate it, and linking in library code to use 1280 the table. For small embedded programs the amount of such library 1281 code and table data is assumed by some to be excessive. Under the 1282 "new" ABI this perception is generally exaggerated, although in some 1283 cases it may actually be excessive. 1284 1285 To implement a library which does not use exceptions directly is 1286 not difficult given minor compiler support (to "turn off" exceptions 1287 and ignore exception constructs), and results in no great library 1288 maintenance difficulties. To be precise, given "-fno-exceptions", 1289 the compiler should treat "try" blocks as ordinary blocks, and 1290 "catch" blocks as dead code to ignore or eliminate. Compiler 1291 support is not strictly necessary, except in the case of "function 1292 try blocks"; otherwise the following macros almost suffice: 1293 1294 #define throw(X) 1295 #define try if (true) 1296 #define catch(X) else if (false) 1297 1298 However, there may be a need to use function try blocks in the 1299 library implementation, and use of macros in this way can make 1300 correct diagnostics impossible. Furthermore, use of this scheme 1301 would require the library to call a function to re-throw exceptions 1302 from a try block. Implementing the above semantics in the compiler 1303 is preferable. 1304 1305 Given the support above (however implemented) it only remains to 1306 replace code that "throws" with a call to a well-documented "handler" 1307 function in a separate compilation unit which may be replaced by 1308 the user. The main source of exceptions that would be difficult 1309 for users to avoid is memory allocation failures, but users can 1310 define their own memory allocation primitives that never throw. 1311 Otherwise, the complete list of such handlers, and which library 1312 functions may call them, would be needed for users to be able to 1313 implement the necessary substitutes. (Fortunately, they have the 1314 source code.) 1315 1316 Opportunities 1317 ------------- 1318 1319 The template capabilities of C++ offer enormous opportunities for 1320 optimizing common library operations, well beyond what would be 1321 considered "eliminating overhead". In particular, many operations 1322 done in Glibc with macros that depend on proprietary language 1323 extensions can be implemented in pristine Standard C++. For example, 1324 the chapter 25 algorithms, and even C library functions such as strchr, 1325 can be specialized for the case of static arrays of known (small) size. 1326 1327 Detailed optimization opportunities are identified below where 1328 the component where they would appear is discussed. Of course new 1329 opportunities will be identified during implementation. 1330 1331 Unimplemented Required Library Features 1332 --------------------------------------- 1333 1334 The standard specifies hundreds of components, grouped broadly by 1335 chapter. These are listed in excruciating detail in the CHECKLIST 1336 file. 1337 1338 17 general 1339 18 support 1340 19 diagnostics 1341 20 utilities 1342 21 string 1343 22 locale 1344 23 containers 1345 24 iterators 1346 25 algorithms 1347 26 numerics 1348 27 iostreams 1349 Annex D backward compatibility 1350 1351 Anyone participating in implementation of the library should obtain 1352 a copy of the standard, ISO 14882. People in the U.S. can obtain an 1353 electronic copy for US$18 from ANSI's web site. Those from other 1354 countries should visit http://www.iso.org/ to find out the location 1355 of their country's representation in ISO, in order to know who can 1356 sell them a copy. 1357 1358 The emphasis in the following sections is on unimplemented features 1359 and optimization opportunities. 1360 1361 Chapter 17 General 1362 ------------------- 1363 1364 Chapter 17 concerns overall library requirements. 1365 1366 The standard doesn't mention threads. A multi-thread (MT) extension 1367 primarily affects operators new and delete (18), allocator (20), 1368 string (21), locale (22), and iostreams (27). The common underlying 1369 support needed for this is discussed under chapter 20. 1370 1371 The standard requirements on names from the C headers create a 1372 lot of work, mostly done. Names in the C headers must be visible 1373 in the std:: and sometimes the global namespace; the names in the 1374 two scopes must refer to the same object. More stringent is that 1375 Koenig lookup implies that any types specified as defined in std:: 1376 really are defined in std::. Names optionally implemented as 1377 macros in C cannot be macros in C++. (An overview may be read at 1378 <http://www.cantrip.org/cheaders.html>). The scripts "inclosure" 1379 and "mkcshadow", and the directories shadow/ and cshadow/, are the 1380 beginning of an effort to conform in this area. 1381 1382 A correct conforming definition of C header names based on underlying 1383 C library headers, and practical linking of conforming namespaced 1384 customer code with third-party C libraries depends ultimately on 1385 an ABI change, allowing namespaced C type names to be mangled into 1386 type names as if they were global, somewhat as C function names in a 1387 namespace, or C++ global variable names, are left unmangled. Perhaps 1388 another "extern" mode, such as 'extern "C-global"' would be an 1389 appropriate place for such type definitions. Such a type would 1390 affect mangling as follows: 1391 1392 namespace A { 1393 struct X {}; 1394 extern "C-global" { // or maybe just 'extern "C"' 1395 struct Y {}; 1396 }; 1397 } 1398 void f(A::X*); // mangles to f__FPQ21A1X 1399 void f(A::Y*); // mangles to f__FP1Y 1400 1401 (It may be that this is really the appropriate semantics for regular 1402 'extern "C"', and 'extern "C-global"', as an extension, would not be 1403 necessary.) This would allow functions declared in non-standard C headers 1404 (and thus fixable by neither us nor users) to link properly with functions 1405 declared using C types defined in properly-namespaced headers. The 1406 problem this solves is that C headers (which C++ programmers do persist 1407 in using) frequently forward-declare C struct tags without including 1408 the header where the type is defined, as in 1409 1410 struct tm; 1411 void munge(tm*); 1412 1413 Without some compiler accommodation, munge cannot be called by correct 1414 C++ code using a pointer to a correctly-scoped tm* value. 1415 1416 The current C headers use the preprocessor extension "#include_next", 1417 which the compiler complains about when run "-pedantic". 1418 (Incidentally, it appears that "-fpedantic" is currently ignored, 1419 probably a bug.) The solution in the C compiler is to use 1420 "-isystem" rather than "-I", but unfortunately in g++ this seems 1421 also to wrap the whole header in an 'extern "C"' block, so it's 1422 unusable for C++ headers. The correct solution appears to be to 1423 allow the various special include-directory options, if not given 1424 an argument, to affect subsequent include-directory options additively, 1425 so that if one said 1426 1427 -pedantic -iprefix $(prefix) \ 1428 -idirafter -ino-pedantic -ino-extern-c -iwithprefix -I g++-v3 \ 1429 -iwithprefix -I g++-v3/ext 1430 1431 the compiler would search $(prefix)/g++-v3 and not report 1432 pedantic warnings for files found there, but treat files in 1433 $(prefix)/g++-v3/ext pedantically. (The undocumented semantics 1434 of "-isystem" in g++ stink. Can they be rescinded? If not it 1435 must be replaced with something more rationally behaved.) 1436 1437 All the C headers need the treatment above; in the standard these 1438 headers are mentioned in various clauses. Below, I have only 1439 mentioned those that present interesting implementation issues. 1440 1441 The components identified as "mostly complete", below, have not been 1442 audited for conformance. In many cases where the library passes 1443 conformance tests we have non-conforming extensions that must be 1444 wrapped in #if guards for "pedantic" use, and in some cases renamed 1445 in a conforming way for continued use in the implementation regardless 1446 of conformance flags. 1447 1448 The STL portion of the library still depends on a header 1449 stl/bits/stl_config.h full of #ifdef clauses. This apparatus 1450 should be replaced with autoconf/automake machinery. 1451 1452 The SGI STL defines a type_traits<> template, specialized for 1453 many types in their code including the built-in numeric and 1454 pointer types and some library types, to direct optimizations of 1455 standard functions. The SGI compiler has been extended to generate 1456 specializations of this template automatically for user types, 1457 so that use of STL templates on user types can take advantage of 1458 these optimizations. Specializations for other, non-STL, types 1459 would make more optimizations possible, but extending the gcc 1460 compiler in the same way would be much better. Probably the next 1461 round of standardization will ratify this, but probably with 1462 changes, so it probably should be renamed to place it in the 1463 implementation namespace. 1464 1465 The SGI STL also defines a large number of extensions visible in 1466 standard headers. (Other extensions that appear in separate headers 1467 have been sequestered in subdirectories ext/ and backward/.) All 1468 these extensions should be moved to other headers where possible, 1469 and in any case wrapped in a namespace (not std!), and (where kept 1470 in a standard header) girded about with macro guards. Some cannot be 1471 moved out of standard headers because they are used to implement 1472 standard features. The canonical method for accommodating these 1473 is to use a protected name, aliased in macro guards to a user-space 1474 name. Unfortunately C++ offers no satisfactory template typedef 1475 mechanism, so very ad-hoc and unsatisfactory aliasing must be used 1476 instead. 1477 1478 Implementation of a template typedef mechanism should have the highest 1479 priority among possible extensions, on the same level as implementation 1480 of the template "export" feature. 1481 1482 Chapter 18 Language support 1483 ---------------------------- 1484 1485 Headers: <limits> <new> <typeinfo> <exception> 1486 C headers: <cstddef> <climits> <cfloat> <cstdarg> <csetjmp> 1487 <ctime> <csignal> <cstdlib> (also 21, 25, 26) 1488 1489 This defines the built-in exceptions, rtti, numeric_limits<>, 1490 operator new and delete. Much of this is provided by the 1491 compiler in its static runtime library. 1492 1493 Work to do includes defining numeric_limits<> specializations in 1494 separate files for all target architectures. Values for integer types 1495 except for bool and wchar_t are readily obtained from the C header 1496 <limits.h>, but values for the remaining numeric types (bool, wchar_t, 1497 float, double, long double) must be entered manually. This is 1498 largely dog work except for those members whose values are not 1499 easily deduced from available documentation. Also, this involves 1500 some work in target configuration to identify the correct choice of 1501 file to build against and to install. 1502 1503 The definitions of the various operators new and delete must be 1504 made thread-safe, which depends on a portable exclusion mechanism, 1505 discussed under chapter 20. Of course there is always plenty of 1506 room for improvements to the speed of operators new and delete. 1507 1508 <cstdarg>, in Glibc, defines some macros that gcc does not allow to 1509 be wrapped into an inline function. Probably this header will demand 1510 attention whenever a new target is chosen. The functions atexit(), 1511 exit(), and abort() in cstdlib have different semantics in C++, so 1512 must be re-implemented for C++. 1513 1514 Chapter 19 Diagnostics 1515 ----------------------- 1516 1517 Headers: <stdexcept> 1518 C headers: <cassert> <cerrno> 1519 1520 This defines the standard exception objects, which are "mostly complete". 1521 Cygnus has a version, and now SGI provides a slightly different one. 1522 It makes little difference which we use. 1523 1524 The C global name "errno", which C allows to be a variable or a macro, 1525 is required in C++ to be a macro. For MT it must typically result in 1526 a function call. 1527 1528 Chapter 20 Utilities 1529 --------------------- 1530 Headers: <utility> <functional> <memory> 1531 C header: <ctime> (also in 18) 1532 1533 SGI STL provides "mostly complete" versions of all the components 1534 defined in this chapter. However, the auto_ptr<> implementation 1535 is known to be wrong. Furthermore, the standard definition of it 1536 is known to be unimplementable as written. A minor change to the 1537 standard would fix it, and auto_ptr<> should be adjusted to match. 1538 1539 Multi-threading affects the allocator implementation, and there must 1540 be configuration/installation choices for different users' MT 1541 requirements. Anyway, users will want to tune allocator options 1542 to support different target conditions, MT or no. 1543 1544 The primitives used for MT implementation should be exposed, as an 1545 extension, for users' own work. We need cross-CPU "mutex" support, 1546 multi-processor shared-memory atomic integer operations, and single- 1547 processor uninterruptible integer operations, and all three configurable 1548 to be stubbed out for non-MT use, or to use an appropriately-loaded 1549 dynamic library for the actual runtime environment, or statically 1550 compiled in for cases where the target architecture is known. 1551 1552 Chapter 21 String 1553 ------------------ 1554 Headers: <string> 1555 C headers: <cctype> <cwctype> <cstring> <cwchar> (also in 27) 1556 <cstdlib> (also in 18, 25, 26) 1557 1558 We have "mostly-complete" char_traits<> implementations. Many of the 1559 char_traits<char> operations might be optimized further using existing 1560 proprietary language extensions. 1561 1562 We have a "mostly-complete" basic_string<> implementation. The work 1563 to manually instantiate char and wchar_t specializations in object 1564 files to improve link-time behavior is extremely unsatisfactory, 1565 literally tripling library-build time with no commensurate improvement 1566 in static program link sizes. It must be redone. (Similar work is 1567 needed for some components in clauses 22 and 27.) 1568 1569 Other work needed for strings is MT-safety, as discussed under the 1570 chapter 20 heading. 1571 1572 The standard C type mbstate_t from <cwchar> and used in char_traits<> 1573 must be different in C++ than in C, because in C++ the default constructor 1574 value mbstate_t() must be the "base" or "ground" sequence state. 1575 (According to the likely resolution of a recently raised Core issue, 1576 this may become unnecessary. However, there are other reasons to 1577 use a state type not as limited as whatever the C library provides.) 1578 If we might want to provide conversions from (e.g.) internally- 1579 represented EUC-wide to externally-represented Unicode, or vice- 1580 versa, the mbstate_t we choose will need to be more accommodating 1581 than what might be provided by an underlying C library. 1582 1583 There remain some basic_string template-member functions which do 1584 not overload properly with their non-template brethren. The infamous 1585 hack akin to what was done in vector<> is needed, to conform to 1586 23.1.1 para 10. The CHECKLIST items for basic_string marked 'X', 1587 or incomplete, are so marked for this reason. 1588 1589 Replacing the string iterators, which currently are simple character 1590 pointers, with class objects would greatly increase the safety of the 1591 client interface, and also permit a "debug" mode in which range, 1592 ownership, and validity are rigorously checked. The current use of 1593 raw pointers as string iterators is evil. vector<> iterators need the 1594 same treatment. Note that the current implementation freely mixes 1595 pointers and iterators, and that must be fixed before safer iterators 1596 can be introduced. 1597 1598 Some of the functions in <cstring> are different from the C version. 1599 generally overloaded on const and non-const argument pointers. For 1600 example, in <cstring> strchr is overloaded. The functions isupper 1601 etc. in <cctype> typically implemented as macros in C are functions 1602 in C++, because they are overloaded with others of the same name 1603 defined in <locale>. 1604 1605 Many of the functions required in <cwctype> and <cwchar> cannot be 1606 implemented using underlying C facilities on intended targets because 1607 such facilities only partly exist. 1608 1609 Chapter 22 Locale 1610 ------------------ 1611 Headers: <locale> 1612 C headers: <clocale> 1613 1614 We have a "mostly complete" class locale, with the exception of 1615 code for constructing, and handling the names of, named locales. 1616 The ways that locales are named (particularly when categories 1617 (e.g. LC_TIME, LC_COLLATE) are different) varies among all target 1618 environments. This code must be written in various versions and 1619 chosen by configuration parameters. 1620 1621 Members of many of the facets defined in <locale> are stubs. Generally, 1622 there are two sets of facets: the base class facets (which are supposed 1623 to implement the "C" locale) and the "byname" facets, which are supposed 1624 to read files to determine their behavior. The base ctype<>, collate<>, 1625 and numpunct<> facets are "mostly complete", except that the table of 1626 bitmask values used for "is" operations, and corresponding mask values, 1627 are still defined in libio and just included/linked. (We will need to 1628 implement these tables independently, soon, but should take advantage 1629 of libio where possible.) The num_put<>::put members for integer types 1630 are "mostly complete". 1631 1632 A complete list of what has and has not been implemented may be 1633 found in CHECKLIST. However, note that the current definition of 1634 codecvt<wchar_t,char,mbstate_t> is wrong. It should simply write 1635 out the raw bytes representing the wide characters, rather than 1636 trying to convert each to a corresponding single "char" value. 1637 1638 Some of the facets are more important than others. Specifically, 1639 the members of ctype<>, numpunct<>, num_put<>, and num_get<> facets 1640 are used by other library facilities defined in <string>, <istream>, 1641 and <ostream>, and the codecvt<> facet is used by basic_filebuf<> 1642 in <fstream>, so a conforming iostream implementation depends on 1643 these. 1644 1645 The "long long" type eventually must be supported, but code mentioning 1646 it should be wrapped in #if guards to allow pedantic-mode compiling. 1647 1648 Performance of num_put<> and num_get<> depend critically on 1649 caching computed values in ios_base objects, and on extensions 1650 to the interface with streambufs. 1651 1652 Specifically: retrieving a copy of the locale object, extracting 1653 the needed facets, and gathering data from them, for each call to 1654 (e.g.) operator<< would be prohibitively slow. To cache format 1655 data for use by num_put<> and num_get<> we have a _Format_cache<> 1656 object stored in the ios_base::pword() array. This is constructed 1657 and initialized lazily, and is organized purely for utility. It 1658 is discarded when a new locale with different facets is imbued. 1659 1660 Using only the public interfaces of the iterator arguments to the 1661 facet functions would limit performance by forbidding "vector-style" 1662 character operations. The streambuf iterator optimizations are 1663 described under chapter 24, but facets can also bypass the streambuf 1664 iterators via explicit specializations and operate directly on the 1665 streambufs, and use extended interfaces to get direct access to the 1666 streambuf internal buffer arrays. These extensions are mentioned 1667 under chapter 27. These optimizations are particularly important 1668 for input parsing. 1669 1670 Unused virtual members of locale facets can be omitted, as mentioned 1671 above, by a smart linker. 1672 1673 Chapter 23 Containers 1674 ---------------------- 1675 Headers: <deque> <list> <queue> <stack> <vector> <map> <set> <bitset> 1676 1677 All the components in chapter 23 are implemented in the SGI STL. 1678 They are "mostly complete"; they include a large number of 1679 nonconforming extensions which must be wrapped. Some of these 1680 are used internally and must be renamed or duplicated. 1681 1682 The SGI components are optimized for large-memory environments. For 1683 embedded targets, different criteria might be more appropriate. Users 1684 will want to be able to tune this behavior. We should provide 1685 ways for users to compile the library with different memory usage 1686 characteristics. 1687 1688 A lot more work is needed on factoring out common code from different 1689 specializations to reduce code size here and in chapter 25. The 1690 easiest fix for this would be a compiler/ABI improvement that allows 1691 the compiler to recognize when a specialization depends only on the 1692 size (or other gross quality) of a template argument, and allow the 1693 linker to share the code with similar specializations. In its 1694 absence, many of the algorithms and containers can be partial- 1695 specialized, at least for the case of pointers, but this only solves 1696 a small part of the problem. Use of a type_traits-style template 1697 allows a few more optimization opportunities, more if the compiler 1698 can generate the specializations automatically. 1699 1700 As an optimization, containers can specialize on the default allocator 1701 and bypass it, or take advantage of details of its implementation 1702 after it has been improved upon. 1703 1704 Replacing the vector iterators, which currently are simple element 1705 pointers, with class objects would greatly increase the safety of the 1706 client interface, and also permit a "debug" mode in which range, 1707 ownership, and validity are rigorously checked. The current use of 1708 pointers for iterators is evil. 1709 1710 As mentioned for chapter 24, the deque iterator is a good example of 1711 an opportunity to implement a "staged" iterator that would benefit 1712 from specializations of some algorithms. 1713 1714 Chapter 24 Iterators 1715 --------------------- 1716 Headers: <iterator> 1717 1718 Standard iterators are "mostly complete", with the exception of 1719 the stream iterators, which are not yet templatized on the 1720 stream type. Also, the base class template iterator<> appears 1721 to be wrong, so everything derived from it must also be wrong, 1722 currently. 1723 1724 The streambuf iterators (currently located in stl/bits/std_iterator.h, 1725 but should be under bits/) can be rewritten to take advantage of 1726 friendship with the streambuf implementation. 1727 1728 Matt Austern has identified opportunities where certain iterator 1729 types, particularly including streambuf iterators and deque 1730 iterators, have a "two-stage" quality, such that an intermediate 1731 limit can be checked much more quickly than the true limit on 1732 range operations. If identified with a member of iterator_traits, 1733 algorithms may be specialized for this case. Of course the 1734 iterators that have this quality can be identified by specializing 1735 a traits class. 1736 1737 Many of the algorithms must be specialized for the streambuf 1738 iterators, to take advantage of block-mode operations, in order 1739 to allow iostream/locale operations' performance not to suffer. 1740 It may be that they could be treated as staged iterators and 1741 take advantage of those optimizations. 1742 1743 Chapter 25 Algorithms 1744 ---------------------- 1745 Headers: <algorithm> 1746 C headers: <cstdlib> (also in 18, 21, 26)) 1747 1748 The algorithms are "mostly complete". As mentioned above, they 1749 are optimized for speed at the expense of code and data size. 1750 1751 Specializations of many of the algorithms for non-STL types would 1752 give performance improvements, but we must use great care not to 1753 interfere with fragile template overloading semantics for the 1754 standard interfaces. Conventionally the standard function template 1755 interface is an inline which delegates to a non-standard function 1756 which is then overloaded (this is already done in many places in 1757 the library). Particularly appealing opportunities for the sake of 1758 iostream performance are for copy and find applied to streambuf 1759 iterators or (as noted elsewhere) for staged iterators, of which 1760 the streambuf iterators are a good example. 1761 1762 The bsearch and qsort functions cannot be overloaded properly as 1763 required by the standard because gcc does not yet allow overloading 1764 on the extern-"C"-ness of a function pointer. 1765 1766 Chapter 26 Numerics 1767 -------------------- 1768 Headers: <complex> <valarray> <numeric> 1769 C headers: <cmath>, <cstdlib> (also 18, 21, 25) 1770 1771 Numeric components: Gabriel dos Reis's valarray, Drepper's complex, 1772 and the few algorithms from the STL are "mostly done". Of course 1773 optimization opportunities abound for the numerically literate. It 1774 is not clear whether the valarray implementation really conforms 1775 fully, in the assumptions it makes about aliasing (and lack thereof) 1776 in its arguments. 1777 1778 The C div() and ldiv() functions are interesting, because they are the 1779 only case where a C library function returns a class object by value. 1780 Since the C++ type div_t must be different from the underlying C type 1781 (which is in the wrong namespace) the underlying functions div() and 1782 ldiv() cannot be re-used efficiently. Fortunately they are trivial to 1783 re-implement. 1784 1785 Chapter 27 Iostreams 1786 --------------------- 1787 Headers: <iosfwd> <streambuf> <ios> <ostream> <istream> <iostream> 1788 <iomanip> <sstream> <fstream> 1789 C headers: <cstdio> <cwchar> (also in 21) 1790 1791 Iostream is currently in a very incomplete state. <iosfwd>, <iomanip>, 1792 ios_base, and basic_ios<> are "mostly complete". basic_streambuf<> and 1793 basic_ostream<> are well along, but basic_istream<> has had little work 1794 done. The standard stream objects, <sstream> and <fstream> have been 1795 started; basic_filebuf<> "write" functions have been implemented just 1796 enough to do "hello, world". 1797 1798 Most of the istream and ostream operators << and >> (with the exception 1799 of the op<<(integer) ones) have not been changed to use locale primitives, 1800 sentry objects, or char_traits members. 1801 1802 All these templates should be manually instantiated for char and 1803 wchar_t in a way that links only used members into user programs. 1804 1805 Streambuf is fertile ground for optimization extensions. An extended 1806 interface giving iterator access to its internal buffer would be very 1807 useful for other library components. 1808 1809 Iostream operations (primarily operators << and >>) can take advantage 1810 of the case where user code has not specified a locale, and bypass locale 1811 operations entirely. The current implementation of op<</num_put<>::put, 1812 for the integer types, demonstrates how they can cache encoding details 1813 from the locale on each operation. There is lots more room for 1814 optimization in this area. 1815 1816 The definition of the relationship between the standard streams 1817 cout et al. and stdout et al. requires something like a "stdiobuf". 1818 The SGI solution of using double-indirection to actually use a 1819 stdio FILE object for buffering is unsatisfactory, because it 1820 interferes with peephole loop optimizations. 1821 1822 The <sstream> header work has begun. stringbuf can benefit from 1823 friendship with basic_string<> and basic_string<>::_Rep to use 1824 those objects directly as buffers, and avoid allocating and making 1825 copies. 1826 1827 The basic_filebuf<> template is a complex beast. It is specified to 1828 use the locale facet codecvt<> to translate characters between native 1829 files and the locale character encoding. In general this involves 1830 two buffers, one of "char" representing the file and another of 1831 "char_type", for the stream, with codecvt<> translating. The process 1832 is complicated by the variable-length nature of the translation, and 1833 the need to seek to corresponding places in the two representations. 1834 For the case of basic_filebuf<char>, when no translation is needed, 1835 a single buffer suffices. A specialized filebuf can be used to reduce 1836 code space overhead when no locale has been imbued. Matt Austern's 1837 work at SGI will be useful, perhaps directly as a source of code, or 1838 at least as an example to draw on. 1839 1840 Filebuf, almost uniquely (cf. operator new), depends heavily on 1841 underlying environmental facilities. In current releases iostream 1842 depends fairly heavily on libio constant definitions, but it should 1843 be made independent. It also depends on operating system primitives 1844 for file operations. There is immense room for optimizations using 1845 (e.g.) mmap for reading. The shadow/ directory wraps, besides the 1846 standard C headers, the libio.h and unistd.h headers, for use mainly 1847 by filebuf. These wrappings have not been completed, though there 1848 is scaffolding in place. 1849 1850 The encapsulation of certain C header <cstdio> names presents an 1851 interesting problem. It is possible to define an inline std::fprintf() 1852 implemented in terms of the 'extern "C"' vfprintf(), but there is no 1853 standard vfscanf() to use to implement std::fscanf(). It appears that 1854 vfscanf but be re-implemented in C++ for targets where no vfscanf 1855 extension has been defined. This is interesting in that it seems 1856 to be the only significant case in the C library where this kind of 1857 rewriting is necessary. (Of course Glibc provides the vfscanf() 1858 extension.) (The functions related to exit() must be rewritten 1859 for other reasons.) 1860 1861 1862 Annex D 1863 ------- 1864 Headers: <strstream> 1865 1866 Annex D defines many non-library features, and many minor 1867 modifications to various headers, and a complete header. 1868 It is "mostly done", except that the libstdc++-2 <strstream> 1869 header has not been adopted into the library, or checked to 1870 verify that it matches the draft in those details that were 1871 clarified by the committee. Certainly it must at least be 1872 moved into the std namespace. 1873 1874 We still need to wrap all the deprecated features in #if guards 1875 so that pedantic compile modes can detect their use. 1876 1877 Nonstandard Extensions 1878 ---------------------- 1879 Headers: <iostream.h> <strstream.h> <hash> <rbtree> 1880 <pthread_alloc> <stdiobuf> (etc.) 1881 1882 User code has come to depend on a variety of nonstandard components 1883 that we must not omit. Much of this code can be adopted from 1884 libstdc++-v2 or from the SGI STL. This particularly includes 1885 <iostream.h>, <strstream.h>, and various SGI extensions such 1886 as <hash_map.h>. Many of these are already placed in the 1887 subdirectories ext/ and backward/. (Note that it is better to 1888 include them via "<backward/hash_map.h>" or "<ext/hash_map>" than 1889 to search the subdirectory itself via a "-I" directive. 1890 </literallayout> 1891</section> 1892 1893</appendix> 1894