1<?xml version='1.0'?> 2<!DOCTYPE appendix PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" 3 "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" 4[ ]> 5 6<appendix id="appendix.contrib" xreflabel="Contributing"> 7<?dbhtml filename="appendix_contributing.html"?> 8 9<appendixinfo> 10 <keywordset> 11 <keyword> 12 ISO C++ 13 </keyword> 14 <keyword> 15 library 16 </keyword> 17 </keywordset> 18</appendixinfo> 19 20<title> 21 Contributing 22 <indexterm> 23 <primary>Appendix</primary> 24 <secondary>Contributing</secondary> 25 </indexterm> 26</title> 27 28<para> 29 The GNU C++ Library follows an open development model. Active 30 contributors are assigned maintainer-ship responsibility, and given 31 write access to the source repository. First time contributors 32 should follow this procedure: 33</para> 34 35<sect1 id="contrib.list" xreflabel="Contributor Checklist"> 36 <title>Contributor Checklist</title> 37 38 <sect2 id="list.reading"> 39 <title>Reading</title> 40 41 <itemizedlist> 42 <listitem> 43 <para> 44 Get and read the relevant sections of the C++ language 45 specification. Copies of the full ISO 14882 standard are 46 available on line via the ISO mirror site for committee 47 members. Non-members, or those who have not paid for the 48 privilege of sitting on the committee and sustained their 49 two meeting commitment for voting rights, may get a copy of 50 the standard from their respective national standards 51 organization. In the USA, this national standards 52 organization is ANSI and their web-site is right 53 <ulink url="http://www.ansi.org">here.</ulink> 54 (And if you've already registered with them, clicking this link will take you to directly to the place where you can 55 <ulink url="http://webstore.ansi.org/RecordDetail.aspx?sku=ISO%2FIEC+14882:2003">buy the standard on-line.)</ulink> 56 </para> 57 </listitem> 58 59 <listitem> 60 <para> 61 The library working group bugs, and known defects, can 62 be obtained here: 63 <ulink url="http://www.open-std.org/jtc1/sc22/wg21/">http://www.open-std.org/jtc1/sc22/wg21 </ulink> 64 </para> 65 </listitem> 66 67 <listitem> 68 <para> 69 The newsgroup dedicated to standardization issues is 70 comp.std.c++: this FAQ for this group is quite useful and 71 can be 72 found <ulink url="http://www.comeaucomputing.com/csc/faq.html"> 73 here </ulink>. 74 </para> 75 </listitem> 76 77 <listitem> 78 <para> 79 Peruse 80 the <ulink url="http://www.gnu.org/prep/standards">GNU 81 Coding Standards</ulink>, and chuckle when you hit the part 82 about <quote>Using Languages Other Than C</quote>. 83 </para> 84 </listitem> 85 86 <listitem> 87 <para> 88 Be familiar with the extensions that preceded these 89 general GNU rules. These style issues for libstdc++ can be 90 found <link linkend="contrib.coding_style">here</link>. 91 </para> 92 </listitem> 93 94 <listitem> 95 <para> 96 And last but certainly not least, read the 97 library-specific information 98 found <link linkend="appendix.porting"> here</link>. 99 </para> 100 </listitem> 101 </itemizedlist> 102 103 </sect2> 104 <sect2 id="list.copyright"> 105 <title>Assignment</title> 106 <para> 107 Small changes can be accepted without a copyright assignment form on 108 file. New code and additions to the library need completed copyright 109 assignment form on file at the FSF. Note: your employer may be required 110 to fill out appropriate disclaimer forms as well. 111 </para> 112 113 <para> 114 Historically, the libstdc++ assignment form added the following 115 question: 116 </para> 117 118 <para> 119 <quote> 120 Which Belgian comic book character is better, Tintin or Asterix, and 121 why? 122 </quote> 123 </para> 124 125 <para> 126 While not strictly necessary, humoring the maintainers and answering 127 this question would be appreciated. 128 </para> 129 130 <para> 131 For more information about getting a copyright assignment, please see 132 <ulink url="http://www.gnu.org/prep/maintain/html_node/Legal-Matters.html">Legal 133 Matters</ulink>. 134 </para> 135 136 <para> 137 Please contact Benjamin Kosnik at 138 <email>bkoz+assign@redhat.com</email> if you are confused 139 about the assignment or have general licensing questions. When 140 requesting an assignment form from 141 <email>mailto:assign@gnu.org</email>, please cc the libstdc++ 142 maintainer above so that progress can be monitored. 143 </para> 144 </sect2> 145 146 <sect2 id="list.getting"> 147 <title>Getting Sources</title> 148 <para> 149 <ulink url="http://gcc.gnu.org/svnwrite.html">Getting write access 150 (look for "Write after approval")</ulink> 151 </para> 152 </sect2> 153 154 <sect2 id="list.patches"> 155 <title>Submitting Patches</title> 156 157 <para> 158 Every patch must have several pieces of information before it can be 159 properly evaluated. Ideally (and to ensure the fastest possible 160 response from the maintainers) it would have all of these pieces: 161 </para> 162 163 <itemizedlist> 164 <listitem> 165 <para> 166 A description of the bug and how your patch fixes this 167 bug. For new features a description of the feature and your 168 implementation. 169 </para> 170 </listitem> 171 172 <listitem> 173 <para> 174 A ChangeLog entry as plain text; see the various 175 ChangeLog files for format and content. If you are 176 using emacs as your editor, simply position the insertion 177 point at the beginning of your change and hit CX-4a to bring 178 up the appropriate ChangeLog entry. See--magic! Similar 179 functionality also exists for vi. 180 </para> 181 </listitem> 182 183 <listitem> 184 <para> 185 A testsuite submission or sample program that will 186 easily and simply show the existing error or test new 187 functionality. 188 </para> 189 </listitem> 190 191 <listitem> 192 <para> 193 The patch itself. If you are accessing the SVN 194 repository use <command>svn update; svn diff NEW</command>; 195 else, use <command>diff -cp OLD NEW</command> ... If your 196 version of diff does not support these options, then get the 197 latest version of GNU 198 diff. The <ulink url="http://gcc.gnu.org/wiki/SvnTricks">SVN 199 Tricks</ulink> wiki page has information on customising the 200 output of <code>svn diff</code>. 201 </para> 202 </listitem> 203 204 <listitem> 205 <para> 206 When you have all these pieces, bundle them up in a 207 mail message and send it to libstdc++@gcc.gnu.org. All 208 patches and related discussion should be sent to the 209 libstdc++ mailing list. 210 </para> 211 </listitem> 212 </itemizedlist> 213 214 </sect2> 215 216</sect1> 217 218<sect1 id="contrib.organization" xreflabel="Source Organization"> 219 <?dbhtml filename="source_organization.html"?> 220 <title>Directory Layout and Source Conventions</title> 221 222 <para> 223 The unpacked source directory of libstdc++ contains the files 224 needed to create the GNU C++ Library. 225 </para> 226 227 <literallayout> 228It has subdirectories: 229 230 doc 231 Files in HTML and text format that document usage, quirks of the 232 implementation, and contributor checklists. 233 234 include 235 All header files for the C++ library are within this directory, 236 modulo specific runtime-related files that are in the libsupc++ 237 directory. 238 239 include/std 240 Files meant to be found by #include <name> directives in 241 standard-conforming user programs. 242 243 include/c 244 Headers intended to directly include standard C headers. 245 [NB: this can be enabled via --enable-cheaders=c] 246 247 include/c_global 248 Headers intended to include standard C headers in 249 the global namespace, and put select names into the std:: 250 namespace. [NB: this is the default, and is the same as 251 --enable-cheaders=c_global] 252 253 include/c_std 254 Headers intended to include standard C headers 255 already in namespace std, and put select names into the std:: 256 namespace. [NB: this is the same as --enable-cheaders=c_std] 257 258 include/bits 259 Files included by standard headers and by other files in 260 the bits directory. 261 262 include/backward 263 Headers provided for backward compatibility, such as <iostream.h>. 264 They are not used in this library. 265 266 include/ext 267 Headers that define extensions to the standard library. No 268 standard header refers to any of them. 269 270 scripts 271 Scripts that are used during the configure, build, make, or test 272 process. 273 274 src 275 Files that are used in constructing the library, but are not 276 installed. 277 278 testsuites/[backward, demangle, ext, performance, thread, 17_* to 27_*] 279 Test programs are here, and may be used to begin to exercise the 280 library. Support for "make check" and "make check-install" is 281 complete, and runs through all the subdirectories here when this 282 command is issued from the build directory. Please note that 283 "make check" requires DejaGNU 1.4 or later to be installed. Please 284 note that "make check-script" calls the script mkcheck, which 285 requires bash, and which may need the paths to bash adjusted to 286 work properly, as /bin/bash is assumed. 287 288Other subdirectories contain variant versions of certain files 289that are meant to be copied or linked by the configure script. 290Currently these are: 291 292 config/abi 293 config/cpu 294 config/io 295 config/locale 296 config/os 297 298In addition, a subdirectory holds the convenience library libsupc++. 299 300 libsupc++ 301 Contains the runtime library for C++, including exception 302 handling and memory allocation and deallocation, RTTI, terminate 303 handlers, etc. 304 305Note that glibc also has a bits/ subdirectory. We will either 306need to be careful not to collide with names in its bits/ 307directory; or rename bits to (e.g.) cppbits/. 308 309In files throughout the system, lines marked with an "XXX" indicate 310a bug or incompletely-implemented feature. Lines marked "XXX MT" 311indicate a place that may require attention for multi-thread safety. 312 </literallayout> 313 314</sect1> 315 316<sect1 id="contrib.coding_style" xreflabel="Coding Style"> 317 <?dbhtml filename="source_code_style.html"?> 318 <title>Coding Style</title> 319 <para> 320 </para> 321 <sect2 id="coding_style.bad_identifiers"> 322 <title>Bad Identifiers</title> 323 <para> 324 Identifiers that conflict and should be avoided. 325 </para> 326 327 <literallayout> 328 This is the list of names <quote>reserved to the 329 implementation</quote> that have been claimed by certain 330 compilers and system headers of interest, and should not be used 331 in the library. It will grow, of course. We generally are 332 interested in names that are not all-caps, except for those like 333 "_T" 334 335 For Solaris: 336 _B 337 _C 338 _L 339 _N 340 _P 341 _S 342 _U 343 _X 344 _E1 345 .. 346 _E24 347 348 Irix adds: 349 _A 350 _G 351 352 MS adds: 353 _T 354 355 BSD adds: 356 __used 357 __unused 358 __inline 359 _Complex 360 __istype 361 __maskrune 362 __tolower 363 __toupper 364 __wchar_t 365 __wint_t 366 _res 367 _res_ext 368 __tg_* 369 370 SPU adds: 371 __ea 372 373 For GCC: 374 375 [Note that this list is out of date. It applies to the old 376 name-mangling; in G++ 3.0 and higher a different name-mangling is 377 used. In addition, many of the bugs relating to G++ interpreting 378 these names as operators have been fixed.] 379 380 The full set of __* identifiers (combined from gcc/cp/lex.c and 381 gcc/cplus-dem.c) that are either old or new, but are definitely 382 recognized by the demangler, is: 383 384 __aa 385 __aad 386 __ad 387 __addr 388 __adv 389 __aer 390 __als 391 __alshift 392 __amd 393 __ami 394 __aml 395 __amu 396 __aor 397 __apl 398 __array 399 __ars 400 __arshift 401 __as 402 __bit_and 403 __bit_ior 404 __bit_not 405 __bit_xor 406 __call 407 __cl 408 __cm 409 __cn 410 __co 411 __component 412 __compound 413 __cond 414 __convert 415 __delete 416 __dl 417 __dv 418 __eq 419 __er 420 __ge 421 __gt 422 __indirect 423 __le 424 __ls 425 __lt 426 __max 427 __md 428 __method_call 429 __mi 430 __min 431 __minus 432 __ml 433 __mm 434 __mn 435 __mult 436 __mx 437 __ne 438 __negate 439 __new 440 __nop 441 __nt 442 __nw 443 __oo 444 __op 445 __or 446 __pl 447 __plus 448 __postdecrement 449 __postincrement 450 __pp 451 __pt 452 __rf 453 __rm 454 __rs 455 __sz 456 __trunc_div 457 __trunc_mod 458 __truth_andif 459 __truth_not 460 __truth_orif 461 __vc 462 __vd 463 __vn 464 465 SGI badnames: 466 __builtin_alloca 467 __builtin_fsqrt 468 __builtin_sqrt 469 __builtin_fabs 470 __builtin_dabs 471 __builtin_cast_f2i 472 __builtin_cast_i2f 473 __builtin_cast_d2ll 474 __builtin_cast_ll2d 475 __builtin_copy_dhi2i 476 __builtin_copy_i2dhi 477 __builtin_copy_dlo2i 478 __builtin_copy_i2dlo 479 __add_and_fetch 480 __sub_and_fetch 481 __or_and_fetch 482 __xor_and_fetch 483 __and_and_fetch 484 __nand_and_fetch 485 __mpy_and_fetch 486 __min_and_fetch 487 __max_and_fetch 488 __fetch_and_add 489 __fetch_and_sub 490 __fetch_and_or 491 __fetch_and_xor 492 __fetch_and_and 493 __fetch_and_nand 494 __fetch_and_mpy 495 __fetch_and_min 496 __fetch_and_max 497 __lock_test_and_set 498 __lock_release 499 __lock_acquire 500 __compare_and_swap 501 __synchronize 502 __high_multiply 503 __unix 504 __sgi 505 __linux__ 506 __i386__ 507 __i486__ 508 __cplusplus 509 __embedded_cplusplus 510 // long double conversion members mangled as __opr 511 // http://gcc.gnu.org/ml/libstdc++/1999-q4/msg00060.html 512 _opr 513 </literallayout> 514 </sect2> 515 516 <sect2 id="coding_style.example"> 517 <title>By Example</title> 518 <literallayout> 519 This library is written to appropriate C++ coding standards. As such, 520 it is intended to precede the recommendations of the GNU Coding 521 Standard, which can be referenced in full here: 522 523 http://www.gnu.org/prep/standards/standards.html#Formatting 524 525 The rest of this is also interesting reading, but skip the "Design 526 Advice" part. 527 528 The GCC coding conventions are here, and are also useful: 529 http://gcc.gnu.org/codingconventions.html 530 531 In addition, because it doesn't seem to be stated explicitly anywhere 532 else, there is an 80 column source limit. 533 534 ChangeLog entries for member functions should use the 535 classname::member function name syntax as follows: 536 537 1999-04-15 Dennis Ritchie <dr@att.com> 538 539 * src/basic_file.cc (__basic_file::open): Fix thinko in 540 _G_HAVE_IO_FILE_OPEN bits. 541 542 Notable areas of divergence from what may be previous local practice 543 (particularly for GNU C) include: 544 545 01. Pointers and references 546 char* p = "flop"; 547 char& c = *p; 548 -NOT- 549 char *p = "flop"; // wrong 550 char &c = *p; // wrong 551 552 Reason: In C++, definitions are mixed with executable code. Here, 553 p is being initialized, not *p. This is near-universal 554 practice among C++ programmers; it is normal for C hackers 555 to switch spontaneously as they gain experience. 556 557 02. Operator names and parentheses 558 operator==(type) 559 -NOT- 560 operator == (type) // wrong 561 562 Reason: The == is part of the function name. Separating 563 it makes the declaration look like an expression. 564 565 03. Function names and parentheses 566 void mangle() 567 -NOT- 568 void mangle () // wrong 569 570 Reason: no space before parentheses (except after a control-flow 571 keyword) is near-universal practice for C++. It identifies the 572 parentheses as the function-call operator or declarator, as 573 opposed to an expression or other overloaded use of parentheses. 574 575 04. Template function indentation 576 template<typename T> 577 void 578 template_function(args) 579 { } 580 -NOT- 581 template<class T> 582 void template_function(args) {}; 583 584 Reason: In class definitions, without indentation whitespace is 585 needed both above and below the declaration to distinguish 586 it visually from other members. (Also, re: "typename" 587 rather than "class".) T often could be int, which is 588 not a class. ("class", here, is an anachronism.) 589 590 05. Template class indentation 591 template<typename _CharT, typename _Traits> 592 class basic_ios : public ios_base 593 { 594 public: 595 // Types: 596 }; 597 -NOT- 598 template<class _CharT, class _Traits> 599 class basic_ios : public ios_base 600 { 601 public: 602 // Types: 603 }; 604 -NOT- 605 template<class _CharT, class _Traits> 606 class basic_ios : public ios_base 607 { 608 public: 609 // Types: 610 }; 611 612 06. Enumerators 613 enum 614 { 615 space = _ISspace, 616 print = _ISprint, 617 cntrl = _IScntrl 618 }; 619 -NOT- 620 enum { space = _ISspace, print = _ISprint, cntrl = _IScntrl }; 621 622 07. Member initialization lists 623 All one line, separate from class name. 624 625 gribble::gribble() 626 : _M_private_data(0), _M_more_stuff(0), _M_helper(0); 627 { } 628 -NOT- 629 gribble::gribble() : _M_private_data(0), _M_more_stuff(0), _M_helper(0); 630 { } 631 632 08. Try/Catch blocks 633 try 634 { 635 // 636 } 637 catch (...) 638 { 639 // 640 } 641 -NOT- 642 try { 643 // 644 } catch(...) { 645 // 646 } 647 648 09. Member functions declarations and definitions 649 Keywords such as extern, static, export, explicit, inline, etc 650 go on the line above the function name. Thus 651 652 virtual int 653 foo() 654 -NOT- 655 virtual int foo() 656 657 Reason: GNU coding conventions dictate return types for functions 658 are on a separate line than the function name and parameter list 659 for definitions. For C++, where we have member functions that can 660 be either inline definitions or declarations, keeping to this 661 standard allows all member function names for a given class to be 662 aligned to the same margin, increasing readability. 663 664 665 10. Invocation of member functions with "this->" 666 For non-uglified names, use this->name to call the function. 667 668 this->sync() 669 -NOT- 670 sync() 671 672 Reason: Koenig lookup. 673 674 11. Namespaces 675 namespace std 676 { 677 blah blah blah; 678 } // namespace std 679 680 -NOT- 681 682 namespace std { 683 blah blah blah; 684 } // namespace std 685 686 12. Spacing under protected and private in class declarations: 687 space above, none below 688 i.e. 689 690 public: 691 int foo; 692 693 -NOT- 694 public: 695 696 int foo; 697 698 13. Spacing WRT return statements. 699 no extra spacing before returns, no parenthesis 700 i.e. 701 702 } 703 return __ret; 704 705 -NOT- 706 } 707 708 return __ret; 709 710 -NOT- 711 712 } 713 return (__ret); 714 715 716 14. Location of global variables. 717 All global variables of class type, whether in the "user visible" 718 space (e.g., cin) or the implementation namespace, must be defined 719 as a character array with the appropriate alignment and then later 720 re-initialized to the correct value. 721 722 This is due to startup issues on certain platforms, such as AIX. 723 For more explanation and examples, see src/globals.cc. All such 724 variables should be contained in that file, for simplicity. 725 726 15. Exception abstractions 727 Use the exception abstractions found in functexcept.h, which allow 728 C++ programmers to use this library with -fno-exceptions. (Even if 729 that is rarely advisable, it's a necessary evil for backwards 730 compatibility.) 731 732 16. Exception error messages 733 All start with the name of the function where the exception is 734 thrown, and then (optional) descriptive text is added. Example: 735 736 __throw_logic_error(__N("basic_string::_S_construct NULL not valid")); 737 738 Reason: The verbose terminate handler prints out exception::what(), 739 as well as the typeinfo for the thrown exception. As this is the 740 default terminate handler, by putting location info into the 741 exception string, a very useful error message is printed out for 742 uncaught exceptions. So useful, in fact, that non-programmers can 743 give useful error messages, and programmers can intelligently 744 speculate what went wrong without even using a debugger. 745 746 17. The doxygen style guide to comments is a separate document, 747 see index. 748 749 The library currently has a mixture of GNU-C and modern C++ coding 750 styles. The GNU C usages will be combed out gradually. 751 752 Name patterns: 753 754 For nonstandard names appearing in Standard headers, we are constrained 755 to use names that begin with underscores. This is called "uglification". 756 The convention is: 757 758 Local and argument names: __[a-z].* 759 760 Examples: __count __ix __s1 761 762 Type names and template formal-argument names: _[A-Z][^_].* 763 764 Examples: _Helper _CharT _N 765 766 Member data and function names: _M_.* 767 768 Examples: _M_num_elements _M_initialize () 769 770 Static data members, constants, and enumerations: _S_.* 771 772 Examples: _S_max_elements _S_default_value 773 774 Don't use names in the same scope that differ only in the prefix, 775 e.g. _S_top and _M_top. See BADNAMES for a list of forbidden names. 776 (The most tempting of these seem to be and "_T" and "__sz".) 777 778 Names must never have "__" internally; it would confuse name 779 unmanglers on some targets. Also, never use "__[0-9]", same reason. 780 781 -------------------------- 782 783 [BY EXAMPLE] 784 785 #ifndef _HEADER_ 786 #define _HEADER_ 1 787 788 namespace std 789 { 790 class gribble 791 { 792 public: 793 gribble() throw(); 794 795 gribble(const gribble&); 796 797 explicit 798 gribble(int __howmany); 799 800 gribble& 801 operator=(const gribble&); 802 803 virtual 804 ~gribble() throw (); 805 806 // Start with a capital letter, end with a period. 807 inline void 808 public_member(const char* __arg) const; 809 810 // In-class function definitions should be restricted to one-liners. 811 int 812 one_line() { return 0 } 813 814 int 815 two_lines(const char* arg) 816 { return strchr(arg, 'a'); } 817 818 inline int 819 three_lines(); // inline, but defined below. 820 821 // Note indentation. 822 template<typename _Formal_argument> 823 void 824 public_template() const throw(); 825 826 template<typename _Iterator> 827 void 828 other_template(); 829 830 private: 831 class _Helper; 832 833 int _M_private_data; 834 int _M_more_stuff; 835 _Helper* _M_helper; 836 int _M_private_function(); 837 838 enum _Enum 839 { 840 _S_one, 841 _S_two 842 }; 843 844 static void 845 _S_initialize_library(); 846 }; 847 848 // More-or-less-standard language features described by lack, not presence. 849 # ifndef _G_NO_LONGLONG 850 extern long long _G_global_with_a_good_long_name; // avoid globals! 851 # endif 852 853 // Avoid in-class inline definitions, define separately; 854 // likewise for member class definitions: 855 inline int 856 gribble::public_member() const 857 { int __local = 0; return __local; } 858 859 class gribble::_Helper 860 { 861 int _M_stuff; 862 863 friend class gribble; 864 }; 865 } 866 867 // Names beginning with "__": only for arguments and 868 // local variables; never use "__" in a type name, or 869 // within any name; never use "__[0-9]". 870 871 #endif /* _HEADER_ */ 872 873 874 namespace std 875 { 876 template<typename T> // notice: "typename", not "class", no space 877 long_return_value_type<with_many, args> 878 function_name(char* pointer, // "char *pointer" is wrong. 879 char* argument, 880 const Reference& ref) 881 { 882 // int a_local; /* wrong; see below. */ 883 if (test) 884 { 885 nested code 886 } 887 888 int a_local = 0; // declare variable at first use. 889 890 // char a, b, *p; /* wrong */ 891 char a = 'a'; 892 char b = a + 1; 893 char* c = "abc"; // each variable goes on its own line, always. 894 895 // except maybe here... 896 for (unsigned i = 0, mask = 1; mask; ++i, mask <<= 1) { 897 // ... 898 } 899 } 900 901 gribble::gribble() 902 : _M_private_data(0), _M_more_stuff(0), _M_helper(0); 903 { } 904 905 inline int 906 gribble::three_lines() 907 { 908 // doesn't fit in one line. 909 } 910 } // namespace std 911 </literallayout> 912 </sect2> 913</sect1> 914 915<sect1 id="contrib.doc_style" xreflabel="Documentation Style"> 916 <?dbhtml filename="documentation_style.html"?> 917 <title>Documentation Style</title> 918 <sect2 id="doc_style.doxygen"> 919 <title>Doxygen</title> 920 <sect3 id="doxygen.prereq"> 921 <title>Prerequisites</title> 922 <para> 923 Prerequisite tools are Bash 2.x, 924 <ulink url="http://www.doxygen.org/">Doxygen</ulink>, and 925 the <ulink url="http://www.gnu.org/software/coreutils/">GNU 926 coreutils</ulink>. (GNU versions of find, xargs, and possibly 927 sed and grep are used, just because the GNU versions make 928 things very easy.) 929 </para> 930 931 <para> 932 To generate the pretty pictures and hierarchy 933 graphs, the 934 <ulink url="http://www.graphviz.org">Graphviz</ulink> package 935 will need to be installed. For PDF 936 output, <ulink url="http://www.tug.org/applications/pdftex/"> 937 pdflatex</ulink> is required. 938 </para> 939 </sect3> 940 941 <sect3 id="doxygen.rules"> 942 <title>Generating the Doxygen Files</title> 943 <para> 944 The following Makefile rules run Doxygen to generate HTML 945 docs, XML docs, XML docs as a single file, PDF docs, and the man pages. 946 </para> 947 948 <para> 949 Generated files are output into separate sub directores of 950 <filename class="directory">doc/doxygen/</filename> in the 951 build directory, based on the output format. For instance, the 952 HTML docs will be in <filename 953 class="directory">doc/doxygen/html</filename>. 954 </para> 955 956 <para> 957 <screen><userinput>make doc-html-doxygen</userinput></screen> 958 </para> 959 960 <para> 961 <screen><userinput>make doc-xml-doxygen</userinput></screen> 962 </para> 963 964 <para> 965 <screen><userinput>make doc-xml-single-doxygen</userinput></screen> 966 </para> 967 968 <para> 969 <screen><userinput>make doc-pdf-doxygen</userinput></screen> 970 </para> 971 972 <para> 973 <screen><userinput>make doc-man-doxygen</userinput></screen> 974 </para> 975 976 <para> 977 Careful observers will see that the Makefile rules simply call 978 a script from the source tree, <filename>run_doxygen</filename>, which 979 does the actual work of running Doxygen and then (most 980 importantly) massaging the output files. If for some reason 981 you prefer to not go through the Makefile, you can call this 982 script directly. (Start by passing <literal>--help</literal>.) 983 </para> 984 985 <para> 986 If you wish to tweak the Doxygen settings, do so by editing 987 <filename>doc/doxygen/user.cfg.in</filename>. Notes to fellow 988 library hackers are written in triple-# comments. 989 </para> 990 991 </sect3> 992 993 <sect3 id="doxygen.markup"> 994 <title>Markup</title> 995 996 <para> 997 In general, libstdc++ files should be formatted according to 998 the rules found in the 999 <link linkend="contrib.coding_style">Coding Standard</link>. Before 1000 any doxygen-specific formatting tweaks are made, please try to 1001 make sure that the initial formatting is sound. 1002 </para> 1003 1004 <para> 1005 Adding Doxygen markup to a file (informally called 1006 <quote>doxygenating</quote>) is very simple. The Doxygen manual can be 1007 found 1008 <ulink url="http://www.stack.nl/~dimitri/doxygen/download.html#latestman">here</ulink>. 1009 We try to use a very-recent version of Doxygen. 1010 </para> 1011 1012 <para> 1013 For classes, use 1014 <classname>deque</classname>/<classname>vector</classname>/<classname>list</classname> 1015 and <classname>std::pair</classname> as examples. For 1016 functions, see their member functions, and the free functions 1017 in <filename>stl_algobase.h</filename>. Member functions of 1018 other container-like types should read similarly to these 1019 member functions. 1020 </para> 1021 1022 <para> 1023 Some commentary to accompany 1024 the first list in the <ulink url="http://www.stack.nl/~dimitri/doxygen/docblocks.html">Special 1025 Documentation Blocks</ulink> section of 1026 the Doxygen manual: 1027 </para> 1028 1029 <orderedlist> 1030 <listitem> 1031 <para>For longer comments, use the Javadoc style...</para> 1032 </listitem> 1033 1034 <listitem> 1035 <para> 1036 ...not the Qt style. The intermediate *'s are preferred. 1037 </para> 1038 </listitem> 1039 1040 <listitem> 1041 <para> 1042 Use the triple-slash style only for one-line comments (the 1043 <quote>brief</quote> mode). 1044 </para> 1045 </listitem> 1046 1047 <listitem> 1048 <para> 1049 This is disgusting. Don't do this. 1050 </para> 1051 </listitem> 1052 </orderedlist> 1053 1054 <para> 1055 Some specific guidelines: 1056 </para> 1057 1058 <para> 1059 Use the @-style of commands, not the !-style. Please be 1060 careful about whitespace in your markup comments. Most of the 1061 time it doesn't matter; doxygen absorbs most whitespace, and 1062 both HTML and *roff are agnostic about whitespace. However, 1063 in <pre> blocks and @code/@endcode sections, spacing can 1064 have <quote>interesting</quote> effects. 1065 </para> 1066 1067 <para> 1068 Use either kind of grouping, as 1069 appropriate. <filename>doxygroups.cc</filename> exists for this 1070 purpose. See <filename>stl_iterator.h</filename> for a good example 1071 of the <quote>other</quote> kind of grouping. 1072 </para> 1073 1074 <para> 1075 Please use markup tags like @p and @a when referring to things 1076 such as the names of function parameters. Use @e for emphasis 1077 when necessary. Use @c to refer to other standard names. 1078 (Examples of all these abound in the present code.) 1079 </para> 1080 1081 <para> 1082 Complicated math functions should use the multi-line 1083 format. An example from <filename>random.h</filename>: 1084 </para> 1085 1086 <para> 1087<literallayout> 1088 /** 1089 * @brief A model of a linear congruential random number generator. 1090 * 1091 * @f[ 1092 * x_{i+1}\leftarrow(ax_{i} + c) \bmod m 1093 * @f] 1094 */ 1095</literallayout> 1096 </para> 1097 1098 <para> 1099 Be careful about using certain, special characters when 1100 writing Doxygen comments. Single and double quotes, and 1101 separators in filenames are two common trouble spots. When in 1102 doubt, consult the following table. 1103 </para> 1104 1105<table frame='all'> 1106<title>HTML to Doxygen Markup Comparison</title> 1107<tgroup cols='2' align='left' colsep='1' rowsep='1'> 1108<colspec colname='c1'></colspec> 1109<colspec colname='c2'></colspec> 1110 1111 <thead> 1112 <row> 1113 <entry>HTML</entry> 1114 <entry>Doxygen</entry> 1115 </row> 1116 </thead> 1117 1118 <tbody> 1119 <row> 1120 <entry>\</entry> 1121 <entry>\\</entry> 1122 </row> 1123 1124 <row> 1125 <entry>"</entry> 1126 <entry>\"</entry> 1127 </row> 1128 1129 <row> 1130 <entry>'</entry> 1131 <entry>\'</entry> 1132 </row> 1133 1134 <row> 1135 <entry><i></entry> 1136 <entry>@a word</entry> 1137 </row> 1138 1139 <row> 1140 <entry><b></entry> 1141 <entry>@b word</entry> 1142 </row> 1143 1144 <row> 1145 <entry><code></entry> 1146 <entry>@c word</entry> 1147 </row> 1148 1149 <row> 1150 <entry><em></entry> 1151 <entry>@a word</entry> 1152 </row> 1153 1154 <row> 1155 <entry><em></entry> 1156 <entry><em>two words or more</em></entry> 1157 </row> 1158 </tbody> 1159 1160</tgroup> 1161</table> 1162 1163 1164 </sect3> 1165 1166 </sect2> 1167 1168 <sect2 id="doc_style.docbook"> 1169 <title>Docbook</title> 1170 1171 <sect3 id="docbook.prereq"> 1172 <title>Prerequisites</title> 1173 <para> 1174 Editing the DocBook sources requires an XML editor. Many 1175 exist: some notable options 1176 include <command>emacs</command>, <application>Kate</application>, 1177 or <application>Conglomerate</application>. 1178 </para> 1179 1180 <para> 1181 Some editors support special <quote>XML Validation</quote> 1182 modes that can validate the file as it is 1183 produced. Recommended is the <command>nXML Mode</command> 1184 for <command>emacs</command>. 1185 </para> 1186 1187 <para> 1188 Besides an editor, additional DocBook files and XML tools are 1189 also required. 1190 </para> 1191 1192 <para> 1193 Access to the DocBook stylesheets and DTD is required. The 1194 stylesheets are usually packaged by vendor, in something 1195 like <filename>docbook-style-xsl</filename>. To exactly match 1196 generated output, please use a version of the stylesheets 1197 equivalent 1198 to <filename>docbook-style-xsl-1.74.0-5</filename>. The 1199 installation directory for this package corresponds to 1200 the <literal>XSL_STYLE_DIR</literal> 1201 in <filename>doc/Makefile.am</filename> and defaults 1202 to <filename class="directory">/usr/share/sgml/docbook/xsl-stylesheets</filename>. 1203 </para> 1204 1205 <para> 1206 For processing XML, an XML processor and some style 1207 sheets are necessary. Defaults are <command>xsltproc</command> 1208 provided by <filename>libxslt</filename>. 1209 </para> 1210 1211 <para> 1212 For validating the XML document, you'll need 1213 something like <command>xmllint</command> and access to the 1214 DocBook DTD. These are provided 1215 by a vendor package like <filename>libxml2</filename>. 1216 </para> 1217 1218 <para> 1219 For PDF output, something that transforms valid Docbook XML to PDF is 1220 required. Possible solutions include <ulink 1221 url="http://dblatex.sourceforge.net">dblatex</ulink>, 1222 <command>xmlto</command>, or <command>prince</command>. Of 1223 these, <command>dblatex</command> is the default. Other 1224 options are listed on the DocBook web <ulink 1225 url="http://wiki.docbook.org/topic/DocBookPublishingTools">pages</ulink>. Please 1226 consult the <email>libstdc++@gcc.gnu.org</email> list when 1227 preparing printed manuals for current best practice and 1228 suggestions. 1229 </para> 1230 1231 <para> 1232 For Texinfo output, something that transforms valid Docbook 1233 XML to Texinfo is required. The default choice is <ulink 1234 url="http://docbook2x.sourceforge.net/">docbook2X</ulink>. 1235 </para> 1236 1237 <para> 1238 Please make sure that the XML documentation and markup is valid for 1239 any change. This can be done easily, with the validation rule 1240 detailed below, which is equivalent to doing: 1241 </para> 1242 1243 <screen> 1244 <userinput> 1245xmllint --noout --valid <filename>xml/index.xml</filename> 1246 </userinput> 1247 </screen> 1248 </sect3> 1249 1250 <sect3 id="docbook.rules"> 1251 <title>Generating the DocBook Files</title> 1252 1253 <para> 1254 The following Makefile rules generate (in order): an HTML 1255 version of all the DocBook documentation, a PDF version of the same, a 1256 single XML document, and the result of validating the entire XML 1257 document. 1258 </para> 1259 1260 <para> 1261 Generated files are output into separate sub directores of 1262 <filename class="directory">doc/docbook/</filename> in the 1263 build directory, based on the output format. For instance, the 1264 HTML docs will be in <filename 1265 class="directory">doc/docbook/html</filename>. 1266 </para> 1267 1268 <para> 1269 <screen><userinput>make doc-html-docbook</userinput></screen> 1270 </para> 1271 1272 <para> 1273 <screen><userinput>make doc-pdf-docbook</userinput></screen> 1274 </para> 1275 1276 <para> 1277 <screen><userinput>make doc-xml-single-docbook</userinput></screen> 1278 </para> 1279 1280 <para> 1281 <screen><userinput>make doc-xml-validate-docbook</userinput></screen> 1282 </para> 1283 1284 </sect3> 1285 1286 <sect3 id="docbook.examples"> 1287 <title>File Organization and Basics</title> 1288 1289 <literallayout> 1290 <emphasis>Which files are important</emphasis> 1291 1292 All Docbook files are in the directory 1293 libstdc++-v3/doc/xml 1294 1295 Inside this directory, the files of importance: 1296 spine.xml - index to documentation set 1297 manual/spine.xml - index to manual 1298 manual/*.xml - individual chapters and sections of the manual 1299 faq.xml - index to FAQ 1300 api.xml - index to source level / API 1301 1302 All *.txml files are template xml files, i.e., otherwise empty files with 1303 the correct structure, suitable for filling in with new information. 1304 1305 <emphasis>Canonical Writing Style</emphasis> 1306 1307 class template 1308 function template 1309 member function template 1310 (via C++ Templates, Vandevoorde) 1311 1312 class in namespace std: allocator, not std::allocator 1313 1314 header file: iostream, not <iostream> 1315 1316 1317 <emphasis>General structure</emphasis> 1318 1319 <set> 1320 <book> 1321 </book> 1322 1323 <book> 1324 <chapter> 1325 </chapter> 1326 </book> 1327 1328 <book> 1329 <part> 1330 <chapter> 1331 <section> 1332 </section> 1333 1334 <sect1> 1335 </sect1> 1336 1337 <sect1> 1338 <sect2> 1339 </sect2> 1340 </sect1> 1341 </chapter> 1342 1343 <chapter> 1344 </chapter> 1345 </part> 1346 </book> 1347 1348 </set> 1349 </literallayout> 1350 </sect3> 1351 1352 <sect3 id="docbook.markup"> 1353 <title>Markup By Example</title> 1354 1355 <para> 1356 Complete details on Docbook markup can be found in the DocBook 1357 Element Reference, 1358 <ulink url="http://www.docbook.org/tdg/en/html/part2.html">online</ulink>. 1359 An incomplete reference for HTML to Docbook conversion is 1360 detailed in the table below. 1361 </para> 1362 1363<table frame='all'> 1364<title>HTML to Docbook XML Markup Comparison</title> 1365<tgroup cols='2' align='left' colsep='1' rowsep='1'> 1366<colspec colname='c1'></colspec> 1367<colspec colname='c2'></colspec> 1368 1369 <thead> 1370 <row> 1371 <entry>HTML</entry> 1372 <entry>Docbook</entry> 1373 </row> 1374 </thead> 1375 1376 <tbody> 1377 <row> 1378 <entry><p></entry> 1379 <entry><para></entry> 1380 </row> 1381 <row> 1382 <entry><pre></entry> 1383 <entry><computeroutput>, <programlisting>, 1384 <literallayout></entry> 1385 </row> 1386 <row> 1387 <entry><ul></entry> 1388 <entry><itemizedlist></entry> 1389 </row> 1390 <row> 1391 <entry><ol></entry> 1392 <entry><orderedlist></entry> 1393 </row> 1394 <row> 1395 <entry><il></entry> 1396 <entry><listitem></entry> 1397 </row> 1398 <row> 1399 <entry><dl></entry> 1400 <entry><variablelist></entry> 1401 </row> 1402 <row> 1403 <entry><dt></entry> 1404 <entry><term></entry> 1405 </row> 1406 <row> 1407 <entry><dd></entry> 1408 <entry><listitem></entry> 1409 </row> 1410 1411 <row> 1412 <entry><a href=""></entry> 1413 <entry><ulink url=""></entry> 1414 </row> 1415 <row> 1416 <entry><code></entry> 1417 <entry><literal>, <programlisting></entry> 1418 </row> 1419 <row> 1420 <entry><strong></entry> 1421 <entry><emphasis></entry> 1422 </row> 1423 <row> 1424 <entry><em></entry> 1425 <entry><emphasis></entry> 1426 </row> 1427 <row> 1428 <entry>"</entry> 1429 <entry><quote></entry> 1430 </row> 1431 </tbody> 1432</tgroup> 1433</table> 1434 1435<para> 1436 And examples of detailed markup for which there are no real HTML 1437 equivalents are listed in the table below. 1438</para> 1439 1440<table frame='all'> 1441<title>Docbook XML Element Use</title> 1442<tgroup cols='2' align='left' colsep='1' rowsep='1'> 1443<colspec colname='c1'></colspec> 1444<colspec colname='c2'></colspec> 1445 1446 <thead> 1447 <row> 1448 <entry>Element</entry> 1449 <entry>Use</entry> 1450 </row> 1451 </thead> 1452 1453 <tbody> 1454 <row> 1455 <entry><structname></entry> 1456 <entry><structname>char_traits</structname></entry> 1457 </row> 1458 <row> 1459 <entry><classname></entry> 1460 <entry><classname>string</classname></entry> 1461 </row> 1462 <row> 1463 <entry><function></entry> 1464 <entry> 1465 <para><function>clear()</function></para> 1466 <para><function>fs.clear()</function></para> 1467 </entry> 1468 </row> 1469 <row> 1470 <entry><type></entry> 1471 <entry><type>long long</type></entry> 1472 </row> 1473 <row> 1474 <entry><varname></entry> 1475 <entry><varname>fs</varname></entry> 1476 </row> 1477 <row> 1478 <entry><literal></entry> 1479 <entry> 1480 <para><literal>-Weffc++</literal></para> 1481 <para><literal>rel_ops</literal></para> 1482 </entry> 1483 </row> 1484 <row> 1485 <entry><constant></entry> 1486 <entry> 1487 <para><constant>_GNU_SOURCE</constant></para> 1488 <para><constant>3.0</constant></para> 1489 </entry> 1490 </row> 1491 <row> 1492 <entry><command></entry> 1493 <entry><command>g++</command></entry> 1494 </row> 1495 <row> 1496 <entry><errortext></entry> 1497 <entry><errortext>In instantiation of</errortext></entry> 1498 </row> 1499 <row> 1500 <entry><filename></entry> 1501 <entry> 1502 <para><filename class="headerfile">ctype.h</filename></para> 1503 <para><filename class="directory">/home/gcc/build</filename></para> 1504 <para><filename class="libraryfile">libstdc++.so</filename></para> 1505 </entry> 1506 </row> 1507 </tbody> 1508</tgroup> 1509</table> 1510 1511 </sect3> 1512 </sect2> 1513 1514 <sect2 id="doc_style.combines"> 1515 <title>Combines</title> 1516 1517 <sect3 id="combines.rules"> 1518 <title>Generating Combines and Assemblages</title> 1519 1520 <para> 1521 The following Makefile rules are defaults, and are usually 1522 aliased to more detailed rules. They are shortcuts for 1523 generating HTML, PDF, Texinfo, XML, or man files and then collecting 1524 the generated files into the build directory's doc directory. 1525 </para> 1526 1527<variablelist> 1528 1529<varlistentry><term> 1530 <emphasis>make doc-html</emphasis> 1531 </term> 1532<listitem> 1533 <para> 1534 Generates multi-page HTML documentation in the following directories: 1535 </para> 1536 <para> 1537 <filename class="directory">doc/libstdc++-api.html</filename> 1538 </para> 1539 <para> 1540 <filename class="directory">doc/libstdc++-manual.html</filename> 1541 </para> 1542</listitem> 1543</varlistentry> 1544 1545<varlistentry><term> 1546 <emphasis>make doc-man</emphasis> 1547 </term> 1548<listitem> 1549 <para> 1550 Generates man pages in the following directory: 1551 </para> 1552 <para> 1553 <filename class="directory">doc/libstdc++-api.man</filename> 1554 </para> 1555</listitem> 1556</varlistentry> 1557 1558<varlistentry><term> 1559 <emphasis>make doc-pdf</emphasis> 1560 </term> 1561<listitem> 1562 <para> 1563 Generates indexed PDF documentation in the following files: 1564 </para> 1565 <para> 1566 <filename>doc/libstdc++-api.pdf</filename> 1567 </para> 1568 <para> 1569 <filename>doc/libstdc++-manual.pdf</filename> 1570 </para> 1571</listitem> 1572</varlistentry> 1573 1574<varlistentry><term> 1575 <emphasis>make doc-texinfo</emphasis> 1576 </term> 1577<listitem> 1578 <para> 1579 Generates Texinfo documentation in the following files: 1580 </para> 1581 <para> 1582 <filename>doc/libstdc++-manual.texinfo</filename> 1583 </para> 1584</listitem> 1585</varlistentry> 1586 1587<varlistentry><term> 1588 <emphasis>make doc-xml</emphasis> 1589 </term> 1590<listitem> 1591 <para> 1592 Generates single-file XML documentation in the following files: 1593 </para> 1594 <para> 1595 <filename>doc/libstdc++-api.xml</filename> 1596 </para> 1597 <para> 1598 <filename>doc/libstdc++-manual.xml</filename> 1599 </para> 1600</listitem> 1601</varlistentry> 1602 1603</variablelist> 1604 1605 1606 </sect3> 1607 </sect2> 1608</sect1> 1609 1610<sect1 id="contrib.design_notes" xreflabel="Design Notes"> 1611 <?dbhtml filename="source_design_notes.html"?> 1612 <title>Design Notes</title> 1613 <para> 1614 </para> 1615 1616 <literallayout> 1617 1618 The Library 1619 ----------- 1620 1621 This paper is covers two major areas: 1622 1623 - Features and policies not mentioned in the standard that 1624 the quality of the library implementation depends on, including 1625 extensions and "implementation-defined" features; 1626 1627 - Plans for required but unimplemented library features and 1628 optimizations to them. 1629 1630 Overhead 1631 -------- 1632 1633 The standard defines a large library, much larger than the standard 1634 C library. A naive implementation would suffer substantial overhead 1635 in compile time, executable size, and speed, rendering it unusable 1636 in many (particularly embedded) applications. The alternative demands 1637 care in construction, and some compiler support, but there is no 1638 need for library subsets. 1639 1640 What are the sources of this overhead? There are four main causes: 1641 1642 - The library is specified almost entirely as templates, which 1643 with current compilers must be included in-line, resulting in 1644 very slow builds as tens or hundreds of thousands of lines 1645 of function definitions are read for each user source file. 1646 Indeed, the entire SGI STL, as well as the dos Reis valarray, 1647 are provided purely as header files, largely for simplicity in 1648 porting. Iostream/locale is (or will be) as large again. 1649 1650 - The library is very flexible, specifying a multitude of hooks 1651 where users can insert their own code in place of defaults. 1652 When these hooks are not used, any time and code expended to 1653 support that flexibility is wasted. 1654 1655 - Templates are often described as causing to "code bloat". In 1656 practice, this refers (when it refers to anything real) to several 1657 independent processes. First, when a class template is manually 1658 instantiated in its entirely, current compilers place the definitions 1659 for all members in a single object file, so that a program linking 1660 to one member gets definitions of all. Second, template functions 1661 which do not actually depend on the template argument are, under 1662 current compilers, generated anew for each instantiation, rather 1663 than being shared with other instantiations. Third, some of the 1664 flexibility mentioned above comes from virtual functions (both in 1665 regular classes and template classes) which current linkers add 1666 to the executable file even when they manifestly cannot be called. 1667 1668 - The library is specified to use a language feature, exceptions, 1669 which in the current gcc compiler ABI imposes a run time and 1670 code space cost to handle the possibility of exceptions even when 1671 they are not used. Under the new ABI (accessed with -fnew-abi), 1672 there is a space overhead and a small reduction in code efficiency 1673 resulting from lost optimization opportunities associated with 1674 non-local branches associated with exceptions. 1675 1676 What can be done to eliminate this overhead? A variety of coding 1677 techniques, and compiler, linker and library improvements and 1678 extensions may be used, as covered below. Most are not difficult, 1679 and some are already implemented in varying degrees. 1680 1681 Overhead: Compilation Time 1682 -------------------------- 1683 1684 Providing "ready-instantiated" template code in object code archives 1685 allows us to avoid generating and optimizing template instantiations 1686 in each compilation unit which uses them. However, the number of such 1687 instantiations that are useful to provide is limited, and anyway this 1688 is not enough, by itself, to minimize compilation time. In particular, 1689 it does not reduce time spent parsing conforming headers. 1690 1691 Quicker header parsing will depend on library extensions and compiler 1692 improvements. One approach is some variation on the techniques 1693 previously marketed as "pre-compiled headers", now standardized as 1694 support for the "export" keyword. "Exported" template definitions 1695 can be placed (once) in a "repository" -- really just a library, but 1696 of template definitions rather than object code -- to be drawn upon 1697 at link time when an instantiation is needed, rather than placed in 1698 header files to be parsed along with every compilation unit. 1699 1700 Until "export" is implemented we can put some of the lengthy template 1701 definitions in #if guards or alternative headers so that users can skip 1702 over the full definitions when they need only the ready-instantiated 1703 specializations. 1704 1705 To be precise, this means that certain headers which define 1706 templates which users normally use only for certain arguments 1707 can be instrumented to avoid exposing the template definitions 1708 to the compiler unless a macro is defined. For example, in 1709 <string>, we might have: 1710 1711 template <class _CharT, ... > class basic_string { 1712 ... // member declarations 1713 }; 1714 ... // operator declarations 1715 1716 #ifdef _STRICT_ISO_ 1717 # if _G_NO_TEMPLATE_EXPORT 1718 # include <bits/std_locale.h> // headers needed by definitions 1719 # ... 1720 # include <bits/string.tcc> // member and global template definitions. 1721 # endif 1722 #endif 1723 1724 Users who compile without specifying a strict-ISO-conforming flag 1725 would not see many of the template definitions they now see, and rely 1726 instead on ready-instantiated specializations in the library. This 1727 technique would be useful for the following substantial components: 1728 string, locale/iostreams, valarray. It would *not* be useful or 1729 usable with the following: containers, algorithms, iterators, 1730 allocator. Since these constitute a large (though decreasing) 1731 fraction of the library, the benefit the technique offers is 1732 limited. 1733 1734 The language specifies the semantics of the "export" keyword, but 1735 the gcc compiler does not yet support it. When it does, problems 1736 with large template inclusions can largely disappear, given some 1737 minor library reorganization, along with the need for the apparatus 1738 described above. 1739 1740 Overhead: Flexibility Cost 1741 -------------------------- 1742 1743 The library offers many places where users can specify operations 1744 to be performed by the library in place of defaults. Sometimes 1745 this seems to require that the library use a more-roundabout, and 1746 possibly slower, way to accomplish the default requirements than 1747 would be used otherwise. 1748 1749 The primary protection against this overhead is thorough compiler 1750 optimization, to crush out layers of inline function interfaces. 1751 Kuck & Associates has demonstrated the practicality of this kind 1752 of optimization. 1753 1754 The second line of defense against this overhead is explicit 1755 specialization. By defining helper function templates, and writing 1756 specialized code for the default case, overhead can be eliminated 1757 for that case without sacrificing flexibility. This takes full 1758 advantage of any ability of the optimizer to crush out degenerate 1759 code. 1760 1761 The library specifies many virtual functions which current linkers 1762 load even when they cannot be called. Some minor improvements to the 1763 compiler and to ld would eliminate any such overhead by simply 1764 omitting virtual functions that the complete program does not call. 1765 A prototype of this work has already been done. For targets where 1766 GNU ld is not used, a "pre-linker" could do the same job. 1767 1768 The main areas in the standard interface where user flexibility 1769 can result in overhead are: 1770 1771 - Allocators: Containers are specified to use user-definable 1772 allocator types and objects, making tuning for the container 1773 characteristics tricky. 1774 1775 - Locales: the standard specifies locale objects used to implement 1776 iostream operations, involving many virtual functions which use 1777 streambuf iterators. 1778 1779 - Algorithms and containers: these may be instantiated on any type, 1780 frequently duplicating code for identical operations. 1781 1782 - Iostreams and strings: users are permitted to use these on their 1783 own types, and specify the operations the stream must use on these 1784 types. 1785 1786 Note that these sources of overhead are _avoidable_. The techniques 1787 to avoid them are covered below. 1788 1789 Code Bloat 1790 ---------- 1791 1792 In the SGI STL, and in some other headers, many of the templates 1793 are defined "inline" -- either explicitly or by their placement 1794 in class definitions -- which should not be inline. This is a 1795 source of code bloat. Matt had remarked that he was relying on 1796 the compiler to recognize what was too big to benefit from inlining, 1797 and generate it out-of-line automatically. However, this also can 1798 result in code bloat except where the linker can eliminate the extra 1799 copies. 1800 1801 Fixing these cases will require an audit of all inline functions 1802 defined in the library to determine which merit inlining, and moving 1803 the rest out of line. This is an issue mainly in chapters 23, 25, and 1804 27. Of course it can be done incrementally, and we should generally 1805 accept patches that move large functions out of line and into ".tcc" 1806 files, which can later be pulled into a repository. Compiler/linker 1807 improvements to recognize very large inline functions and move them 1808 out-of-line, but shared among compilation units, could make this 1809 work unnecessary. 1810 1811 Pre-instantiating template specializations currently produces large 1812 amounts of dead code which bloats statically linked programs. The 1813 current state of the static library, libstdc++.a, is intolerable on 1814 this account, and will fuel further confused speculation about a need 1815 for a library "subset". A compiler improvement that treats each 1816 instantiated function as a separate object file, for linking purposes, 1817 would be one solution to this problem. An alternative would be to 1818 split up the manual instantiation files into dozens upon dozens of 1819 little files, each compiled separately, but an abortive attempt at 1820 this was done for <string> and, though it is far from complete, it 1821 is already a nuisance. A better interim solution (just until we have 1822 "export") is badly needed. 1823 1824 When building a shared library, the current compiler/linker cannot 1825 automatically generate the instantiations needed. This creates a 1826 miserable situation; it means any time something is changed in the 1827 library, before a shared library can be built someone must manually 1828 copy the declarations of all templates that are needed by other parts 1829 of the library to an "instantiation" file, and add it to the build 1830 system to be compiled and linked to the library. This process is 1831 readily automated, and should be automated as soon as possible. 1832 Users building their own shared libraries experience identical 1833 frustrations. 1834 1835 Sharing common aspects of template definitions among instantiations 1836 can radically reduce code bloat. The compiler could help a great 1837 deal here by recognizing when a function depends on nothing about 1838 a template parameter, or only on its size, and giving the resulting 1839 function a link-name "equate" that allows it to be shared with other 1840 instantiations. Implementation code could take advantage of the 1841 capability by factoring out code that does not depend on the template 1842 argument into separate functions to be merged by the compiler. 1843 1844 Until such a compiler optimization is implemented, much can be done 1845 manually (if tediously) in this direction. One such optimization is 1846 to derive class templates from non-template classes, and move as much 1847 implementation as possible into the base class. Another is to partial- 1848 specialize certain common instantiations, such as vector<T*>, to share 1849 code for instantiations on all types T. While these techniques work, 1850 they are far from the complete solution that a compiler improvement 1851 would afford. 1852 1853 Overhead: Expensive Language Features 1854 ------------------------------------- 1855 1856 The main "expensive" language feature used in the standard library 1857 is exception support, which requires compiling in cleanup code with 1858 static table data to locate it, and linking in library code to use 1859 the table. For small embedded programs the amount of such library 1860 code and table data is assumed by some to be excessive. Under the 1861 "new" ABI this perception is generally exaggerated, although in some 1862 cases it may actually be excessive. 1863 1864 To implement a library which does not use exceptions directly is 1865 not difficult given minor compiler support (to "turn off" exceptions 1866 and ignore exception constructs), and results in no great library 1867 maintenance difficulties. To be precise, given "-fno-exceptions", 1868 the compiler should treat "try" blocks as ordinary blocks, and 1869 "catch" blocks as dead code to ignore or eliminate. Compiler 1870 support is not strictly necessary, except in the case of "function 1871 try blocks"; otherwise the following macros almost suffice: 1872 1873 #define throw(X) 1874 #define try if (true) 1875 #define catch(X) else if (false) 1876 1877 However, there may be a need to use function try blocks in the 1878 library implementation, and use of macros in this way can make 1879 correct diagnostics impossible. Furthermore, use of this scheme 1880 would require the library to call a function to re-throw exceptions 1881 from a try block. Implementing the above semantics in the compiler 1882 is preferable. 1883 1884 Given the support above (however implemented) it only remains to 1885 replace code that "throws" with a call to a well-documented "handler" 1886 function in a separate compilation unit which may be replaced by 1887 the user. The main source of exceptions that would be difficult 1888 for users to avoid is memory allocation failures, but users can 1889 define their own memory allocation primitives that never throw. 1890 Otherwise, the complete list of such handlers, and which library 1891 functions may call them, would be needed for users to be able to 1892 implement the necessary substitutes. (Fortunately, they have the 1893 source code.) 1894 1895 Opportunities 1896 ------------- 1897 1898 The template capabilities of C++ offer enormous opportunities for 1899 optimizing common library operations, well beyond what would be 1900 considered "eliminating overhead". In particular, many operations 1901 done in Glibc with macros that depend on proprietary language 1902 extensions can be implemented in pristine Standard C++. For example, 1903 the chapter 25 algorithms, and even C library functions such as strchr, 1904 can be specialized for the case of static arrays of known (small) size. 1905 1906 Detailed optimization opportunities are identified below where 1907 the component where they would appear is discussed. Of course new 1908 opportunities will be identified during implementation. 1909 1910 Unimplemented Required Library Features 1911 --------------------------------------- 1912 1913 The standard specifies hundreds of components, grouped broadly by 1914 chapter. These are listed in excruciating detail in the CHECKLIST 1915 file. 1916 1917 17 general 1918 18 support 1919 19 diagnostics 1920 20 utilities 1921 21 string 1922 22 locale 1923 23 containers 1924 24 iterators 1925 25 algorithms 1926 26 numerics 1927 27 iostreams 1928 Annex D backward compatibility 1929 1930 Anyone participating in implementation of the library should obtain 1931 a copy of the standard, ISO 14882. People in the U.S. can obtain an 1932 electronic copy for US$18 from ANSI's web site. Those from other 1933 countries should visit http://www.iso.org/ to find out the location 1934 of their country's representation in ISO, in order to know who can 1935 sell them a copy. 1936 1937 The emphasis in the following sections is on unimplemented features 1938 and optimization opportunities. 1939 1940 Chapter 17 General 1941 ------------------- 1942 1943 Chapter 17 concerns overall library requirements. 1944 1945 The standard doesn't mention threads. A multi-thread (MT) extension 1946 primarily affects operators new and delete (18), allocator (20), 1947 string (21), locale (22), and iostreams (27). The common underlying 1948 support needed for this is discussed under chapter 20. 1949 1950 The standard requirements on names from the C headers create a 1951 lot of work, mostly done. Names in the C headers must be visible 1952 in the std:: and sometimes the global namespace; the names in the 1953 two scopes must refer to the same object. More stringent is that 1954 Koenig lookup implies that any types specified as defined in std:: 1955 really are defined in std::. Names optionally implemented as 1956 macros in C cannot be macros in C++. (An overview may be read at 1957 <http://www.cantrip.org/cheaders.html>). The scripts "inclosure" 1958 and "mkcshadow", and the directories shadow/ and cshadow/, are the 1959 beginning of an effort to conform in this area. 1960 1961 A correct conforming definition of C header names based on underlying 1962 C library headers, and practical linking of conforming namespaced 1963 customer code with third-party C libraries depends ultimately on 1964 an ABI change, allowing namespaced C type names to be mangled into 1965 type names as if they were global, somewhat as C function names in a 1966 namespace, or C++ global variable names, are left unmangled. Perhaps 1967 another "extern" mode, such as 'extern "C-global"' would be an 1968 appropriate place for such type definitions. Such a type would 1969 affect mangling as follows: 1970 1971 namespace A { 1972 struct X {}; 1973 extern "C-global" { // or maybe just 'extern "C"' 1974 struct Y {}; 1975 }; 1976 } 1977 void f(A::X*); // mangles to f__FPQ21A1X 1978 void f(A::Y*); // mangles to f__FP1Y 1979 1980 (It may be that this is really the appropriate semantics for regular 1981 'extern "C"', and 'extern "C-global"', as an extension, would not be 1982 necessary.) This would allow functions declared in non-standard C headers 1983 (and thus fixable by neither us nor users) to link properly with functions 1984 declared using C types defined in properly-namespaced headers. The 1985 problem this solves is that C headers (which C++ programmers do persist 1986 in using) frequently forward-declare C struct tags without including 1987 the header where the type is defined, as in 1988 1989 struct tm; 1990 void munge(tm*); 1991 1992 Without some compiler accommodation, munge cannot be called by correct 1993 C++ code using a pointer to a correctly-scoped tm* value. 1994 1995 The current C headers use the preprocessor extension "#include_next", 1996 which the compiler complains about when run "-pedantic". 1997 (Incidentally, it appears that "-fpedantic" is currently ignored, 1998 probably a bug.) The solution in the C compiler is to use 1999 "-isystem" rather than "-I", but unfortunately in g++ this seems 2000 also to wrap the whole header in an 'extern "C"' block, so it's 2001 unusable for C++ headers. The correct solution appears to be to 2002 allow the various special include-directory options, if not given 2003 an argument, to affect subsequent include-directory options additively, 2004 so that if one said 2005 2006 -pedantic -iprefix $(prefix) \ 2007 -idirafter -ino-pedantic -ino-extern-c -iwithprefix -I g++-v3 \ 2008 -iwithprefix -I g++-v3/ext 2009 2010 the compiler would search $(prefix)/g++-v3 and not report 2011 pedantic warnings for files found there, but treat files in 2012 $(prefix)/g++-v3/ext pedantically. (The undocumented semantics 2013 of "-isystem" in g++ stink. Can they be rescinded? If not it 2014 must be replaced with something more rationally behaved.) 2015 2016 All the C headers need the treatment above; in the standard these 2017 headers are mentioned in various chapters. Below, I have only 2018 mentioned those that present interesting implementation issues. 2019 2020 The components identified as "mostly complete", below, have not been 2021 audited for conformance. In many cases where the library passes 2022 conformance tests we have non-conforming extensions that must be 2023 wrapped in #if guards for "pedantic" use, and in some cases renamed 2024 in a conforming way for continued use in the implementation regardless 2025 of conformance flags. 2026 2027 The STL portion of the library still depends on a header 2028 stl/bits/stl_config.h full of #ifdef clauses. This apparatus 2029 should be replaced with autoconf/automake machinery. 2030 2031 The SGI STL defines a type_traits<> template, specialized for 2032 many types in their code including the built-in numeric and 2033 pointer types and some library types, to direct optimizations of 2034 standard functions. The SGI compiler has been extended to generate 2035 specializations of this template automatically for user types, 2036 so that use of STL templates on user types can take advantage of 2037 these optimizations. Specializations for other, non-STL, types 2038 would make more optimizations possible, but extending the gcc 2039 compiler in the same way would be much better. Probably the next 2040 round of standardization will ratify this, but probably with 2041 changes, so it probably should be renamed to place it in the 2042 implementation namespace. 2043 2044 The SGI STL also defines a large number of extensions visible in 2045 standard headers. (Other extensions that appear in separate headers 2046 have been sequestered in subdirectories ext/ and backward/.) All 2047 these extensions should be moved to other headers where possible, 2048 and in any case wrapped in a namespace (not std!), and (where kept 2049 in a standard header) girded about with macro guards. Some cannot be 2050 moved out of standard headers because they are used to implement 2051 standard features. The canonical method for accommodating these 2052 is to use a protected name, aliased in macro guards to a user-space 2053 name. Unfortunately C++ offers no satisfactory template typedef 2054 mechanism, so very ad-hoc and unsatisfactory aliasing must be used 2055 instead. 2056 2057 Implementation of a template typedef mechanism should have the highest 2058 priority among possible extensions, on the same level as implementation 2059 of the template "export" feature. 2060 2061 Chapter 18 Language support 2062 ---------------------------- 2063 2064 Headers: <limits> <new> <typeinfo> <exception> 2065 C headers: <cstddef> <climits> <cfloat> <cstdarg> <csetjmp> 2066 <ctime> <csignal> <cstdlib> (also 21, 25, 26) 2067 2068 This defines the built-in exceptions, rtti, numeric_limits<>, 2069 operator new and delete. Much of this is provided by the 2070 compiler in its static runtime library. 2071 2072 Work to do includes defining numeric_limits<> specializations in 2073 separate files for all target architectures. Values for integer types 2074 except for bool and wchar_t are readily obtained from the C header 2075 <limits.h>, but values for the remaining numeric types (bool, wchar_t, 2076 float, double, long double) must be entered manually. This is 2077 largely dog work except for those members whose values are not 2078 easily deduced from available documentation. Also, this involves 2079 some work in target configuration to identify the correct choice of 2080 file to build against and to install. 2081 2082 The definitions of the various operators new and delete must be 2083 made thread-safe, which depends on a portable exclusion mechanism, 2084 discussed under chapter 20. Of course there is always plenty of 2085 room for improvements to the speed of operators new and delete. 2086 2087 <cstdarg>, in Glibc, defines some macros that gcc does not allow to 2088 be wrapped into an inline function. Probably this header will demand 2089 attention whenever a new target is chosen. The functions atexit(), 2090 exit(), and abort() in cstdlib have different semantics in C++, so 2091 must be re-implemented for C++. 2092 2093 Chapter 19 Diagnostics 2094 ----------------------- 2095 2096 Headers: <stdexcept> 2097 C headers: <cassert> <cerrno> 2098 2099 This defines the standard exception objects, which are "mostly complete". 2100 Cygnus has a version, and now SGI provides a slightly different one. 2101 It makes little difference which we use. 2102 2103 The C global name "errno", which C allows to be a variable or a macro, 2104 is required in C++ to be a macro. For MT it must typically result in 2105 a function call. 2106 2107 Chapter 20 Utilities 2108 --------------------- 2109 Headers: <utility> <functional> <memory> 2110 C header: <ctime> (also in 18) 2111 2112 SGI STL provides "mostly complete" versions of all the components 2113 defined in this chapter. However, the auto_ptr<> implementation 2114 is known to be wrong. Furthermore, the standard definition of it 2115 is known to be unimplementable as written. A minor change to the 2116 standard would fix it, and auto_ptr<> should be adjusted to match. 2117 2118 Multi-threading affects the allocator implementation, and there must 2119 be configuration/installation choices for different users' MT 2120 requirements. Anyway, users will want to tune allocator options 2121 to support different target conditions, MT or no. 2122 2123 The primitives used for MT implementation should be exposed, as an 2124 extension, for users' own work. We need cross-CPU "mutex" support, 2125 multi-processor shared-memory atomic integer operations, and single- 2126 processor uninterruptible integer operations, and all three configurable 2127 to be stubbed out for non-MT use, or to use an appropriately-loaded 2128 dynamic library for the actual runtime environment, or statically 2129 compiled in for cases where the target architecture is known. 2130 2131 Chapter 21 String 2132 ------------------ 2133 Headers: <string> 2134 C headers: <cctype> <cwctype> <cstring> <cwchar> (also in 27) 2135 <cstdlib> (also in 18, 25, 26) 2136 2137 We have "mostly-complete" char_traits<> implementations. Many of the 2138 char_traits<char> operations might be optimized further using existing 2139 proprietary language extensions. 2140 2141 We have a "mostly-complete" basic_string<> implementation. The work 2142 to manually instantiate char and wchar_t specializations in object 2143 files to improve link-time behavior is extremely unsatisfactory, 2144 literally tripling library-build time with no commensurate improvement 2145 in static program link sizes. It must be redone. (Similar work is 2146 needed for some components in chapters 22 and 27.) 2147 2148 Other work needed for strings is MT-safety, as discussed under the 2149 chapter 20 heading. 2150 2151 The standard C type mbstate_t from <cwchar> and used in char_traits<> 2152 must be different in C++ than in C, because in C++ the default constructor 2153 value mbstate_t() must be the "base" or "ground" sequence state. 2154 (According to the likely resolution of a recently raised Core issue, 2155 this may become unnecessary. However, there are other reasons to 2156 use a state type not as limited as whatever the C library provides.) 2157 If we might want to provide conversions from (e.g.) internally- 2158 represented EUC-wide to externally-represented Unicode, or vice- 2159 versa, the mbstate_t we choose will need to be more accommodating 2160 than what might be provided by an underlying C library. 2161 2162 There remain some basic_string template-member functions which do 2163 not overload properly with their non-template brethren. The infamous 2164 hack akin to what was done in vector<> is needed, to conform to 2165 23.1.1 para 10. The CHECKLIST items for basic_string marked 'X', 2166 or incomplete, are so marked for this reason. 2167 2168 Replacing the string iterators, which currently are simple character 2169 pointers, with class objects would greatly increase the safety of the 2170 client interface, and also permit a "debug" mode in which range, 2171 ownership, and validity are rigorously checked. The current use of 2172 raw pointers as string iterators is evil. vector<> iterators need the 2173 same treatment. Note that the current implementation freely mixes 2174 pointers and iterators, and that must be fixed before safer iterators 2175 can be introduced. 2176 2177 Some of the functions in <cstring> are different from the C version. 2178 generally overloaded on const and non-const argument pointers. For 2179 example, in <cstring> strchr is overloaded. The functions isupper 2180 etc. in <cctype> typically implemented as macros in C are functions 2181 in C++, because they are overloaded with others of the same name 2182 defined in <locale>. 2183 2184 Many of the functions required in <cwctype> and <cwchar> cannot be 2185 implemented using underlying C facilities on intended targets because 2186 such facilities only partly exist. 2187 2188 Chapter 22 Locale 2189 ------------------ 2190 Headers: <locale> 2191 C headers: <clocale> 2192 2193 We have a "mostly complete" class locale, with the exception of 2194 code for constructing, and handling the names of, named locales. 2195 The ways that locales are named (particularly when categories 2196 (e.g. LC_TIME, LC_COLLATE) are different) varies among all target 2197 environments. This code must be written in various versions and 2198 chosen by configuration parameters. 2199 2200 Members of many of the facets defined in <locale> are stubs. Generally, 2201 there are two sets of facets: the base class facets (which are supposed 2202 to implement the "C" locale) and the "byname" facets, which are supposed 2203 to read files to determine their behavior. The base ctype<>, collate<>, 2204 and numpunct<> facets are "mostly complete", except that the table of 2205 bitmask values used for "is" operations, and corresponding mask values, 2206 are still defined in libio and just included/linked. (We will need to 2207 implement these tables independently, soon, but should take advantage 2208 of libio where possible.) The num_put<>::put members for integer types 2209 are "mostly complete". 2210 2211 A complete list of what has and has not been implemented may be 2212 found in CHECKLIST. However, note that the current definition of 2213 codecvt<wchar_t,char,mbstate_t> is wrong. It should simply write 2214 out the raw bytes representing the wide characters, rather than 2215 trying to convert each to a corresponding single "char" value. 2216 2217 Some of the facets are more important than others. Specifically, 2218 the members of ctype<>, numpunct<>, num_put<>, and num_get<> facets 2219 are used by other library facilities defined in <string>, <istream>, 2220 and <ostream>, and the codecvt<> facet is used by basic_filebuf<> 2221 in <fstream>, so a conforming iostream implementation depends on 2222 these. 2223 2224 The "long long" type eventually must be supported, but code mentioning 2225 it should be wrapped in #if guards to allow pedantic-mode compiling. 2226 2227 Performance of num_put<> and num_get<> depend critically on 2228 caching computed values in ios_base objects, and on extensions 2229 to the interface with streambufs. 2230 2231 Specifically: retrieving a copy of the locale object, extracting 2232 the needed facets, and gathering data from them, for each call to 2233 (e.g.) operator<< would be prohibitively slow. To cache format 2234 data for use by num_put<> and num_get<> we have a _Format_cache<> 2235 object stored in the ios_base::pword() array. This is constructed 2236 and initialized lazily, and is organized purely for utility. It 2237 is discarded when a new locale with different facets is imbued. 2238 2239 Using only the public interfaces of the iterator arguments to the 2240 facet functions would limit performance by forbidding "vector-style" 2241 character operations. The streambuf iterator optimizations are 2242 described under chapter 24, but facets can also bypass the streambuf 2243 iterators via explicit specializations and operate directly on the 2244 streambufs, and use extended interfaces to get direct access to the 2245 streambuf internal buffer arrays. These extensions are mentioned 2246 under chapter 27. These optimizations are particularly important 2247 for input parsing. 2248 2249 Unused virtual members of locale facets can be omitted, as mentioned 2250 above, by a smart linker. 2251 2252 Chapter 23 Containers 2253 ---------------------- 2254 Headers: <deque> <list> <queue> <stack> <vector> <map> <set> <bitset> 2255 2256 All the components in chapter 23 are implemented in the SGI STL. 2257 They are "mostly complete"; they include a large number of 2258 nonconforming extensions which must be wrapped. Some of these 2259 are used internally and must be renamed or duplicated. 2260 2261 The SGI components are optimized for large-memory environments. For 2262 embedded targets, different criteria might be more appropriate. Users 2263 will want to be able to tune this behavior. We should provide 2264 ways for users to compile the library with different memory usage 2265 characteristics. 2266 2267 A lot more work is needed on factoring out common code from different 2268 specializations to reduce code size here and in chapter 25. The 2269 easiest fix for this would be a compiler/ABI improvement that allows 2270 the compiler to recognize when a specialization depends only on the 2271 size (or other gross quality) of a template argument, and allow the 2272 linker to share the code with similar specializations. In its 2273 absence, many of the algorithms and containers can be partial- 2274 specialized, at least for the case of pointers, but this only solves 2275 a small part of the problem. Use of a type_traits-style template 2276 allows a few more optimization opportunities, more if the compiler 2277 can generate the specializations automatically. 2278 2279 As an optimization, containers can specialize on the default allocator 2280 and bypass it, or take advantage of details of its implementation 2281 after it has been improved upon. 2282 2283 Replacing the vector iterators, which currently are simple element 2284 pointers, with class objects would greatly increase the safety of the 2285 client interface, and also permit a "debug" mode in which range, 2286 ownership, and validity are rigorously checked. The current use of 2287 pointers for iterators is evil. 2288 2289 As mentioned for chapter 24, the deque iterator is a good example of 2290 an opportunity to implement a "staged" iterator that would benefit 2291 from specializations of some algorithms. 2292 2293 Chapter 24 Iterators 2294 --------------------- 2295 Headers: <iterator> 2296 2297 Standard iterators are "mostly complete", with the exception of 2298 the stream iterators, which are not yet templatized on the 2299 stream type. Also, the base class template iterator<> appears 2300 to be wrong, so everything derived from it must also be wrong, 2301 currently. 2302 2303 The streambuf iterators (currently located in stl/bits/std_iterator.h, 2304 but should be under bits/) can be rewritten to take advantage of 2305 friendship with the streambuf implementation. 2306 2307 Matt Austern has identified opportunities where certain iterator 2308 types, particularly including streambuf iterators and deque 2309 iterators, have a "two-stage" quality, such that an intermediate 2310 limit can be checked much more quickly than the true limit on 2311 range operations. If identified with a member of iterator_traits, 2312 algorithms may be specialized for this case. Of course the 2313 iterators that have this quality can be identified by specializing 2314 a traits class. 2315 2316 Many of the algorithms must be specialized for the streambuf 2317 iterators, to take advantage of block-mode operations, in order 2318 to allow iostream/locale operations' performance not to suffer. 2319 It may be that they could be treated as staged iterators and 2320 take advantage of those optimizations. 2321 2322 Chapter 25 Algorithms 2323 ---------------------- 2324 Headers: <algorithm> 2325 C headers: <cstdlib> (also in 18, 21, 26)) 2326 2327 The algorithms are "mostly complete". As mentioned above, they 2328 are optimized for speed at the expense of code and data size. 2329 2330 Specializations of many of the algorithms for non-STL types would 2331 give performance improvements, but we must use great care not to 2332 interfere with fragile template overloading semantics for the 2333 standard interfaces. Conventionally the standard function template 2334 interface is an inline which delegates to a non-standard function 2335 which is then overloaded (this is already done in many places in 2336 the library). Particularly appealing opportunities for the sake of 2337 iostream performance are for copy and find applied to streambuf 2338 iterators or (as noted elsewhere) for staged iterators, of which 2339 the streambuf iterators are a good example. 2340 2341 The bsearch and qsort functions cannot be overloaded properly as 2342 required by the standard because gcc does not yet allow overloading 2343 on the extern-"C"-ness of a function pointer. 2344 2345 Chapter 26 Numerics 2346 -------------------- 2347 Headers: <complex> <valarray> <numeric> 2348 C headers: <cmath>, <cstdlib> (also 18, 21, 25) 2349 2350 Numeric components: Gabriel dos Reis's valarray, Drepper's complex, 2351 and the few algorithms from the STL are "mostly done". Of course 2352 optimization opportunities abound for the numerically literate. It 2353 is not clear whether the valarray implementation really conforms 2354 fully, in the assumptions it makes about aliasing (and lack thereof) 2355 in its arguments. 2356 2357 The C div() and ldiv() functions are interesting, because they are the 2358 only case where a C library function returns a class object by value. 2359 Since the C++ type div_t must be different from the underlying C type 2360 (which is in the wrong namespace) the underlying functions div() and 2361 ldiv() cannot be re-used efficiently. Fortunately they are trivial to 2362 re-implement. 2363 2364 Chapter 27 Iostreams 2365 --------------------- 2366 Headers: <iosfwd> <streambuf> <ios> <ostream> <istream> <iostream> 2367 <iomanip> <sstream> <fstream> 2368 C headers: <cstdio> <cwchar> (also in 21) 2369 2370 Iostream is currently in a very incomplete state. <iosfwd>, <iomanip>, 2371 ios_base, and basic_ios<> are "mostly complete". basic_streambuf<> and 2372 basic_ostream<> are well along, but basic_istream<> has had little work 2373 done. The standard stream objects, <sstream> and <fstream> have been 2374 started; basic_filebuf<> "write" functions have been implemented just 2375 enough to do "hello, world". 2376 2377 Most of the istream and ostream operators << and >> (with the exception 2378 of the op<<(integer) ones) have not been changed to use locale primitives, 2379 sentry objects, or char_traits members. 2380 2381 All these templates should be manually instantiated for char and 2382 wchar_t in a way that links only used members into user programs. 2383 2384 Streambuf is fertile ground for optimization extensions. An extended 2385 interface giving iterator access to its internal buffer would be very 2386 useful for other library components. 2387 2388 Iostream operations (primarily operators << and >>) can take advantage 2389 of the case where user code has not specified a locale, and bypass locale 2390 operations entirely. The current implementation of op<</num_put<>::put, 2391 for the integer types, demonstrates how they can cache encoding details 2392 from the locale on each operation. There is lots more room for 2393 optimization in this area. 2394 2395 The definition of the relationship between the standard streams 2396 cout et al. and stdout et al. requires something like a "stdiobuf". 2397 The SGI solution of using double-indirection to actually use a 2398 stdio FILE object for buffering is unsatisfactory, because it 2399 interferes with peephole loop optimizations. 2400 2401 The <sstream> header work has begun. stringbuf can benefit from 2402 friendship with basic_string<> and basic_string<>::_Rep to use 2403 those objects directly as buffers, and avoid allocating and making 2404 copies. 2405 2406 The basic_filebuf<> template is a complex beast. It is specified to 2407 use the locale facet codecvt<> to translate characters between native 2408 files and the locale character encoding. In general this involves 2409 two buffers, one of "char" representing the file and another of 2410 "char_type", for the stream, with codecvt<> translating. The process 2411 is complicated by the variable-length nature of the translation, and 2412 the need to seek to corresponding places in the two representations. 2413 For the case of basic_filebuf<char>, when no translation is needed, 2414 a single buffer suffices. A specialized filebuf can be used to reduce 2415 code space overhead when no locale has been imbued. Matt Austern's 2416 work at SGI will be useful, perhaps directly as a source of code, or 2417 at least as an example to draw on. 2418 2419 Filebuf, almost uniquely (cf. operator new), depends heavily on 2420 underlying environmental facilities. In current releases iostream 2421 depends fairly heavily on libio constant definitions, but it should 2422 be made independent. It also depends on operating system primitives 2423 for file operations. There is immense room for optimizations using 2424 (e.g.) mmap for reading. The shadow/ directory wraps, besides the 2425 standard C headers, the libio.h and unistd.h headers, for use mainly 2426 by filebuf. These wrappings have not been completed, though there 2427 is scaffolding in place. 2428 2429 The encapsulation of certain C header <cstdio> names presents an 2430 interesting problem. It is possible to define an inline std::fprintf() 2431 implemented in terms of the 'extern "C"' vfprintf(), but there is no 2432 standard vfscanf() to use to implement std::fscanf(). It appears that 2433 vfscanf but be re-implemented in C++ for targets where no vfscanf 2434 extension has been defined. This is interesting in that it seems 2435 to be the only significant case in the C library where this kind of 2436 rewriting is necessary. (Of course Glibc provides the vfscanf() 2437 extension.) (The functions related to exit() must be rewritten 2438 for other reasons.) 2439 2440 2441 Annex D 2442 ------- 2443 Headers: <strstream> 2444 2445 Annex D defines many non-library features, and many minor 2446 modifications to various headers, and a complete header. 2447 It is "mostly done", except that the libstdc++-2 <strstream> 2448 header has not been adopted into the library, or checked to 2449 verify that it matches the draft in those details that were 2450 clarified by the committee. Certainly it must at least be 2451 moved into the std namespace. 2452 2453 We still need to wrap all the deprecated features in #if guards 2454 so that pedantic compile modes can detect their use. 2455 2456 Nonstandard Extensions 2457 ---------------------- 2458 Headers: <iostream.h> <strstream.h> <hash> <rbtree> 2459 <pthread_alloc> <stdiobuf> (etc.) 2460 2461 User code has come to depend on a variety of nonstandard components 2462 that we must not omit. Much of this code can be adopted from 2463 libstdc++-v2 or from the SGI STL. This particularly includes 2464 <iostream.h>, <strstream.h>, and various SGI extensions such 2465 as <hash_map.h>. Many of these are already placed in the 2466 subdirectories ext/ and backward/. (Note that it is better to 2467 include them via "<backward/hash_map.h>" or "<ext/hash_map>" than 2468 to search the subdirectory itself via a "-I" directive. 2469 </literallayout> 2470</sect1> 2471 2472</appendix> 2473