1 2Standard C++ Library Design Document 3------------------------------------ 4 5This is an overview of libstdc++-v3, with particular attention 6to projects to be done and how they fit into the whole. 7 8The Library 9----------- 10 11This paper is covers two major areas: 12 13 - Features and policies not mentioned in the standard that 14 the quality of the library implementation depends on, including 15 extensions and "implementation-defined" features; 16 17 - Plans for required but unimplemented library features and 18 optimizations to them. 19 20Overhead 21-------- 22 23The standard defines a large library, much larger than the standard 24C library. A naive implementation would suffer substantial overhead 25in compile time, executable size, and speed, rendering it unusable 26in many (particularly embedded) applications. The alternative demands 27care in construction, and some compiler support, but there is no 28need for library subsets. 29 30What are the sources of this overhead? There are four main causes: 31 32 - The library is specified almost entirely as templates, which 33 with current compilers must be included in-line, resulting in 34 very slow builds as tens or hundreds of thousands of lines 35 of function definitions are read for each user source file. 36 Indeed, the entire SGI STL, as well as the dos Reis valarray, 37 are provided purely as header files, largely for simplicity in 38 porting. Iostream/locale is (or will be) as large again. 39 40 - The library is very flexible, specifying a multitude of hooks 41 where users can insert their own code in place of defaults. 42 When these hooks are not used, any time and code expended to 43 support that flexibility is wasted. 44 45 - Templates are often described as causing to "code bloat". In 46 practice, this refers (when it refers to anything real) to several 47 independent processes. First, when a class template is manually 48 instantiated in its entirely, current compilers place the definitions 49 for all members in a single object file, so that a program linking 50 to one member gets definitions of all. Second, template functions 51 which do not actually depend on the template argument are, under 52 current compilers, generated anew for each instantiation, rather 53 than being shared with other instantiations. Third, some of the 54 flexibility mentioned above comes from virtual functions (both in 55 regular classes and template classes) which current linkers add 56 to the executable file even when they manifestly cannot be called. 57 58 - The library is specified to use a language feature, exceptions, 59 which in the current gcc compiler ABI imposes a run time and 60 code space cost to handle the possibility of exceptions even when 61 they are not used. Under the new ABI (accessed with -fnew-abi), 62 there is a space overhead and a small reduction in code efficiency 63 resulting from lost optimization opportunities associated with 64 non-local branches associated with exceptions. 65 66What can be done to eliminate this overhead? A variety of coding 67techniques, and compiler, linker and library improvements and 68extensions may be used, as covered below. Most are not difficult, 69and some are already implemented in varying degrees. 70 71Overhead: Compilation Time 72-------------------------- 73 74Providing "ready-instantiated" template code in object code archives 75allows us to avoid generating and optimizing template instantiations 76in each compilation unit which uses them. However, the number of such 77instantiations that are useful to provide is limited, and anyway this 78is not enough, by itself, to minimize compilation time. In particular, 79it does not reduce time spent parsing conforming headers. 80 81Quicker header parsing will depend on library extensions and compiler 82improvements. One approach is some variation on the techniques 83previously marketed as "pre-compiled headers", now standardized as 84support for the "export" keyword. "Exported" template definitions 85can be placed (once) in a "repository" -- really just a library, but 86of template definitions rather than object code -- to be drawn upon 87at link time when an instantiation is needed, rather than placed in 88header files to be parsed along with every compilation unit. 89 90Until "export" is implemented we can put some of the lengthy template 91definitions in #if guards or alternative headers so that users can skip 92over the the full definitions when they need only the ready-instantiated 93specializations. 94 95To be precise, this means that certain headers which define 96templates which users normally use only for certain arguments 97can be instrumented to avoid exposing the template definitions 98to the compiler unless a macro is defined. For example, in 99<string>, we might have: 100 101 template <class _CharT, ... > class basic_string { 102 ... // member declarations 103 }; 104 ... // operator declarations 105 106 #ifdef _STRICT_ISO_ 107 # if _G_NO_TEMPLATE_EXPORT 108 # include <bits/std_locale.h> // headers needed by definitions 109 # ... 110 # include <bits/string.tcc> // member and global template definitions. 111 # endif 112 #endif 113 114Users who compile without specifying a strict-ISO-conforming flag 115would not see many of the template definitions they now see, and rely 116instead on ready-instantiated specializations in the library. This 117technique would be useful for the following substantial components: 118string, locale/iostreams, valarray. It would *not* be useful or 119usable with the following: containers, algorithms, iterators, 120allocator. Since these constitute a large (though decreasing) 121fraction of the library, the benefit the technique offers is 122limited. 123 124The language specifies the semantics of the "export" keyword, but 125the gcc compiler does not yet support it. When it does, problems 126with large template inclusions can largely disappear, given some 127minor library reorganization, along with the need for the apparatus 128described above. 129 130Overhead: Flexibility Cost 131-------------------------- 132 133The library offers many places where users can specify operations 134to be performed by the library in place of defaults. Sometimes 135this seems to require that the library use a more-roundabout, and 136possibly slower, way to accomplish the default requirements than 137would be used otherwise. 138 139The primary protection against this overhead is thorough compiler 140optimization, to crush out layers of inline function interfaces. 141Kuck & Associates has demonstrated the practicality of this kind 142of optimization. 143 144The second line of defense against this overhead is explicit 145specialization. By defining helper function templates, and writing 146specialized code for the default case, overhead can be eliminated 147for that case without sacrificing flexibility. This takes full 148advantage of any ability of the optimizer to crush out degenerate 149code. 150 151The library specifies many virtual functions which current linkers 152load even when they cannot be called. Some minor improvements to the 153compiler and to ld would eliminate any such overhead by simply 154omitting virtual functions that the complete program does not call. 155A prototype of this work has already been done. For targets where 156GNU ld is not used, a "pre-linker" could do the same job. 157 158The main areas in the standard interface where user flexibility 159can result in overhead are: 160 161 - Allocators: Containers are specified to use user-definable 162 allocator types and objects, making tuning for the container 163 characteristics tricky. 164 165 - Locales: the standard specifies locale objects used to implement 166 iostream operations, involving many virtual functions which use 167 streambuf iterators. 168 169 - Algorithms and containers: these may be instantiated on any type, 170 frequently duplicating code for identical operations. 171 172 - Iostreams and strings: users are permitted to use these on their 173 own types, and specify the operations the stream must use on these 174 types. 175 176Note that these sources of overhead are _avoidable_. The techniques 177to avoid them are covered below. 178 179Code Bloat 180---------- 181 182In the SGI STL, and in some other headers, many of the templates 183are defined "inline" -- either explicitly or by their placement 184in class definitions -- which should not be inline. This is a 185source of code bloat. Matt had remarked that he was relying on 186the compiler to recognize what was too big to benefit from inlining, 187and generate it out-of-line automatically. However, this also can 188result in code bloat except where the linker can eliminate the extra 189copies. 190 191Fixing these cases will require an audit of all inline functions 192defined in the library to determine which merit inlining, and moving 193the rest out of line. This is an issue mainly in chapters 23, 25, and 19427. Of course it can be done incrementally, and we should generally 195accept patches that move large functions out of line and into ".tcc" 196files, which can later be pulled into a repository. Compiler/linker 197improvements to recognize very large inline functions and move them 198out-of-line, but shared among compilation units, could make this 199work unnecessary. 200 201Pre-instantiating template specializations currently produces large 202amounts of dead code which bloats statically linked programs. The 203current state of the static library, libstdc++.a, is intolerable on 204this account, and will fuel further confused speculation about a need 205for a library "subset". A compiler improvement that treats each 206instantiated function as a separate object file, for linking purposes, 207would be one solution to this problem. An alternative would be to 208split up the manual instantiation files into dozens upon dozens of 209little files, each compiled separately, but an abortive attempt at 210this was done for <string> and, though it is far from complete, it 211is already a nuisance. A better interim solution (just until we have 212"export") is badly needed. 213 214When building a shared library, the current compiler/linker cannot 215automatically generate the instantiatiations needed. This creates a 216miserable situation; it means any time something is changed in the 217library, before a shared library can be built someone must manually 218copy the declarations of all templates that are needed by other parts 219of the library to an "instantiation" file, and add it to the build 220system to be compiled and linked to the library. This process is 221readily automated, and should be automated as soon as possible. 222Users building their own shared libraries experience identical 223frustrations. 224 225Sharing common aspects of template definitions among instantiations 226can radically reduce code bloat. The compiler could help a great 227deal here by recognizing when a function depends on nothing about 228a template parameter, or only on its size, and giving the resulting 229function a link-name "equate" that allows it to be shared with other 230instantiations. Implementation code could take advantage of the 231capability by factoring out code that does not depend on the template 232argument into separate functions to be merged by the compiler. 233 234Until such a compiler optimization is implemented, much can be done 235manually (if tediously) in this direction. One such optimization is 236to derive class templates from non-template classes, and move as much 237implementation as possible into the base class. Another is to partial- 238specialize certain common instantiations, such as vector<T*>, to share 239code for instantiations on all types T. While these techniques work, 240they are far from the complete solution that a compiler improvement 241would afford. 242 243Overhead: Expensive Language Features 244------------------------------------- 245 246The main "expensive" language feature used in the standard library 247is exception support, which requires compiling in cleanup code with 248static table data to locate it, and linking in library code to use 249the table. For small embedded programs the amount of such library 250code and table data is assumed by some to be excessive. Under the 251"new" ABI this perception is generally exaggerated, although in some 252cases it may actually be excessive. 253 254To implement a library which does not use exceptions directly is 255not difficult given minor compiler support (to "turn off" exceptions 256and ignore exception constructs), and results in no great library 257maintenance difficulties. To be precise, given "-fno-exceptions", 258the compiler should treat "try" blocks as ordinary blocks, and 259"catch" blocks as dead code to ignore or eliminate. Compiler 260support is not strictly necessary, except in the case of "function 261try blocks"; otherwise the following macros almost suffice: 262 263 #define throw(X) 264 #define try if (true) 265 #define catch(X) else if (false) 266 267However, there may be a need to use function try blocks in the 268library implementation, and use of macros in this way can make 269correct diagnostics impossible. Furthermore, use of this scheme 270would require the library to call a function to re-throw exceptions 271from a try block. Implementing the above semantics in the compiler 272is preferable. 273 274Given the support above (however implemented) it only remains to 275replace code that "throws" with a call to a well-documented "handler" 276function in a separate compilation unit which may be replaced by 277the user. The main source of exceptions that would be difficult 278for users to avoid is memory allocation failures, but users can 279define their own memory allocation primitives that never throw. 280Otherwise, the complete list of such handlers, and which library 281functions may call them, would be needed for users to be able to 282implement the necessary substitutes. (Fortunately, they have the 283source code.) 284 285Opportunities 286------------- 287 288The template capabilities of C++ offer enormous opportunities for 289optimizing common library operations, well beyond what would be 290considered "eliminating overhead". In particular, many operations 291done in Glibc with macros that depend on proprietary language 292extensions can be implemented in pristine Standard C++. For example, 293the chapter 25 algorithms, and even C library functions such as strchr, 294can be specialized for the case of static arrays of known (small) size. 295 296Detailed optimization opportunities are identified below where 297the component where they would appear is discussed. Of course new 298opportunities will be identified during implementation. 299 300Unimplemented Required Library Features 301--------------------------------------- 302 303The standard specifies hundreds of components, grouped broadly by 304chapter. These are listed in excruciating detail in the CHECKLIST 305file. 306 307 17 general 308 18 support 309 19 diagnostics 310 20 utilities 311 21 string 312 22 locale 313 23 containers 314 24 iterators 315 25 algorithms 316 26 numerics 317 27 iostreams 318 Annex D backward compatibility 319 320Anyone participating in implementation of the library should obtain 321a copy of the standard, ISO 14882. People in the U.S. can obtain an 322electronic copy for US$18 from ANSI's web site. Those from other 323countries should visit http://www.iso.ch/ to find out the location 324of their country's representation in ISO, in order to know who can 325sell them a copy. 326 327The emphasis in the following sections is on unimplemented features 328and optimization opportunities. 329 330Chapter 17 General 331------------------- 332 333Chapter 17 concerns overall library requirements. 334 335The standard doesn't mention threads. A multi-thread (MT) extension 336primarily affects operators new and delete (18), allocator (20), 337string (21), locale (22), and iostreams (27). The common underlying 338support needed for this is discussed under chapter 20. 339 340The standard requirements on names from the C headers create a 341lot of work, mostly done. Names in the C headers must be visible 342in the std:: and sometimes the global namespace; the names in the 343two scopes must refer to the same object. More stringent is that 344Koenig lookup implies that any types specified as defined in std:: 345really are defined in std::. Names optionally implemented as 346macros in C cannot be macros in C++. (An overview may be read at 347<http://www.cantrip.org/cheaders.html>). The scripts "inclosure" 348and "mkcshadow", and the directories shadow/ and cshadow/, are the 349beginning of an effort to conform in this area. 350 351A correct conforming definition of C header names based on underlying 352C library headers, and practical linking of conforming namespaced 353customer code with third-party C libraries depends ultimately on 354an ABI change, allowing namespaced C type names to be mangled into 355type names as if they were global, somewhat as C function names in a 356namespace, or C++ global variable names, are left unmangled. Perhaps 357another "extern" mode, such as 'extern "C-global"' would be an 358appropriate place for such type definitions. Such a type would 359affect mangling as follows: 360 361 namespace A { 362 struct X {}; 363 extern "C-global" { // or maybe just 'extern "C"' 364 struct Y {}; 365 }; 366 } 367 void f(A::X*); // mangles to f__FPQ21A1X 368 void f(A::Y*); // mangles to f__FP1Y 369 370(It may be that this is really the appropriate semantics for regular 371'extern "C"', and 'extern "C-global"', as an extension, would not be 372necessary.) This would allow functions declared in non-standard C headers 373(and thus fixable by neither us nor users) to link properly with functions 374declared using C types defined in properly-namespaced headers. The 375problem this solves is that C headers (which C++ programmers do persist 376in using) frequently forward-declare C struct tags without including 377the header where the type is defined, as in 378 379 struct tm; 380 void munge(tm*); 381 382Without some compiler accommodation, munge cannot be called by correct 383C++ code using a pointer to a correctly-scoped tm* value. 384 385The current C headers use the preprocessor extension "#include_next", 386which the compiler complains about when run "-pedantic". 387(Incidentally, it appears that "-fpedantic" is currently ignored, 388probably a bug.) The solution in the C compiler is to use 389"-isystem" rather than "-I", but unfortunately in g++ this seems 390also to wrap the whole header in an 'extern "C"' block, so it's 391unusable for C++ headers. The correct solution appears to be to 392allow the various special include-directory options, if not given 393an argument, to affect subsequent include-directory options additively, 394so that if one said 395 396 -pedantic -iprefix $(prefix) \ 397 -idirafter -ino-pedantic -ino-extern-c -iwithprefix -I g++-v3 \ 398 -iwithprefix -I g++-v3/ext 399 400the compiler would search $(prefix)/g++-v3 and not report 401pedantic warnings for files found there, but treat files in 402$(prefix)/g++-v3/ext pedantically. (The undocumented semantics 403of "-isystem" in g++ stink. Can they be rescinded? If not it 404must be replaced with something more rationally behaved.) 405 406All the C headers need the treatment above; in the standard these 407headers are mentioned in various chapters. Below, I have only 408mentioned those that present interesting implementation issues. 409 410The components identified as "mostly complete", below, have not been 411audited for conformance. In many cases where the library passes 412conformance tests we have non-conforming extensions that must be 413wrapped in #if guards for "pedantic" use, and in some cases renamed 414in a conforming way for continued use in the implementation regardless 415of conformance flags. 416 417The STL portion of the library still depends on a header 418stl/bits/stl_config.h full of #ifdef clauses. This apparatus 419should be replaced with autoconf/automake machinery. 420 421The SGI STL defines a type_traits<> template, specialized for 422many types in their code including the built-in numeric and 423pointer types and some library types, to direct optimizations of 424standard functions. The SGI compiler has been extended to generate 425specializations of this template automatically for user types, 426so that use of STL templates on user types can take advantage of 427these optimizations. Specializations for other, non-STL, types 428would make more optimizations possible, but extending the gcc 429compiler in the same way would be much better. Probably the next 430round of standardization will ratify this, but probably with 431changes, so it probably should be renamed to place it in the 432implementation namespace. 433 434The SGI STL also defines a large number of extensions visible in 435standard headers. (Other extensions that appear in separate headers 436have been sequestered in subdirectories ext/ and backward/.) All 437these extensions should be moved to other headers where possible, 438and in any case wrapped in a namespace (not std!), and (where kept 439in a standard header) girded about with macro guards. Some cannot be 440moved out of standard headers because they are used to implement 441standard features. The canonical method for accommodating these 442is to use a protected name, aliased in macro guards to a user-space 443name. Unfortunately C++ offers no satisfactory template typedef 444mechanism, so very ad-hoc and unsatisfactory aliasing must be used 445instead. 446 447Implementation of a template typedef mechanism should have the highest 448priority among possible extensions, on the same level as implementation 449of the template "export" feature. 450 451Chapter 18 Language support 452---------------------------- 453 454Headers: <limits> <new> <typeinfo> <exception> 455C headers: <cstddef> <climits> <cfloat> <cstdarg> <csetjmp> 456 <ctime> <csignal> <cstdlib> (also 21, 25, 26) 457 458This defines the built-in exceptions, rtti, numeric_limits<>, 459operator new and delete. Much of this is provided by the 460compiler in its static runtime library. 461 462Work to do includes defining numeric_limits<> specializations in 463separate files for all target architectures. Values for integer types 464except for bool and wchar_t are readily obtained from the C header 465<limits.h>, but values for the remaining numeric types (bool, wchar_t, 466float, double, long double) must be entered manually. This is 467largely dog work except for those members whose values are not 468easily deduced from available documentation. Also, this involves 469some work in target configuration to identify the correct choice of 470file to build against and to install. 471 472The definitions of the various operators new and delete must be 473made thread-safe, which depends on a portable exclusion mechanism, 474discussed under chapter 20. Of course there is always plenty of 475room for improvements to the speed of operators new and delete. 476 477<cstdarg>, in Glibc, defines some macros that gcc does not allow to 478be wrapped into an inline function. Probably this header will demand 479attention whenever a new target is chosen. The functions atexit(), 480exit(), and abort() in cstdlib have different semantics in C++, so 481must be re-implemented for C++. 482 483Chapter 19 Diagnostics 484----------------------- 485 486Headers: <stdexcept> 487C headers: <cassert> <cerrno> 488 489This defines the standard exception objects, which are "mostly complete". 490Cygnus has a version, and now SGI provides a slightly different one. 491It makes little difference which we use. 492 493The C global name "errno", which C allows to be a variable or a macro, 494is required in C++ to be a macro. For MT it must typically result in 495a function call. 496 497Chapter 20 Utilities 498--------------------- 499Headers: <utility> <functional> <memory> 500C header: <ctime> (also in 18) 501 502SGI STL provides "mostly complete" versions of all the components 503defined in this chapter. However, the auto_ptr<> implementation 504is known to be wrong. Furthermore, the standard definition of it 505is known to be unimplementable as written. A minor change to the 506standard would fix it, and auto_ptr<> should be adjusted to match. 507 508Multi-threading affects the allocator implementation, and there must 509be configuration/installation choices for different users' MT 510requirements. Anyway, users will want to tune allocator options 511to support different target conditions, MT or no. 512 513The primitives used for MT implementation should be exposed, as an 514extension, for users' own work. We need cross-CPU "mutex" support, 515multi-processor shared-memory atomic integer operations, and single- 516processor uninterruptible integer operations, and all three configurable 517to be stubbed out for non-MT use, or to use an appropriately-loaded 518dynamic library for the actual runtime environment, or statically 519compiled in for cases where the target architecture is known. 520 521Chapter 21 String 522------------------ 523Headers: <string> 524C headers: <cctype> <cwctype> <cstring> <cwchar> (also in 27) 525 <cstdlib> (also in 18, 25, 26) 526 527We have "mostly-complete" char_traits<> implementations. Many of the 528char_traits<char> operations might be optimized further using existing 529proprietary language extensions. 530 531We have a "mostly-complete" basic_string<> implementation. The work 532to manually instantiate char and wchar_t specializations in object 533files to improve link-time behavior is extremely unsatisfactory, 534literally tripling library-build time with no commensurate improvement 535in static program link sizes. It must be redone. (Similar work is 536needed for some components in chapters 22 and 27.) 537 538Other work needed for strings is MT-safety, as discussed under the 539chapter 20 heading. 540 541The standard C type mbstate_t from <cwchar> and used in char_traits<> 542must be different in C++ than in C, because in C++ the default constructor 543value mbstate_t() must be the "base" or "ground" sequence state. 544(According to the likely resolution of a recently raised Core issue, 545this may become unnecessary. However, there are other reasons to 546use a state type not as limited as whatever the C library provides.) 547If we might want to provide conversions from (e.g.) internally- 548represented EUC-wide to externally-represented Unicode, or vice- 549versa, the mbstate_t we choose will need to be more accommodating 550than what might be provided by an underlying C library. 551 552There remain some basic_string template-member functions which do 553not overload properly with their non-template brethren. The infamous 554hack akin to what was done in vector<> is needed, to conform to 55523.1.1 para 10. The CHECKLIST items for basic_string marked 'X', 556or incomplete, are so marked for this reason. 557 558Replacing the string iterators, which currently are simple character 559pointers, with class objects would greatly increase the safety of the 560client interface, and also permit a "debug" mode in which range, 561ownership, and validity are rigorously checked. The current use of 562raw pointers as string iterators is evil. vector<> iterators need the 563same treatment. Note that the current implementation freely mixes 564pointers and iterators, and that must be fixed before safer iterators 565can be introduced. 566 567Some of the functions in <cstring> are different from the C version. 568generally overloaded on const and non-const argument pointers. For 569example, in <cstring> strchr is overloaded. The functions isupper 570etc. in <cctype> typically implemented as macros in C are functions 571in C++, because they are overloaded with others of the same name 572defined in <locale>. 573 574Many of the functions required in <cwctype> and <cwchar> cannot be 575implemented using underlying C facilities on intended targets because 576such facilities only partly exist. 577 578Chapter 22 Locale 579------------------ 580Headers: <locale> 581C headers: <clocale> 582 583We have a "mostly complete" class locale, with the exception of 584code for constructing, and handling the names of, named locales. 585The ways that locales are named (particularly when categories 586(e.g. LC_TIME, LC_COLLATE) are different) varies among all target 587environments. This code must be written in various versions and 588chosen by configuration parameters. 589 590Members of many of the facets defined in <locale> are stubs. Generally, 591there are two sets of facets: the base class facets (which are supposed 592to implement the "C" locale) and the "byname" facets, which are supposed 593to read files to determine their behavior. The base ctype<>, collate<>, 594and numpunct<> facets are "mostly complete", except that the table of 595bitmask values used for "is" operations, and corresponding mask values, 596are still defined in libio and just included/linked. (We will need to 597implement these tables independently, soon, but should take advantage 598of libio where possible.) The num_put<>::put members for integer types 599are "mostly complete". 600 601A complete list of what has and has not been implemented may be 602found in CHECKLIST. However, note that the current definition of 603codecvt<wchar_t,char,mbstate_t> is wrong. It should simply write 604out the raw bytes representing the wide characters, rather than 605trying to convert each to a corresponding single "char" value. 606 607Some of the facets are more important than others. Specifically, 608the members of ctype<>, numpunct<>, num_put<>, and num_get<> facets 609are used by other library facilities defined in <string>, <istream>, 610and <ostream>, and the codecvt<> facet is used by basic_filebuf<> 611in <fstream>, so a conforming iostream implementation depends on 612these. 613 614The "long long" type eventually must be supported, but code mentioning 615it should be wrapped in #if guards to allow pedantic-mode compiling. 616 617Performance of num_put<> and num_get<> depend critically on 618caching computed values in ios_base objects, and on extensions 619to the interface with streambufs. 620 621Specifically: retrieving a copy of the locale object, extracting 622the needed facets, and gathering data from them, for each call to 623(e.g.) operator<< would be prohibitively slow. To cache format 624data for use by num_put<> and num_get<> we have a _Format_cache<> 625object stored in the ios_base::pword() array. This is constructed 626and initialized lazily, and is organized purely for utility. It 627is discarded when a new locale with different facets is imbued. 628 629Using only the public interfaces of the iterator arguments to the 630facet functions would limit performance by forbidding "vector-style" 631character operations. The streambuf iterator optimizations are 632described under chapter 24, but facets can also bypass the streambuf 633iterators via explicit specializations and operate directly on the 634streambufs, and use extended interfaces to get direct access to the 635streambuf internal buffer arrays. These extensions are mentioned 636under chapter 27. These optimizations are particularly important 637for input parsing. 638 639Unused virtual members of locale facets can be omitted, as mentioned 640above, by a smart linker. 641 642Chapter 23 Containers 643---------------------- 644Headers: <deque> <list> <queue> <stack> <vector> <map> <set> <bitset> 645 646All the components in chapter 23 are implemented in the SGI STL. 647They are "mostly complete"; they include a large number of 648nonconforming extensions which must be wrapped. Some of these 649are used internally and must be renamed or duplicated. 650 651The SGI components are optimized for large-memory environments. For 652embedded targets, different criteria might be more appropriate. Users 653will want to be able to tune this behavior. We should provide 654ways for users to compile the library with different memory usage 655characteristics. 656 657A lot more work is needed on factoring out common code from different 658specializations to reduce code size here and in chapter 25. The 659easiest fix for this would be a compiler/ABI improvement that allows 660the compiler to recognize when a specialization depends only on the 661size (or other gross quality) of a template argument, and allow the 662linker to share the code with similar specializations. In its 663absence, many of the algorithms and containers can be partial- 664specialized, at least for the case of pointers, but this only solves 665a small part of the problem. Use of a type_traits-style template 666allows a few more optimization opportunities, more if the compiler 667can generate the specializations automatically. 668 669As an optimization, containers can specialize on the default allocator 670and bypass it, or take advantage of details of its implementation 671after it has been improved upon. 672 673Replacing the vector iterators, which currently are simple element 674pointers, with class objects would greatly increase the safety of the 675client interface, and also permit a "debug" mode in which range, 676ownership, and validity are rigorously checked. The current use of 677pointers for iterators is evil. 678 679As mentioned for chapter 24, the deque iterator is a good example of 680an opportunity to implement a "staged" iterator that would benefit 681from specializations of some algorithms. 682 683Chapter 24 Iterators 684--------------------- 685Headers: <iterator> 686 687Standard iterators are "mostly complete", with the exception of 688the stream iterators, which are not yet templatized on the 689stream type. Also, the base class template iterator<> appears 690to be wrong, so everything derived from it must also be wrong, 691currently. 692 693The streambuf iterators (currently located in stl/bits/std_iterator.h, 694but should be under bits/) can be rewritten to take advantage of 695friendship with the streambuf implementation. 696 697Matt Austern has identified opportunities where certain iterator 698types, particularly including streambuf iterators and deque 699iterators, have a "two-stage" quality, such that an intermediate 700limit can be checked much more quickly than the true limit on 701range operations. If identified with a member of iterator_traits, 702algorithms may be specialized for this case. Of course the 703iterators that have this quality can be identified by specializing 704a traits class. 705 706Many of the algorithms must be specialized for the streambuf 707iterators, to take advantage of block-mode operations, in order 708to allow iostream/locale operations' performance not to suffer. 709It may be that they could be treated as staged iterators and 710take advantage of those optimizations. 711 712Chapter 25 Algorithms 713---------------------- 714Headers: <algorithm> 715C headers: <cstdlib> (also in 18, 21, 26)) 716 717The algorithms are "mostly complete". As mentioned above, they 718are optimized for speed at the expense of code and data size. 719 720Specializations of many of the algorithms for non-STL types would 721give performance improvements, but we must use great care not to 722interfere with fragile template overloading semantics for the 723standard interfaces. Conventionally the standard function template 724interface is an inline which delegates to a non-standard function 725which is then overloaded (this is already done in many places in 726the library). Particularly appealing opportunities for the sake of 727iostream performance are for copy and find applied to streambuf 728iterators or (as noted elsewhere) for staged iterators, of which 729the streambuf iterators are a good example. 730 731The bsearch and qsort functions cannot be overloaded properly as 732required by the standard because gcc does not yet allow overloading 733on the extern-"C"-ness of a function pointer. 734 735Chapter 26 Numerics 736-------------------- 737Headers: <complex> <valarray> <numeric> 738C headers: <cmath>, <cstdlib> (also 18, 21, 25) 739 740Numeric components: Gabriel dos Reis's valarray, Drepper's complex, 741and the few algorithms from the STL are "mostly done". Of course 742optimization opportunities abound for the numerically literate. It 743is not clear whether the valarray implementation really conforms 744fully, in the assumptions it makes about aliasing (and lack thereof) 745in its arguments. 746 747The C div() and ldiv() functions are interesting, because they are the 748only case where a C library function returns a class object by value. 749Since the C++ type div_t must be different from the underlying C type 750(which is in the wrong namespace) the underlying functions div() and 751ldiv() cannot be re-used efficiently. Fortunately they are trivial to 752re-implement. 753 754Chapter 27 Iostreams 755--------------------- 756Headers: <iosfwd> <streambuf> <ios> <ostream> <istream> <iostream> 757 <iomanip> <sstream> <fstream> 758C headers: <cstdio> <cwchar> (also in 21) 759 760Iostream is currently in a very incomplete state. <iosfwd>, <iomanip>, 761ios_base, and basic_ios<> are "mostly complete". basic_streambuf<> and 762basic_ostream<> are well along, but basic_istream<> has had little work 763done. The standard stream objects, <sstream> and <fstream> have been 764started; basic_filebuf<> "write" functions have been implemented just 765enough to do "hello, world". 766 767Most of the istream and ostream operators << and >> (with the exception 768of the op<<(integer) ones) have not been changed to use locale primitives, 769sentry objects, or char_traits members. 770 771All these templates should be manually instantiated for char and 772wchar_t in a way that links only used members into user programs. 773 774Streambuf is fertile ground for optimization extensions. An extended 775interface giving iterator access to its internal buffer would be very 776useful for other library components. 777 778Iostream operations (primarily operators << and >>) can take advantage 779of the case where user code has not specified a locale, and bypass locale 780operations entirely. The current implementation of op<</num_put<>::put, 781for the integer types, demonstrates how they can cache encoding details 782from the locale on each operation. There is lots more room for 783optimization in this area. 784 785The definition of the relationship between the standard streams 786cout et al. and stdout et al. requires something like a "stdiobuf". 787The SGI solution of using double-indirection to actually use a 788stdio FILE object for buffering is unsatisfactory, because it 789interferes with peephole loop optimizations. 790 791The <sstream> header work has begun. stringbuf can benefit from 792friendship with basic_string<> and basic_string<>::_Rep to use 793those objects directly as buffers, and avoid allocating and making 794copies. 795 796The basic_filebuf<> template is a complex beast. It is specified to 797use the locale facet codecvt<> to translate characters between native 798files and the locale character encoding. In general this involves 799two buffers, one of "char" representing the file and another of 800"char_type", for the stream, with codecvt<> translating. The process 801is complicated by the variable-length nature of the translation, and 802the need to seek to corresponding places in the two representations. 803For the case of basic_filebuf<char>, when no translation is needed, 804a single buffer suffices. A specialized filebuf can be used to reduce 805code space overhead when no locale has been imbued. Matt Austern's 806work at SGI will be useful, perhaps directly as a source of code, or 807at least as an example to draw on. 808 809Filebuf, almost uniquely (cf. operator new), depends heavily on 810underlying environmental facilities. In current releases iostream 811depends fairly heavily on libio constant definitions, but it should 812be made independent. It also depends on operating system primitives 813for file operations. There is immense room for optimizations using 814(e.g.) mmap for reading. The shadow/ directory wraps, besides the 815standard C headers, the libio.h and unistd.h headers, for use mainly 816by filebuf. These wrappings have not been completed, though there 817is scaffolding in place. 818 819The encapulation of certain C header <cstdio> names presents an 820interesting problem. It is possible to define an inline std::fprintf() 821implemented in terms of the 'extern "C"' vfprintf(), but there is no 822standard vfscanf() to use to implement std::fscanf(). It appears that 823vfscanf but be re-implemented in C++ for targets where no vfscanf 824extension has been defined. This is interesting in that it seems 825to be the only significant case in the C library where this kind of 826rewriting is necessary. (Of course Glibc provides the vfscanf() 827extension.) (The functions related to exit() must be rewritten 828for other reasons.) 829 830 831Annex D 832------- 833Headers: <strstream> 834 835Annex D defines many non-library features, and many minor 836modifications to various headers, and a complete header. 837It is "mostly done", except that the libstdc++-2 <strstream> 838header has not been adopted into the library, or checked to 839verify that it matches the draft in those details that were 840clarified by the committee. Certainly it must at least be 841moved into the std namespace. 842 843We still need to wrap all the deprecated features in #if guards 844so that pedantic compile modes can detect their use. 845 846Nonstandard Extensions 847---------------------- 848Headers: <iostream.h> <strstream.h> <hash> <rbtree> 849 <pthread_alloc> <stdiobuf> (etc.) 850 851User code has come to depend on a variety of nonstandard components 852that we must not omit. Much of this code can be adopted from 853libstdc++-v2 or from the SGI STL. This particularly includes 854<iostream.h>, <strstream.h>, and various SGI extensions such 855as <hash_map.h>. Many of these are already placed in the 856subdirectories ext/ and backward/. (Note that it is better to 857include them via "<backward/hash_map.h>" or "<ext/hash_map>" than 858to search the subdirectory itself via a "-I" directive. 859 860