1*404b540aSrobert 2*404b540aSrobertStandard C++ Library Design Document 3*404b540aSrobert------------------------------------ 4*404b540aSrobert 5*404b540aSrobertThis is an overview of libstdc++-v3, with particular attention 6*404b540aSrobertto projects to be done and how they fit into the whole. 7*404b540aSrobert 8*404b540aSrobertThe Library 9*404b540aSrobert----------- 10*404b540aSrobert 11*404b540aSrobertThis paper is covers two major areas: 12*404b540aSrobert 13*404b540aSrobert - Features and policies not mentioned in the standard that 14*404b540aSrobert the quality of the library implementation depends on, including 15*404b540aSrobert extensions and "implementation-defined" features; 16*404b540aSrobert 17*404b540aSrobert - Plans for required but unimplemented library features and 18*404b540aSrobert optimizations to them. 19*404b540aSrobert 20*404b540aSrobertOverhead 21*404b540aSrobert-------- 22*404b540aSrobert 23*404b540aSrobertThe standard defines a large library, much larger than the standard 24*404b540aSrobertC library. A naive implementation would suffer substantial overhead 25*404b540aSrobertin compile time, executable size, and speed, rendering it unusable 26*404b540aSrobertin many (particularly embedded) applications. The alternative demands 27*404b540aSrobertcare in construction, and some compiler support, but there is no 28*404b540aSrobertneed for library subsets. 29*404b540aSrobert 30*404b540aSrobertWhat are the sources of this overhead? There are four main causes: 31*404b540aSrobert 32*404b540aSrobert - The library is specified almost entirely as templates, which 33*404b540aSrobert with current compilers must be included in-line, resulting in 34*404b540aSrobert very slow builds as tens or hundreds of thousands of lines 35*404b540aSrobert of function definitions are read for each user source file. 36*404b540aSrobert Indeed, the entire SGI STL, as well as the dos Reis valarray, 37*404b540aSrobert are provided purely as header files, largely for simplicity in 38*404b540aSrobert porting. Iostream/locale is (or will be) as large again. 39*404b540aSrobert 40*404b540aSrobert - The library is very flexible, specifying a multitude of hooks 41*404b540aSrobert where users can insert their own code in place of defaults. 42*404b540aSrobert When these hooks are not used, any time and code expended to 43*404b540aSrobert support that flexibility is wasted. 44*404b540aSrobert 45*404b540aSrobert - Templates are often described as causing to "code bloat". In 46*404b540aSrobert practice, this refers (when it refers to anything real) to several 47*404b540aSrobert independent processes. First, when a class template is manually 48*404b540aSrobert instantiated in its entirely, current compilers place the definitions 49*404b540aSrobert for all members in a single object file, so that a program linking 50*404b540aSrobert to one member gets definitions of all. Second, template functions 51*404b540aSrobert which do not actually depend on the template argument are, under 52*404b540aSrobert current compilers, generated anew for each instantiation, rather 53*404b540aSrobert than being shared with other instantiations. Third, some of the 54*404b540aSrobert flexibility mentioned above comes from virtual functions (both in 55*404b540aSrobert regular classes and template classes) which current linkers add 56*404b540aSrobert to the executable file even when they manifestly cannot be called. 57*404b540aSrobert 58*404b540aSrobert - The library is specified to use a language feature, exceptions, 59*404b540aSrobert which in the current gcc compiler ABI imposes a run time and 60*404b540aSrobert code space cost to handle the possibility of exceptions even when 61*404b540aSrobert they are not used. Under the new ABI (accessed with -fnew-abi), 62*404b540aSrobert there is a space overhead and a small reduction in code efficiency 63*404b540aSrobert resulting from lost optimization opportunities associated with 64*404b540aSrobert non-local branches associated with exceptions. 65*404b540aSrobert 66*404b540aSrobertWhat can be done to eliminate this overhead? A variety of coding 67*404b540aSroberttechniques, and compiler, linker and library improvements and 68*404b540aSrobertextensions may be used, as covered below. Most are not difficult, 69*404b540aSrobertand some are already implemented in varying degrees. 70*404b540aSrobert 71*404b540aSrobertOverhead: Compilation Time 72*404b540aSrobert-------------------------- 73*404b540aSrobert 74*404b540aSrobertProviding "ready-instantiated" template code in object code archives 75*404b540aSrobertallows us to avoid generating and optimizing template instantiations 76*404b540aSrobertin each compilation unit which uses them. However, the number of such 77*404b540aSrobertinstantiations that are useful to provide is limited, and anyway this 78*404b540aSrobertis not enough, by itself, to minimize compilation time. In particular, 79*404b540aSrobertit does not reduce time spent parsing conforming headers. 80*404b540aSrobert 81*404b540aSrobertQuicker header parsing will depend on library extensions and compiler 82*404b540aSrobertimprovements. One approach is some variation on the techniques 83*404b540aSrobertpreviously marketed as "pre-compiled headers", now standardized as 84*404b540aSrobertsupport for the "export" keyword. "Exported" template definitions 85*404b540aSrobertcan be placed (once) in a "repository" -- really just a library, but 86*404b540aSrobertof template definitions rather than object code -- to be drawn upon 87*404b540aSrobertat link time when an instantiation is needed, rather than placed in 88*404b540aSrobertheader files to be parsed along with every compilation unit. 89*404b540aSrobert 90*404b540aSrobertUntil "export" is implemented we can put some of the lengthy template 91*404b540aSrobertdefinitions in #if guards or alternative headers so that users can skip 92*404b540aSrobertover the the full definitions when they need only the ready-instantiated 93*404b540aSrobertspecializations. 94*404b540aSrobert 95*404b540aSrobertTo be precise, this means that certain headers which define 96*404b540aSroberttemplates which users normally use only for certain arguments 97*404b540aSrobertcan be instrumented to avoid exposing the template definitions 98*404b540aSrobertto the compiler unless a macro is defined. For example, in 99*404b540aSrobert<string>, we might have: 100*404b540aSrobert 101*404b540aSrobert template <class _CharT, ... > class basic_string { 102*404b540aSrobert ... // member declarations 103*404b540aSrobert }; 104*404b540aSrobert ... // operator declarations 105*404b540aSrobert 106*404b540aSrobert #ifdef _STRICT_ISO_ 107*404b540aSrobert # if _G_NO_TEMPLATE_EXPORT 108*404b540aSrobert # include <bits/std_locale.h> // headers needed by definitions 109*404b540aSrobert # ... 110*404b540aSrobert # include <bits/string.tcc> // member and global template definitions. 111*404b540aSrobert # endif 112*404b540aSrobert #endif 113*404b540aSrobert 114*404b540aSrobertUsers who compile without specifying a strict-ISO-conforming flag 115*404b540aSrobertwould not see many of the template definitions they now see, and rely 116*404b540aSrobertinstead on ready-instantiated specializations in the library. This 117*404b540aSroberttechnique would be useful for the following substantial components: 118*404b540aSrobertstring, locale/iostreams, valarray. It would *not* be useful or 119*404b540aSrobertusable with the following: containers, algorithms, iterators, 120*404b540aSrobertallocator. Since these constitute a large (though decreasing) 121*404b540aSrobertfraction of the library, the benefit the technique offers is 122*404b540aSrobertlimited. 123*404b540aSrobert 124*404b540aSrobertThe language specifies the semantics of the "export" keyword, but 125*404b540aSrobertthe gcc compiler does not yet support it. When it does, problems 126*404b540aSrobertwith large template inclusions can largely disappear, given some 127*404b540aSrobertminor library reorganization, along with the need for the apparatus 128*404b540aSrobertdescribed above. 129*404b540aSrobert 130*404b540aSrobertOverhead: Flexibility Cost 131*404b540aSrobert-------------------------- 132*404b540aSrobert 133*404b540aSrobertThe library offers many places where users can specify operations 134*404b540aSrobertto be performed by the library in place of defaults. Sometimes 135*404b540aSrobertthis seems to require that the library use a more-roundabout, and 136*404b540aSrobertpossibly slower, way to accomplish the default requirements than 137*404b540aSrobertwould be used otherwise. 138*404b540aSrobert 139*404b540aSrobertThe primary protection against this overhead is thorough compiler 140*404b540aSrobertoptimization, to crush out layers of inline function interfaces. 141*404b540aSrobertKuck & Associates has demonstrated the practicality of this kind 142*404b540aSrobertof optimization. 143*404b540aSrobert 144*404b540aSrobertThe second line of defense against this overhead is explicit 145*404b540aSrobertspecialization. By defining helper function templates, and writing 146*404b540aSrobertspecialized code for the default case, overhead can be eliminated 147*404b540aSrobertfor that case without sacrificing flexibility. This takes full 148*404b540aSrobertadvantage of any ability of the optimizer to crush out degenerate 149*404b540aSrobertcode. 150*404b540aSrobert 151*404b540aSrobertThe library specifies many virtual functions which current linkers 152*404b540aSrobertload even when they cannot be called. Some minor improvements to the 153*404b540aSrobertcompiler and to ld would eliminate any such overhead by simply 154*404b540aSrobertomitting virtual functions that the complete program does not call. 155*404b540aSrobertA prototype of this work has already been done. For targets where 156*404b540aSrobertGNU ld is not used, a "pre-linker" could do the same job. 157*404b540aSrobert 158*404b540aSrobertThe main areas in the standard interface where user flexibility 159*404b540aSrobertcan result in overhead are: 160*404b540aSrobert 161*404b540aSrobert - Allocators: Containers are specified to use user-definable 162*404b540aSrobert allocator types and objects, making tuning for the container 163*404b540aSrobert characteristics tricky. 164*404b540aSrobert 165*404b540aSrobert - Locales: the standard specifies locale objects used to implement 166*404b540aSrobert iostream operations, involving many virtual functions which use 167*404b540aSrobert streambuf iterators. 168*404b540aSrobert 169*404b540aSrobert - Algorithms and containers: these may be instantiated on any type, 170*404b540aSrobert frequently duplicating code for identical operations. 171*404b540aSrobert 172*404b540aSrobert - Iostreams and strings: users are permitted to use these on their 173*404b540aSrobert own types, and specify the operations the stream must use on these 174*404b540aSrobert types. 175*404b540aSrobert 176*404b540aSrobertNote that these sources of overhead are _avoidable_. The techniques 177*404b540aSrobertto avoid them are covered below. 178*404b540aSrobert 179*404b540aSrobertCode Bloat 180*404b540aSrobert---------- 181*404b540aSrobert 182*404b540aSrobertIn the SGI STL, and in some other headers, many of the templates 183*404b540aSrobertare defined "inline" -- either explicitly or by their placement 184*404b540aSrobertin class definitions -- which should not be inline. This is a 185*404b540aSrobertsource of code bloat. Matt had remarked that he was relying on 186*404b540aSrobertthe compiler to recognize what was too big to benefit from inlining, 187*404b540aSrobertand generate it out-of-line automatically. However, this also can 188*404b540aSrobertresult in code bloat except where the linker can eliminate the extra 189*404b540aSrobertcopies. 190*404b540aSrobert 191*404b540aSrobertFixing these cases will require an audit of all inline functions 192*404b540aSrobertdefined in the library to determine which merit inlining, and moving 193*404b540aSrobertthe rest out of line. This is an issue mainly in chapters 23, 25, and 194*404b540aSrobert27. Of course it can be done incrementally, and we should generally 195*404b540aSrobertaccept patches that move large functions out of line and into ".tcc" 196*404b540aSrobertfiles, which can later be pulled into a repository. Compiler/linker 197*404b540aSrobertimprovements to recognize very large inline functions and move them 198*404b540aSrobertout-of-line, but shared among compilation units, could make this 199*404b540aSrobertwork unnecessary. 200*404b540aSrobert 201*404b540aSrobertPre-instantiating template specializations currently produces large 202*404b540aSrobertamounts of dead code which bloats statically linked programs. The 203*404b540aSrobertcurrent state of the static library, libstdc++.a, is intolerable on 204*404b540aSrobertthis account, and will fuel further confused speculation about a need 205*404b540aSrobertfor a library "subset". A compiler improvement that treats each 206*404b540aSrobertinstantiated function as a separate object file, for linking purposes, 207*404b540aSrobertwould be one solution to this problem. An alternative would be to 208*404b540aSrobertsplit up the manual instantiation files into dozens upon dozens of 209*404b540aSrobertlittle files, each compiled separately, but an abortive attempt at 210*404b540aSrobertthis was done for <string> and, though it is far from complete, it 211*404b540aSrobertis already a nuisance. A better interim solution (just until we have 212*404b540aSrobert"export") is badly needed. 213*404b540aSrobert 214*404b540aSrobertWhen building a shared library, the current compiler/linker cannot 215*404b540aSrobertautomatically generate the instantiatiations needed. This creates a 216*404b540aSrobertmiserable situation; it means any time something is changed in the 217*404b540aSrobertlibrary, before a shared library can be built someone must manually 218*404b540aSrobertcopy the declarations of all templates that are needed by other parts 219*404b540aSrobertof the library to an "instantiation" file, and add it to the build 220*404b540aSrobertsystem to be compiled and linked to the library. This process is 221*404b540aSrobertreadily automated, and should be automated as soon as possible. 222*404b540aSrobertUsers building their own shared libraries experience identical 223*404b540aSrobertfrustrations. 224*404b540aSrobert 225*404b540aSrobertSharing common aspects of template definitions among instantiations 226*404b540aSrobertcan radically reduce code bloat. The compiler could help a great 227*404b540aSrobertdeal here by recognizing when a function depends on nothing about 228*404b540aSroberta template parameter, or only on its size, and giving the resulting 229*404b540aSrobertfunction a link-name "equate" that allows it to be shared with other 230*404b540aSrobertinstantiations. Implementation code could take advantage of the 231*404b540aSrobertcapability by factoring out code that does not depend on the template 232*404b540aSrobertargument into separate functions to be merged by the compiler. 233*404b540aSrobert 234*404b540aSrobertUntil such a compiler optimization is implemented, much can be done 235*404b540aSrobertmanually (if tediously) in this direction. One such optimization is 236*404b540aSrobertto derive class templates from non-template classes, and move as much 237*404b540aSrobertimplementation as possible into the base class. Another is to partial- 238*404b540aSrobertspecialize certain common instantiations, such as vector<T*>, to share 239*404b540aSrobertcode for instantiations on all types T. While these techniques work, 240*404b540aSrobertthey are far from the complete solution that a compiler improvement 241*404b540aSrobertwould afford. 242*404b540aSrobert 243*404b540aSrobertOverhead: Expensive Language Features 244*404b540aSrobert------------------------------------- 245*404b540aSrobert 246*404b540aSrobertThe main "expensive" language feature used in the standard library 247*404b540aSrobertis exception support, which requires compiling in cleanup code with 248*404b540aSrobertstatic table data to locate it, and linking in library code to use 249*404b540aSrobertthe table. For small embedded programs the amount of such library 250*404b540aSrobertcode and table data is assumed by some to be excessive. Under the 251*404b540aSrobert"new" ABI this perception is generally exaggerated, although in some 252*404b540aSrobertcases it may actually be excessive. 253*404b540aSrobert 254*404b540aSrobertTo implement a library which does not use exceptions directly is 255*404b540aSrobertnot difficult given minor compiler support (to "turn off" exceptions 256*404b540aSrobertand ignore exception constructs), and results in no great library 257*404b540aSrobertmaintenance difficulties. To be precise, given "-fno-exceptions", 258*404b540aSrobertthe compiler should treat "try" blocks as ordinary blocks, and 259*404b540aSrobert"catch" blocks as dead code to ignore or eliminate. Compiler 260*404b540aSrobertsupport is not strictly necessary, except in the case of "function 261*404b540aSroberttry blocks"; otherwise the following macros almost suffice: 262*404b540aSrobert 263*404b540aSrobert #define throw(X) 264*404b540aSrobert #define try if (true) 265*404b540aSrobert #define catch(X) else if (false) 266*404b540aSrobert 267*404b540aSrobertHowever, there may be a need to use function try blocks in the 268*404b540aSrobertlibrary implementation, and use of macros in this way can make 269*404b540aSrobertcorrect diagnostics impossible. Furthermore, use of this scheme 270*404b540aSrobertwould require the library to call a function to re-throw exceptions 271*404b540aSrobertfrom a try block. Implementing the above semantics in the compiler 272*404b540aSrobertis preferable. 273*404b540aSrobert 274*404b540aSrobertGiven the support above (however implemented) it only remains to 275*404b540aSrobertreplace code that "throws" with a call to a well-documented "handler" 276*404b540aSrobertfunction in a separate compilation unit which may be replaced by 277*404b540aSrobertthe user. The main source of exceptions that would be difficult 278*404b540aSrobertfor users to avoid is memory allocation failures, but users can 279*404b540aSrobertdefine their own memory allocation primitives that never throw. 280*404b540aSrobertOtherwise, the complete list of such handlers, and which library 281*404b540aSrobertfunctions may call them, would be needed for users to be able to 282*404b540aSrobertimplement the necessary substitutes. (Fortunately, they have the 283*404b540aSrobertsource code.) 284*404b540aSrobert 285*404b540aSrobertOpportunities 286*404b540aSrobert------------- 287*404b540aSrobert 288*404b540aSrobertThe template capabilities of C++ offer enormous opportunities for 289*404b540aSrobertoptimizing common library operations, well beyond what would be 290*404b540aSrobertconsidered "eliminating overhead". In particular, many operations 291*404b540aSrobertdone in Glibc with macros that depend on proprietary language 292*404b540aSrobertextensions can be implemented in pristine Standard C++. For example, 293*404b540aSrobertthe chapter 25 algorithms, and even C library functions such as strchr, 294*404b540aSrobertcan be specialized for the case of static arrays of known (small) size. 295*404b540aSrobert 296*404b540aSrobertDetailed optimization opportunities are identified below where 297*404b540aSrobertthe component where they would appear is discussed. Of course new 298*404b540aSrobertopportunities will be identified during implementation. 299*404b540aSrobert 300*404b540aSrobertUnimplemented Required Library Features 301*404b540aSrobert--------------------------------------- 302*404b540aSrobert 303*404b540aSrobertThe standard specifies hundreds of components, grouped broadly by 304*404b540aSrobertchapter. These are listed in excruciating detail in the CHECKLIST 305*404b540aSrobertfile. 306*404b540aSrobert 307*404b540aSrobert 17 general 308*404b540aSrobert 18 support 309*404b540aSrobert 19 diagnostics 310*404b540aSrobert 20 utilities 311*404b540aSrobert 21 string 312*404b540aSrobert 22 locale 313*404b540aSrobert 23 containers 314*404b540aSrobert 24 iterators 315*404b540aSrobert 25 algorithms 316*404b540aSrobert 26 numerics 317*404b540aSrobert 27 iostreams 318*404b540aSrobert Annex D backward compatibility 319*404b540aSrobert 320*404b540aSrobertAnyone participating in implementation of the library should obtain 321*404b540aSroberta copy of the standard, ISO 14882. People in the U.S. can obtain an 322*404b540aSrobertelectronic copy for US$18 from ANSI's web site. Those from other 323*404b540aSrobertcountries should visit http://www.iso.ch/ to find out the location 324*404b540aSrobertof their country's representation in ISO, in order to know who can 325*404b540aSrobertsell them a copy. 326*404b540aSrobert 327*404b540aSrobertThe emphasis in the following sections is on unimplemented features 328*404b540aSrobertand optimization opportunities. 329*404b540aSrobert 330*404b540aSrobertChapter 17 General 331*404b540aSrobert------------------- 332*404b540aSrobert 333*404b540aSrobertChapter 17 concerns overall library requirements. 334*404b540aSrobert 335*404b540aSrobertThe standard doesn't mention threads. A multi-thread (MT) extension 336*404b540aSrobertprimarily affects operators new and delete (18), allocator (20), 337*404b540aSrobertstring (21), locale (22), and iostreams (27). The common underlying 338*404b540aSrobertsupport needed for this is discussed under chapter 20. 339*404b540aSrobert 340*404b540aSrobertThe standard requirements on names from the C headers create a 341*404b540aSrobertlot of work, mostly done. Names in the C headers must be visible 342*404b540aSrobertin the std:: and sometimes the global namespace; the names in the 343*404b540aSroberttwo scopes must refer to the same object. More stringent is that 344*404b540aSrobertKoenig lookup implies that any types specified as defined in std:: 345*404b540aSrobertreally are defined in std::. Names optionally implemented as 346*404b540aSrobertmacros in C cannot be macros in C++. (An overview may be read at 347*404b540aSrobert<http://www.cantrip.org/cheaders.html>). The scripts "inclosure" 348*404b540aSrobertand "mkcshadow", and the directories shadow/ and cshadow/, are the 349*404b540aSrobertbeginning of an effort to conform in this area. 350*404b540aSrobert 351*404b540aSrobertA correct conforming definition of C header names based on underlying 352*404b540aSrobertC library headers, and practical linking of conforming namespaced 353*404b540aSrobertcustomer code with third-party C libraries depends ultimately on 354*404b540aSrobertan ABI change, allowing namespaced C type names to be mangled into 355*404b540aSroberttype names as if they were global, somewhat as C function names in a 356*404b540aSrobertnamespace, or C++ global variable names, are left unmangled. Perhaps 357*404b540aSrobertanother "extern" mode, such as 'extern "C-global"' would be an 358*404b540aSrobertappropriate place for such type definitions. Such a type would 359*404b540aSrobertaffect mangling as follows: 360*404b540aSrobert 361*404b540aSrobert namespace A { 362*404b540aSrobert struct X {}; 363*404b540aSrobert extern "C-global" { // or maybe just 'extern "C"' 364*404b540aSrobert struct Y {}; 365*404b540aSrobert }; 366*404b540aSrobert } 367*404b540aSrobert void f(A::X*); // mangles to f__FPQ21A1X 368*404b540aSrobert void f(A::Y*); // mangles to f__FP1Y 369*404b540aSrobert 370*404b540aSrobert(It may be that this is really the appropriate semantics for regular 371*404b540aSrobert'extern "C"', and 'extern "C-global"', as an extension, would not be 372*404b540aSrobertnecessary.) This would allow functions declared in non-standard C headers 373*404b540aSrobert(and thus fixable by neither us nor users) to link properly with functions 374*404b540aSrobertdeclared using C types defined in properly-namespaced headers. The 375*404b540aSrobertproblem this solves is that C headers (which C++ programmers do persist 376*404b540aSrobertin using) frequently forward-declare C struct tags without including 377*404b540aSrobertthe header where the type is defined, as in 378*404b540aSrobert 379*404b540aSrobert struct tm; 380*404b540aSrobert void munge(tm*); 381*404b540aSrobert 382*404b540aSrobertWithout some compiler accommodation, munge cannot be called by correct 383*404b540aSrobertC++ code using a pointer to a correctly-scoped tm* value. 384*404b540aSrobert 385*404b540aSrobertThe current C headers use the preprocessor extension "#include_next", 386*404b540aSrobertwhich the compiler complains about when run "-pedantic". 387*404b540aSrobert(Incidentally, it appears that "-fpedantic" is currently ignored, 388*404b540aSrobertprobably a bug.) The solution in the C compiler is to use 389*404b540aSrobert"-isystem" rather than "-I", but unfortunately in g++ this seems 390*404b540aSrobertalso to wrap the whole header in an 'extern "C"' block, so it's 391*404b540aSrobertunusable for C++ headers. The correct solution appears to be to 392*404b540aSrobertallow the various special include-directory options, if not given 393*404b540aSrobertan argument, to affect subsequent include-directory options additively, 394*404b540aSrobertso that if one said 395*404b540aSrobert 396*404b540aSrobert -pedantic -iprefix $(prefix) \ 397*404b540aSrobert -idirafter -ino-pedantic -ino-extern-c -iwithprefix -I g++-v3 \ 398*404b540aSrobert -iwithprefix -I g++-v3/ext 399*404b540aSrobert 400*404b540aSrobertthe compiler would search $(prefix)/g++-v3 and not report 401*404b540aSrobertpedantic warnings for files found there, but treat files in 402*404b540aSrobert$(prefix)/g++-v3/ext pedantically. (The undocumented semantics 403*404b540aSrobertof "-isystem" in g++ stink. Can they be rescinded? If not it 404*404b540aSrobertmust be replaced with something more rationally behaved.) 405*404b540aSrobert 406*404b540aSrobertAll the C headers need the treatment above; in the standard these 407*404b540aSrobertheaders are mentioned in various chapters. Below, I have only 408*404b540aSrobertmentioned those that present interesting implementation issues. 409*404b540aSrobert 410*404b540aSrobertThe components identified as "mostly complete", below, have not been 411*404b540aSrobertaudited for conformance. In many cases where the library passes 412*404b540aSrobertconformance tests we have non-conforming extensions that must be 413*404b540aSrobertwrapped in #if guards for "pedantic" use, and in some cases renamed 414*404b540aSrobertin a conforming way for continued use in the implementation regardless 415*404b540aSrobertof conformance flags. 416*404b540aSrobert 417*404b540aSrobertThe STL portion of the library still depends on a header 418*404b540aSrobertstl/bits/stl_config.h full of #ifdef clauses. This apparatus 419*404b540aSrobertshould be replaced with autoconf/automake machinery. 420*404b540aSrobert 421*404b540aSrobertThe SGI STL defines a type_traits<> template, specialized for 422*404b540aSrobertmany types in their code including the built-in numeric and 423*404b540aSrobertpointer types and some library types, to direct optimizations of 424*404b540aSrobertstandard functions. The SGI compiler has been extended to generate 425*404b540aSrobertspecializations of this template automatically for user types, 426*404b540aSrobertso that use of STL templates on user types can take advantage of 427*404b540aSrobertthese optimizations. Specializations for other, non-STL, types 428*404b540aSrobertwould make more optimizations possible, but extending the gcc 429*404b540aSrobertcompiler in the same way would be much better. Probably the next 430*404b540aSrobertround of standardization will ratify this, but probably with 431*404b540aSrobertchanges, so it probably should be renamed to place it in the 432*404b540aSrobertimplementation namespace. 433*404b540aSrobert 434*404b540aSrobertThe SGI STL also defines a large number of extensions visible in 435*404b540aSrobertstandard headers. (Other extensions that appear in separate headers 436*404b540aSroberthave been sequestered in subdirectories ext/ and backward/.) All 437*404b540aSrobertthese extensions should be moved to other headers where possible, 438*404b540aSrobertand in any case wrapped in a namespace (not std!), and (where kept 439*404b540aSrobertin a standard header) girded about with macro guards. Some cannot be 440*404b540aSrobertmoved out of standard headers because they are used to implement 441*404b540aSrobertstandard features. The canonical method for accommodating these 442*404b540aSrobertis to use a protected name, aliased in macro guards to a user-space 443*404b540aSrobertname. Unfortunately C++ offers no satisfactory template typedef 444*404b540aSrobertmechanism, so very ad-hoc and unsatisfactory aliasing must be used 445*404b540aSrobertinstead. 446*404b540aSrobert 447*404b540aSrobertImplementation of a template typedef mechanism should have the highest 448*404b540aSrobertpriority among possible extensions, on the same level as implementation 449*404b540aSrobertof the template "export" feature. 450*404b540aSrobert 451*404b540aSrobertChapter 18 Language support 452*404b540aSrobert---------------------------- 453*404b540aSrobert 454*404b540aSrobertHeaders: <limits> <new> <typeinfo> <exception> 455*404b540aSrobertC headers: <cstddef> <climits> <cfloat> <cstdarg> <csetjmp> 456*404b540aSrobert <ctime> <csignal> <cstdlib> (also 21, 25, 26) 457*404b540aSrobert 458*404b540aSrobertThis defines the built-in exceptions, rtti, numeric_limits<>, 459*404b540aSrobertoperator new and delete. Much of this is provided by the 460*404b540aSrobertcompiler in its static runtime library. 461*404b540aSrobert 462*404b540aSrobertWork to do includes defining numeric_limits<> specializations in 463*404b540aSrobertseparate files for all target architectures. Values for integer types 464*404b540aSrobertexcept for bool and wchar_t are readily obtained from the C header 465*404b540aSrobert<limits.h>, but values for the remaining numeric types (bool, wchar_t, 466*404b540aSrobertfloat, double, long double) must be entered manually. This is 467*404b540aSrobertlargely dog work except for those members whose values are not 468*404b540aSroberteasily deduced from available documentation. Also, this involves 469*404b540aSrobertsome work in target configuration to identify the correct choice of 470*404b540aSrobertfile to build against and to install. 471*404b540aSrobert 472*404b540aSrobertThe definitions of the various operators new and delete must be 473*404b540aSrobertmade thread-safe, which depends on a portable exclusion mechanism, 474*404b540aSrobertdiscussed under chapter 20. Of course there is always plenty of 475*404b540aSrobertroom for improvements to the speed of operators new and delete. 476*404b540aSrobert 477*404b540aSrobert<cstdarg>, in Glibc, defines some macros that gcc does not allow to 478*404b540aSrobertbe wrapped into an inline function. Probably this header will demand 479*404b540aSrobertattention whenever a new target is chosen. The functions atexit(), 480*404b540aSrobertexit(), and abort() in cstdlib have different semantics in C++, so 481*404b540aSrobertmust be re-implemented for C++. 482*404b540aSrobert 483*404b540aSrobertChapter 19 Diagnostics 484*404b540aSrobert----------------------- 485*404b540aSrobert 486*404b540aSrobertHeaders: <stdexcept> 487*404b540aSrobertC headers: <cassert> <cerrno> 488*404b540aSrobert 489*404b540aSrobertThis defines the standard exception objects, which are "mostly complete". 490*404b540aSrobertCygnus has a version, and now SGI provides a slightly different one. 491*404b540aSrobertIt makes little difference which we use. 492*404b540aSrobert 493*404b540aSrobertThe C global name "errno", which C allows to be a variable or a macro, 494*404b540aSrobertis required in C++ to be a macro. For MT it must typically result in 495*404b540aSroberta function call. 496*404b540aSrobert 497*404b540aSrobertChapter 20 Utilities 498*404b540aSrobert--------------------- 499*404b540aSrobertHeaders: <utility> <functional> <memory> 500*404b540aSrobertC header: <ctime> (also in 18) 501*404b540aSrobert 502*404b540aSrobertSGI STL provides "mostly complete" versions of all the components 503*404b540aSrobertdefined in this chapter. However, the auto_ptr<> implementation 504*404b540aSrobertis known to be wrong. Furthermore, the standard definition of it 505*404b540aSrobertis known to be unimplementable as written. A minor change to the 506*404b540aSrobertstandard would fix it, and auto_ptr<> should be adjusted to match. 507*404b540aSrobert 508*404b540aSrobertMulti-threading affects the allocator implementation, and there must 509*404b540aSrobertbe configuration/installation choices for different users' MT 510*404b540aSrobertrequirements. Anyway, users will want to tune allocator options 511*404b540aSrobertto support different target conditions, MT or no. 512*404b540aSrobert 513*404b540aSrobertThe primitives used for MT implementation should be exposed, as an 514*404b540aSrobertextension, for users' own work. We need cross-CPU "mutex" support, 515*404b540aSrobertmulti-processor shared-memory atomic integer operations, and single- 516*404b540aSrobertprocessor uninterruptible integer operations, and all three configurable 517*404b540aSrobertto be stubbed out for non-MT use, or to use an appropriately-loaded 518*404b540aSrobertdynamic library for the actual runtime environment, or statically 519*404b540aSrobertcompiled in for cases where the target architecture is known. 520*404b540aSrobert 521*404b540aSrobertChapter 21 String 522*404b540aSrobert------------------ 523*404b540aSrobertHeaders: <string> 524*404b540aSrobertC headers: <cctype> <cwctype> <cstring> <cwchar> (also in 27) 525*404b540aSrobert <cstdlib> (also in 18, 25, 26) 526*404b540aSrobert 527*404b540aSrobertWe have "mostly-complete" char_traits<> implementations. Many of the 528*404b540aSrobertchar_traits<char> operations might be optimized further using existing 529*404b540aSrobertproprietary language extensions. 530*404b540aSrobert 531*404b540aSrobertWe have a "mostly-complete" basic_string<> implementation. The work 532*404b540aSrobertto manually instantiate char and wchar_t specializations in object 533*404b540aSrobertfiles to improve link-time behavior is extremely unsatisfactory, 534*404b540aSrobertliterally tripling library-build time with no commensurate improvement 535*404b540aSrobertin static program link sizes. It must be redone. (Similar work is 536*404b540aSrobertneeded for some components in chapters 22 and 27.) 537*404b540aSrobert 538*404b540aSrobertOther work needed for strings is MT-safety, as discussed under the 539*404b540aSrobertchapter 20 heading. 540*404b540aSrobert 541*404b540aSrobertThe standard C type mbstate_t from <cwchar> and used in char_traits<> 542*404b540aSrobertmust be different in C++ than in C, because in C++ the default constructor 543*404b540aSrobertvalue mbstate_t() must be the "base" or "ground" sequence state. 544*404b540aSrobert(According to the likely resolution of a recently raised Core issue, 545*404b540aSrobertthis may become unnecessary. However, there are other reasons to 546*404b540aSrobertuse a state type not as limited as whatever the C library provides.) 547*404b540aSrobertIf we might want to provide conversions from (e.g.) internally- 548*404b540aSrobertrepresented EUC-wide to externally-represented Unicode, or vice- 549*404b540aSrobertversa, the mbstate_t we choose will need to be more accommodating 550*404b540aSrobertthan what might be provided by an underlying C library. 551*404b540aSrobert 552*404b540aSrobertThere remain some basic_string template-member functions which do 553*404b540aSrobertnot overload properly with their non-template brethren. The infamous 554*404b540aSroberthack akin to what was done in vector<> is needed, to conform to 555*404b540aSrobert23.1.1 para 10. The CHECKLIST items for basic_string marked 'X', 556*404b540aSrobertor incomplete, are so marked for this reason. 557*404b540aSrobert 558*404b540aSrobertReplacing the string iterators, which currently are simple character 559*404b540aSrobertpointers, with class objects would greatly increase the safety of the 560*404b540aSrobertclient interface, and also permit a "debug" mode in which range, 561*404b540aSrobertownership, and validity are rigorously checked. The current use of 562*404b540aSrobertraw pointers as string iterators is evil. vector<> iterators need the 563*404b540aSrobertsame treatment. Note that the current implementation freely mixes 564*404b540aSrobertpointers and iterators, and that must be fixed before safer iterators 565*404b540aSrobertcan be introduced. 566*404b540aSrobert 567*404b540aSrobertSome of the functions in <cstring> are different from the C version. 568*404b540aSrobertgenerally overloaded on const and non-const argument pointers. For 569*404b540aSrobertexample, in <cstring> strchr is overloaded. The functions isupper 570*404b540aSrobertetc. in <cctype> typically implemented as macros in C are functions 571*404b540aSrobertin C++, because they are overloaded with others of the same name 572*404b540aSrobertdefined in <locale>. 573*404b540aSrobert 574*404b540aSrobertMany of the functions required in <cwctype> and <cwchar> cannot be 575*404b540aSrobertimplemented using underlying C facilities on intended targets because 576*404b540aSrobertsuch facilities only partly exist. 577*404b540aSrobert 578*404b540aSrobertChapter 22 Locale 579*404b540aSrobert------------------ 580*404b540aSrobertHeaders: <locale> 581*404b540aSrobertC headers: <clocale> 582*404b540aSrobert 583*404b540aSrobertWe have a "mostly complete" class locale, with the exception of 584*404b540aSrobertcode for constructing, and handling the names of, named locales. 585*404b540aSrobertThe ways that locales are named (particularly when categories 586*404b540aSrobert(e.g. LC_TIME, LC_COLLATE) are different) varies among all target 587*404b540aSrobertenvironments. This code must be written in various versions and 588*404b540aSrobertchosen by configuration parameters. 589*404b540aSrobert 590*404b540aSrobertMembers of many of the facets defined in <locale> are stubs. Generally, 591*404b540aSrobertthere are two sets of facets: the base class facets (which are supposed 592*404b540aSrobertto implement the "C" locale) and the "byname" facets, which are supposed 593*404b540aSrobertto read files to determine their behavior. The base ctype<>, collate<>, 594*404b540aSrobertand numpunct<> facets are "mostly complete", except that the table of 595*404b540aSrobertbitmask values used for "is" operations, and corresponding mask values, 596*404b540aSrobertare still defined in libio and just included/linked. (We will need to 597*404b540aSrobertimplement these tables independently, soon, but should take advantage 598*404b540aSrobertof libio where possible.) The num_put<>::put members for integer types 599*404b540aSrobertare "mostly complete". 600*404b540aSrobert 601*404b540aSrobertA complete list of what has and has not been implemented may be 602*404b540aSrobertfound in CHECKLIST. However, note that the current definition of 603*404b540aSrobertcodecvt<wchar_t,char,mbstate_t> is wrong. It should simply write 604*404b540aSrobertout the raw bytes representing the wide characters, rather than 605*404b540aSroberttrying to convert each to a corresponding single "char" value. 606*404b540aSrobert 607*404b540aSrobertSome of the facets are more important than others. Specifically, 608*404b540aSrobertthe members of ctype<>, numpunct<>, num_put<>, and num_get<> facets 609*404b540aSrobertare used by other library facilities defined in <string>, <istream>, 610*404b540aSrobertand <ostream>, and the codecvt<> facet is used by basic_filebuf<> 611*404b540aSrobertin <fstream>, so a conforming iostream implementation depends on 612*404b540aSrobertthese. 613*404b540aSrobert 614*404b540aSrobertThe "long long" type eventually must be supported, but code mentioning 615*404b540aSrobertit should be wrapped in #if guards to allow pedantic-mode compiling. 616*404b540aSrobert 617*404b540aSrobertPerformance of num_put<> and num_get<> depend critically on 618*404b540aSrobertcaching computed values in ios_base objects, and on extensions 619*404b540aSrobertto the interface with streambufs. 620*404b540aSrobert 621*404b540aSrobertSpecifically: retrieving a copy of the locale object, extracting 622*404b540aSrobertthe needed facets, and gathering data from them, for each call to 623*404b540aSrobert(e.g.) operator<< would be prohibitively slow. To cache format 624*404b540aSrobertdata for use by num_put<> and num_get<> we have a _Format_cache<> 625*404b540aSrobertobject stored in the ios_base::pword() array. This is constructed 626*404b540aSrobertand initialized lazily, and is organized purely for utility. It 627*404b540aSrobertis discarded when a new locale with different facets is imbued. 628*404b540aSrobert 629*404b540aSrobertUsing only the public interfaces of the iterator arguments to the 630*404b540aSrobertfacet functions would limit performance by forbidding "vector-style" 631*404b540aSrobertcharacter operations. The streambuf iterator optimizations are 632*404b540aSrobertdescribed under chapter 24, but facets can also bypass the streambuf 633*404b540aSrobertiterators via explicit specializations and operate directly on the 634*404b540aSrobertstreambufs, and use extended interfaces to get direct access to the 635*404b540aSrobertstreambuf internal buffer arrays. These extensions are mentioned 636*404b540aSrobertunder chapter 27. These optimizations are particularly important 637*404b540aSrobertfor input parsing. 638*404b540aSrobert 639*404b540aSrobertUnused virtual members of locale facets can be omitted, as mentioned 640*404b540aSrobertabove, by a smart linker. 641*404b540aSrobert 642*404b540aSrobertChapter 23 Containers 643*404b540aSrobert---------------------- 644*404b540aSrobertHeaders: <deque> <list> <queue> <stack> <vector> <map> <set> <bitset> 645*404b540aSrobert 646*404b540aSrobertAll the components in chapter 23 are implemented in the SGI STL. 647*404b540aSrobertThey are "mostly complete"; they include a large number of 648*404b540aSrobertnonconforming extensions which must be wrapped. Some of these 649*404b540aSrobertare used internally and must be renamed or duplicated. 650*404b540aSrobert 651*404b540aSrobertThe SGI components are optimized for large-memory environments. For 652*404b540aSrobertembedded targets, different criteria might be more appropriate. Users 653*404b540aSrobertwill want to be able to tune this behavior. We should provide 654*404b540aSrobertways for users to compile the library with different memory usage 655*404b540aSrobertcharacteristics. 656*404b540aSrobert 657*404b540aSrobertA lot more work is needed on factoring out common code from different 658*404b540aSrobertspecializations to reduce code size here and in chapter 25. The 659*404b540aSroberteasiest fix for this would be a compiler/ABI improvement that allows 660*404b540aSrobertthe compiler to recognize when a specialization depends only on the 661*404b540aSrobertsize (or other gross quality) of a template argument, and allow the 662*404b540aSrobertlinker to share the code with similar specializations. In its 663*404b540aSrobertabsence, many of the algorithms and containers can be partial- 664*404b540aSrobertspecialized, at least for the case of pointers, but this only solves 665*404b540aSroberta small part of the problem. Use of a type_traits-style template 666*404b540aSrobertallows a few more optimization opportunities, more if the compiler 667*404b540aSrobertcan generate the specializations automatically. 668*404b540aSrobert 669*404b540aSrobertAs an optimization, containers can specialize on the default allocator 670*404b540aSrobertand bypass it, or take advantage of details of its implementation 671*404b540aSrobertafter it has been improved upon. 672*404b540aSrobert 673*404b540aSrobertReplacing the vector iterators, which currently are simple element 674*404b540aSrobertpointers, with class objects would greatly increase the safety of the 675*404b540aSrobertclient interface, and also permit a "debug" mode in which range, 676*404b540aSrobertownership, and validity are rigorously checked. The current use of 677*404b540aSrobertpointers for iterators is evil. 678*404b540aSrobert 679*404b540aSrobertAs mentioned for chapter 24, the deque iterator is a good example of 680*404b540aSrobertan opportunity to implement a "staged" iterator that would benefit 681*404b540aSrobertfrom specializations of some algorithms. 682*404b540aSrobert 683*404b540aSrobertChapter 24 Iterators 684*404b540aSrobert--------------------- 685*404b540aSrobertHeaders: <iterator> 686*404b540aSrobert 687*404b540aSrobertStandard iterators are "mostly complete", with the exception of 688*404b540aSrobertthe stream iterators, which are not yet templatized on the 689*404b540aSrobertstream type. Also, the base class template iterator<> appears 690*404b540aSrobertto be wrong, so everything derived from it must also be wrong, 691*404b540aSrobertcurrently. 692*404b540aSrobert 693*404b540aSrobertThe streambuf iterators (currently located in stl/bits/std_iterator.h, 694*404b540aSrobertbut should be under bits/) can be rewritten to take advantage of 695*404b540aSrobertfriendship with the streambuf implementation. 696*404b540aSrobert 697*404b540aSrobertMatt Austern has identified opportunities where certain iterator 698*404b540aSroberttypes, particularly including streambuf iterators and deque 699*404b540aSrobertiterators, have a "two-stage" quality, such that an intermediate 700*404b540aSrobertlimit can be checked much more quickly than the true limit on 701*404b540aSrobertrange operations. If identified with a member of iterator_traits, 702*404b540aSrobertalgorithms may be specialized for this case. Of course the 703*404b540aSrobertiterators that have this quality can be identified by specializing 704*404b540aSroberta traits class. 705*404b540aSrobert 706*404b540aSrobertMany of the algorithms must be specialized for the streambuf 707*404b540aSrobertiterators, to take advantage of block-mode operations, in order 708*404b540aSrobertto allow iostream/locale operations' performance not to suffer. 709*404b540aSrobertIt may be that they could be treated as staged iterators and 710*404b540aSroberttake advantage of those optimizations. 711*404b540aSrobert 712*404b540aSrobertChapter 25 Algorithms 713*404b540aSrobert---------------------- 714*404b540aSrobertHeaders: <algorithm> 715*404b540aSrobertC headers: <cstdlib> (also in 18, 21, 26)) 716*404b540aSrobert 717*404b540aSrobertThe algorithms are "mostly complete". As mentioned above, they 718*404b540aSrobertare optimized for speed at the expense of code and data size. 719*404b540aSrobert 720*404b540aSrobertSpecializations of many of the algorithms for non-STL types would 721*404b540aSrobertgive performance improvements, but we must use great care not to 722*404b540aSrobertinterfere with fragile template overloading semantics for the 723*404b540aSrobertstandard interfaces. Conventionally the standard function template 724*404b540aSrobertinterface is an inline which delegates to a non-standard function 725*404b540aSrobertwhich is then overloaded (this is already done in many places in 726*404b540aSrobertthe library). Particularly appealing opportunities for the sake of 727*404b540aSrobertiostream performance are for copy and find applied to streambuf 728*404b540aSrobertiterators or (as noted elsewhere) for staged iterators, of which 729*404b540aSrobertthe streambuf iterators are a good example. 730*404b540aSrobert 731*404b540aSrobertThe bsearch and qsort functions cannot be overloaded properly as 732*404b540aSrobertrequired by the standard because gcc does not yet allow overloading 733*404b540aSroberton the extern-"C"-ness of a function pointer. 734*404b540aSrobert 735*404b540aSrobertChapter 26 Numerics 736*404b540aSrobert-------------------- 737*404b540aSrobertHeaders: <complex> <valarray> <numeric> 738*404b540aSrobertC headers: <cmath>, <cstdlib> (also 18, 21, 25) 739*404b540aSrobert 740*404b540aSrobertNumeric components: Gabriel dos Reis's valarray, Drepper's complex, 741*404b540aSrobertand the few algorithms from the STL are "mostly done". Of course 742*404b540aSrobertoptimization opportunities abound for the numerically literate. It 743*404b540aSrobertis not clear whether the valarray implementation really conforms 744*404b540aSrobertfully, in the assumptions it makes about aliasing (and lack thereof) 745*404b540aSrobertin its arguments. 746*404b540aSrobert 747*404b540aSrobertThe C div() and ldiv() functions are interesting, because they are the 748*404b540aSrobertonly case where a C library function returns a class object by value. 749*404b540aSrobertSince the C++ type div_t must be different from the underlying C type 750*404b540aSrobert(which is in the wrong namespace) the underlying functions div() and 751*404b540aSrobertldiv() cannot be re-used efficiently. Fortunately they are trivial to 752*404b540aSrobertre-implement. 753*404b540aSrobert 754*404b540aSrobertChapter 27 Iostreams 755*404b540aSrobert--------------------- 756*404b540aSrobertHeaders: <iosfwd> <streambuf> <ios> <ostream> <istream> <iostream> 757*404b540aSrobert <iomanip> <sstream> <fstream> 758*404b540aSrobertC headers: <cstdio> <cwchar> (also in 21) 759*404b540aSrobert 760*404b540aSrobertIostream is currently in a very incomplete state. <iosfwd>, <iomanip>, 761*404b540aSrobertios_base, and basic_ios<> are "mostly complete". basic_streambuf<> and 762*404b540aSrobertbasic_ostream<> are well along, but basic_istream<> has had little work 763*404b540aSrobertdone. The standard stream objects, <sstream> and <fstream> have been 764*404b540aSrobertstarted; basic_filebuf<> "write" functions have been implemented just 765*404b540aSrobertenough to do "hello, world". 766*404b540aSrobert 767*404b540aSrobertMost of the istream and ostream operators << and >> (with the exception 768*404b540aSrobertof the op<<(integer) ones) have not been changed to use locale primitives, 769*404b540aSrobertsentry objects, or char_traits members. 770*404b540aSrobert 771*404b540aSrobertAll these templates should be manually instantiated for char and 772*404b540aSrobertwchar_t in a way that links only used members into user programs. 773*404b540aSrobert 774*404b540aSrobertStreambuf is fertile ground for optimization extensions. An extended 775*404b540aSrobertinterface giving iterator access to its internal buffer would be very 776*404b540aSrobertuseful for other library components. 777*404b540aSrobert 778*404b540aSrobertIostream operations (primarily operators << and >>) can take advantage 779*404b540aSrobertof the case where user code has not specified a locale, and bypass locale 780*404b540aSrobertoperations entirely. The current implementation of op<</num_put<>::put, 781*404b540aSrobertfor the integer types, demonstrates how they can cache encoding details 782*404b540aSrobertfrom the locale on each operation. There is lots more room for 783*404b540aSrobertoptimization in this area. 784*404b540aSrobert 785*404b540aSrobertThe definition of the relationship between the standard streams 786*404b540aSrobertcout et al. and stdout et al. requires something like a "stdiobuf". 787*404b540aSrobertThe SGI solution of using double-indirection to actually use a 788*404b540aSrobertstdio FILE object for buffering is unsatisfactory, because it 789*404b540aSrobertinterferes with peephole loop optimizations. 790*404b540aSrobert 791*404b540aSrobertThe <sstream> header work has begun. stringbuf can benefit from 792*404b540aSrobertfriendship with basic_string<> and basic_string<>::_Rep to use 793*404b540aSrobertthose objects directly as buffers, and avoid allocating and making 794*404b540aSrobertcopies. 795*404b540aSrobert 796*404b540aSrobertThe basic_filebuf<> template is a complex beast. It is specified to 797*404b540aSrobertuse the locale facet codecvt<> to translate characters between native 798*404b540aSrobertfiles and the locale character encoding. In general this involves 799*404b540aSroberttwo buffers, one of "char" representing the file and another of 800*404b540aSrobert"char_type", for the stream, with codecvt<> translating. The process 801*404b540aSrobertis complicated by the variable-length nature of the translation, and 802*404b540aSrobertthe need to seek to corresponding places in the two representations. 803*404b540aSrobertFor the case of basic_filebuf<char>, when no translation is needed, 804*404b540aSroberta single buffer suffices. A specialized filebuf can be used to reduce 805*404b540aSrobertcode space overhead when no locale has been imbued. Matt Austern's 806*404b540aSrobertwork at SGI will be useful, perhaps directly as a source of code, or 807*404b540aSrobertat least as an example to draw on. 808*404b540aSrobert 809*404b540aSrobertFilebuf, almost uniquely (cf. operator new), depends heavily on 810*404b540aSrobertunderlying environmental facilities. In current releases iostream 811*404b540aSrobertdepends fairly heavily on libio constant definitions, but it should 812*404b540aSrobertbe made independent. It also depends on operating system primitives 813*404b540aSrobertfor file operations. There is immense room for optimizations using 814*404b540aSrobert(e.g.) mmap for reading. The shadow/ directory wraps, besides the 815*404b540aSrobertstandard C headers, the libio.h and unistd.h headers, for use mainly 816*404b540aSrobertby filebuf. These wrappings have not been completed, though there 817*404b540aSrobertis scaffolding in place. 818*404b540aSrobert 819*404b540aSrobertThe encapulation of certain C header <cstdio> names presents an 820*404b540aSrobertinteresting problem. It is possible to define an inline std::fprintf() 821*404b540aSrobertimplemented in terms of the 'extern "C"' vfprintf(), but there is no 822*404b540aSrobertstandard vfscanf() to use to implement std::fscanf(). It appears that 823*404b540aSrobertvfscanf but be re-implemented in C++ for targets where no vfscanf 824*404b540aSrobertextension has been defined. This is interesting in that it seems 825*404b540aSrobertto be the only significant case in the C library where this kind of 826*404b540aSrobertrewriting is necessary. (Of course Glibc provides the vfscanf() 827*404b540aSrobertextension.) (The functions related to exit() must be rewritten 828*404b540aSrobertfor other reasons.) 829*404b540aSrobert 830*404b540aSrobert 831*404b540aSrobertAnnex D 832*404b540aSrobert------- 833*404b540aSrobertHeaders: <strstream> 834*404b540aSrobert 835*404b540aSrobertAnnex D defines many non-library features, and many minor 836*404b540aSrobertmodifications to various headers, and a complete header. 837*404b540aSrobertIt is "mostly done", except that the libstdc++-2 <strstream> 838*404b540aSrobertheader has not been adopted into the library, or checked to 839*404b540aSrobertverify that it matches the draft in those details that were 840*404b540aSrobertclarified by the committee. Certainly it must at least be 841*404b540aSrobertmoved into the std namespace. 842*404b540aSrobert 843*404b540aSrobertWe still need to wrap all the deprecated features in #if guards 844*404b540aSrobertso that pedantic compile modes can detect their use. 845*404b540aSrobert 846*404b540aSrobertNonstandard Extensions 847*404b540aSrobert---------------------- 848*404b540aSrobertHeaders: <iostream.h> <strstream.h> <hash> <rbtree> 849*404b540aSrobert <pthread_alloc> <stdiobuf> (etc.) 850*404b540aSrobert 851*404b540aSrobertUser code has come to depend on a variety of nonstandard components 852*404b540aSrobertthat we must not omit. Much of this code can be adopted from 853*404b540aSrobertlibstdc++-v2 or from the SGI STL. This particularly includes 854*404b540aSrobert<iostream.h>, <strstream.h>, and various SGI extensions such 855*404b540aSrobertas <hash_map.h>. Many of these are already placed in the 856*404b540aSrobertsubdirectories ext/ and backward/. (Note that it is better to 857*404b540aSrobertinclude them via "<backward/hash_map.h>" or "<ext/hash_map>" than 858*404b540aSrobertto search the subdirectory itself via a "-I" directive. 859*404b540aSrobert 860