xref: /openbsd-src/gnu/gcc/libstdc++-v3/docs/html/17_intro/DESIGN (revision 404b540a9034ac75a6199ad1a32d1bbc7a0d4210)
1
2Standard C++ Library Design Document
3------------------------------------
4
5This is an overview of libstdc++-v3, with particular attention
6to projects to be done and how they fit into the whole.
7
8The Library
9-----------
10
11This paper is covers two major areas:
12
13 - Features and policies not mentioned in the standard that
14   the quality of the library implementation depends on, including
15   extensions and "implementation-defined" features;
16
17 - Plans for required but unimplemented library features and
18   optimizations to them.
19
20Overhead
21--------
22
23The standard defines a large library, much larger than the standard
24C library.  A naive implementation would suffer substantial overhead
25in compile time, executable size, and speed, rendering it unusable
26in many (particularly embedded) applications.  The alternative demands
27care in construction, and some compiler support, but there is no
28need for library subsets.
29
30What are the sources of this overhead?  There are four main causes:
31
32 - The library is specified almost entirely as templates, which
33   with current compilers must be included in-line, resulting in
34   very slow builds as tens or hundreds of thousands of lines
35   of function definitions are read for each user source file.
36   Indeed, the entire SGI STL, as well as the dos Reis valarray,
37   are provided purely as header files, largely for simplicity in
38   porting.  Iostream/locale is (or will be) as large again.
39
40 - The library is very flexible, specifying a multitude of hooks
41   where users can insert their own code in place of defaults.
42   When these hooks are not used, any time and code expended to
43   support that flexibility is wasted.
44
45 - Templates are often described as causing to "code bloat".  In
46   practice, this refers (when it refers to anything real) to several
47   independent processes.  First, when a class template is manually
48   instantiated in its entirely, current compilers place the definitions
49   for all members in a single object file, so that a program linking
50   to one member gets definitions of all.  Second, template functions
51   which do not actually depend on the template argument are, under
52   current compilers, generated anew for each instantiation, rather
53   than being shared with other instantiations.  Third, some of the
54   flexibility mentioned above comes from virtual functions (both in
55   regular classes and template classes) which current linkers add
56   to the executable file even when they manifestly cannot be called.
57
58 - The library is specified to use a language feature, exceptions,
59   which in the current gcc compiler ABI imposes a run time and
60   code space cost to handle the possibility of exceptions even when
61   they are not used.  Under the new ABI (accessed with -fnew-abi),
62   there is a space overhead and a small reduction in code efficiency
63   resulting from lost optimization opportunities associated with
64   non-local branches associated with exceptions.
65
66What can be done to eliminate this overhead?  A variety of coding
67techniques, and compiler, linker and library improvements and
68extensions may be used, as covered below.  Most are not difficult,
69and some are already implemented in varying degrees.
70
71Overhead: Compilation Time
72--------------------------
73
74Providing "ready-instantiated" template code in object code archives
75allows us to avoid generating and optimizing template instantiations
76in each compilation unit which uses them.  However, the number of such
77instantiations that are useful to provide is limited, and anyway this
78is not enough, by itself, to minimize compilation time.  In particular,
79it does not reduce time spent parsing conforming headers.
80
81Quicker header parsing will depend on library extensions and compiler
82improvements.   One approach is some variation on the techniques
83previously marketed as "pre-compiled headers", now standardized as
84support for the "export" keyword.  "Exported" template definitions
85can be placed (once) in a "repository" -- really just a library, but
86of template definitions rather than object code -- to be drawn upon
87at link time when an instantiation is needed, rather than placed in
88header files to be parsed along with every compilation unit.
89
90Until "export" is implemented we can put some of the lengthy template
91definitions in #if guards or alternative headers so that users can skip
92over the the full definitions when they need only the ready-instantiated
93specializations.
94
95To be precise, this means that certain headers which define
96templates which users normally use only for certain arguments
97can be instrumented to avoid exposing the template definitions
98to the compiler unless a macro is defined.  For example, in
99<string>, we might have:
100
101   template <class _CharT, ... > class basic_string {
102     ... // member declarations
103   };
104   ... // operator declarations
105
106   #ifdef _STRICT_ISO_
107   # if _G_NO_TEMPLATE_EXPORT
108   #   include <bits/std_locale.h>  // headers needed by definitions
109   #   ...
110   #   include <bits/string.tcc>  // member and global template definitions.
111   # endif
112   #endif
113
114Users who compile without specifying a strict-ISO-conforming flag
115would not see many of the template definitions they now see, and rely
116instead on ready-instantiated specializations in the library.  This
117technique would be useful for the following substantial components:
118string, locale/iostreams, valarray.  It would *not* be useful or
119usable with the following: containers, algorithms, iterators,
120allocator.  Since these constitute a large (though decreasing)
121fraction of the library, the benefit the technique offers is
122limited.
123
124The language specifies the semantics of the "export" keyword, but
125the gcc compiler does not yet support it.  When it does, problems
126with large template inclusions can largely disappear, given some
127minor library reorganization, along with the need for the apparatus
128described above.
129
130Overhead: Flexibility Cost
131--------------------------
132
133The library offers many places where users can specify operations
134to be performed by the library in place of defaults.  Sometimes
135this seems to require that the library use a more-roundabout, and
136possibly slower, way to accomplish the default requirements than
137would be used otherwise.
138
139The primary protection against this overhead is thorough compiler
140optimization, to crush out layers of inline function interfaces.
141Kuck & Associates has demonstrated the practicality of this kind
142of optimization.
143
144The second line of defense against this overhead is explicit
145specialization.  By defining helper function templates, and writing
146specialized code for the default case, overhead can be eliminated
147for that case without sacrificing flexibility.  This takes full
148advantage of any ability of the optimizer to crush out degenerate
149code.
150
151The library specifies many virtual functions which current linkers
152load even when they cannot be called.  Some minor improvements to the
153compiler and to ld would eliminate any such overhead by simply
154omitting virtual functions that the complete program does not call.
155A prototype of this work has already been done.  For targets where
156GNU ld is not used, a "pre-linker" could do the same job.
157
158The main areas in the standard interface where user flexibility
159can result in overhead are:
160
161 - Allocators:  Containers are specified to use user-definable
162   allocator types and objects, making tuning for the container
163   characteristics tricky.
164
165 - Locales: the standard specifies locale objects used to implement
166   iostream operations, involving many virtual functions which use
167   streambuf iterators.
168
169 - Algorithms and containers: these may be instantiated on any type,
170   frequently duplicating code for identical operations.
171
172 - Iostreams and strings: users are permitted to use these on their
173   own types, and specify the operations the stream must use on these
174   types.
175
176Note that these sources of overhead are _avoidable_.  The techniques
177to avoid them are covered below.
178
179Code Bloat
180----------
181
182In the SGI STL, and in some other headers, many of the templates
183are defined "inline" -- either explicitly or by their placement
184in class definitions -- which should not be inline.  This is a
185source of code bloat.  Matt had remarked that he was relying on
186the compiler to recognize what was too big to benefit from inlining,
187and generate it out-of-line automatically.  However, this also can
188result in code bloat except where the linker can eliminate the extra
189copies.
190
191Fixing these cases will require an audit of all inline functions
192defined in the library to determine which merit inlining, and moving
193the rest out of line.  This is an issue mainly in chapters 23, 25, and
19427.  Of course it can be done incrementally, and we should generally
195accept patches that move large functions out of line and into ".tcc"
196files, which can later be pulled into a repository.  Compiler/linker
197improvements to recognize very large inline functions and move them
198out-of-line, but shared among compilation units, could make this
199work unnecessary.
200
201Pre-instantiating template specializations currently produces large
202amounts of dead code which bloats statically linked programs.  The
203current state of the static library, libstdc++.a, is intolerable on
204this account, and will fuel further confused speculation about a need
205for a library "subset".  A compiler improvement that treats each
206instantiated function as a separate object file, for linking purposes,
207would be one solution to this problem.  An alternative would be to
208split up the manual instantiation files into dozens upon dozens of
209little files, each compiled separately, but an abortive attempt at
210this was done for <string> and, though it is far from complete, it
211is already a nuisance.  A better interim solution (just until we have
212"export") is badly needed.
213
214When building a shared library, the current compiler/linker cannot
215automatically generate the instantiatiations needed.  This creates a
216miserable situation; it means any time something is changed in the
217library, before a shared library can be built someone must manually
218copy the declarations of all templates that are needed by other parts
219of the library to an "instantiation" file, and add it to the build
220system to be compiled and linked to the library.  This process is
221readily automated, and should be automated as soon as possible.
222Users building their own shared libraries experience identical
223frustrations.
224
225Sharing common aspects of template definitions among instantiations
226can radically reduce code bloat.  The compiler could help a great
227deal here by recognizing when a function depends on nothing about
228a template parameter, or only on its size, and giving the resulting
229function a link-name "equate" that allows it to be shared with other
230instantiations.  Implementation code could take advantage of the
231capability by factoring out code that does not depend on the template
232argument into separate functions to be merged by the compiler.
233
234Until such a compiler optimization is implemented, much can be done
235manually (if tediously) in this direction.  One such optimization is
236to derive class templates from non-template classes, and move as much
237implementation as possible into the base class.  Another is to partial-
238specialize certain common instantiations, such as vector<T*>, to share
239code for instantiations on all types T.  While these techniques work,
240they are far from the complete solution that a compiler improvement
241would afford.
242
243Overhead: Expensive Language Features
244-------------------------------------
245
246The main "expensive" language feature used in the standard library
247is exception support, which requires compiling in cleanup code with
248static table data to locate it, and linking in library code to use
249the table.  For small embedded programs the amount of such library
250code and table data is assumed by some to be excessive.  Under the
251"new" ABI this perception is generally exaggerated, although in some
252cases it may actually be excessive.
253
254To implement a library which does not use exceptions directly is
255not difficult given minor compiler support (to "turn off" exceptions
256and ignore exception constructs), and results in no great library
257maintenance difficulties.  To be precise, given "-fno-exceptions",
258the compiler should treat "try" blocks as ordinary blocks, and
259"catch" blocks as dead code to ignore or eliminate.  Compiler
260support is not strictly necessary, except in the case of "function
261try blocks"; otherwise the following macros almost suffice:
262
263  #define throw(X)
264  #define try      if (true)
265  #define catch(X) else if (false)
266
267However, there may be a need to use function try blocks in the
268library implementation, and use of macros in this way can make
269correct diagnostics impossible.  Furthermore, use of this scheme
270would require the library to call a function to re-throw exceptions
271from a try block.  Implementing the above semantics in the compiler
272is preferable.
273
274Given the support above (however implemented) it only remains to
275replace code that "throws" with a call to a well-documented "handler"
276function in a separate compilation unit which may be replaced by
277the user.  The main source of exceptions that would be difficult
278for users to avoid is memory allocation failures, but users can
279define their own memory allocation primitives that never throw.
280Otherwise, the complete list of such handlers, and which library
281functions may call them, would be needed for users to be able to
282implement the necessary substitutes.  (Fortunately, they have the
283source code.)
284
285Opportunities
286-------------
287
288The template capabilities of C++ offer enormous opportunities for
289optimizing common library operations, well beyond what would be
290considered "eliminating overhead".  In particular, many operations
291done in Glibc with macros that depend on proprietary language
292extensions can be implemented in pristine Standard C++.  For example,
293the chapter 25 algorithms, and even C library functions such as strchr,
294can be specialized for the case of static arrays of known (small) size.
295
296Detailed optimization opportunities are identified below where
297the component where they would appear is discussed.  Of course new
298opportunities will be identified during implementation.
299
300Unimplemented Required Library Features
301---------------------------------------
302
303The standard specifies hundreds of components, grouped broadly by
304chapter.  These are listed in excruciating detail in the CHECKLIST
305file.
306
307  17 general
308  18 support
309  19 diagnostics
310  20 utilities
311  21 string
312  22 locale
313  23 containers
314  24 iterators
315  25 algorithms
316  26 numerics
317  27 iostreams
318  Annex D  backward compatibility
319
320Anyone participating in implementation of the library should obtain
321a copy of the standard, ISO 14882.   People in the U.S. can obtain an
322electronic copy for US$18 from ANSI's web site.  Those from other
323countries should visit http://www.iso.ch/ to find out the location
324of their country's representation in ISO, in order to know who can
325sell them a copy.
326
327The emphasis in the following sections is on unimplemented features
328and optimization opportunities.
329
330Chapter 17  General
331-------------------
332
333Chapter 17 concerns overall library requirements.
334
335The standard doesn't mention threads.  A multi-thread (MT) extension
336primarily affects operators new and delete (18), allocator (20),
337string (21), locale (22), and iostreams (27).  The common underlying
338support needed for this is discussed under chapter 20.
339
340The standard requirements on names from the C headers create a
341lot of work, mostly done.  Names in the C headers must be visible
342in the std:: and sometimes the global namespace; the names in the
343two scopes must refer to the same object.  More stringent is that
344Koenig lookup implies that any types specified as defined in std::
345really are defined in std::.  Names optionally implemented as
346macros in C cannot be macros in C++.  (An overview may be read at
347<http://www.cantrip.org/cheaders.html>).  The scripts "inclosure"
348and "mkcshadow", and the directories shadow/ and cshadow/, are the
349beginning of an effort to conform in this area.
350
351A correct conforming definition of C header names based on underlying
352C library headers, and practical linking of conforming namespaced
353customer code with third-party C libraries depends ultimately on
354an ABI change, allowing namespaced C type names to be mangled into
355type names as if they were global, somewhat as C function names in a
356namespace, or C++ global variable names, are left unmangled.  Perhaps
357another "extern" mode, such as 'extern "C-global"' would be an
358appropriate place for such type definitions.  Such a type would
359affect mangling as follows:
360
361  namespace A {
362    struct X {};
363    extern "C-global" {  // or maybe just 'extern "C"'
364      struct Y {};
365    };
366  }
367  void f(A::X*);  // mangles to f__FPQ21A1X
368  void f(A::Y*);  // mangles to f__FP1Y
369
370(It may be that this is really the appropriate semantics for regular
371'extern "C"', and 'extern "C-global"', as an extension, would not be
372necessary.) This would allow functions declared in non-standard C headers
373(and thus fixable by neither us nor users) to link properly with functions
374declared using C types defined in properly-namespaced headers.  The
375problem this solves is that C headers (which C++ programmers do persist
376in using) frequently forward-declare C struct tags without including
377the header where the type is defined, as in
378
379  struct tm;
380  void munge(tm*);
381
382Without some compiler accommodation, munge cannot be called by correct
383C++ code using a pointer to a correctly-scoped tm* value.
384
385The current C headers use the preprocessor extension "#include_next",
386which the compiler complains about when run "-pedantic".
387(Incidentally, it appears that "-fpedantic" is currently ignored,
388probably a bug.)  The solution in the C compiler is to use
389"-isystem" rather than "-I", but unfortunately in g++ this seems
390also to wrap the whole header in an 'extern "C"' block, so it's
391unusable for C++ headers.  The correct solution appears to be to
392allow the various special include-directory options, if not given
393an argument, to affect subsequent include-directory options additively,
394so that if one said
395
396  -pedantic -iprefix $(prefix) \
397  -idirafter -ino-pedantic -ino-extern-c -iwithprefix -I g++-v3 \
398  -iwithprefix -I g++-v3/ext
399
400the compiler would search $(prefix)/g++-v3 and not report
401pedantic warnings for files found there, but treat files in
402$(prefix)/g++-v3/ext pedantically.  (The undocumented semantics
403of "-isystem" in g++ stink.  Can they be rescinded?  If not it
404must be replaced with something more rationally behaved.)
405
406All the C headers need the treatment above; in the standard these
407headers are mentioned in various chapters.  Below, I have only
408mentioned those that present interesting implementation issues.
409
410The components identified as "mostly complete", below, have not been
411audited for conformance.  In many cases where the library passes
412conformance tests we have non-conforming extensions that must be
413wrapped in #if guards for "pedantic" use, and in some cases renamed
414in a conforming way for continued use in the implementation regardless
415of conformance flags.
416
417The STL portion of the library still depends on a header
418stl/bits/stl_config.h full of #ifdef clauses.  This apparatus
419should be replaced with autoconf/automake machinery.
420
421The SGI STL defines a type_traits<> template, specialized for
422many types in their code including the built-in numeric and
423pointer types and some library types, to direct optimizations of
424standard functions.  The SGI compiler has been extended to generate
425specializations of this template automatically for user types,
426so that use of STL templates on user types can take advantage of
427these optimizations.  Specializations for other, non-STL, types
428would make more optimizations possible, but extending the gcc
429compiler in the same way would be much better.  Probably the next
430round of standardization will ratify this, but probably with
431changes, so it probably should be renamed to place it in the
432implementation namespace.
433
434The SGI STL also defines a large number of extensions visible in
435standard headers.  (Other extensions that appear in separate headers
436have been sequestered in subdirectories ext/ and backward/.)  All
437these extensions should be moved to other headers where possible,
438and in any case wrapped in a namespace (not std!), and (where kept
439in a standard header) girded about with macro guards.  Some cannot be
440moved out of standard headers because they are used to implement
441standard features.   The canonical method for accommodating these
442is to use a protected name, aliased in macro guards to a user-space
443name.  Unfortunately C++ offers no satisfactory template typedef
444mechanism, so very ad-hoc and unsatisfactory aliasing must be used
445instead.
446
447Implementation of a template typedef mechanism should have the highest
448priority among possible extensions, on the same level as implementation
449of the template "export" feature.
450
451Chapter 18  Language support
452----------------------------
453
454Headers: <limits> <new> <typeinfo> <exception>
455C headers: <cstddef> <climits> <cfloat>  <cstdarg> <csetjmp>
456           <ctime>   <csignal> <cstdlib> (also 21, 25, 26)
457
458This defines the built-in exceptions, rtti, numeric_limits<>,
459operator new and delete.  Much of this is provided by the
460compiler in its static runtime library.
461
462Work to do includes defining numeric_limits<> specializations in
463separate files for all target architectures.  Values for integer types
464except for bool and wchar_t are readily obtained from the C header
465<limits.h>, but values for the remaining numeric types (bool, wchar_t,
466float, double, long double) must be entered manually.  This is
467largely dog work except for those members whose values are not
468easily deduced from available documentation.  Also, this involves
469some work in target configuration to identify the correct choice of
470file to build against and to install.
471
472The definitions of the various operators new and delete must be
473made thread-safe, which depends on a portable exclusion mechanism,
474discussed under chapter 20.   Of course there is always plenty of
475room for improvements to the speed of operators new and delete.
476
477<cstdarg>, in Glibc, defines some macros that gcc does not allow to
478be wrapped into an inline function.  Probably this header will demand
479attention whenever a new target is chosen.  The functions atexit(),
480exit(), and abort() in cstdlib have different semantics in C++, so
481must be re-implemented for C++.
482
483Chapter 19  Diagnostics
484-----------------------
485
486Headers: <stdexcept>
487C headers: <cassert> <cerrno>
488
489This defines the standard exception objects, which are "mostly complete".
490Cygnus has a version, and now SGI provides a slightly different one.
491It makes little difference which we use.
492
493The C global name "errno", which C allows to be a variable or a macro,
494is required in C++ to be a macro.  For MT it must typically result in
495a function call.
496
497Chapter 20  Utilities
498---------------------
499Headers: <utility> <functional> <memory>
500C header: <ctime> (also in 18)
501
502SGI STL provides "mostly complete" versions of all the components
503defined in this chapter.  However, the auto_ptr<> implementation
504is known to be wrong.  Furthermore, the standard definition of it
505is known to be unimplementable as written.  A minor change to the
506standard would fix it, and auto_ptr<> should be adjusted to match.
507
508Multi-threading affects the allocator implementation, and there must
509be configuration/installation choices for different users' MT
510requirements.  Anyway, users will want to tune allocator options
511to support different target conditions, MT or no.
512
513The primitives used for MT implementation should be exposed, as an
514extension, for users' own work.  We need cross-CPU "mutex" support,
515multi-processor shared-memory atomic integer operations, and single-
516processor uninterruptible integer operations, and all three configurable
517to be stubbed out for non-MT use, or to use an appropriately-loaded
518dynamic library for the actual runtime environment, or statically
519compiled in for cases where the target architecture is known.
520
521Chapter 21  String
522------------------
523Headers: <string>
524C headers: <cctype> <cwctype> <cstring> <cwchar> (also in 27)
525           <cstdlib> (also in 18, 25, 26)
526
527We have "mostly-complete" char_traits<> implementations.  Many of the
528char_traits<char> operations might be optimized further using existing
529proprietary language extensions.
530
531We have a "mostly-complete" basic_string<> implementation.  The work
532to manually instantiate char and wchar_t specializations in object
533files to improve link-time behavior is extremely unsatisfactory,
534literally tripling library-build time with no commensurate improvement
535in static program link sizes.  It must be redone.  (Similar work is
536needed for some components in chapters 22 and 27.)
537
538Other work needed for strings is MT-safety, as discussed under the
539chapter 20 heading.
540
541The standard C type mbstate_t from <cwchar> and used in char_traits<>
542must be different in C++ than in C, because in C++ the default constructor
543value mbstate_t() must be the "base" or "ground" sequence state.
544(According to the likely resolution of a recently raised Core issue,
545this may become unnecessary.  However, there are other reasons to
546use a state type not as limited as whatever the C library provides.)
547If we might want to provide conversions from (e.g.) internally-
548represented EUC-wide to externally-represented Unicode, or vice-
549versa, the mbstate_t we choose will need to be more accommodating
550than what might be provided by an underlying C library.
551
552There remain some basic_string template-member functions which do
553not overload properly with their non-template brethren.  The infamous
554hack akin to what was done in vector<> is needed, to conform to
55523.1.1 para 10.  The CHECKLIST items for basic_string marked 'X',
556or incomplete, are so marked for this reason.
557
558Replacing the string iterators, which currently are simple character
559pointers, with class objects would greatly increase the safety of the
560client interface, and also permit a "debug" mode in which range,
561ownership, and validity are rigorously checked.  The current use of
562raw pointers as string iterators is evil.  vector<> iterators need the
563same treatment.  Note that the current implementation freely mixes
564pointers and iterators, and that must be fixed before safer iterators
565can be introduced.
566
567Some of the functions in <cstring> are different from the C version.
568generally overloaded on const and non-const argument pointers.  For
569example, in <cstring> strchr is overloaded.  The functions isupper
570etc.  in <cctype> typically implemented as macros in C are functions
571in C++, because they are overloaded with others of the same name
572defined in <locale>.
573
574Many of the functions required in <cwctype> and <cwchar> cannot be
575implemented using underlying C facilities on intended targets because
576such facilities only partly exist.
577
578Chapter 22  Locale
579------------------
580Headers: <locale>
581C headers: <clocale>
582
583We have a "mostly complete" class locale, with the exception of
584code for constructing, and handling the names of, named locales.
585The ways that locales are named (particularly when categories
586(e.g. LC_TIME, LC_COLLATE) are different) varies among all target
587environments.  This code must be written in various versions and
588chosen by configuration parameters.
589
590Members of many of the facets defined in <locale> are stubs.  Generally,
591there are two sets of facets: the base class facets (which are supposed
592to implement the "C" locale) and the "byname" facets, which are supposed
593to read files to determine their behavior.  The base ctype<>, collate<>,
594and numpunct<> facets are "mostly complete", except that the table of
595bitmask values used for "is" operations, and corresponding mask values,
596are still defined in libio and just included/linked.  (We will need to
597implement these tables independently, soon, but should take advantage
598of libio where possible.)  The num_put<>::put members for integer types
599are "mostly complete".
600
601A complete list of what has and has not been implemented may be
602found in CHECKLIST.  However, note that the current definition of
603codecvt<wchar_t,char,mbstate_t> is wrong.  It should simply write
604out the raw bytes representing the wide characters, rather than
605trying to convert each to a corresponding single "char" value.
606
607Some of the facets are more important than others.  Specifically,
608the members of ctype<>, numpunct<>, num_put<>, and num_get<> facets
609are used by other library facilities defined in <string>, <istream>,
610and <ostream>, and the codecvt<> facet is used by basic_filebuf<>
611in <fstream>, so a conforming iostream implementation depends on
612these.
613
614The "long long" type eventually must be supported, but code mentioning
615it should be wrapped in #if guards to allow pedantic-mode compiling.
616
617Performance of num_put<> and num_get<> depend critically on
618caching computed values in ios_base objects, and on extensions
619to the interface with streambufs.
620
621Specifically: retrieving a copy of the locale object, extracting
622the needed facets, and gathering data from them, for each call to
623(e.g.) operator<< would be prohibitively slow.   To cache format
624data for use by num_put<> and num_get<> we have a _Format_cache<>
625object stored in the ios_base::pword() array.  This is constructed
626and initialized lazily, and is organized purely for utility.  It
627is discarded when a new locale with different facets is imbued.
628
629Using only the public interfaces of the iterator arguments to the
630facet functions would limit performance by forbidding "vector-style"
631character operations.  The streambuf iterator optimizations are
632described under chapter 24, but facets can also bypass the streambuf
633iterators via explicit specializations and operate directly on the
634streambufs, and use extended interfaces to get direct access to the
635streambuf internal buffer arrays.  These extensions are mentioned
636under chapter 27.  These optimizations are particularly important
637for input parsing.
638
639Unused virtual members of locale facets can be omitted, as mentioned
640above, by a smart linker.
641
642Chapter 23  Containers
643----------------------
644Headers: <deque> <list> <queue> <stack> <vector> <map> <set> <bitset>
645
646All the components in chapter 23 are implemented in the SGI STL.
647They are "mostly complete"; they include a large number of
648nonconforming extensions which must be wrapped.  Some of these
649are used internally and must be renamed or duplicated.
650
651The SGI components are optimized for large-memory environments.  For
652embedded targets, different criteria might be more appropriate.  Users
653will want to be able to tune this behavior.  We should provide
654ways for users to compile the library with different memory usage
655characteristics.
656
657A lot more work is needed on factoring out common code from different
658specializations to reduce code size here and in chapter 25.  The
659easiest fix for this would be a compiler/ABI improvement that allows
660the compiler to recognize when a specialization depends only on the
661size (or other gross quality) of a template argument, and allow the
662linker to share the code with similar specializations.  In its
663absence, many of the algorithms and containers can be partial-
664specialized, at least for the case of pointers, but this only solves
665a small part of the problem.  Use of a type_traits-style template
666allows a few more optimization opportunities, more if the compiler
667can generate the specializations automatically.
668
669As an optimization, containers can specialize on the default allocator
670and bypass it, or take advantage of details of its implementation
671after it has been improved upon.
672
673Replacing the vector iterators, which currently are simple element
674pointers, with class objects would greatly increase the safety of the
675client interface, and also permit a "debug" mode in which range,
676ownership, and validity are rigorously checked.  The current use of
677pointers for iterators is evil.
678
679As mentioned for chapter 24, the deque iterator is a good example of
680an opportunity to implement a "staged" iterator that would benefit
681from specializations of some algorithms.
682
683Chapter 24  Iterators
684---------------------
685Headers: <iterator>
686
687Standard iterators are "mostly complete", with the exception of
688the stream iterators, which are not yet templatized on the
689stream type.  Also, the base class template iterator<> appears
690to be wrong, so everything derived from it must also be wrong,
691currently.
692
693The streambuf iterators (currently located in stl/bits/std_iterator.h,
694but should be under bits/) can be rewritten to take advantage of
695friendship with the streambuf implementation.
696
697Matt Austern has identified opportunities where certain iterator
698types, particularly including streambuf iterators and deque
699iterators, have a "two-stage" quality, such that an intermediate
700limit can be checked much more quickly than the true limit on
701range operations.  If identified with a member of iterator_traits,
702algorithms may be specialized for this case.  Of course the
703iterators that have this quality can be identified by specializing
704a traits class.
705
706Many of the algorithms must be specialized for the streambuf
707iterators, to take advantage of block-mode operations, in order
708to allow iostream/locale operations' performance not to suffer.
709It may be that they could be treated as staged iterators and
710take advantage of those optimizations.
711
712Chapter 25  Algorithms
713----------------------
714Headers: <algorithm>
715C headers: <cstdlib> (also in 18, 21, 26))
716
717The algorithms are "mostly complete".  As mentioned above, they
718are optimized for speed at the expense of code and data size.
719
720Specializations of many of the algorithms for non-STL types would
721give performance improvements, but we must use great care not to
722interfere with fragile template overloading semantics for the
723standard interfaces.  Conventionally the standard function template
724interface is an inline which delegates to a non-standard function
725which is then overloaded (this is already done in many places in
726the library).  Particularly appealing opportunities for the sake of
727iostream performance are for copy and find applied to streambuf
728iterators or (as noted elsewhere) for staged iterators, of which
729the streambuf iterators are a good example.
730
731The bsearch and qsort functions cannot be overloaded properly as
732required by the standard because gcc does not yet allow overloading
733on the extern-"C"-ness of a function pointer.
734
735Chapter 26  Numerics
736--------------------
737Headers: <complex> <valarray> <numeric>
738C headers: <cmath>, <cstdlib> (also 18, 21, 25)
739
740Numeric components: Gabriel dos Reis's valarray, Drepper's complex,
741and the few algorithms from the STL are "mostly done".   Of course
742optimization opportunities abound for the numerically literate.  It
743is not clear whether the valarray implementation really conforms
744fully, in the assumptions it makes about aliasing (and lack thereof)
745in its arguments.
746
747The C div() and ldiv() functions are interesting, because they are the
748only case where a C library function returns a class object by value.
749Since the C++ type div_t must be different from the underlying C type
750(which is in the wrong namespace) the underlying functions div() and
751ldiv() cannot be re-used efficiently.  Fortunately they are trivial to
752re-implement.
753
754Chapter 27  Iostreams
755---------------------
756Headers: <iosfwd> <streambuf> <ios> <ostream> <istream> <iostream>
757         <iomanip> <sstream> <fstream>
758C headers: <cstdio> <cwchar> (also in 21)
759
760Iostream is currently in a very incomplete state.  <iosfwd>, <iomanip>,
761ios_base, and basic_ios<> are "mostly complete".  basic_streambuf<> and
762basic_ostream<> are well along, but basic_istream<> has had little work
763done.  The standard stream objects, <sstream> and <fstream> have been
764started; basic_filebuf<> "write" functions have been implemented just
765enough to do "hello, world".
766
767Most of the istream and ostream operators << and >> (with the exception
768of the op<<(integer) ones) have not been changed to use locale primitives,
769sentry objects, or char_traits members.
770
771All these templates should be manually instantiated for char and
772wchar_t in a way that links only used members into user programs.
773
774Streambuf is fertile ground for optimization extensions.  An extended
775interface giving iterator access to its internal buffer would be very
776useful for other library components.
777
778Iostream operations (primarily operators << and >>) can take advantage
779of the case where user code has not specified a locale, and bypass locale
780operations entirely.  The current implementation of op<</num_put<>::put,
781for the integer types, demonstrates how they can cache encoding details
782from the locale on each operation.  There is lots more room for
783optimization in this area.
784
785The definition of the relationship between the standard streams
786cout et al. and stdout et al. requires something like a "stdiobuf".
787The SGI solution of using double-indirection to actually use a
788stdio FILE object for buffering is unsatisfactory, because it
789interferes with peephole loop optimizations.
790
791The <sstream> header work has begun.  stringbuf can benefit from
792friendship with basic_string<> and basic_string<>::_Rep to use
793those objects directly as buffers, and avoid allocating and making
794copies.
795
796The basic_filebuf<> template is a complex beast.  It is specified to
797use the locale facet codecvt<> to translate characters between native
798files and the locale character encoding.  In general this involves
799two buffers, one of "char" representing the file and another of
800"char_type", for the stream, with codecvt<> translating.  The process
801is complicated by the variable-length nature of the translation, and
802the need to seek to corresponding places in the two representations.
803For the case of basic_filebuf<char>, when no translation is needed,
804a single buffer suffices.  A specialized filebuf can be used to reduce
805code space overhead when no locale has been imbued.  Matt Austern's
806work at SGI will be useful, perhaps directly as a source of code, or
807at least as an example to draw on.
808
809Filebuf, almost uniquely (cf. operator new), depends heavily on
810underlying environmental facilities.  In current releases iostream
811depends fairly heavily on libio constant definitions, but it should
812be made independent.   It also depends on operating system primitives
813for file operations.  There is immense room for optimizations using
814(e.g.) mmap for reading.  The shadow/ directory wraps, besides the
815standard C headers, the libio.h and unistd.h headers, for use mainly
816by filebuf.  These wrappings have not been completed, though there
817is scaffolding in place.
818
819The encapulation of certain C header <cstdio> names presents an
820interesting problem.  It is possible to define an inline std::fprintf()
821implemented in terms of the 'extern "C"' vfprintf(), but there is no
822standard vfscanf() to use to implement std::fscanf().  It appears that
823vfscanf but be re-implemented in C++ for targets where no vfscanf
824extension has been defined.  This is interesting in that it seems
825to be the only significant case in the C library where this kind of
826rewriting is necessary.  (Of course Glibc provides the vfscanf()
827extension.)  (The functions related to exit() must be rewritten
828for other reasons.)
829
830
831Annex D
832-------
833Headers: <strstream>
834
835Annex D defines many non-library features, and many minor
836modifications to various headers, and a complete header.
837It is "mostly done", except that the libstdc++-2 <strstream>
838header has not been adopted into the library, or checked to
839verify that it matches the draft in those details that were
840clarified by the committee.  Certainly it must at least be
841moved into the std namespace.
842
843We still need to wrap all the deprecated features in #if guards
844so that pedantic compile modes can detect their use.
845
846Nonstandard Extensions
847----------------------
848Headers: <iostream.h> <strstream.h> <hash> <rbtree>
849         <pthread_alloc> <stdiobuf> (etc.)
850
851User code has come to depend on a variety of nonstandard components
852that we must not omit.  Much of this code can be adopted from
853libstdc++-v2 or from the SGI STL.  This particularly includes
854<iostream.h>, <strstream.h>, and various SGI extensions such
855as <hash_map.h>.  Many of these are already placed in the
856subdirectories ext/ and backward/.  (Note that it is better to
857include them via "<backward/hash_map.h>" or "<ext/hash_map>" than
858to search the subdirectory itself via a "-I" directive.
859
860