xref: /openbsd-src/gnu/gcc/libstdc++-v3/docs/html/24_iterators/howto.html (revision 404b540a9034ac75a6199ad1a32d1bbc7a0d4210)
1*404b540aSrobert<?xml version="1.0" encoding="ISO-8859-1"?>
2*404b540aSrobert<!DOCTYPE html
3*404b540aSrobert          PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
4*404b540aSrobert          "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
5*404b540aSrobert
6*404b540aSrobert<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
7*404b540aSrobert<head>
8*404b540aSrobert   <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
9*404b540aSrobert   <meta name="AUTHOR" content="pme@gcc.gnu.org (Phil Edwards)" />
10*404b540aSrobert   <meta name="KEYWORDS" content="HOWTO, libstdc++, GCC, g++, libg++, STL" />
11*404b540aSrobert   <meta name="DESCRIPTION" content="HOWTO for the libstdc++ chapter 24." />
12*404b540aSrobert   <meta name="GENERATOR" content="vi and eight fingers" />
13*404b540aSrobert   <title>libstdc++-v3 HOWTO:  Chapter 24: Iterators</title>
14*404b540aSrobert<link rel="StyleSheet" href="../lib3styles.css" type="text/css" />
15*404b540aSrobert<link rel="Start" href="../documentation.html" type="text/html"
16*404b540aSrobert  title="GNU C++ Standard Library" />
17*404b540aSrobert<link rel="Prev" href="../23_containers/howto.html" type="text/html"
18*404b540aSrobert  title="Containers" />
19*404b540aSrobert<link rel="Next" href="../25_algorithms/howto.html" type="text/html"
20*404b540aSrobert  title="Algorithms" />
21*404b540aSrobert<link rel="Copyright" href="../17_intro/license.html" type="text/html" />
22*404b540aSrobert<link rel="Help" href="../faq/index.html" type="text/html" title="F.A.Q." />
23*404b540aSrobert</head>
24*404b540aSrobert<body>
25*404b540aSrobert
26*404b540aSrobert<h1 class="centered"><a name="top">Chapter 24:  Iterators</a></h1>
27*404b540aSrobert
28*404b540aSrobert<p>Chapter 24 deals with the FORTRAN subroutines for automatically
29*404b540aSrobert   transforming lemmings into gold.
30*404b540aSrobert</p>
31*404b540aSrobert
32*404b540aSrobert
33*404b540aSrobert<!-- ####################################################### -->
34*404b540aSrobert<hr />
35*404b540aSrobert<h1>Contents</h1>
36*404b540aSrobert<ul>
37*404b540aSrobert   <li><a href="#1">They ain't pointers!</a></li>
38*404b540aSrobert   <li><a href="#2">It ends <em>where?</em></a></li>
39*404b540aSrobert</ul>
40*404b540aSrobert
41*404b540aSrobert<hr />
42*404b540aSrobert
43*404b540aSrobert<!-- ####################################################### -->
44*404b540aSrobert
45*404b540aSrobert<h2><a name="1">They ain't pointers!</a></h2>
46*404b540aSrobert   <p><a href="../faq/index.html#5_1">FAQ 5.1</a> points out that iterators
47*404b540aSrobert      are not implemented as pointers.  They are a generalization of
48*404b540aSrobert      pointers, but they are implemented in libstdc++-v3 as separate classes.
49*404b540aSrobert   </p>
50*404b540aSrobert   <p>Keeping that simple fact in mind as you design your code will
51*404b540aSrobert      prevent a whole lot of difficult-to-understand bugs.
52*404b540aSrobert   </p>
53*404b540aSrobert   <p>You can think of it the other way 'round, even.  Since iterators
54*404b540aSrobert      are a generalization, that means that <em>pointers</em> are
55*404b540aSrobert      <em>iterators</em>, and that pointers can be used whenever an
56*404b540aSrobert      iterator would be.  All those functions in the Algorithms chapter
57*404b540aSrobert      of the Standard will work just as well on plain arrays and their
58*404b540aSrobert      pointers.
59*404b540aSrobert   </p>
60*404b540aSrobert   <p>That doesn't mean that when you pass in a pointer, it gets wrapped
61*404b540aSrobert      into some special delegating iterator-to-pointer class with a layer
62*404b540aSrobert      of overhead.  (If you think that's the case anywhere, you don't
63*404b540aSrobert      understand templates to begin with...)  Oh, no; if you pass
64*404b540aSrobert      in a pointer, then the compiler will instantiate that template
65*404b540aSrobert      using T* as a type, and good old high-speed pointer arithmetic as
66*404b540aSrobert      its operations, so the resulting code will be doing exactly the same
67*404b540aSrobert      things as it would be doing if you had hand-coded it yourself (for
68*404b540aSrobert      the 273rd time).
69*404b540aSrobert   </p>
70*404b540aSrobert   <p>How much overhead <em>is</em> there when using an iterator class?
71*404b540aSrobert      Very little.  Most of the layering classes contain nothing but
72*404b540aSrobert      typedefs, and typedefs are &quot;meta-information&quot; that simply
73*404b540aSrobert      tell the compiler some nicknames; they don't create code.  That
74*404b540aSrobert      information gets passed down through inheritance, so while the
75*404b540aSrobert      compiler has to do work looking up all the names, your runtime code
76*404b540aSrobert      does not.  (This has been a prime concern from the beginning.)
77*404b540aSrobert   </p>
78*404b540aSrobert   <p>Return <a href="#top">to top of page</a> or
79*404b540aSrobert      <a href="../faq/index.html">to the FAQ</a>.
80*404b540aSrobert   </p>
81*404b540aSrobert
82*404b540aSrobert<hr />
83*404b540aSrobert<h2><a name="2">It ends <em>where?</em></a></h2>
84*404b540aSrobert   <p>This starts off sounding complicated, but is actually very easy,
85*404b540aSrobert      especially towards the end.  Trust me.
86*404b540aSrobert   </p>
87*404b540aSrobert   <p>Beginners usually have a little trouble understand the whole
88*404b540aSrobert      'past-the-end' thing, until they remember their early algebra classes
89*404b540aSrobert      (see, they <em>told</em> you that stuff would come in handy!) and
90*404b540aSrobert      the concept of half-open ranges.
91*404b540aSrobert   </p>
92*404b540aSrobert   <p>First, some history, and a reminder of some of the funkier rules in
93*404b540aSrobert      C and C++ for builtin arrays.  The following rules have always been
94*404b540aSrobert      true for both languages:
95*404b540aSrobert   </p>
96*404b540aSrobert   <ol>
97*404b540aSrobert      <li>You can point anywhere in the array, <em>or to the first element
98*404b540aSrobert          past the end of the array</em>.  A pointer that points to one
99*404b540aSrobert          past the end of the array is guaranteed to be as unique as a
100*404b540aSrobert          pointer to somewhere inside the array, so that you can compare
101*404b540aSrobert          such pointers safely.
102*404b540aSrobert      </li>
103*404b540aSrobert      <li>You can only dereference a pointer that points into an array.
104*404b540aSrobert          If your array pointer points outside the array -- even to just
105*404b540aSrobert          one past the end -- and you dereference it, Bad Things happen.
106*404b540aSrobert      </li>
107*404b540aSrobert      <li>Strictly speaking, simply pointing anywhere else invokes
108*404b540aSrobert          undefined behavior.  Most programs won't puke until such a
109*404b540aSrobert          pointer is actually dereferenced, but the standards leave that
110*404b540aSrobert          up to the platform.
111*404b540aSrobert      </li>
112*404b540aSrobert   </ol>
113*404b540aSrobert   <p>The reason this past-the-end addressing was allowed is to make it
114*404b540aSrobert      easy to write a loop to go over an entire array, e.g.,
115*404b540aSrobert      while (*d++ = *s++);.
116*404b540aSrobert   </p>
117*404b540aSrobert   <p>So, when you think of two pointers delimiting an array, don't think
118*404b540aSrobert      of them as indexing 0 through n-1.  Think of them as <em>boundary
119*404b540aSrobert      markers</em>:
120*404b540aSrobert   </p>
121*404b540aSrobert   <pre>
122*404b540aSrobert
123*404b540aSrobert   beginning            end
124*404b540aSrobert     |                   |
125*404b540aSrobert     |                   |               This is bad.  Always having to
126*404b540aSrobert     |                   |               remember to add or subtract one.
127*404b540aSrobert     |                   |               Off-by-one bugs very common here.
128*404b540aSrobert     V                   V
129*404b540aSrobert        array of N elements
130*404b540aSrobert     |---|---|--...--|---|---|
131*404b540aSrobert     | 0 | 1 |  ...  |N-2|N-1|
132*404b540aSrobert     |---|---|--...--|---|---|
133*404b540aSrobert
134*404b540aSrobert     ^                       ^
135*404b540aSrobert     |                       |
136*404b540aSrobert     |                       |           This is good.  This is safe.  This
137*404b540aSrobert     |                       |           is guaranteed to work.  Just don't
138*404b540aSrobert     |                       |           dereference 'end'.
139*404b540aSrobert   beginning                end
140*404b540aSrobert
141*404b540aSrobert   </pre>
142*404b540aSrobert   <p>See?  Everything between the boundary markers is part of the array.
143*404b540aSrobert      Simple.
144*404b540aSrobert   </p>
145*404b540aSrobert   <p>Now think back to your junior-high school algebra course, when you
146*404b540aSrobert      were learning how to draw graphs.  Remember that a graph terminating
147*404b540aSrobert      with a solid dot meant, &quot;Everything up through this point,&quot;
148*404b540aSrobert      and a graph terminating with an open dot meant, &quot;Everything up
149*404b540aSrobert      to, but not including, this point,&quot; respectively called closed
150*404b540aSrobert      and open ranges?  Remember how closed ranges were written with
151*404b540aSrobert      brackets, <em>[a,b]</em>, and open ranges were written with parentheses,
152*404b540aSrobert      <em>(a,b)</em>?
153*404b540aSrobert   </p>
154*404b540aSrobert   <p>The boundary markers for arrays describe a <em>half-open range</em>,
155*404b540aSrobert      starting with (and including) the first element, and ending with (but
156*404b540aSrobert      not including) the last element:  <em>[beginning,end)</em>.  See, I
157*404b540aSrobert      told you it would be simple in the end.
158*404b540aSrobert   </p>
159*404b540aSrobert   <p>Iterators, and everything working with iterators, follows this same
160*404b540aSrobert      time-honored tradition.  A container's <code>begin()</code> method returns
161*404b540aSrobert      an iterator referring to the first element, and its <code>end()</code>
162*404b540aSrobert      method returns a past-the-end iterator, which is guaranteed to be
163*404b540aSrobert      unique and comparable against any other iterator pointing into the
164*404b540aSrobert      middle of the container.
165*404b540aSrobert   </p>
166*404b540aSrobert   <p>Container constructors, container methods, and algorithms, all take
167*404b540aSrobert      pairs of iterators describing a range of values on which to operate.
168*404b540aSrobert      All of these ranges are half-open ranges, so you pass the beginning
169*404b540aSrobert      iterator as the starting parameter, and the one-past-the-end iterator
170*404b540aSrobert      as the finishing parameter.
171*404b540aSrobert   </p>
172*404b540aSrobert   <p>This generalizes very well.  You can operate on sub-ranges quite
173*404b540aSrobert      easily this way; functions accepting a <em>[first,last)</em> range
174*404b540aSrobert      don't know or care whether they are the boundaries of an entire {array,
175*404b540aSrobert      sequence, container, whatever}, or whether they only enclose a few
176*404b540aSrobert      elements from the center.  This approach also makes zero-length
177*404b540aSrobert      sequences very simple to recognize:  if the two endpoints compare
178*404b540aSrobert      equal, then the {array, sequence, container, whatever} is empty.
179*404b540aSrobert   </p>
180*404b540aSrobert   <p>Just don't dereference <code>end()</code>.
181*404b540aSrobert   </p>
182*404b540aSrobert   <p>Return <a href="#top">to top of page</a> or
183*404b540aSrobert      <a href="../faq/index.html">to the FAQ</a>.
184*404b540aSrobert   </p>
185*404b540aSrobert
186*404b540aSrobert
187*404b540aSrobert
188*404b540aSrobert
189*404b540aSrobert<!-- ####################################################### -->
190*404b540aSrobert
191*404b540aSrobert<hr />
192*404b540aSrobert<p class="fineprint"><em>
193*404b540aSrobertSee <a href="../17_intro/license.html">license.html</a> for copying conditions.
194*404b540aSrobertComments and suggestions are welcome, and may be sent to
195*404b540aSrobert<a href="mailto:libstdc++@gcc.gnu.org">the libstdc++ mailing list</a>.
196*404b540aSrobert</em></p>
197*404b540aSrobert
198*404b540aSrobert
199*404b540aSrobert</body>
200*404b540aSrobert</html>
201