1*404b540aSrobert<?xml version="1.0" encoding="ISO-8859-1"?> 2*404b540aSrobert<!DOCTYPE html 3*404b540aSrobert PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 4*404b540aSrobert "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 5*404b540aSrobert 6*404b540aSrobert<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> 7*404b540aSrobert<head> 8*404b540aSrobert <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> 9*404b540aSrobert <meta name="AUTHOR" content="pme@gcc.gnu.org (Phil Edwards)" /> 10*404b540aSrobert <meta name="KEYWORDS" content="HOWTO, libstdc++, GCC, g++, libg++, STL" /> 11*404b540aSrobert <meta name="DESCRIPTION" content="HOWTO for the libstdc++ chapter 24." /> 12*404b540aSrobert <meta name="GENERATOR" content="vi and eight fingers" /> 13*404b540aSrobert <title>libstdc++-v3 HOWTO: Chapter 24: Iterators</title> 14*404b540aSrobert<link rel="StyleSheet" href="../lib3styles.css" type="text/css" /> 15*404b540aSrobert<link rel="Start" href="../documentation.html" type="text/html" 16*404b540aSrobert title="GNU C++ Standard Library" /> 17*404b540aSrobert<link rel="Prev" href="../23_containers/howto.html" type="text/html" 18*404b540aSrobert title="Containers" /> 19*404b540aSrobert<link rel="Next" href="../25_algorithms/howto.html" type="text/html" 20*404b540aSrobert title="Algorithms" /> 21*404b540aSrobert<link rel="Copyright" href="../17_intro/license.html" type="text/html" /> 22*404b540aSrobert<link rel="Help" href="../faq/index.html" type="text/html" title="F.A.Q." /> 23*404b540aSrobert</head> 24*404b540aSrobert<body> 25*404b540aSrobert 26*404b540aSrobert<h1 class="centered"><a name="top">Chapter 24: Iterators</a></h1> 27*404b540aSrobert 28*404b540aSrobert<p>Chapter 24 deals with the FORTRAN subroutines for automatically 29*404b540aSrobert transforming lemmings into gold. 30*404b540aSrobert</p> 31*404b540aSrobert 32*404b540aSrobert 33*404b540aSrobert<!-- ####################################################### --> 34*404b540aSrobert<hr /> 35*404b540aSrobert<h1>Contents</h1> 36*404b540aSrobert<ul> 37*404b540aSrobert <li><a href="#1">They ain't pointers!</a></li> 38*404b540aSrobert <li><a href="#2">It ends <em>where?</em></a></li> 39*404b540aSrobert</ul> 40*404b540aSrobert 41*404b540aSrobert<hr /> 42*404b540aSrobert 43*404b540aSrobert<!-- ####################################################### --> 44*404b540aSrobert 45*404b540aSrobert<h2><a name="1">They ain't pointers!</a></h2> 46*404b540aSrobert <p><a href="../faq/index.html#5_1">FAQ 5.1</a> points out that iterators 47*404b540aSrobert are not implemented as pointers. They are a generalization of 48*404b540aSrobert pointers, but they are implemented in libstdc++-v3 as separate classes. 49*404b540aSrobert </p> 50*404b540aSrobert <p>Keeping that simple fact in mind as you design your code will 51*404b540aSrobert prevent a whole lot of difficult-to-understand bugs. 52*404b540aSrobert </p> 53*404b540aSrobert <p>You can think of it the other way 'round, even. Since iterators 54*404b540aSrobert are a generalization, that means that <em>pointers</em> are 55*404b540aSrobert <em>iterators</em>, and that pointers can be used whenever an 56*404b540aSrobert iterator would be. All those functions in the Algorithms chapter 57*404b540aSrobert of the Standard will work just as well on plain arrays and their 58*404b540aSrobert pointers. 59*404b540aSrobert </p> 60*404b540aSrobert <p>That doesn't mean that when you pass in a pointer, it gets wrapped 61*404b540aSrobert into some special delegating iterator-to-pointer class with a layer 62*404b540aSrobert of overhead. (If you think that's the case anywhere, you don't 63*404b540aSrobert understand templates to begin with...) Oh, no; if you pass 64*404b540aSrobert in a pointer, then the compiler will instantiate that template 65*404b540aSrobert using T* as a type, and good old high-speed pointer arithmetic as 66*404b540aSrobert its operations, so the resulting code will be doing exactly the same 67*404b540aSrobert things as it would be doing if you had hand-coded it yourself (for 68*404b540aSrobert the 273rd time). 69*404b540aSrobert </p> 70*404b540aSrobert <p>How much overhead <em>is</em> there when using an iterator class? 71*404b540aSrobert Very little. Most of the layering classes contain nothing but 72*404b540aSrobert typedefs, and typedefs are "meta-information" that simply 73*404b540aSrobert tell the compiler some nicknames; they don't create code. That 74*404b540aSrobert information gets passed down through inheritance, so while the 75*404b540aSrobert compiler has to do work looking up all the names, your runtime code 76*404b540aSrobert does not. (This has been a prime concern from the beginning.) 77*404b540aSrobert </p> 78*404b540aSrobert <p>Return <a href="#top">to top of page</a> or 79*404b540aSrobert <a href="../faq/index.html">to the FAQ</a>. 80*404b540aSrobert </p> 81*404b540aSrobert 82*404b540aSrobert<hr /> 83*404b540aSrobert<h2><a name="2">It ends <em>where?</em></a></h2> 84*404b540aSrobert <p>This starts off sounding complicated, but is actually very easy, 85*404b540aSrobert especially towards the end. Trust me. 86*404b540aSrobert </p> 87*404b540aSrobert <p>Beginners usually have a little trouble understand the whole 88*404b540aSrobert 'past-the-end' thing, until they remember their early algebra classes 89*404b540aSrobert (see, they <em>told</em> you that stuff would come in handy!) and 90*404b540aSrobert the concept of half-open ranges. 91*404b540aSrobert </p> 92*404b540aSrobert <p>First, some history, and a reminder of some of the funkier rules in 93*404b540aSrobert C and C++ for builtin arrays. The following rules have always been 94*404b540aSrobert true for both languages: 95*404b540aSrobert </p> 96*404b540aSrobert <ol> 97*404b540aSrobert <li>You can point anywhere in the array, <em>or to the first element 98*404b540aSrobert past the end of the array</em>. A pointer that points to one 99*404b540aSrobert past the end of the array is guaranteed to be as unique as a 100*404b540aSrobert pointer to somewhere inside the array, so that you can compare 101*404b540aSrobert such pointers safely. 102*404b540aSrobert </li> 103*404b540aSrobert <li>You can only dereference a pointer that points into an array. 104*404b540aSrobert If your array pointer points outside the array -- even to just 105*404b540aSrobert one past the end -- and you dereference it, Bad Things happen. 106*404b540aSrobert </li> 107*404b540aSrobert <li>Strictly speaking, simply pointing anywhere else invokes 108*404b540aSrobert undefined behavior. Most programs won't puke until such a 109*404b540aSrobert pointer is actually dereferenced, but the standards leave that 110*404b540aSrobert up to the platform. 111*404b540aSrobert </li> 112*404b540aSrobert </ol> 113*404b540aSrobert <p>The reason this past-the-end addressing was allowed is to make it 114*404b540aSrobert easy to write a loop to go over an entire array, e.g., 115*404b540aSrobert while (*d++ = *s++);. 116*404b540aSrobert </p> 117*404b540aSrobert <p>So, when you think of two pointers delimiting an array, don't think 118*404b540aSrobert of them as indexing 0 through n-1. Think of them as <em>boundary 119*404b540aSrobert markers</em>: 120*404b540aSrobert </p> 121*404b540aSrobert <pre> 122*404b540aSrobert 123*404b540aSrobert beginning end 124*404b540aSrobert | | 125*404b540aSrobert | | This is bad. Always having to 126*404b540aSrobert | | remember to add or subtract one. 127*404b540aSrobert | | Off-by-one bugs very common here. 128*404b540aSrobert V V 129*404b540aSrobert array of N elements 130*404b540aSrobert |---|---|--...--|---|---| 131*404b540aSrobert | 0 | 1 | ... |N-2|N-1| 132*404b540aSrobert |---|---|--...--|---|---| 133*404b540aSrobert 134*404b540aSrobert ^ ^ 135*404b540aSrobert | | 136*404b540aSrobert | | This is good. This is safe. This 137*404b540aSrobert | | is guaranteed to work. Just don't 138*404b540aSrobert | | dereference 'end'. 139*404b540aSrobert beginning end 140*404b540aSrobert 141*404b540aSrobert </pre> 142*404b540aSrobert <p>See? Everything between the boundary markers is part of the array. 143*404b540aSrobert Simple. 144*404b540aSrobert </p> 145*404b540aSrobert <p>Now think back to your junior-high school algebra course, when you 146*404b540aSrobert were learning how to draw graphs. Remember that a graph terminating 147*404b540aSrobert with a solid dot meant, "Everything up through this point," 148*404b540aSrobert and a graph terminating with an open dot meant, "Everything up 149*404b540aSrobert to, but not including, this point," respectively called closed 150*404b540aSrobert and open ranges? Remember how closed ranges were written with 151*404b540aSrobert brackets, <em>[a,b]</em>, and open ranges were written with parentheses, 152*404b540aSrobert <em>(a,b)</em>? 153*404b540aSrobert </p> 154*404b540aSrobert <p>The boundary markers for arrays describe a <em>half-open range</em>, 155*404b540aSrobert starting with (and including) the first element, and ending with (but 156*404b540aSrobert not including) the last element: <em>[beginning,end)</em>. See, I 157*404b540aSrobert told you it would be simple in the end. 158*404b540aSrobert </p> 159*404b540aSrobert <p>Iterators, and everything working with iterators, follows this same 160*404b540aSrobert time-honored tradition. A container's <code>begin()</code> method returns 161*404b540aSrobert an iterator referring to the first element, and its <code>end()</code> 162*404b540aSrobert method returns a past-the-end iterator, which is guaranteed to be 163*404b540aSrobert unique and comparable against any other iterator pointing into the 164*404b540aSrobert middle of the container. 165*404b540aSrobert </p> 166*404b540aSrobert <p>Container constructors, container methods, and algorithms, all take 167*404b540aSrobert pairs of iterators describing a range of values on which to operate. 168*404b540aSrobert All of these ranges are half-open ranges, so you pass the beginning 169*404b540aSrobert iterator as the starting parameter, and the one-past-the-end iterator 170*404b540aSrobert as the finishing parameter. 171*404b540aSrobert </p> 172*404b540aSrobert <p>This generalizes very well. You can operate on sub-ranges quite 173*404b540aSrobert easily this way; functions accepting a <em>[first,last)</em> range 174*404b540aSrobert don't know or care whether they are the boundaries of an entire {array, 175*404b540aSrobert sequence, container, whatever}, or whether they only enclose a few 176*404b540aSrobert elements from the center. This approach also makes zero-length 177*404b540aSrobert sequences very simple to recognize: if the two endpoints compare 178*404b540aSrobert equal, then the {array, sequence, container, whatever} is empty. 179*404b540aSrobert </p> 180*404b540aSrobert <p>Just don't dereference <code>end()</code>. 181*404b540aSrobert </p> 182*404b540aSrobert <p>Return <a href="#top">to top of page</a> or 183*404b540aSrobert <a href="../faq/index.html">to the FAQ</a>. 184*404b540aSrobert </p> 185*404b540aSrobert 186*404b540aSrobert 187*404b540aSrobert 188*404b540aSrobert 189*404b540aSrobert<!-- ####################################################### --> 190*404b540aSrobert 191*404b540aSrobert<hr /> 192*404b540aSrobert<p class="fineprint"><em> 193*404b540aSrobertSee <a href="../17_intro/license.html">license.html</a> for copying conditions. 194*404b540aSrobertComments and suggestions are welcome, and may be sent to 195*404b540aSrobert<a href="mailto:libstdc++@gcc.gnu.org">the libstdc++ mailing list</a>. 196*404b540aSrobert</em></p> 197*404b540aSrobert 198*404b540aSrobert 199*404b540aSrobert</body> 200*404b540aSrobert</html> 201