1<?xml version="1.0" encoding="ISO-8859-1"?> 2<!DOCTYPE html 3 PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 4 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 5 6<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> 7<head> 8 <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> 9 <meta name="AUTHOR" content="pme@gcc.gnu.org (Phil Edwards)" /> 10 <meta name="KEYWORDS" content="HOWTO, libstdc++, GCC, g++, libg++, STL" /> 11 <meta name="DESCRIPTION" content="HOWTO for the libstdc++ chapter 22." /> 12 <meta name="GENERATOR" content="vi and eight fingers" /> 13 <title>libstdc++-v3 HOWTO: Chapter 22: Localization</title> 14<link rel="StyleSheet" href="../lib3styles.css" type="text/css" /> 15<link rel="Start" href="../documentation.html" type="text/html" 16 title="GNU C++ Standard Library" /> 17<link rel="Prev" href="../21_strings/howto.html" type="text/html" 18 title="Strings" /> 19<link rel="Next" href="../23_containers/howto.html" type="text/html" 20 title="Containers" /> 21<link rel="Bookmark" href="locale.html" type="text/html" title="class locale" /> 22<link rel="Bookmark" href="codecvt.html" type="text/html" title="class codecvt" /> 23<link rel="Bookmark" href="ctype.html" type="text/html" title="class ctype" /> 24<link rel="Bookmark" href="messages.html" type="text/html" title="class messages" /> 25<link rel="Bookmark" href="http://www.research.att.com/~bs/3rd_loc0.html" type="text/html" title="Bjarne Stroustrup on Locales" /> 26<link rel="Bookmark" href="http://www.cantrip.org/locale.html" type="text/html" title="Nathan Myers on Locales" /> 27<link rel="Copyright" href="../17_intro/license.html" type="text/html" /> 28<link rel="Help" href="../faq/index.html" type="text/html" title="F.A.Q." /> 29</head> 30<body> 31 32<h1 class="centered"><a name="top">Chapter 22: Localization</a></h1> 33 34<p>Chapter 22 deals with the C++ localization facilities. 35</p> 36<!-- I wanted to write that sentence in something requiring an exotic font, 37 like Cyrllic or Kanji. Probably more work than such cuteness is worth, 38 but I still think it'd be funny. 39 --> 40 41 42<!-- ####################################################### --> 43<hr /> 44<h1>Contents</h1> 45<ul> 46 <li><a href="#1">class locale</a></li> 47 <li><a href="#2">class codecvt</a></li> 48 <li><a href="#3">class ctype</a></li> 49 <li><a href="#4">class messages</a></li> 50 <li><a href="#5">Bjarne Stroustrup on Locales</a></li> 51 <li><a href="#6">Nathan Myers on Locales</a></li> 52 <li><a href="#7">Correct Transformations</a></li> 53</ul> 54 55<!-- ####################################################### --> 56 57<hr /> 58<h2><a name="1">class locale</a></h2> 59 <p>Notes made during the implementation of locales can be found 60 <a href="locale.html">here</a>. 61 </p> 62 63<hr /> 64<h2><a name="2">class codecvt</a></h2> 65 <p>Notes made during the implementation of codecvt can be found 66 <a href="codecvt.html">here</a>. 67 </p> 68 69 <p>The following is the abstract from the implementation notes: 70 </p> 71 <blockquote> 72 The standard class codecvt attempts to address conversions between 73 different character encoding schemes. In particular, the standard 74 attempts to detail conversions between the implementation-defined 75 wide characters (hereafter referred to as wchar_t) and the standard 76 type char that is so beloved in classic "C" (which can 77 now be referred to as narrow characters.) This document attempts 78 to describe how the GNU libstdc++-v3 implementation deals with the 79 conversion between wide and narrow characters, and also presents a 80 framework for dealing with the huge number of other encodings that 81 iconv can convert, including Unicode and UTF8. Design issues and 82 requirements are addressed, and examples of correct usage for both 83 the required specializations for wide and narrow characters and the 84 implementation-provided extended functionality are given. 85 </blockquote> 86 87<hr /> 88<h2><a name="3">class ctype</a></h2> 89 <p>Notes made during the implementation of ctype can be found 90 <a href="ctype.html">here</a>. 91 </p> 92 93<hr /> 94<h2><a name="4">class messages</a></h2> 95 <p>Notes made during the implementation of messages can be found 96 <a href="messages.html">here</a>. 97 </p> 98 99<hr /> 100<h2><a name="5">Bjarne Stroustrup on Locales</a></h2> 101 <p>Dr. Bjarne Stroustrup has released a 102 <a href="http://www.research.att.com/~bs/3rd_loc0.html">pointer</a> 103 to Appendix D of his book, 104 <a href="http://www.research.att.com/~bs/3rd.html">The C++ 105 Programming Language (3rd Edition)</a>. It is a detailed 106 description of locales and how to use them. 107 </p> 108 <p>He also writes: 109 </p> 110 <blockquote><em> 111 Please note that I still consider this detailed description of 112 locales beyond the needs of most C++ programmers. It is written 113 with experienced programmers in mind and novices will do best to 114 avoid it. 115 </em></blockquote> 116 117<hr /> 118<h2><a name="6">Nathan Myers on Locales</a></h2> 119 <p>An article entitled "The Standard C++ Locale" was 120 published in Dr. Dobb's Journal and can be found 121 <a href="http://www.cantrip.org/locale.html">here</a>. 122 </p> 123 124<hr /> 125<h2><a name="7">Correct Transformations</a></h2> 126 <!-- Jumping directly to here from chapter 21. --> 127 <p>A very common question on newsgroups and mailing lists is, "How 128 do I do <foo> to a character string?" where <foo> is 129 a task such as changing all the letters to uppercase, to lowercase, 130 testing for digits, etc. A skilled and conscientious programmer 131 will follow the question with another, "And how do I make the 132 code portable?" 133 </p> 134 <p>(Poor innocent programmer, you have no idea the depths of trouble 135 you are getting yourself into. 'Twould be best for your sanity if 136 you dropped the whole idea and took up basket weaving instead. No? 137 Fine, you asked for it...) 138 </p> 139 <p>The task of changing the case of a letter or classifying a character 140 as numeric, graphical, etc., all depends on the cultural context of the 141 program at runtime. So, first you must take the portability question 142 into account. Once you have localized the program to a particular 143 natural language, only then can you perform the specific task. 144 Unfortunately, specializing a function for a human language is not 145 as simple as declaring 146 <code> extern "Danish" int tolower (int); </code>. 147 </p> 148 <p>The C++ code to do all this proceeds in the same way. First, a locale 149 is created. Then member functions of that locale are called to 150 perform minor tasks. Continuing the example from Chapter 21, we wish 151 to use the following convenience functions: 152 </p> 153 <pre> 154 namespace std { 155 template <class charT> 156 charT 157 toupper (charT c, const locale& loc) const; 158 template <class charT> 159 charT 160 tolower (charT c, const locale& loc) const; 161 }</pre> 162 <p> 163 This function extracts the appropriate "facet" from the 164 locale <em>loc</em> and calls the appropriate member function of that 165 facet, passing <em>c</em> as its argument. The resulting character 166 is returned. 167 </p> 168 <p>For the C/POSIX locale, the results are the same as calling the 169 classic C <code>toupper/tolower</code> function that was used in previous 170 examples. For other locales, the code should Do The Right Thing. 171 </p> 172 <p>Of course, these functions take a second argument, and the 173 transformation algorithm's operator argument can only take a single 174 parameter. So we write simple wrapper structs to handle that. 175 </p> 176 <p>The next-to-final version of the code started in Chapter 21 looks like: 177 </p> 178 <pre> 179 #include <iterator> // for back_inserter 180 #include <locale> 181 #include <string> 182 #include <algorithm> 183 #include <cctype> // old <ctype.h> 184 185 struct ToUpper 186 { 187 ToUpper(std::locale const& l) : loc(l) {;} 188 char operator() (char c) const { return std::toupper(c,loc); } 189 private: 190 std::locale const& loc; 191 }; 192 193 struct ToLower 194 { 195 ToLower(std::locale const& l) : loc(l) {;} 196 char operator() (char c) const { return std::tolower(c,loc); } 197 private: 198 std::locale const& loc; 199 }; 200 201 int main () 202 { 203 std::string s("Some Kind Of Initial Input Goes Here"); 204 ToUpper up(std::locale::classic()); 205 ToLower down(std::locale::classic()); 206 207 // Change everything into upper case. 208 std::transform(s.begin(), s.end(), s.begin(), up); 209 210 // Change everything into lower case. 211 std::transform(s.begin(), s.end(), s.begin(), down); 212 213 // Change everything back into upper case, but store the 214 // result in a different string. 215 std::string capital_s; 216 std::transform(s.begin(), s.end(), std::back_inserter(capital_s), up); 217 }</pre> 218 <p>The <code>ToUpper</code> and <code>ToLower</code> structs can be 219 generalized for other character types by making <code>operator()</code> 220 a member function template. 221 </p> 222 <p>The final version of the code uses <code>bind2nd</code> to eliminate 223 the wrapper structs, but the resulting code is tricky. I have not 224 shown it here because no compilers currently available to me will 225 handle it. 226 </p> 227 228 229<!-- ####################################################### --> 230 231<hr /> 232<p class="fineprint"><em> 233See <a href="../17_intro/license.html">license.html</a> for copying conditions. 234Comments and suggestions are welcome, and may be sent to 235<a href="mailto:libstdc++@gcc.gnu.org">the libstdc++ mailing list</a>. 236</em></p> 237 238 239</body> 240</html> 241