1<?xml version="1.0" encoding="ISO-8859-1"?> 2<!DOCTYPE html 3 PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 4 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 5 6<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> 7<head> 8 <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> 9 <meta name="AUTHOR" content="bkoz@redhat.com (Benjamin Kosnik)" /> 10 <meta name="KEYWORDS" content="HOWTO, libstdc++, GCC, g++, libg++, STL" /> 11 <meta name="DESCRIPTION" content="Notes on the messages implementation." /> 12 <title>Notes on the messages implementation.</title> 13<link rel="StyleSheet" href="../lib3styles.css" type="text/css" /> 14<link rel="Start" href="../documentation.html" type="text/html" 15 title="GNU C++ Standard Library" /> 16<link rel="Bookmark" href="howto.html" type="text/html" title="Localization" /> 17<link rel="Copyright" href="../17_intro/license.html" type="text/html" /> 18<link rel="Help" href="../faq/index.html" type="text/html" title="F.A.Q." /> 19</head> 20<body> 21<h1> 22Notes on the messages implementation. 23</h1> 24<em> 25prepared by Benjamin Kosnik (bkoz@redhat.com) on August 8, 2001 26</em> 27 28<h2> 291. Abstract 30</h2> 31<p> 32The std::messages facet implements message retrieval functionality 33equivalent to Java's java.text.MessageFormat .using either GNU gettext 34or IEEE 1003.1-200 functions. 35</p> 36 37<h2> 382. What the standard says 39</h2> 40The std::messages facet is probably the most vaguely defined facet in 41the standard library. It's assumed that this facility was built into 42the standard library in order to convert string literals from one 43locale to the other. For instance, converting the "C" locale's 44<code>const char* c = "please"</code> to a German-localized <code>"bitte"</code> 45during program execution. 46 47<blockquote> 4822.2.7.1 - Template class messages [lib.locale.messages] 49</blockquote> 50 51This class has three public member functions, which directly 52correspond to three protected virtual member functions. 53 54The public member functions are: 55 56<p> 57<code>catalog open(const string&, const locale&) const</code> 58</p> 59 60<p> 61<code>string_type get(catalog, int, int, const string_type&) const</code> 62</p> 63 64<p> 65<code>void close(catalog) const</code> 66</p> 67 68<p> 69While the virtual functions are: 70</p> 71 72<p> 73<code>catalog do_open(const string&, const locale&) const</code> 74</p> 75<blockquote> 76<em> 77-1- Returns: A value that may be passed to get() to retrieve a 78message, from the message catalog identified by the string name 79according to an implementation-defined mapping. The result can be used 80until it is passed to close(). Returns a value less than 0 if no such 81catalog can be opened. 82</em> 83</blockquote> 84 85<p> 86<code>string_type do_get(catalog, int, int, const string_type&) const</code> 87</p> 88<blockquote> 89<em> 90-3- Requires: A catalog cat obtained from open() and not yet closed. 91-4- Returns: A message identified by arguments set, msgid, and dfault, 92according to an implementation-defined mapping. If no such message can 93be found, returns dfault. 94</em> 95</blockquote> 96 97<p> 98<code>void do_close(catalog) const</code> 99</p> 100<blockquote> 101<em> 102-5- Requires: A catalog cat obtained from open() and not yet closed. 103-6- Effects: Releases unspecified resources associated with cat. 104-7- Notes: The limit on such resources, if any, is implementation-defined. 105</em> 106</blockquote> 107 108 109<h2> 1103. Problems with "C" messages: thread safety, 111over-specification, and assumptions. 112</h2> 113A couple of notes on the standard. 114 115<p> 116First, why is <code>messages_base::catalog</code> specified as a typedef 117to int? This makes sense for implementations that use 118<code>catopen</code>, but not for others. Fortunately, it's not heavily 119used and so only a minor irritant. 120</p> 121 122<p> 123Second, by making the member functions <code>const</code>, it is 124impossible to save state in them. Thus, storing away information used 125in the 'open' member function for use in 'get' is impossible. This is 126unfortunate. 127</p> 128 129<p> 130The 'open' member function in particular seems to be oddly 131designed. The signature seems quite peculiar. Why specify a <code>const 132string& </code> argument, for instance, instead of just <code>const 133char*</code>? Or, why specify a <code>const locale&</code> argument that is 134to be used in the 'get' member function? How, exactly, is this locale 135argument useful? What was the intent? It might make sense if a locale 136argument was associated with a given default message string in the 137'open' member function, for instance. Quite murky and unclear, on 138reflection. 139</p> 140 141<p> 142Lastly, it seems odd that messages, which explicitly require code 143conversion, don't use the codecvt facet. Because the messages facet 144has only one template parameter, it is assumed that ctype, and not 145codecvt, is to be used to convert between character sets. 146</p> 147 148<p> 149It is implicitly assumed that the locale for the default message 150string in 'get' is in the "C" locale. Thus, all source code is assumed 151to be written in English, so translations are always from "en_US" to 152other, explicitly named locales. 153</p> 154 155<h2> 1564. Design and Implementation Details 157</h2> 158This is a relatively simple class, on the face of it. The standard 159specifies very little in concrete terms, so generic implementations 160that are conforming yet do very little are the norm. Adding 161functionality that would be useful to programmers and comparable to 162Java's java.text.MessageFormat takes a bit of work, and is highly 163dependent on the capabilities of the underlying operating system. 164 165<p> 166Three different mechanisms have been provided, selectable via 167configure flags: 168</p> 169 170<ul> 171 <li> generic 172 <p> 173 This model does very little, and is what is used by default. 174 </p> 175 </li> 176 177 <li> gnu 178 <p> 179 The gnu model is complete and fully tested. It's based on the 180 GNU gettext package, which is part of glibc. It uses the functions 181 <code>textdomain, bindtextdomain, gettext</code> 182 to implement full functionality. Creating message 183 catalogs is a relatively straight-forward process and is 184 lightly documented below, and fully documented in gettext's 185 distributed documentation. 186 </p> 187 </li> 188 189 <li> ieee_1003.1-200x 190 <p> 191 This is a complete, though untested, implementation based on 192 the IEEE standard. The functions 193 <code>catopen, catgets, catclose</code> 194 are used to retrieve locale-specific messages given the 195 appropriate message catalogs that have been constructed for 196 their use. Note, the script <code> po2msg.sed</code> that is part 197 of the gettext distribution can convert gettext catalogs into 198 catalogs that <code>catopen</code> can use. 199 </p> 200 </li> 201</ul> 202 203<p> 204A new, standards-conformant non-virtual member function signature was 205added for 'open' so that a directory could be specified with a given 206message catalog. This simplifies calling conventions for the gnu 207model. 208</p> 209 210<p> 211The rest of this document discusses details of the GNU model. 212</p> 213 214<p> 215The messages facet, because it is retrieving and converting between 216characters sets, depends on the ctype and perhaps the codecvt facet in 217a given locale. In addition, underlying "C" library locale support is 218necessary for more than just the <code>LC_MESSAGES</code> mask: 219<code>LC_CTYPE</code> is also necessary. To avoid any unpleasantness, all 220bits of the "C" mask (ie <code>LC_ALL</code>) are set before retrieving 221messages. 222</p> 223 224<p> 225Making the message catalogs can be initially tricky, but become quite 226simple with practice. For complete info, see the gettext 227documentation. Here's an idea of what is required: 228</p> 229 230<ul> 231 <li> Make a source file with the required string literals 232 that need to be translated. See 233 <code>intl/string_literals.cc</code> for an example. 234 </li> 235 236 <li> Make initial catalog (see "4 Making the PO Template File" 237 from the gettext docs). 238 <p> 239 <code> xgettext --c++ --debug string_literals.cc -o libstdc++.pot </code> 240 </p> 241 </li> 242 243 <li> Make language and country-specific locale catalogs. 244 <p> 245 <code>cp libstdc++.pot fr_FR.po</code> 246 </p> 247 <p> 248 <code>cp libstdc++.pot de_DE.po</code> 249 </p> 250 </li> 251 252 <li> Edit localized catalogs in emacs so that strings are 253 translated. 254 <p> 255 <code>emacs fr_FR.po</code> 256 </p> 257 </li> 258 259 <li> Make the binary mo files. 260 <p> 261 <code>msgfmt fr_FR.po -o fr_FR.mo</code> 262 </p> 263 <p> 264 <code>msgfmt de_DE.po -o de_DE.mo</code> 265 </p> 266 </li> 267 268 <li> Copy the binary files into the correct directory structure. 269 <p> 270 <code>cp fr_FR.mo (dir)/fr_FR/LC_MESSAGES/libstdc++-v3.mo</code> 271 </p> 272 <p> 273 <code>cp de_DE.mo (dir)/de_DE/LC_MESSAGES/libstdc++-v3.mo</code> 274 </p> 275 </li> 276 277 <li> Use the new message catalogs. 278 <p> 279 <code>locale loc_de("de_DE");</code> 280 </p> 281 <p> 282 <code> 283 use_facet<messages<char> >(loc_de).open("libstdc++", locale(), dir); 284 </code> 285 </p> 286 </li> 287</ul> 288 289<h2> 2905. Examples 291</h2> 292 293<ul> 294 <li> message converting, simple example using the GNU model. 295 296<pre> 297#include <iostream> 298#include <locale> 299using namespace std; 300 301void test01() 302{ 303 typedef messages<char>::catalog catalog; 304 const char* dir = 305 "/mnt/egcs/build/i686-pc-linux-gnu/libstdc++-v3/po/share/locale"; 306 const locale loc_de("de_DE"); 307 const messages<char>& mssg_de = use_facet<messages<char> >(loc_de); 308 309 catalog cat_de = mssg_de.open("libstdc++", loc_de, dir); 310 string s01 = mssg_de.get(cat_de, 0, 0, "please"); 311 string s02 = mssg_de.get(cat_de, 0, 0, "thank you"); 312 cout << "please in german:" << s01 << '\n'; 313 cout << "thank you in german:" << s02 << '\n'; 314 mssg_de.close(cat_de); 315} 316</pre> 317 </li> 318</ul> 319 320More information can be found in the following testcases: 321<ul> 322<li> testsuite/22_locale/messages.cc </li> 323<li> testsuite/22_locale/messages_byname.cc </li> 324<li> testsuite/22_locale/messages_char_members.cc </li> 325</ul> 326 327<h2> 3286. Unresolved Issues 329</h2> 330<ul> 331<li> Things that are sketchy, or remain unimplemented: 332 <ul> 333 <li>_M_convert_from_char, _M_convert_to_char are in 334 flux, depending on how the library ends up doing 335 character set conversions. It might not be possible to 336 do a real character set based conversion, due to the 337 fact that the template parameter for messages is not 338 enough to instantiate the codecvt facet (1 supplied, 339 need at least 2 but would prefer 3). 340 </li> 341 342 <li> There are issues with gettext needing the global 343 locale set to extract a message. This dependence on 344 the global locale makes the current "gnu" model non 345 MT-safe. Future versions of glibc, ie glibc 2.3.x will 346 fix this, and the C++ library bits are already in 347 place. 348 </li> 349 </ul> 350</li> 351 352<li> Development versions of the GNU "C" library, glibc 2.3 will allow 353 a more efficient, MT implementation of std::messages, and will 354 allow the removal of the _M_name_messages data member. If this 355 is done, it will change the library ABI. The C++ parts to 356 support glibc 2.3 have already been coded, but are not in use: 357 once this version of the "C" library is released, the marked 358 parts of the messages implementation can be switched over to 359 the new "C" library functionality. 360</li> 361<li> At some point in the near future, std::numpunct will probably use 362 std::messages facilities to implement truename/falename 363 correctly. This is currently not done, but entries in 364 libstdc++.pot have already been made for "true" and "false" 365 string literals, so all that remains is the std::numpunct 366 coding and the configure/make hassles to make the installed 367 library search its own catalog. Currently the libstdc++.mo 368 catalog is only searched for the testsuite cases involving 369 messages members. 370</li> 371 372<li> The following member functions: 373 374 <p> 375 <code> 376 catalog 377 open(const basic_string<char>& __s, const locale& __loc) const 378 </code> 379 </p> 380 381 <p> 382 <code> 383 catalog 384 open(const basic_string<char>&, const locale&, const char*) const; 385 </code> 386 </p> 387 388 <p> 389 Don't actually return a "value less than 0 if no such catalog 390 can be opened" as required by the standard in the "gnu" 391 model. As of this writing, it is unknown how to query to see 392 if a specified message catalog exists using the gettext 393 package. 394 </p> 395</li> 396</ul> 397 398<h2> 3997. Acknowledgments 400</h2> 401Ulrich Drepper for the character set explanations, gettext details, 402and patient answering of late-night questions, Tom Tromey for the java details. 403 404 405<h2> 4068. Bibliography / Referenced Documents 407</h2> 408 409Drepper, Ulrich, GNU libc (glibc) 2.2 manual. In particular, Chapters 410"7 Locales and Internationalization" 411 412<p> 413Drepper, Ulrich, Thread-Aware Locale Model, A proposal. This is a 414draft document describing the design of glibc 2.3 MT locale 415functionality. 416</p> 417 418<p> 419Drepper, Ulrich, Numerous, late-night email correspondence 420</p> 421 422<p> 423ISO/IEC 9899:1999 Programming languages - C 424</p> 425 426<p> 427ISO/IEC 14882:1998 Programming languages - C++ 428</p> 429 430<p> 431Java 2 Platform, Standard Edition, v 1.3.1 API Specification. In 432particular, java.util.Properties, java.text.MessageFormat, 433java.util.Locale, java.util.ResourceBundle. 434http://java.sun.com/j2se/1.3/docs/api 435</p> 436 437<p> 438System Interface Definitions, Issue 7 (IEEE Std. 1003.1-200x) 439The Open Group/The Institute of Electrical and Electronics Engineers, Inc. 440In particular see lines 5268-5427. 441http://www.opennc.org/austin/docreg.html 442</p> 443 444<p> GNU gettext tools, version 0.10.38, Native Language Support 445Library and Tools. 446http://sources.redhat.com/gettext 447</p> 448 449<p> 450Langer, Angelika and Klaus Kreft, Standard C++ IOStreams and Locales, 451Advanced Programmer's Guide and Reference, Addison Wesley Longman, 452Inc. 2000. See page 725, Internationalized Messages. 453</p> 454 455<p> 456Stroustrup, Bjarne, Appendix D, The C++ Programming Language, Special Edition, Addison Wesley, Inc. 2000 457</p> 458 459</body> 460</html> 461 462