xref: /openbsd-src/gnu/gcc/libstdc++-v3/docs/html/22_locale/messages.html (revision 404b540a9034ac75a6199ad1a32d1bbc7a0d4210)
1<?xml version="1.0" encoding="ISO-8859-1"?>
2<!DOCTYPE html
3          PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
4          "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
5
6<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
7<head>
8   <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
9   <meta name="AUTHOR" content="bkoz@redhat.com (Benjamin Kosnik)" />
10   <meta name="KEYWORDS" content="HOWTO, libstdc++, GCC, g++, libg++, STL" />
11   <meta name="DESCRIPTION" content="Notes on the messages implementation." />
12   <title>Notes on the messages implementation.</title>
13<link rel="StyleSheet" href="../lib3styles.css" type="text/css" />
14<link rel="Start" href="../documentation.html" type="text/html"
15  title="GNU C++ Standard Library" />
16<link rel="Bookmark" href="howto.html" type="text/html" title="Localization" />
17<link rel="Copyright" href="../17_intro/license.html" type="text/html" />
18<link rel="Help" href="../faq/index.html" type="text/html" title="F.A.Q." />
19</head>
20<body>
21<h1>
22Notes on the messages implementation.
23</h1>
24<em>
25prepared by Benjamin Kosnik (bkoz@redhat.com) on August 8, 2001
26</em>
27
28<h2>
291. Abstract
30</h2>
31<p>
32The std::messages facet implements message retrieval functionality
33equivalent to Java's java.text.MessageFormat .using either GNU gettext
34or IEEE 1003.1-200 functions.
35</p>
36
37<h2>
382. What the standard says
39</h2>
40The std::messages facet is probably the most vaguely defined facet in
41the standard library. It's assumed that this facility was built into
42the standard library in order to convert string literals from one
43locale to the other. For instance, converting the "C" locale's
44<code>const char* c = "please"</code> to a German-localized <code>"bitte"</code>
45during program execution.
46
47<blockquote>
4822.2.7.1 - Template class messages [lib.locale.messages]
49</blockquote>
50
51This class has three public member functions, which directly
52correspond to three protected virtual member functions.
53
54The public member functions are:
55
56<p>
57<code>catalog open(const string&amp;, const locale&amp;) const</code>
58</p>
59
60<p>
61<code>string_type get(catalog, int, int, const string_type&amp;) const</code>
62</p>
63
64<p>
65<code>void close(catalog) const</code>
66</p>
67
68<p>
69While the virtual functions are:
70</p>
71
72<p>
73<code>catalog do_open(const string&amp;, const locale&amp;) const</code>
74</p>
75<blockquote>
76<em>
77-1- Returns: A value that may be passed to get() to retrieve a
78message, from the message catalog identified by the string name
79according to an implementation-defined mapping. The result can be used
80until it is passed to close().  Returns a value less than 0 if no such
81catalog can be opened.
82</em>
83</blockquote>
84
85<p>
86<code>string_type do_get(catalog, int, int, const string_type&amp;) const</code>
87</p>
88<blockquote>
89<em>
90-3- Requires: A catalog cat obtained from open() and not yet closed.
91-4- Returns: A message identified by arguments set, msgid, and dfault,
92according to an implementation-defined mapping. If no such message can
93be found, returns dfault.
94</em>
95</blockquote>
96
97<p>
98<code>void do_close(catalog) const</code>
99</p>
100<blockquote>
101<em>
102-5- Requires: A catalog cat obtained from open() and not yet closed.
103-6- Effects: Releases unspecified resources associated with cat.
104-7- Notes: The limit on such resources, if any, is implementation-defined.
105</em>
106</blockquote>
107
108
109<h2>
1103. Problems with &quot;C&quot; messages: thread safety,
111over-specification, and assumptions.
112</h2>
113A couple of notes on the standard.
114
115<p>
116First, why is <code>messages_base::catalog</code> specified as a typedef
117to int? This makes sense for implementations that use
118<code>catopen</code>, but not for others. Fortunately, it's not heavily
119used and so only a minor irritant.
120</p>
121
122<p>
123Second, by making the member functions <code>const</code>, it is
124impossible to save state in them. Thus, storing away information used
125in the 'open' member function for use in 'get' is impossible. This is
126unfortunate.
127</p>
128
129<p>
130The 'open' member function in particular seems to be oddly
131designed. The signature seems quite peculiar. Why specify a <code>const
132string&amp; </code> argument, for instance, instead of just <code>const
133char*</code>? Or, why specify a <code>const locale&amp;</code> argument that is
134to be used in the 'get' member function? How, exactly, is this locale
135argument useful? What was the intent? It might make sense if a locale
136argument was associated with a given default message string in the
137'open' member function, for instance. Quite murky and unclear, on
138reflection.
139</p>
140
141<p>
142Lastly, it seems odd that messages, which explicitly require code
143conversion, don't use the codecvt facet. Because the messages facet
144has only one template parameter, it is assumed that ctype, and not
145codecvt, is to be used to convert between character sets.
146</p>
147
148<p>
149It is implicitly assumed that the locale for the default message
150string in 'get' is in the "C" locale. Thus, all source code is assumed
151to be written in English, so translations are always from "en_US" to
152other, explicitly named locales.
153</p>
154
155<h2>
1564. Design and Implementation Details
157</h2>
158This is a relatively simple class, on the face of it. The standard
159specifies very little in concrete terms, so generic implementations
160that are conforming yet do very little are the norm. Adding
161functionality that would be useful to programmers and comparable to
162Java's java.text.MessageFormat takes a bit of work, and is highly
163dependent on the capabilities of the underlying operating system.
164
165<p>
166Three different mechanisms have been provided, selectable via
167configure flags:
168</p>
169
170<ul>
171   <li> generic
172   <p>
173   This model does very little, and is what is used by default.
174   </p>
175   </li>
176
177   <li> gnu
178   <p>
179   The gnu model is complete and fully tested. It's based on the
180   GNU gettext package, which is part of glibc. It uses the functions
181   <code>textdomain, bindtextdomain, gettext</code>
182   to implement full functionality. Creating message
183   catalogs is a relatively straight-forward process and is
184   lightly documented below, and fully documented in gettext's
185   distributed documentation.
186   </p>
187   </li>
188
189   <li> ieee_1003.1-200x
190   <p>
191   This is a complete, though untested, implementation based on
192   the IEEE standard. The functions
193   <code>catopen, catgets, catclose</code>
194   are used to retrieve locale-specific messages given the
195   appropriate message catalogs that have been constructed for
196   their use. Note, the script <code> po2msg.sed</code> that is part
197   of the gettext distribution can convert gettext catalogs into
198   catalogs that <code>catopen</code> can use.
199   </p>
200   </li>
201</ul>
202
203<p>
204A new, standards-conformant non-virtual member function signature was
205added for 'open' so that a directory could be specified with a given
206message catalog. This simplifies calling conventions for the gnu
207model.
208</p>
209
210<p>
211The rest of this document discusses details of the GNU model.
212</p>
213
214<p>
215The messages facet, because it is retrieving and converting between
216characters sets, depends on the ctype and perhaps the codecvt facet in
217a given locale. In addition, underlying "C" library locale support is
218necessary for more than just the <code>LC_MESSAGES</code> mask:
219<code>LC_CTYPE</code> is also necessary. To avoid any unpleasantness, all
220bits of the "C" mask (ie <code>LC_ALL</code>) are set before retrieving
221messages.
222</p>
223
224<p>
225Making the message catalogs can be initially tricky, but become quite
226simple with practice. For complete info, see the gettext
227documentation. Here's an idea of what is required:
228</p>
229
230<ul>
231   <li> Make a source file with the required string literals
232   that need to be translated. See
233   <code>intl/string_literals.cc</code> for an example.
234   </li>
235
236   <li> Make initial catalog (see "4 Making the PO Template File"
237   from the gettext docs).
238   <p>
239   <code> xgettext --c++ --debug string_literals.cc -o libstdc++.pot </code>
240   </p>
241   </li>
242
243   <li> Make language and country-specific locale catalogs.
244   <p>
245   <code>cp libstdc++.pot fr_FR.po</code>
246   </p>
247   <p>
248   <code>cp libstdc++.pot de_DE.po</code>
249   </p>
250   </li>
251
252   <li> Edit localized catalogs in emacs so that strings are
253   translated.
254   <p>
255   <code>emacs fr_FR.po</code>
256   </p>
257   </li>
258
259   <li> Make the binary mo files.
260   <p>
261   <code>msgfmt fr_FR.po -o fr_FR.mo</code>
262   </p>
263   <p>
264   <code>msgfmt de_DE.po -o de_DE.mo</code>
265   </p>
266   </li>
267
268   <li> Copy the binary files into the correct directory structure.
269   <p>
270   <code>cp fr_FR.mo (dir)/fr_FR/LC_MESSAGES/libstdc++-v3.mo</code>
271   </p>
272   <p>
273   <code>cp de_DE.mo (dir)/de_DE/LC_MESSAGES/libstdc++-v3.mo</code>
274   </p>
275   </li>
276
277   <li> Use the new message catalogs.
278   <p>
279   <code>locale loc_de("de_DE");</code>
280   </p>
281   <p>
282   <code>
283   use_facet&lt;messages&lt;char&gt; &gt;(loc_de).open("libstdc++", locale(), dir);
284   </code>
285   </p>
286   </li>
287</ul>
288
289<h2>
2905.  Examples
291</h2>
292
293<ul>
294   <li> message converting, simple example using the GNU model.
295
296<pre>
297#include &lt;iostream&gt;
298#include &lt;locale&gt;
299using namespace std;
300
301void test01()
302{
303  typedef messages&lt;char&gt;::catalog catalog;
304  const char* dir =
305  "/mnt/egcs/build/i686-pc-linux-gnu/libstdc++-v3/po/share/locale";
306  const locale loc_de("de_DE");
307  const messages&lt;char&gt;&amp; mssg_de = use_facet&lt;messages&lt;char&gt; &gt;(loc_de);
308
309  catalog cat_de = mssg_de.open("libstdc++", loc_de, dir);
310  string s01 = mssg_de.get(cat_de, 0, 0, "please");
311  string s02 = mssg_de.get(cat_de, 0, 0, "thank you");
312  cout &lt;&lt; "please in german:" &lt;&lt; s01 &lt;&lt; '\n';
313  cout &lt;&lt; "thank you in german:" &lt;&lt; s02 &lt;&lt; '\n';
314  mssg_de.close(cat_de);
315}
316</pre>
317   </li>
318</ul>
319
320More information can be found in the following testcases:
321<ul>
322<li> testsuite/22_locale/messages.cc              </li>
323<li> testsuite/22_locale/messages_byname.cc       </li>
324<li> testsuite/22_locale/messages_char_members.cc </li>
325</ul>
326
327<h2>
3286.  Unresolved Issues
329</h2>
330<ul>
331<li>  Things that are sketchy, or remain unimplemented:
332   <ul>
333      <li>_M_convert_from_char, _M_convert_to_char are in
334      flux, depending on how the library ends up doing
335      character set conversions. It might not be possible to
336      do a real character set based conversion, due to the
337      fact that the template parameter for messages is not
338      enough to instantiate the codecvt facet (1 supplied,
339      need at least 2 but would prefer 3).
340      </li>
341
342      <li> There are issues with gettext needing the global
343      locale set to extract a message. This dependence on
344      the global locale makes the current "gnu" model non
345      MT-safe. Future versions of glibc, ie glibc 2.3.x will
346      fix this, and the C++ library bits are already in
347      place.
348      </li>
349   </ul>
350</li>
351
352<li>  Development versions of the GNU "C" library, glibc 2.3 will allow
353   a more efficient, MT implementation of std::messages, and will
354   allow the removal of the _M_name_messages data member. If this
355   is done, it will change the library ABI. The C++ parts to
356   support glibc 2.3 have already been coded, but are not in use:
357   once this version of the "C" library is released, the marked
358   parts of the messages implementation can be switched over to
359   the new "C" library functionality.
360</li>
361<li>    At some point in the near future, std::numpunct will probably use
362   std::messages facilities to implement truename/falename
363   correctly. This is currently not done, but entries in
364   libstdc++.pot have already been made for "true" and "false"
365   string literals, so all that remains is the std::numpunct
366   coding and the configure/make hassles to make the installed
367   library search its own catalog. Currently the libstdc++.mo
368   catalog is only searched for the testsuite cases involving
369   messages members.
370</li>
371
372<li>  The following member functions:
373
374   <p>
375   <code>
376        catalog
377        open(const basic_string&lt;char&gt;&amp; __s, const locale&amp; __loc) const
378   </code>
379   </p>
380
381   <p>
382   <code>
383   catalog
384   open(const basic_string&lt;char&gt;&amp;, const locale&amp;, const char*) const;
385   </code>
386   </p>
387
388   <p>
389   Don't actually return a "value less than 0 if no such catalog
390   can be opened" as required by the standard in the "gnu"
391   model. As of this writing, it is unknown how to query to see
392   if a specified message catalog exists using the gettext
393   package.
394   </p>
395</li>
396</ul>
397
398<h2>
3997. Acknowledgments
400</h2>
401Ulrich Drepper for the character set explanations, gettext details,
402and patient answering of late-night questions, Tom Tromey for the java details.
403
404
405<h2>
4068. Bibliography / Referenced Documents
407</h2>
408
409Drepper, Ulrich, GNU libc (glibc) 2.2 manual. In particular, Chapters
410&quot;7 Locales and Internationalization&quot;
411
412<p>
413Drepper, Ulrich, Thread-Aware Locale Model, A proposal. This is a
414draft document describing the design of glibc 2.3 MT locale
415functionality.
416</p>
417
418<p>
419Drepper, Ulrich, Numerous, late-night email correspondence
420</p>
421
422<p>
423ISO/IEC 9899:1999 Programming languages - C
424</p>
425
426<p>
427ISO/IEC 14882:1998 Programming languages - C++
428</p>
429
430<p>
431Java 2 Platform, Standard Edition, v 1.3.1 API Specification. In
432particular, java.util.Properties, java.text.MessageFormat,
433java.util.Locale, java.util.ResourceBundle.
434http://java.sun.com/j2se/1.3/docs/api
435</p>
436
437<p>
438System Interface Definitions, Issue 7 (IEEE Std. 1003.1-200x)
439The Open Group/The Institute of Electrical and Electronics Engineers, Inc.
440In particular see lines 5268-5427.
441http://www.opennc.org/austin/docreg.html
442</p>
443
444<p> GNU gettext tools, version 0.10.38, Native Language Support
445Library and Tools.
446http://sources.redhat.com/gettext
447</p>
448
449<p>
450Langer, Angelika and Klaus Kreft, Standard C++ IOStreams and Locales,
451Advanced Programmer's Guide and Reference, Addison Wesley Longman,
452Inc. 2000. See page 725, Internationalized Messages.
453</p>
454
455<p>
456Stroustrup, Bjarne, Appendix D, The C++ Programming Language, Special Edition, Addison Wesley, Inc. 2000
457</p>
458
459</body>
460</html>
461
462