xref: /netbsd-src/share/man/man7/nls.7 (revision 1ffa7b76c40339c17a0fb2a09fac93f287cfc046)
1.\"     $NetBSD: nls.7,v 1.3 2003/04/14 06:47:12 gmcgarry Exp $
2.\"
3.\" Copyright (c) 2003 The NetBSD Foundation, Inc.
4.\" All rights reserved.
5.\"
6.\" This code is derived from software contributed to The NetBSD Foundation
7.\" by Gregory McGarry.
8.\"
9.\" Redistribution and use in source and binary forms, with or without
10.\" modification, are permitted provided that the following conditions
11.\" are met:
12.\" 1. Redistributions of source code must retain the above copyright
13.\"    notice, this list of conditions and the following disclaimer.
14.\" 2. Redistributions in binary form must reproduce the above copyright
15.\"    notice, this list of conditions and the following disclaimer in the
16.\"    documentation and/or other materials provided with the distribution.
17.\" 3. All advertising materials mentioning features or use of this software
18.\"    must display the following acknowledgement:
19.\"        This product includes software developed by the NetBSD
20.\"        Foundation, Inc. and its contributors.
21.\" 4. Neither the name of The NetBSD Foundation nor the names of its
22.\"    contributors may be used to endorse or promote products derived
23.\"    from this software without specific prior written permission.
24.\"
25.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
26.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
27.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
28.\" PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
29.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
30.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
31.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
32.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
33.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
34.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
35.\" POSSIBILITY OF SUCH DAMAGE.
36.\"
37.Dd February 12, 2003
38.Dt NLS 7
39.Os
40.Sh NAME
41.Nm NLS
42.Nd Natural Language Support Overview
43.Sh DESCRIPTION
44National Language Support (NLS) provides commands for a single
45worldwide operating system base.
46An internationalized system has no built-in assumptions or dependencies
47on language-specific or cultural-specific conventions such as:
48.Pp
49.Bl -bullet -indent -compact
50.It
51Character classifications
52.It
53Character comparison rules
54.It
55Character collation order
56.It
57Numeric and monetary formatting
58.It
59Date and time formatting
60.It
61Message-text language
62.It
63Code sets
64.El
65.Pp
66All information pertaining to cultural conventions and language is
67obtained at program run time.
68.Pp
69.Dq Internationalization
70(often abbreviated
71.Dq i18n )
72refers to the operation by which system software is developed to support
73multiple cultural-specific and language-specific conventions.
74This is a generalization process by which the system is untied from
75calling only English strings or other English-specific conventions.
76.Dq Localization
77(often abbreviated
78.Dq l10n )
79refers to the operations by which the user environment is customized to
80handle its input and output appropriate for specific language and cultural
81conventions.
82This is a specialization process, by which generic methods already
83implemented in an internationalized system are used in specific ways.
84The formal description of cultural conventions for some country, together
85with all associated translations targeted to the native language, is
86called the
87.Dq locale .
88.Pp
89.Nx
90provides extensive support to programmers and system developers to
91enable internationalized software to be developed.
92.Nx
93also supplies a large variety of locales for system localization.
94.Ss Localization of Information
95All locale information is accessible to programs at run time so that
96data is processed and displayed correctly for specific cultural
97conventions and language.
98.Pp
99A locale is divided into categories.
100A category is a group of language-specific and culture-specific conventions
101as outlined in the list above.
102ISO C specifies the following six standard categories supported by
103.Nx :
104.Pp
105.Bl -tag -compact -width LC_MONETARYXX
106.It LC_COLLATE
107string-collation order information
108.It LC_CTYPE
109character classification, case conversion, and other character attributes
110.It LC_MESSAGES
111the format for affirmative and negative responses
112.It LC_MONETARY
113rules and symbols for formatting monetary numeric information
114.It LC_NUMERIC
115rules and symbols for formatting nonmonetary numeric information
116.It LC_TIME
117rules and symbols for formatting time and date information
118.El
119.Pp
120Localization of the system is achieved by setting appropriate values
121in environment variables to identify which locale should be used.
122The environment variables have the same names as their respective
123locale categories.  Additionally, the
124.Ev LANG ,
125.Ev LC_ALL ,
126and
127.Ev NLSPATH
128environment variables are used.
129The
130.Ev NLSPATH
131environment variable specifies a colon-separated list of directory names
132where the message catalog files of the NLS database are located.
133The
134.Ev LC_ALL
135and
136.Ev LANG
137environment variables also determine the current locale.
138.Pp
139The values of these environment variables contains a string format as:
140.Pp
141.Bd -literal
142	language[_territory][.codeset][@modifier]
143.Ed
144.Pp
145For example, the locale for the Danish language spoken in Denmark
146using the ISO8859-1 code set is da_DK.ISO8859-1.
147The da stands for the Danish language and the DK stands for Denmark.
148The short form of da_DK is sufficient to indicate this locale.
149.Pp
150The environment variable settings are queried by their priority level
151in the following manner:
152.Pp
153.Bl -bullet
154.It
155If the
156.Ev LC_ALL
157environment variable is set, all six categories use the locale it
158specifies.
159.It
160If the
161.Ev LC_ALL
162environment variable is not set, each individual category uses the
163locale specified by its corresponding environment variable.
164.It
165If the
166.Ev LC_ALL
167environment variable is not set, and a value for a particular
168.Ev LC_*
169environment variable is not set, the value of the
170.Ev LANG
171environment variable specifies the default locale for all categories.
172Only the
173.Ev LANG
174environment variable should be set in /etc/profile, since it makes it
175most easy for the user to override the system default using the individual
176.Ev LC_*
177variables.
178.It
179If the
180.Ev LC_ALL
181environment variable is not set, a value for a particular
182.Ev LC_*
183environment variable is not set, and the value of the
184.Ev LANG
185environment variable is not set, the locale for that specific
186category defaults to the C locale.
187The C or POSIX locale assumes the 7-bit ASCII character set and defines
188information for the six categories.
189.El
190.Ss Code Sets
191A character is any symbol used for the organization, control, or
192representation of data.
193A group of such symbols used to describe a
194particular language make up a character set.
195A code set contains the encoding values (conversion from bits to
196displayed characters) for a character set.
197It is the encoding values in a code set that provide
198the interface between the system and its input and output devices.
199.Pp
200The following code sets are supported in
201.Nx
202.Bl -tag -width ISO8859_family
203.It ISO8859 family
204Industry-standard code sets are provided by means of the ISO8859
205family of code sets, which provide a range of single-byte code set
206support that includes Latin-1, Latin-2, Arabic, Cyrillic, Hebrew,
207Greek, and Turkish.
208The eucJP code set is the industry-standard code set used to support
209the Japanese locale.
210.It Unicode
211A Unicode environment based on the UTF-8 codeset is supported for all
212supported language/territories.
213UTF-8 provides character support for most of the major languages of the
214world and can be used in environments where multiple languages must be
215processed simultaneously.
216.El
217.Ss Internationalization for Programmers
218To facilitate translations of messages into various languages and to
219make the translated messages available to the program based on a
220user's locale, it is necessary to keep messages separate from the
221programs and provide them in the form of message catalogs that a
222program can access at run time.
223.Pp
224Access to locale information is provided through the
225.Xr setlocale 3
226and
227.Xr nl_langinfo 3
228interfaces.
229See their respective man pages for further information.
230.Pp
231Message source files containing application messages are created by
232the programmer and converted to message catalogs.
233These catalogs are used by the application to retrieve and display
234messages, as needed.
235.Pp
236.Nx
237supports two message catalog interfaces: the X/Open
238.Xr catgets 3
239interface and the Uniforum
240.Xr gettext 3
241interface.
242The
243.Xr catgets 3
244interface has the advantage that it belongs to a standard which is
245well supported.
246Unfortunately the interface is complicated to use and
247maintenance of the catalogs is difficult.
248The implementation also doesn't support different codesets.
249The
250.Xr gettext 3
251interface has not been standardized yet, however it is being supported
252by an increasing number of systems.
253It also provides many additional tools which make programming and
254catalog maintenance much easier.
255.Sh SEE ALSO
256.Xr gencat 1 ,
257.Xr catgets 3 ,
258.Xr gettext 3 ,
259.Xr nl_langinfo 3 ,
260.Xr setlocale 3
261.Sh BUGS
262This man page is incomplete.
263