1.\" $NetBSD: nls.7,v 1.3 2003/04/14 06:47:12 gmcgarry Exp $ 2.\" 3.\" Copyright (c) 2003 The NetBSD Foundation, Inc. 4.\" All rights reserved. 5.\" 6.\" This code is derived from software contributed to The NetBSD Foundation 7.\" by Gregory McGarry. 8.\" 9.\" Redistribution and use in source and binary forms, with or without 10.\" modification, are permitted provided that the following conditions 11.\" are met: 12.\" 1. Redistributions of source code must retain the above copyright 13.\" notice, this list of conditions and the following disclaimer. 14.\" 2. Redistributions in binary form must reproduce the above copyright 15.\" notice, this list of conditions and the following disclaimer in the 16.\" documentation and/or other materials provided with the distribution. 17.\" 3. All advertising materials mentioning features or use of this software 18.\" must display the following acknowledgement: 19.\" This product includes software developed by the NetBSD 20.\" Foundation, Inc. and its contributors. 21.\" 4. Neither the name of The NetBSD Foundation nor the names of its 22.\" contributors may be used to endorse or promote products derived 23.\" from this software without specific prior written permission. 24.\" 25.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS 26.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED 27.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR 28.\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS 29.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 30.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 31.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 32.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 33.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 34.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 35.\" POSSIBILITY OF SUCH DAMAGE. 36.\" 37.Dd February 12, 2003 38.Dt NLS 7 39.Os 40.Sh NAME 41.Nm NLS 42.Nd Natural Language Support Overview 43.Sh DESCRIPTION 44National Language Support (NLS) provides commands for a single 45worldwide operating system base. 46An internationalized system has no built-in assumptions or dependencies 47on language-specific or cultural-specific conventions such as: 48.Pp 49.Bl -bullet -indent -compact 50.It 51Character classifications 52.It 53Character comparison rules 54.It 55Character collation order 56.It 57Numeric and monetary formatting 58.It 59Date and time formatting 60.It 61Message-text language 62.It 63Code sets 64.El 65.Pp 66All information pertaining to cultural conventions and language is 67obtained at program run time. 68.Pp 69.Dq Internationalization 70(often abbreviated 71.Dq i18n ) 72refers to the operation by which system software is developed to support 73multiple cultural-specific and language-specific conventions. 74This is a generalization process by which the system is untied from 75calling only English strings or other English-specific conventions. 76.Dq Localization 77(often abbreviated 78.Dq l10n ) 79refers to the operations by which the user environment is customized to 80handle its input and output appropriate for specific language and cultural 81conventions. 82This is a specialization process, by which generic methods already 83implemented in an internationalized system are used in specific ways. 84The formal description of cultural conventions for some country, together 85with all associated translations targeted to the native language, is 86called the 87.Dq locale . 88.Pp 89.Nx 90provides extensive support to programmers and system developers to 91enable internationalized software to be developed. 92.Nx 93also supplies a large variety of locales for system localization. 94.Ss Localization of Information 95All locale information is accessible to programs at run time so that 96data is processed and displayed correctly for specific cultural 97conventions and language. 98.Pp 99A locale is divided into categories. 100A category is a group of language-specific and culture-specific conventions 101as outlined in the list above. 102ISO C specifies the following six standard categories supported by 103.Nx : 104.Pp 105.Bl -tag -compact -width LC_MONETARYXX 106.It LC_COLLATE 107string-collation order information 108.It LC_CTYPE 109character classification, case conversion, and other character attributes 110.It LC_MESSAGES 111the format for affirmative and negative responses 112.It LC_MONETARY 113rules and symbols for formatting monetary numeric information 114.It LC_NUMERIC 115rules and symbols for formatting nonmonetary numeric information 116.It LC_TIME 117rules and symbols for formatting time and date information 118.El 119.Pp 120Localization of the system is achieved by setting appropriate values 121in environment variables to identify which locale should be used. 122The environment variables have the same names as their respective 123locale categories. Additionally, the 124.Ev LANG , 125.Ev LC_ALL , 126and 127.Ev NLSPATH 128environment variables are used. 129The 130.Ev NLSPATH 131environment variable specifies a colon-separated list of directory names 132where the message catalog files of the NLS database are located. 133The 134.Ev LC_ALL 135and 136.Ev LANG 137environment variables also determine the current locale. 138.Pp 139The values of these environment variables contains a string format as: 140.Pp 141.Bd -literal 142 language[_territory][.codeset][@modifier] 143.Ed 144.Pp 145For example, the locale for the Danish language spoken in Denmark 146using the ISO8859-1 code set is da_DK.ISO8859-1. 147The da stands for the Danish language and the DK stands for Denmark. 148The short form of da_DK is sufficient to indicate this locale. 149.Pp 150The environment variable settings are queried by their priority level 151in the following manner: 152.Pp 153.Bl -bullet 154.It 155If the 156.Ev LC_ALL 157environment variable is set, all six categories use the locale it 158specifies. 159.It 160If the 161.Ev LC_ALL 162environment variable is not set, each individual category uses the 163locale specified by its corresponding environment variable. 164.It 165If the 166.Ev LC_ALL 167environment variable is not set, and a value for a particular 168.Ev LC_* 169environment variable is not set, the value of the 170.Ev LANG 171environment variable specifies the default locale for all categories. 172Only the 173.Ev LANG 174environment variable should be set in /etc/profile, since it makes it 175most easy for the user to override the system default using the individual 176.Ev LC_* 177variables. 178.It 179If the 180.Ev LC_ALL 181environment variable is not set, a value for a particular 182.Ev LC_* 183environment variable is not set, and the value of the 184.Ev LANG 185environment variable is not set, the locale for that specific 186category defaults to the C locale. 187The C or POSIX locale assumes the 7-bit ASCII character set and defines 188information for the six categories. 189.El 190.Ss Code Sets 191A character is any symbol used for the organization, control, or 192representation of data. 193A group of such symbols used to describe a 194particular language make up a character set. 195A code set contains the encoding values (conversion from bits to 196displayed characters) for a character set. 197It is the encoding values in a code set that provide 198the interface between the system and its input and output devices. 199.Pp 200The following code sets are supported in 201.Nx 202.Bl -tag -width ISO8859_family 203.It ISO8859 family 204Industry-standard code sets are provided by means of the ISO8859 205family of code sets, which provide a range of single-byte code set 206support that includes Latin-1, Latin-2, Arabic, Cyrillic, Hebrew, 207Greek, and Turkish. 208The eucJP code set is the industry-standard code set used to support 209the Japanese locale. 210.It Unicode 211A Unicode environment based on the UTF-8 codeset is supported for all 212supported language/territories. 213UTF-8 provides character support for most of the major languages of the 214world and can be used in environments where multiple languages must be 215processed simultaneously. 216.El 217.Ss Internationalization for Programmers 218To facilitate translations of messages into various languages and to 219make the translated messages available to the program based on a 220user's locale, it is necessary to keep messages separate from the 221programs and provide them in the form of message catalogs that a 222program can access at run time. 223.Pp 224Access to locale information is provided through the 225.Xr setlocale 3 226and 227.Xr nl_langinfo 3 228interfaces. 229See their respective man pages for further information. 230.Pp 231Message source files containing application messages are created by 232the programmer and converted to message catalogs. 233These catalogs are used by the application to retrieve and display 234messages, as needed. 235.Pp 236.Nx 237supports two message catalog interfaces: the X/Open 238.Xr catgets 3 239interface and the Uniforum 240.Xr gettext 3 241interface. 242The 243.Xr catgets 3 244interface has the advantage that it belongs to a standard which is 245well supported. 246Unfortunately the interface is complicated to use and 247maintenance of the catalogs is difficult. 248The implementation also doesn't support different codesets. 249The 250.Xr gettext 3 251interface has not been standardized yet, however it is being supported 252by an increasing number of systems. 253It also provides many additional tools which make programming and 254catalog maintenance much easier. 255.Sh SEE ALSO 256.Xr gencat 1 , 257.Xr catgets 3 , 258.Xr gettext 3 , 259.Xr nl_langinfo 3 , 260.Xr setlocale 3 261.Sh BUGS 262This man page is incomplete. 263