1*0Sstevel@tonic-gate 2*0Sstevel@tonic-gate=head1 NAME 3*0Sstevel@tonic-gate 4*0Sstevel@tonic-gateLocale::Script - ISO codes for script identification (ISO 15924) 5*0Sstevel@tonic-gate 6*0Sstevel@tonic-gate=head1 SYNOPSIS 7*0Sstevel@tonic-gate 8*0Sstevel@tonic-gate use Locale::Script; 9*0Sstevel@tonic-gate use Locale::Constants; 10*0Sstevel@tonic-gate 11*0Sstevel@tonic-gate $script = code2script('ph'); # 'Phoenician' 12*0Sstevel@tonic-gate $code = script2code('Tibetan'); # 'bo' 13*0Sstevel@tonic-gate $code3 = script2code('Tibetan', 14*0Sstevel@tonic-gate LOCALE_CODE_ALPHA_3); # 'bod' 15*0Sstevel@tonic-gate $codeN = script2code('Tibetan', 16*0Sstevel@tonic-gate LOCALE_CODE_ALPHA_NUMERIC); # 330 17*0Sstevel@tonic-gate 18*0Sstevel@tonic-gate @codes = all_script_codes(); 19*0Sstevel@tonic-gate @scripts = all_script_names(); 20*0Sstevel@tonic-gate 21*0Sstevel@tonic-gate 22*0Sstevel@tonic-gate=head1 DESCRIPTION 23*0Sstevel@tonic-gate 24*0Sstevel@tonic-gateThe C<Locale::Script> module provides access to the ISO 25*0Sstevel@tonic-gatecodes for identifying scripts, as defined in ISO 15924. 26*0Sstevel@tonic-gateFor example, Egyptian hieroglyphs are denoted by the two-letter 27*0Sstevel@tonic-gatecode 'eg', the three-letter code 'egy', and the numeric code 050. 28*0Sstevel@tonic-gate 29*0Sstevel@tonic-gateYou can either access the codes via the conversion routines 30*0Sstevel@tonic-gate(described below), or with the two functions which return lists 31*0Sstevel@tonic-gateof all script codes or all script names. 32*0Sstevel@tonic-gate 33*0Sstevel@tonic-gateThere are three different code sets you can use for identifying 34*0Sstevel@tonic-gatescripts: 35*0Sstevel@tonic-gate 36*0Sstevel@tonic-gate=over 4 37*0Sstevel@tonic-gate 38*0Sstevel@tonic-gate=item B<alpha-2> 39*0Sstevel@tonic-gate 40*0Sstevel@tonic-gateTwo letter codes, such as 'bo' for Tibetan. 41*0Sstevel@tonic-gateThis code set is identified with the symbol C<LOCALE_CODE_ALPHA_2>. 42*0Sstevel@tonic-gate 43*0Sstevel@tonic-gate=item B<alpha-3> 44*0Sstevel@tonic-gate 45*0Sstevel@tonic-gateThree letter codes, such as 'ell' for Greek. 46*0Sstevel@tonic-gateThis code set is identified with the symbol C<LOCALE_CODE_ALPHA_3>. 47*0Sstevel@tonic-gate 48*0Sstevel@tonic-gate=item B<numeric> 49*0Sstevel@tonic-gate 50*0Sstevel@tonic-gateNumeric codes, such as 410 for Hiragana. 51*0Sstevel@tonic-gateThis code set is identified with the symbol C<LOCALE_CODE_NUMERIC>. 52*0Sstevel@tonic-gate 53*0Sstevel@tonic-gate=back 54*0Sstevel@tonic-gate 55*0Sstevel@tonic-gateAll of the routines take an optional additional argument 56*0Sstevel@tonic-gatewhich specifies the code set to use. 57*0Sstevel@tonic-gateIf not specified, it defaults to the two-letter codes. 58*0Sstevel@tonic-gateThis is partly for backwards compatibility (previous versions 59*0Sstevel@tonic-gateof Locale modules only supported the alpha-2 codes), and 60*0Sstevel@tonic-gatepartly because they are the most widely used codes. 61*0Sstevel@tonic-gate 62*0Sstevel@tonic-gateThe alpha-2 and alpha-3 codes are not case-dependent, 63*0Sstevel@tonic-gateso you can use 'BO', 'Bo', 'bO' or 'bo' for Tibetan. 64*0Sstevel@tonic-gateWhen a code is returned by one of the functions in 65*0Sstevel@tonic-gatethis module, it will always be lower-case. 66*0Sstevel@tonic-gate 67*0Sstevel@tonic-gate=head2 SPECIAL CODES 68*0Sstevel@tonic-gate 69*0Sstevel@tonic-gateThe standard defines various special codes. 70*0Sstevel@tonic-gate 71*0Sstevel@tonic-gate=over 4 72*0Sstevel@tonic-gate 73*0Sstevel@tonic-gate=item * 74*0Sstevel@tonic-gate 75*0Sstevel@tonic-gateThe standard reserves codes in the ranges B<qa> - B<qt>, 76*0Sstevel@tonic-gateB<qaa> - B<qat>, and B<900> - B<919>, for private use. 77*0Sstevel@tonic-gate 78*0Sstevel@tonic-gate=item * 79*0Sstevel@tonic-gate 80*0Sstevel@tonic-gateB<zx>, B<zxx>, and B<997>, are the codes for unwritten languages. 81*0Sstevel@tonic-gate 82*0Sstevel@tonic-gate=item * 83*0Sstevel@tonic-gate 84*0Sstevel@tonic-gateB<zy>, B<zyy>, and B<998>, are the codes for an undetermined script. 85*0Sstevel@tonic-gate 86*0Sstevel@tonic-gate=item * 87*0Sstevel@tonic-gate 88*0Sstevel@tonic-gateB<zz>, B<zzz>, and B<999>, are the codes for an uncoded script. 89*0Sstevel@tonic-gate 90*0Sstevel@tonic-gate=back 91*0Sstevel@tonic-gate 92*0Sstevel@tonic-gateThe private codes are not recognised by Locale::Script, 93*0Sstevel@tonic-gatebut the others are. 94*0Sstevel@tonic-gate 95*0Sstevel@tonic-gate 96*0Sstevel@tonic-gate=head1 CONVERSION ROUTINES 97*0Sstevel@tonic-gate 98*0Sstevel@tonic-gateThere are three conversion routines: C<code2script()>, C<script2code()>, 99*0Sstevel@tonic-gateand C<script_code2code()>. 100*0Sstevel@tonic-gate 101*0Sstevel@tonic-gate=over 4 102*0Sstevel@tonic-gate 103*0Sstevel@tonic-gate=item code2script( CODE, [ CODESET ] ) 104*0Sstevel@tonic-gate 105*0Sstevel@tonic-gateThis function takes a script code and returns a string 106*0Sstevel@tonic-gatewhich contains the name of the script identified. 107*0Sstevel@tonic-gateIf the code is not a valid script code, as defined by ISO 15924, 108*0Sstevel@tonic-gatethen C<undef> will be returned: 109*0Sstevel@tonic-gate 110*0Sstevel@tonic-gate $script = code2script('cy'); # Cyrillic 111*0Sstevel@tonic-gate 112*0Sstevel@tonic-gate=item script2code( STRING, [ CODESET ] ) 113*0Sstevel@tonic-gate 114*0Sstevel@tonic-gateThis function takes a script name and returns the corresponding 115*0Sstevel@tonic-gatescript code, if such exists. 116*0Sstevel@tonic-gateIf the argument could not be identified as a script name, 117*0Sstevel@tonic-gatethen C<undef> will be returned: 118*0Sstevel@tonic-gate 119*0Sstevel@tonic-gate $code = script2code('Gothic', LOCALE_CODE_ALPHA_3); 120*0Sstevel@tonic-gate # $code will now be 'gth' 121*0Sstevel@tonic-gate 122*0Sstevel@tonic-gateThe case of the script name is not important. 123*0Sstevel@tonic-gateSee the section L<KNOWN BUGS AND LIMITATIONS> below. 124*0Sstevel@tonic-gate 125*0Sstevel@tonic-gate=item script_code2code( CODE, CODESET, CODESET ) 126*0Sstevel@tonic-gate 127*0Sstevel@tonic-gateThis function takes a script code from one code set, 128*0Sstevel@tonic-gateand returns the corresponding code from another code set. 129*0Sstevel@tonic-gate 130*0Sstevel@tonic-gate $alpha2 = script_code2code('jwi', 131*0Sstevel@tonic-gate LOCALE_CODE_ALPHA_3 => LOCALE_CODE_ALPHA_2); 132*0Sstevel@tonic-gate # $alpha2 will now be 'jw' (Javanese) 133*0Sstevel@tonic-gate 134*0Sstevel@tonic-gateIf the code passed is not a valid script code in 135*0Sstevel@tonic-gatethe first code set, or if there isn't a code for the 136*0Sstevel@tonic-gatecorresponding script in the second code set, 137*0Sstevel@tonic-gatethen C<undef> will be returned. 138*0Sstevel@tonic-gate 139*0Sstevel@tonic-gate=back 140*0Sstevel@tonic-gate 141*0Sstevel@tonic-gate 142*0Sstevel@tonic-gate=head1 QUERY ROUTINES 143*0Sstevel@tonic-gate 144*0Sstevel@tonic-gateThere are two function which can be used to obtain a list of all codes, 145*0Sstevel@tonic-gateor all script names: 146*0Sstevel@tonic-gate 147*0Sstevel@tonic-gate=over 4 148*0Sstevel@tonic-gate 149*0Sstevel@tonic-gate=item C<all_script_codes ( [ CODESET ] )> 150*0Sstevel@tonic-gate 151*0Sstevel@tonic-gateReturns a list of all two-letter script codes. 152*0Sstevel@tonic-gateThe codes are guaranteed to be all lower-case, 153*0Sstevel@tonic-gateand not in any particular order. 154*0Sstevel@tonic-gate 155*0Sstevel@tonic-gate=item C<all_script_names ( [ CODESET ] )> 156*0Sstevel@tonic-gate 157*0Sstevel@tonic-gateReturns a list of all script names for which there is a corresponding 158*0Sstevel@tonic-gatescript code in the specified code set. 159*0Sstevel@tonic-gateThe names are capitalised, and not returned in any particular order. 160*0Sstevel@tonic-gate 161*0Sstevel@tonic-gate=back 162*0Sstevel@tonic-gate 163*0Sstevel@tonic-gate 164*0Sstevel@tonic-gate=head1 EXAMPLES 165*0Sstevel@tonic-gate 166*0Sstevel@tonic-gateThe following example illustrates use of the C<code2script()> function. 167*0Sstevel@tonic-gateThe user is prompted for a script code, and then told the corresponding 168*0Sstevel@tonic-gatescript name: 169*0Sstevel@tonic-gate 170*0Sstevel@tonic-gate $| = 1; # turn off buffering 171*0Sstevel@tonic-gate 172*0Sstevel@tonic-gate print "Enter script code: "; 173*0Sstevel@tonic-gate chop($code = <STDIN>); 174*0Sstevel@tonic-gate $script = code2script($code, LOCALE_CODE_ALPHA_2); 175*0Sstevel@tonic-gate if (defined $script) 176*0Sstevel@tonic-gate { 177*0Sstevel@tonic-gate print "$code = $script\n"; 178*0Sstevel@tonic-gate } 179*0Sstevel@tonic-gate else 180*0Sstevel@tonic-gate { 181*0Sstevel@tonic-gate print "'$code' is not a valid script code!\n"; 182*0Sstevel@tonic-gate } 183*0Sstevel@tonic-gate 184*0Sstevel@tonic-gate 185*0Sstevel@tonic-gate=head1 KNOWN BUGS AND LIMITATIONS 186*0Sstevel@tonic-gate 187*0Sstevel@tonic-gate=over 4 188*0Sstevel@tonic-gate 189*0Sstevel@tonic-gate=item * 190*0Sstevel@tonic-gate 191*0Sstevel@tonic-gateWhen using C<script2code()>, the script name must currently appear 192*0Sstevel@tonic-gateexactly as it does in the source of the module. For example, 193*0Sstevel@tonic-gate 194*0Sstevel@tonic-gate script2code('Egyptian hieroglyphs') 195*0Sstevel@tonic-gate 196*0Sstevel@tonic-gatewill return B<eg>, as expected. But the following will all return C<undef>: 197*0Sstevel@tonic-gate 198*0Sstevel@tonic-gate script2code('hieroglyphs') 199*0Sstevel@tonic-gate script2code('Egyptian Hieroglypics') 200*0Sstevel@tonic-gate 201*0Sstevel@tonic-gateIf there's need for it, a future version could have variants 202*0Sstevel@tonic-gatefor script names. 203*0Sstevel@tonic-gate 204*0Sstevel@tonic-gate=item * 205*0Sstevel@tonic-gate 206*0Sstevel@tonic-gateIn the current implementation, all data is read in when the 207*0Sstevel@tonic-gatemodule is loaded, and then held in memory. 208*0Sstevel@tonic-gateA lazy implementation would be more memory friendly. 209*0Sstevel@tonic-gate 210*0Sstevel@tonic-gate=back 211*0Sstevel@tonic-gate 212*0Sstevel@tonic-gate=head1 SEE ALSO 213*0Sstevel@tonic-gate 214*0Sstevel@tonic-gate=over 4 215*0Sstevel@tonic-gate 216*0Sstevel@tonic-gate=item Locale::Language 217*0Sstevel@tonic-gate 218*0Sstevel@tonic-gateISO two letter codes for identification of language (ISO 639). 219*0Sstevel@tonic-gate 220*0Sstevel@tonic-gate=item Locale::Currency 221*0Sstevel@tonic-gate 222*0Sstevel@tonic-gateISO three letter codes for identification of currencies 223*0Sstevel@tonic-gateand funds (ISO 4217). 224*0Sstevel@tonic-gate 225*0Sstevel@tonic-gate=item Locale::Country 226*0Sstevel@tonic-gate 227*0Sstevel@tonic-gateISO three letter codes for identification of countries (ISO 3166) 228*0Sstevel@tonic-gate 229*0Sstevel@tonic-gate=item ISO 15924 230*0Sstevel@tonic-gate 231*0Sstevel@tonic-gateThe ISO standard which defines these codes. 232*0Sstevel@tonic-gate 233*0Sstevel@tonic-gate=item http://www.evertype.com/standards/iso15924/ 234*0Sstevel@tonic-gate 235*0Sstevel@tonic-gateHome page for ISO 15924. 236*0Sstevel@tonic-gate 237*0Sstevel@tonic-gate 238*0Sstevel@tonic-gate=back 239*0Sstevel@tonic-gate 240*0Sstevel@tonic-gate 241*0Sstevel@tonic-gate=head1 AUTHOR 242*0Sstevel@tonic-gate 243*0Sstevel@tonic-gateNeil Bowers E<lt>neil@bowers.comE<gt> 244*0Sstevel@tonic-gate 245*0Sstevel@tonic-gate=head1 COPYRIGHT 246*0Sstevel@tonic-gate 247*0Sstevel@tonic-gateCopyright (c) 2002 Neil Bowers. 248*0Sstevel@tonic-gate 249*0Sstevel@tonic-gateThis module is free software; you can redistribute it and/or 250*0Sstevel@tonic-gatemodify it under the same terms as Perl itself. 251*0Sstevel@tonic-gate 252*0Sstevel@tonic-gate=cut 253*0Sstevel@tonic-gate 254