1*0Sstevel@tonic-gateIf you read this file _as_is_, just ignore the funny characters you 2*0Sstevel@tonic-gatesee. It is written in the POD format (see perlpod manpage) which is 3*0Sstevel@tonic-gatespecially designed to be readable as is. 4*0Sstevel@tonic-gate 5*0Sstevel@tonic-gateThe following documentation is written in EUC-CN encoding. 6*0Sstevel@tonic-gate 7*0Sstevel@tonic-gate�������һ������ֱ༭����������ļ�, ������������ص�ע���ַ�. 8*0Sstevel@tonic-gate����ļ����� POD (�����ļ���ʽ) д��; ���ָ�ʽ��Ϊ��������ֱ���Ķ�, 9*0Sstevel@tonic-gate���ر���Ƶ�. ���ڴ˸�ʽ�Ľ�һ����Ϣ, ��ο� perlpod �����ļ�. 10*0Sstevel@tonic-gate 11*0Sstevel@tonic-gate=head1 NAME 12*0Sstevel@tonic-gate 13*0Sstevel@tonic-gateperlcn - �������� Perl ָ�� 14*0Sstevel@tonic-gate 15*0Sstevel@tonic-gate=head1 DESCRIPTION 16*0Sstevel@tonic-gate 17*0Sstevel@tonic-gate��ӭ���� Perl �����! 18*0Sstevel@tonic-gate 19*0Sstevel@tonic-gate�� 5.8.0 �濪ʼ, Perl �߱������Ƶ� Unicode (ͳһ��) ֧Ԯ, 20*0Sstevel@tonic-gateҲ����֧Ԯ�����������ϵ����ı��뷽ʽ; CJK (���պ�) �������е�һ����. 21*0Sstevel@tonic-gateUnicode �ǹ����Եı�, ��ͼ�������������е��ַ�: ��������, ��������, 22*0Sstevel@tonic-gate�Լ������һ�� (ϣ����, ��������, ��������, ϣ������, ӡ����, 23*0Sstevel@tonic-gateӡ�ذ���, �ȵ�). ��Ҳ�����˶�����ҵϵͳ��ƽ̨ (�� PC �������). 24*0Sstevel@tonic-gate 25*0Sstevel@tonic-gatePerl ������ Unicode ���в���. ���ʾ Perl �ڲ����ַ������ݿ��� Unicode 26*0Sstevel@tonic-gate��ʾ; Perl �ĺ�ʽ����� (���������ʾʽ�ȶ�) Ҳ�ܶ� Unicode ���в���. 27*0Sstevel@tonic-gate�����뼰���ʱ, Ϊ�˴����� Unicode ֮ǰ�ı��뷽ʽ��ŵ�����, Perl 28*0Sstevel@tonic-gate�ṩ�� Encode ���ģ��, �����������ض�ȡ��д����еı�������. 29*0Sstevel@tonic-gate 30*0Sstevel@tonic-gateEncode ����ģ��֧Ԯ���м������ĵı��뷽ʽ ('gb2312' ��ʾ 'euc-cn'): 31*0Sstevel@tonic-gate 32*0Sstevel@tonic-gate euc-cn Unix �����ַ���, Ҳ�����׳ƵĹ����� 33*0Sstevel@tonic-gate gb2312-raw δ������� (�ͱ���) GB2312 �ַ��� 34*0Sstevel@tonic-gate gb12345 δ��������й��÷������ı��� 35*0Sstevel@tonic-gate iso-ir-165 GB2312 + GB6345 + GB8565 + �����ַ� 36*0Sstevel@tonic-gate cp936 ����ҳ 936, Ҳ������ 'GBK' (���������) ָ�� 37*0Sstevel@tonic-gate hz 7 �����ݳ�ʽ GB2312 ���� 38*0Sstevel@tonic-gate 39*0Sstevel@tonic-gate������˵, �� EUC-CN ����ĵ���ת�� Unicode, �����������ָ��: 40*0Sstevel@tonic-gate 41*0Sstevel@tonic-gate perl -Mencoding=euc-cn,STDOUT,utf8 -pe1 < file.euc-cn > file.utf8 42*0Sstevel@tonic-gate 43*0Sstevel@tonic-gatePerl Ҳ�ڸ��� "piconv", һ֧��ȫ�� Perl д�ɵ��ַ�ת�����߳���, �÷�����: 44*0Sstevel@tonic-gate 45*0Sstevel@tonic-gate piconv -f euc-cn -t utf8 < file.euc-cn > file.utf8 46*0Sstevel@tonic-gate piconv -f utf8 -t euc-cn < file.utf8 > file.euc-cn 47*0Sstevel@tonic-gate 48*0Sstevel@tonic-gate����, ���� encoding ģ��, ���������д�����ַ�Ϊ��λ�ij�����, ������ʾ: 49*0Sstevel@tonic-gate 50*0Sstevel@tonic-gate #!/usr/bin/env perl 51*0Sstevel@tonic-gate # ���� euc-cn �ִ�����; ������뼰��������Ϊ euc-cn ���� 52*0Sstevel@tonic-gate use encoding 'euc-cn', STDIN => 'euc-cn', STDOUT => 'euc-cn'; 53*0Sstevel@tonic-gate print length("����"); # 2 (˫���ű�ʾ�ַ�) 54*0Sstevel@tonic-gate print length('����'); # 4 (�����ű�ʾ�ֽ�) 55*0Sstevel@tonic-gate print index("�̻�", "��"); # -1 (�����������ַ���) 56*0Sstevel@tonic-gate print index('�̻�', '��'); # 1 (�ӵڶ����ֽڿ�ʼ) 57*0Sstevel@tonic-gate 58*0Sstevel@tonic-gate�����һ��������, "" �ĵڶ����ֽ��� "" �ĵ�һ���ֽڽ�ϳ� EUC-CN 59*0Sstevel@tonic-gate��� "��"; "" �ĵڶ����ֽ����� "��" �ĵ�һ���ֽڽ�ϳ� "��". 60*0Sstevel@tonic-gate��������ǰ EUC-CN ��ȶԴ����ϳ���������. 61*0Sstevel@tonic-gate 62*0Sstevel@tonic-gate=head2 ��������ı��� 63*0Sstevel@tonic-gate 64*0Sstevel@tonic-gate�����Ҫ��������ı���, ���Դ� CPAN (L<http://www.cpan.org/>) ���� 65*0Sstevel@tonic-gateEncode::HanExtra ģ��. ��Ŀǰ�ṩ���б��뷽ʽ: 66*0Sstevel@tonic-gate 67*0Sstevel@tonic-gate gb18030 ������Ĺ�����, ������������ 68*0Sstevel@tonic-gate 69*0Sstevel@tonic-gate����, Encode::HanConvert ģ�����ṩ�˼�ת���õ����ֱ���: 70*0Sstevel@tonic-gate 71*0Sstevel@tonic-gate big5-simp Big5 ���������� Unicode �������Ļ�ת 72*0Sstevel@tonic-gate gbk-trad GBK ���������� Unicode �������Ļ�ת 73*0Sstevel@tonic-gate 74*0Sstevel@tonic-gate������ GBK �� Big5 ֮�以ת, ��ο���ģ���ڸ��� b2g.pl �� g2b.pl ��֧����, 75*0Sstevel@tonic-gate���ڳ�����ʹ������д��: 76*0Sstevel@tonic-gate 77*0Sstevel@tonic-gate use Encode::HanConvert; 78*0Sstevel@tonic-gate $euc_cn = big5_to_gb($big5); # �� Big5 תΪ GBK 79*0Sstevel@tonic-gate $big5 = gb_to_big5($euc_cn); # �� GBK תΪ Big5 80*0Sstevel@tonic-gate 81*0Sstevel@tonic-gate=head2 ��һ������Ϣ 82*0Sstevel@tonic-gate 83*0Sstevel@tonic-gate��ο� Perl �ڸ��Ĵ���˵���ļ� (����ȫ����Ӣ��д��), ��ѧϰ������� 84*0Sstevel@tonic-gatePerl ��֪ʶ, �Լ� Unicode ��ʹ�÷�ʽ. ����, �ⲿ����Դ�൱�ḻ: 85*0Sstevel@tonic-gate 86*0Sstevel@tonic-gate=head2 �ṩ Perl ��Դ����ַ 87*0Sstevel@tonic-gate 88*0Sstevel@tonic-gate=over 4 89*0Sstevel@tonic-gate 90*0Sstevel@tonic-gate=item L<http://www.perl.com/> 91*0Sstevel@tonic-gate 92*0Sstevel@tonic-gatePerl ����ҳ (��ŷ����˾ά��) 93*0Sstevel@tonic-gate 94*0Sstevel@tonic-gate=item L<http://www.cpan.org/> 95*0Sstevel@tonic-gate 96*0Sstevel@tonic-gatePerl �ۺϵ���� (Comprehensive Perl Archive Network) 97*0Sstevel@tonic-gate 98*0Sstevel@tonic-gate=item L<http://lists.perl.org/> 99*0Sstevel@tonic-gate 100*0Sstevel@tonic-gatePerl �ʵ���̳һ�� 101*0Sstevel@tonic-gate 102*0Sstevel@tonic-gate=back 103*0Sstevel@tonic-gate 104*0Sstevel@tonic-gate=head2 ѧϰ Perl ����ַ 105*0Sstevel@tonic-gate 106*0Sstevel@tonic-gate=over 4 107*0Sstevel@tonic-gate 108*0Sstevel@tonic-gate=item L<http://www.oreilly.com.cn/html/perl.html> 109*0Sstevel@tonic-gate 110*0Sstevel@tonic-gate�������İ��ŷ���� Perl ��� 111*0Sstevel@tonic-gate 112*0Sstevel@tonic-gate=back 113*0Sstevel@tonic-gate 114*0Sstevel@tonic-gate=head2 Perl ʹ������ 115*0Sstevel@tonic-gate 116*0Sstevel@tonic-gate=over 4 117*0Sstevel@tonic-gate 118*0Sstevel@tonic-gate=item L<http://www.pm.org/groups/asia.shtml#China> 119*0Sstevel@tonic-gate 120*0Sstevel@tonic-gate�й� Perl �ƹ���һ�� 121*0Sstevel@tonic-gate 122*0Sstevel@tonic-gate=back 123*0Sstevel@tonic-gate 124*0Sstevel@tonic-gate=head2 Unicode �����ַ 125*0Sstevel@tonic-gate 126*0Sstevel@tonic-gate=over 4 127*0Sstevel@tonic-gate 128*0Sstevel@tonic-gate=item L<http://www.unicode.org/> 129*0Sstevel@tonic-gate 130*0Sstevel@tonic-gateUnicode ѧ��ѧ�� (Unicode �����ƶ���) 131*0Sstevel@tonic-gate 132*0Sstevel@tonic-gate=item L<http://www.cl.cam.ac.uk/%7Emgk25/unicode.html> 133*0Sstevel@tonic-gate 134*0Sstevel@tonic-gateUnix/Linux �ϵ� UTF-8 �� Unicode ����� 135*0Sstevel@tonic-gate 136*0Sstevel@tonic-gate=back 137*0Sstevel@tonic-gate 138*0Sstevel@tonic-gate=head1 SEE ALSO 139*0Sstevel@tonic-gate 140*0Sstevel@tonic-gateL<Encode>, L<Encode::CN>, L<encoding>, L<perluniintro>, L<perlunicode> 141*0Sstevel@tonic-gate 142*0Sstevel@tonic-gate=head1 AUTHORS 143*0Sstevel@tonic-gate 144*0Sstevel@tonic-gateJarkko Hietaniemi E<lt>jhi@iki.fiE<gt> 145*0Sstevel@tonic-gate 146*0Sstevel@tonic-gateAutrijus Tang (���ں�) E<lt>autrijus@autrijus.orgE<gt> 147*0Sstevel@tonic-gate 148*0Sstevel@tonic-gate=cut 149