Lines Matching defs:Unicode

28 =head2 Unicode
30 =head3 Unicode Version 6.0 is now supported (mostly)
32 Perl comes with the Unicode 6.0 data base updated with
36 release. Perl does not support any Unicode provisional properties,
39 Unicode 6.0 has chosen to use the name C<BELL> for the character at U+1F514,
58 Unicode semantics. See L<feature/"the 'unicode_strings' feature">.
62 This feature avoids most forms of the "Unicode Bug" (see
63 L<perlunicode/The "Unicode Bug"> for details). If there is any
64 possibility that your code will process Unicode strings, you are
74 character names listed by Unicode, such as NBSP, SHY, LRO, ZWJ, etc.; all
81 Unicode has several I<named character sequences>, in which particular sequences
87 now know about every character in Unicode. In earlier releases of
100 Unicode character names. This made it impossible to create an alias for
107 of C<\N{I<NAME>}}>, returning the string of characters whose Unicode
108 name is its parameter. It can handle Unicode named character
116 =head3 New warnings categories for problematic (non-)Unicode code points
122 C<nonchar> when Unicode non-character code points are encountered;
123 and C<non_unicode> when code points above the legal Unicode
130 without warnings, not just the code points that are legal in Unicode.
133 or executing a Unicode-defined operation such as upper-casing
139 Unicode non-characters, some of which previously were erroneously
140 considered illegal in places by Perl, contrary to the Unicode Standard,
142 works the same as with the non-legal Unicode code points, because the Unicode
145 =head3 Unicode database files not installed
147 The Unicode database files are no longer installed with Perl. This
201 case-insensitive matching uses Unicode semantics.
263 This synonym is added for symmetry with the Unicode property names
537 regular expression matching under Unicode rules. One example is
550 Unicode says that C<"ss"> is what C<SHARP S> matches under C</i>. So
575 For most Unicode properties, it doesn't make sense to have them match
581 could previously match non-ASCII characters because of the Unicode
588 Details are in L<perlrecharclass/Unicode Properties>.
595 =head3 \p{} implies Unicode semantics
597 Specifying a Unicode property in the pattern indicates
598 that the pattern is meant for matching according to Unicode rules, the way
857 Characters outside the Unicode "XIDStart" set are no longer allowed at the
990 This is because Unicode is using that name for a different character.
991 See L</Unicode Version 6.0 is now supported (mostly)> for more
1018 L<Unicode::Casing>, which provides improved functionality.
1250 The following modules were added by the L<Unicode::Collate>
1253 L<Unicode::Collate::CJK::Big5>
1255 L<Unicode::Collate::CJK::GB2312>
1257 L<Unicode::Collate::CJK::JISX0208>
1259 L<Unicode::Collate::CJK::Korean>
1261 L<Unicode::Collate::CJK::Pinyin>
1263 L<Unicode::Collate::CJK::Stroke>
1416 load its Unicode tables, so as to avoid the "BEGIN not safe after
1442 Unicode constants work once more. They have been broken since Perl 5.10.0
1579 Now, all 66 Unicode non-characters are treated the same way U+FFFF has
2086 L<Unicode::Collate> has been upgraded from version 0.52_01 to 0.73.
2088 L<Unicode::Collate> has been updated to use Unicode 6.0.0.
2090 L<Unicode::Collate::Locale> now supports a plethora of new locales: I<ar, be,
2096 L<Unicode::Collate::CJK::Big5> for C<zh__big5han> which makes
2099 L<Unicode::Collate::CJK::GB2312> for C<zh__gb2312han> which makes
2102 L<Unicode::Collate::CJK::JISX0208> which makes tailoring of 6355 kanji
2105 L<Unicode::Collate::CJK::Korean> which makes tailoring of CJK Unified Ideographs
2108 L<Unicode::Collate::CJK::Pinyin> for C<zh__pinyin> which makes
2111 L<Unicode::Collate::CJK::Stroke> for C<zh__stroke> which makes
2119 L<Unicode::Normalize> has been upgraded from version 1.03 to 1.10.
2123 L<Unicode::UCD> has been upgraded from version 0.27 to 0.32.
2125 A new function, Unicode::UCD::num(), has been added. This function
2128 definition of "safe", see L<Unicode::UCD/num()>.)
2140 It is now updated to Unicode Version 6.0.0 with I<Corrigendum #8>,
2305 conversions on Unicode data, and how to provide scoped changes to alter
2496 Performing an operation requiring Unicode semantics (such as case-folding)
2497 on a Unicode surrogate or a non-Unicode character now triggers this
3281 fundamentally broken model of how the Unicode non-character code points
3283 L<perlunicode/Non-character code points>. See also the Unicode section
3491 Matching a Unicode character against an alternation containing characters
3518 above 255 are treated as Unicode, but code points between 0 and 255
3612 The parser no longer hangs when encountering certain Unicode characters,
3803 =head2 Unicode
3809 What has become known as "the Unicode Bug" is almost completely resolved in
3823 L<Unicode::Casing> has been written to replace this feature without its
3830 L<perlunicode/The "Unicode Bug">.
3836 Handling of Unicode non-character code points has changed.
3838 place only one of the 66 of them was known. The Unicode Standard
3846 Case-insensitive C<"/i"> regular expression matching of Unicode
3873 Also, this matching doesn't fully conform to the current Unicode
3876 writing (April 2010), the Unicode Standard is currently in flux about