1package bytes; 2 3$bytes::hint_bits = 0x00000008; 4 5sub import { 6 $^H |= $bytes::hint_bits; 7} 8 9sub unimport { 10 $^H &= ~$bytes::hint_bits; 11} 12 13sub AUTOLOAD { 14 require "bytes_heavy.pl"; 15 goto &$AUTOLOAD; 16} 17 18sub length ($); 19 201; 21__END__ 22 23=head1 NAME 24 25bytes - Perl pragma to force byte semantics rather than character semantics 26 27=head1 SYNOPSIS 28 29 use bytes; 30 no bytes; 31 32=head1 DESCRIPTION 33 34WARNING: The implementation of Unicode support in Perl is incomplete. 35See L<perlunicode> for the exact details. 36 37The C<use bytes> pragma disables character semantics for the rest of the 38lexical scope in which it appears. C<no bytes> can be used to reverse 39the effect of C<use bytes> within the current lexical scope. 40 41Perl normally assumes character semantics in the presence of character 42data (i.e. data that has come from a source that has been marked as 43being of a particular character encoding). When C<use bytes> is in 44effect, the encoding is temporarily ignored, and each string is treated 45as a series of bytes. 46 47As an example, when Perl sees C<$x = chr(400)>, it encodes the character 48in UTF8 and stores it in $x. Then it is marked as character data, so, 49for instance, C<length $x> returns C<1>. However, in the scope of the 50C<bytes> pragma, $x is treated as a series of bytes - the bytes that make 51up the UTF8 encoding - and C<length $x> returns C<2>: 52 53 $x = chr(400); 54 print "Length is ", length $x, "\n"; # "Length is 1" 55 printf "Contents are %vd\n", $x; # "Contents are 400" 56 { 57 use bytes; 58 print "Length is ", length $x, "\n"; # "Length is 2" 59 printf "Contents are %vd\n", $x; # "Contents are 198.144" 60 } 61 62For more on the implications and differences between character 63semantics and byte semantics, see L<perlunicode>. 64 65=head1 SEE ALSO 66 67L<perlunicode>, L<utf8> 68 69=cut 70