1package PerlIO; 2 3our $VERSION = '1.02'; 4 5# Map layer name to package that defines it 6our %alias; 7 8sub import 9{ 10 my $class = shift; 11 while (@_) 12 { 13 my $layer = shift; 14 if (exists $alias{$layer}) 15 { 16 $layer = $alias{$layer} 17 } 18 else 19 { 20 $layer = "${class}::$layer"; 21 } 22 eval "require $layer"; 23 warn $@ if $@; 24 } 25} 26 27sub F_UTF8 () { 0x8000 } 28 291; 30__END__ 31 32=head1 NAME 33 34PerlIO - On demand loader for PerlIO layers and root of PerlIO::* name space 35 36=head1 SYNOPSIS 37 38 open($fh,"<:crlf", "my.txt"); # portably open a text file for reading 39 40 open($fh,"<","his.jpg"); # portably open a binary file for reading 41 binmode($fh); 42 43 Shell: 44 PERLIO=perlio perl .... 45 46=head1 DESCRIPTION 47 48When an undefined layer 'foo' is encountered in an C<open> or 49C<binmode> layer specification then C code performs the equivalent of: 50 51 use PerlIO 'foo'; 52 53The perl code in PerlIO.pm then attempts to locate a layer by doing 54 55 require PerlIO::foo; 56 57Otherwise the C<PerlIO> package is a place holder for additional 58PerlIO related functions. 59 60The following layers are currently defined: 61 62=over 4 63 64=item unix 65 66Low level layer which calls C<read>, C<write> and C<lseek> etc. 67 68=item stdio 69 70Layer which calls C<fread>, C<fwrite> and C<fseek>/C<ftell> etc. Note 71that as this is "real" stdio it will ignore any layers beneath it and 72got straight to the operating system via the C library as usual. 73 74=item perlio 75 76This is a re-implementation of "stdio-like" buffering written as a 77PerlIO "layer". As such it will call whatever layer is below it for 78its operations. 79 80=item crlf 81 82A layer which does CRLF to "\n" translation distinguishing "text" and 83"binary" files in the manner of MS-DOS and similar operating systems. 84(It currently does I<not> mimic MS-DOS as far as treating of Control-Z 85as being an end-of-file marker.) 86 87=item utf8 88 89Declares that the stream accepts perl's internal encoding of 90characters. (Which really is UTF-8 on ASCII machines, but is 91UTF-EBCDIC on EBCDIC machines.) This allows any character perl can 92represent to be read from or written to the stream. The UTF-X encoding 93is chosen to render simple text parts (i.e. non-accented letters, 94digits and common punctuation) human readable in the encoded file. 95 96Here is how to write your native data out using UTF-8 (or UTF-EBCDIC) 97and then read it back in. 98 99 open(F, ">:utf8", "data.utf"); 100 print F $out; 101 close(F); 102 103 open(F, "<:utf8", "data.utf"); 104 $in = <F>; 105 close(F); 106 107=item bytes 108 109This is the inverse of C<:utf8> layer. It turns off the flag 110on the layer below so that data read from it is considered to 111be "octets" i.e. characters in range 0..255 only. Likewise 112on output perl will warn if a "wide" character is written 113to a such a stream. 114 115=item raw 116 117The C<:raw> layer is I<defined> as being identical to calling 118C<binmode($fh)> - the stream is made suitable for passing binary data 119i.e. each byte is passed as-is. The stream will still be 120buffered. Unlike in the earlier versions of Perl C<:raw> is I<not> 121just the inverse of C<:crlf> - other layers which would affect the 122binary nature of the stream are also removed or disabled. 123 124The implementation of C<:raw> is as a pseudo-layer which when "pushed" 125pops itself and then any layers which do not declare themselves as suitable 126for binary data. (Undoing :utf8 and :crlf are implemented by clearing 127flags rather than popping layers but that is an implementation detail.) 128 129As a consequence of the fact that C<:raw> normally pops layers 130it usually only makes sense to have it as the only or first element in 131a layer specification. When used as the first element it provides 132a known base on which to build e.g. 133 134 open($fh,":raw:utf8",...) 135 136will construct a "binary" stream, but then enable UTF-8 translation. 137 138=item pop 139 140A pseudo layer that removes the top-most layer. Gives perl code 141a way to manipulate the layer stack. Should be considered 142as experimental. Note that C<:pop> only works on real layers 143and will not undo the effects of pseudo layers like C<:utf8>. 144An example of a possible use might be: 145 146 open($fh,...) 147 ... 148 binmode($fh,":encoding(...)"); # next chunk is encoded 149 ... 150 binmode($fh,":pop"); # back to un-encocded 151 152A more elegant (and safer) interface is needed. 153 154=back 155 156=head2 Custom Layers 157 158It is possible to write custom layers in addition to the above builtin 159ones, both in C/XS and Perl. Two such layers (and one example written 160in Perl using the latter) come with the Perl distribution. 161 162=over 4 163 164=item :encoding 165 166Use C<:encoding(ENCODING)> either in open() or binmode() to install 167a layer that does transparently character set and encoding transformations, 168for example from Shift-JIS to Unicode. Note that under C<stdio> 169an C<:encoding> also enables C<:utf8>. See L<PerlIO::encoding> 170for more information. 171 172=item :via 173 174Use C<:via(MODULE)> either in open() or binmode() to install a layer 175that does whatever transformation (for example compression / 176decompression, encryption / decryption) to the filehandle. 177See L<PerlIO::via> for more information. 178 179=back 180 181=head2 Alternatives to raw 182 183To get a binary stream an alternate method is to use: 184 185 open($fh,"whatever") 186 binmode($fh); 187 188this has advantage of being backward compatible with how such things have 189had to be coded on some platforms for years. 190 191To get an un-buffered stream specify an unbuffered layer (e.g. C<:unix>) 192in the open call: 193 194 open($fh,"<:unix",$path) 195 196=head2 Defaults and how to override them 197 198If the platform is MS-DOS like and normally does CRLF to "\n" 199translation for text files then the default layers are : 200 201 unix crlf 202 203(The low level "unix" layer may be replaced by a platform specific low 204level layer.) 205 206Otherwise if C<Configure> found out how to do "fast" IO using system's 207stdio, then the default layers are: 208 209 unix stdio 210 211Otherwise the default layers are 212 213 unix perlio 214 215These defaults may change once perlio has been better tested and tuned. 216 217The default can be overridden by setting the environment variable 218PERLIO to a space separated list of layers (C<unix> or platform low 219level layer is always pushed first). 220 221This can be used to see the effect of/bugs in the various layers e.g. 222 223 cd .../perl/t 224 PERLIO=stdio ./perl harness 225 PERLIO=perlio ./perl harness 226 227For the various value of PERLIO see L<perlrun/PERLIO>. 228 229=head2 Querying the layers of filehandles 230 231The following returns the B<names> of the PerlIO layers on a filehandle. 232 233 my @layers = PerlIO::get_layers($fh); # Or FH, *FH, "FH". 234 235The layers are returned in the order an open() or binmode() call would 236use them. Note that the "default stack" depends on the operating 237system and on the Perl version, and both the compile-time and 238runtime configurations of Perl. 239 240The following table summarizes the default layers on UNIX-like and 241DOS-like platforms and depending on the setting of the C<$ENV{PERLIO}>: 242 243 PERLIO UNIX-like DOS-like 244 245 unset / "" unix perlio / stdio [1] unix crlf 246 stdio unix perlio / stdio [1] stdio 247 perlio unix perlio unix perlio 248 mmap unix mmap unix mmap 249 250 # [1] "stdio" if Configure found out how to do "fast stdio" (depends 251 # on the stdio implementation) and in Perl 5.8, otherwise "unix perlio" 252 253By default the layers from the input side of the filehandle is 254returned, to get the output side use the optional C<output> argument: 255 256 my @layers = PerlIO::get_layers($fh, output => 1); 257 258(Usually the layers are identical on either side of a filehandle but 259for example with sockets there may be differences, or if you have 260been using the C<open> pragma.) 261 262There is no set_layers(), nor does get_layers() return a tied array 263mirroring the stack, or anything fancy like that. This is not 264accidental or unintentional. The PerlIO layer stack is a bit more 265complicated than just a stack (see for example the behaviour of C<:raw>). 266You are supposed to use open() and binmode() to manipulate the stack. 267 268B<Implementation details follow, please close your eyes.> 269 270The arguments to layers are by default returned in parenthesis after 271the name of the layer, and certain layers (like C<utf8>) are not real 272layers but instead flags on real layers: to get all of these returned 273separately use the optional C<separate> argument: 274 275 my @layer_and_args_and_flags = PerlIO::get_layers($fh, details => 1); 276 277The result will be up to be three times the number of layers: 278the first element will be a name, the second element the arguments 279(unspecified arguments will be C<undef>), the third element the flags, 280the fourth element a name again, and so forth. 281 282B<You may open your eyes now.> 283 284=head1 AUTHOR 285 286Nick Ing-Simmons E<lt>nick@ing-simmons.netE<gt> 287 288=head1 SEE ALSO 289 290L<perlfunc/"binmode">, L<perlfunc/"open">, L<perlunicode>, L<perliol>, 291L<Encode> 292 293=cut 294 295