1*0Sstevel@tonic-gatepackage PerlIO; 2*0Sstevel@tonic-gate 3*0Sstevel@tonic-gateour $VERSION = '1.03'; 4*0Sstevel@tonic-gate 5*0Sstevel@tonic-gate# Map layer name to package that defines it 6*0Sstevel@tonic-gateour %alias; 7*0Sstevel@tonic-gate 8*0Sstevel@tonic-gatesub import 9*0Sstevel@tonic-gate{ 10*0Sstevel@tonic-gate my $class = shift; 11*0Sstevel@tonic-gate while (@_) 12*0Sstevel@tonic-gate { 13*0Sstevel@tonic-gate my $layer = shift; 14*0Sstevel@tonic-gate if (exists $alias{$layer}) 15*0Sstevel@tonic-gate { 16*0Sstevel@tonic-gate $layer = $alias{$layer} 17*0Sstevel@tonic-gate } 18*0Sstevel@tonic-gate else 19*0Sstevel@tonic-gate { 20*0Sstevel@tonic-gate $layer = "${class}::$layer"; 21*0Sstevel@tonic-gate } 22*0Sstevel@tonic-gate eval "require $layer"; 23*0Sstevel@tonic-gate warn $@ if $@; 24*0Sstevel@tonic-gate } 25*0Sstevel@tonic-gate} 26*0Sstevel@tonic-gate 27*0Sstevel@tonic-gatesub F_UTF8 () { 0x8000 } 28*0Sstevel@tonic-gate 29*0Sstevel@tonic-gate1; 30*0Sstevel@tonic-gate__END__ 31*0Sstevel@tonic-gate 32*0Sstevel@tonic-gate=head1 NAME 33*0Sstevel@tonic-gate 34*0Sstevel@tonic-gatePerlIO - On demand loader for PerlIO layers and root of PerlIO::* name space 35*0Sstevel@tonic-gate 36*0Sstevel@tonic-gate=head1 SYNOPSIS 37*0Sstevel@tonic-gate 38*0Sstevel@tonic-gate open($fh,"<:crlf", "my.txt"); # portably open a text file for reading 39*0Sstevel@tonic-gate 40*0Sstevel@tonic-gate open($fh,"<","his.jpg"); # portably open a binary file for reading 41*0Sstevel@tonic-gate binmode($fh); 42*0Sstevel@tonic-gate 43*0Sstevel@tonic-gate Shell: 44*0Sstevel@tonic-gate PERLIO=perlio perl .... 45*0Sstevel@tonic-gate 46*0Sstevel@tonic-gate=head1 DESCRIPTION 47*0Sstevel@tonic-gate 48*0Sstevel@tonic-gateWhen an undefined layer 'foo' is encountered in an C<open> or 49*0Sstevel@tonic-gateC<binmode> layer specification then C code performs the equivalent of: 50*0Sstevel@tonic-gate 51*0Sstevel@tonic-gate use PerlIO 'foo'; 52*0Sstevel@tonic-gate 53*0Sstevel@tonic-gateThe perl code in PerlIO.pm then attempts to locate a layer by doing 54*0Sstevel@tonic-gate 55*0Sstevel@tonic-gate require PerlIO::foo; 56*0Sstevel@tonic-gate 57*0Sstevel@tonic-gateOtherwise the C<PerlIO> package is a place holder for additional 58*0Sstevel@tonic-gatePerlIO related functions. 59*0Sstevel@tonic-gate 60*0Sstevel@tonic-gateThe following layers are currently defined: 61*0Sstevel@tonic-gate 62*0Sstevel@tonic-gate=over 4 63*0Sstevel@tonic-gate 64*0Sstevel@tonic-gate=item :unix 65*0Sstevel@tonic-gate 66*0Sstevel@tonic-gateLowest level layer which provides basic PerlIO operations in terms of 67*0Sstevel@tonic-gateUNIX/POSIX numeric file descriptor calls 68*0Sstevel@tonic-gate(open(), read(), write(), lseek(), close()). 69*0Sstevel@tonic-gate 70*0Sstevel@tonic-gate=item :stdio 71*0Sstevel@tonic-gate 72*0Sstevel@tonic-gateLayer which calls C<fread>, C<fwrite> and C<fseek>/C<ftell> etc. Note 73*0Sstevel@tonic-gatethat as this is "real" stdio it will ignore any layers beneath it and 74*0Sstevel@tonic-gategot straight to the operating system via the C library as usual. 75*0Sstevel@tonic-gate 76*0Sstevel@tonic-gate=item :perlio 77*0Sstevel@tonic-gate 78*0Sstevel@tonic-gateA from scratch implementation of buffering for PerlIO. Provides fast 79*0Sstevel@tonic-gateaccess to the buffer for C<sv_gets> which implements perl's readline/E<lt>E<gt> 80*0Sstevel@tonic-gateand in general attempts to minimize data copying. 81*0Sstevel@tonic-gate 82*0Sstevel@tonic-gateC<:perlio> will insert a C<:unix> layer below itself to do low level IO. 83*0Sstevel@tonic-gate 84*0Sstevel@tonic-gate=item :crlf 85*0Sstevel@tonic-gate 86*0Sstevel@tonic-gateA layer that implements DOS/Windows like CRLF line endings. On read 87*0Sstevel@tonic-gateconverts pairs of CR,LF to a single "\n" newline character. On write 88*0Sstevel@tonic-gateconverts each "\n" to a CR,LF pair. Note that this layer likes to be 89*0Sstevel@tonic-gateone of its kind: it silently ignores attempts to be pushed into the 90*0Sstevel@tonic-gatelayer stack more than once. 91*0Sstevel@tonic-gate 92*0Sstevel@tonic-gateIt currently does I<not> mimic MS-DOS as far as treating of Control-Z 93*0Sstevel@tonic-gateas being an end-of-file marker. 94*0Sstevel@tonic-gate 95*0Sstevel@tonic-gate(Gory details follow) To be more exact what happens is this: after 96*0Sstevel@tonic-gatepushing itself to the stack, the C<:crlf> layer checks all the layers 97*0Sstevel@tonic-gatebelow itself to find the first layer that is capable of being a CRLF 98*0Sstevel@tonic-gatelayer but is not yet enabled to be a CRLF layer. If it finds such a 99*0Sstevel@tonic-gatelayer, it enables the CRLFness of that other deeper layer, and then 100*0Sstevel@tonic-gatepops itself off the stack. If not, fine, use the one we just pushed. 101*0Sstevel@tonic-gate 102*0Sstevel@tonic-gateThe end result is that a C<:crlf> means "please enable the first CRLF 103*0Sstevel@tonic-gatelayer you can find, and if you can't find one, here would be a good 104*0Sstevel@tonic-gatespot to place a new one." 105*0Sstevel@tonic-gate 106*0Sstevel@tonic-gateBased on the C<:perlio> layer. 107*0Sstevel@tonic-gate 108*0Sstevel@tonic-gate=item :mmap 109*0Sstevel@tonic-gate 110*0Sstevel@tonic-gateA layer which implements "reading" of files by using C<mmap()> to 111*0Sstevel@tonic-gatemake (whole) file appear in the process's address space, and then 112*0Sstevel@tonic-gateusing that as PerlIO's "buffer". This I<may> be faster in certain 113*0Sstevel@tonic-gatecircumstances for large files, and may result in less physical memory 114*0Sstevel@tonic-gateuse when multiple processes are reading the same file. 115*0Sstevel@tonic-gate 116*0Sstevel@tonic-gateFiles which are not C<mmap()>-able revert to behaving like the C<:perlio> 117*0Sstevel@tonic-gatelayer. Writes also behave like C<:perlio> layer as C<mmap()> for write 118*0Sstevel@tonic-gateneeds extra house-keeping (to extend the file) which negates any advantage. 119*0Sstevel@tonic-gate 120*0Sstevel@tonic-gateThe C<:mmap> layer will not exist if platform does not support C<mmap()>. 121*0Sstevel@tonic-gate 122*0Sstevel@tonic-gate=item :utf8 123*0Sstevel@tonic-gate 124*0Sstevel@tonic-gateDeclares that the stream accepts perl's internal encoding of 125*0Sstevel@tonic-gatecharacters. (Which really is UTF-8 on ASCII machines, but is 126*0Sstevel@tonic-gateUTF-EBCDIC on EBCDIC machines.) This allows any character perl can 127*0Sstevel@tonic-gaterepresent to be read from or written to the stream. The UTF-X encoding 128*0Sstevel@tonic-gateis chosen to render simple text parts (i.e. non-accented letters, 129*0Sstevel@tonic-gatedigits and common punctuation) human readable in the encoded file. 130*0Sstevel@tonic-gate 131*0Sstevel@tonic-gateHere is how to write your native data out using UTF-8 (or UTF-EBCDIC) 132*0Sstevel@tonic-gateand then read it back in. 133*0Sstevel@tonic-gate 134*0Sstevel@tonic-gate open(F, ">:utf8", "data.utf"); 135*0Sstevel@tonic-gate print F $out; 136*0Sstevel@tonic-gate close(F); 137*0Sstevel@tonic-gate 138*0Sstevel@tonic-gate open(F, "<:utf8", "data.utf"); 139*0Sstevel@tonic-gate $in = <F>; 140*0Sstevel@tonic-gate close(F); 141*0Sstevel@tonic-gate 142*0Sstevel@tonic-gate=item :bytes 143*0Sstevel@tonic-gate 144*0Sstevel@tonic-gateThis is the inverse of C<:utf8> layer. It turns off the flag 145*0Sstevel@tonic-gateon the layer below so that data read from it is considered to 146*0Sstevel@tonic-gatebe "octets" i.e. characters in range 0..255 only. Likewise 147*0Sstevel@tonic-gateon output perl will warn if a "wide" character is written 148*0Sstevel@tonic-gateto a such a stream. 149*0Sstevel@tonic-gate 150*0Sstevel@tonic-gate=item :raw 151*0Sstevel@tonic-gate 152*0Sstevel@tonic-gateThe C<:raw> layer is I<defined> as being identical to calling 153*0Sstevel@tonic-gateC<binmode($fh)> - the stream is made suitable for passing binary data 154*0Sstevel@tonic-gatei.e. each byte is passed as-is. The stream will still be 155*0Sstevel@tonic-gatebuffered. 156*0Sstevel@tonic-gate 157*0Sstevel@tonic-gateIn Perl 5.6 and some books the C<:raw> layer (previously sometimes also 158*0Sstevel@tonic-gatereferred to as a "discipline") is documented as the inverse of the 159*0Sstevel@tonic-gateC<:crlf> layer. That is no longer the case - other layers which would 160*0Sstevel@tonic-gatealter binary nature of the stream are also disabled. If you want UNIX 161*0Sstevel@tonic-gateline endings on a platform that normally does CRLF translation, but still 162*0Sstevel@tonic-gatewant UTF-8 or encoding defaults the appropriate thing to do is to add 163*0Sstevel@tonic-gateC<:perlio> to PERLIO environment variable. 164*0Sstevel@tonic-gate 165*0Sstevel@tonic-gateThe implementation of C<:raw> is as a pseudo-layer which when "pushed" 166*0Sstevel@tonic-gatepops itself and then any layers which do not declare themselves as suitable 167*0Sstevel@tonic-gatefor binary data. (Undoing :utf8 and :crlf are implemented by clearing 168*0Sstevel@tonic-gateflags rather than popping layers but that is an implementation detail.) 169*0Sstevel@tonic-gate 170*0Sstevel@tonic-gateAs a consequence of the fact that C<:raw> normally pops layers 171*0Sstevel@tonic-gateit usually only makes sense to have it as the only or first element in 172*0Sstevel@tonic-gatea layer specification. When used as the first element it provides 173*0Sstevel@tonic-gatea known base on which to build e.g. 174*0Sstevel@tonic-gate 175*0Sstevel@tonic-gate open($fh,":raw:utf8",...) 176*0Sstevel@tonic-gate 177*0Sstevel@tonic-gatewill construct a "binary" stream, but then enable UTF-8 translation. 178*0Sstevel@tonic-gate 179*0Sstevel@tonic-gate=item :pop 180*0Sstevel@tonic-gate 181*0Sstevel@tonic-gateA pseudo layer that removes the top-most layer. Gives perl code 182*0Sstevel@tonic-gatea way to manipulate the layer stack. Should be considered 183*0Sstevel@tonic-gateas experimental. Note that C<:pop> only works on real layers 184*0Sstevel@tonic-gateand will not undo the effects of pseudo layers like C<:utf8>. 185*0Sstevel@tonic-gateAn example of a possible use might be: 186*0Sstevel@tonic-gate 187*0Sstevel@tonic-gate open($fh,...) 188*0Sstevel@tonic-gate ... 189*0Sstevel@tonic-gate binmode($fh,":encoding(...)"); # next chunk is encoded 190*0Sstevel@tonic-gate ... 191*0Sstevel@tonic-gate binmode($fh,":pop"); # back to un-encocded 192*0Sstevel@tonic-gate 193*0Sstevel@tonic-gateA more elegant (and safer) interface is needed. 194*0Sstevel@tonic-gate 195*0Sstevel@tonic-gate=item :win32 196*0Sstevel@tonic-gate 197*0Sstevel@tonic-gateOn Win32 platforms this I<experimental> layer uses native "handle" IO 198*0Sstevel@tonic-gaterather than unix-like numeric file descriptor layer. Known to be 199*0Sstevel@tonic-gatebuggy as of perl 5.8.2. 200*0Sstevel@tonic-gate 201*0Sstevel@tonic-gate=back 202*0Sstevel@tonic-gate 203*0Sstevel@tonic-gate=head2 Custom Layers 204*0Sstevel@tonic-gate 205*0Sstevel@tonic-gateIt is possible to write custom layers in addition to the above builtin 206*0Sstevel@tonic-gateones, both in C/XS and Perl. Two such layers (and one example written 207*0Sstevel@tonic-gatein Perl using the latter) come with the Perl distribution. 208*0Sstevel@tonic-gate 209*0Sstevel@tonic-gate=over 4 210*0Sstevel@tonic-gate 211*0Sstevel@tonic-gate=item :encoding 212*0Sstevel@tonic-gate 213*0Sstevel@tonic-gateUse C<:encoding(ENCODING)> either in open() or binmode() to install 214*0Sstevel@tonic-gatea layer that does transparently character set and encoding transformations, 215*0Sstevel@tonic-gatefor example from Shift-JIS to Unicode. Note that under C<stdio> 216*0Sstevel@tonic-gatean C<:encoding> also enables C<:utf8>. See L<PerlIO::encoding> 217*0Sstevel@tonic-gatefor more information. 218*0Sstevel@tonic-gate 219*0Sstevel@tonic-gate=item :via 220*0Sstevel@tonic-gate 221*0Sstevel@tonic-gateUse C<:via(MODULE)> either in open() or binmode() to install a layer 222*0Sstevel@tonic-gatethat does whatever transformation (for example compression / 223*0Sstevel@tonic-gatedecompression, encryption / decryption) to the filehandle. 224*0Sstevel@tonic-gateSee L<PerlIO::via> for more information. 225*0Sstevel@tonic-gate 226*0Sstevel@tonic-gate=back 227*0Sstevel@tonic-gate 228*0Sstevel@tonic-gate=head2 Alternatives to raw 229*0Sstevel@tonic-gate 230*0Sstevel@tonic-gateTo get a binary stream an alternate method is to use: 231*0Sstevel@tonic-gate 232*0Sstevel@tonic-gate open($fh,"whatever") 233*0Sstevel@tonic-gate binmode($fh); 234*0Sstevel@tonic-gate 235*0Sstevel@tonic-gatethis has advantage of being backward compatible with how such things have 236*0Sstevel@tonic-gatehad to be coded on some platforms for years. 237*0Sstevel@tonic-gate 238*0Sstevel@tonic-gateTo get an un-buffered stream specify an unbuffered layer (e.g. C<:unix>) 239*0Sstevel@tonic-gatein the open call: 240*0Sstevel@tonic-gate 241*0Sstevel@tonic-gate open($fh,"<:unix",$path) 242*0Sstevel@tonic-gate 243*0Sstevel@tonic-gate=head2 Defaults and how to override them 244*0Sstevel@tonic-gate 245*0Sstevel@tonic-gateIf the platform is MS-DOS like and normally does CRLF to "\n" 246*0Sstevel@tonic-gatetranslation for text files then the default layers are : 247*0Sstevel@tonic-gate 248*0Sstevel@tonic-gate unix crlf 249*0Sstevel@tonic-gate 250*0Sstevel@tonic-gate(The low level "unix" layer may be replaced by a platform specific low 251*0Sstevel@tonic-gatelevel layer.) 252*0Sstevel@tonic-gate 253*0Sstevel@tonic-gateOtherwise if C<Configure> found out how to do "fast" IO using system's 254*0Sstevel@tonic-gatestdio, then the default layers are: 255*0Sstevel@tonic-gate 256*0Sstevel@tonic-gate unix stdio 257*0Sstevel@tonic-gate 258*0Sstevel@tonic-gateOtherwise the default layers are 259*0Sstevel@tonic-gate 260*0Sstevel@tonic-gate unix perlio 261*0Sstevel@tonic-gate 262*0Sstevel@tonic-gateThese defaults may change once perlio has been better tested and tuned. 263*0Sstevel@tonic-gate 264*0Sstevel@tonic-gateThe default can be overridden by setting the environment variable 265*0Sstevel@tonic-gatePERLIO to a space separated list of layers (C<unix> or platform low 266*0Sstevel@tonic-gatelevel layer is always pushed first). 267*0Sstevel@tonic-gate 268*0Sstevel@tonic-gateThis can be used to see the effect of/bugs in the various layers e.g. 269*0Sstevel@tonic-gate 270*0Sstevel@tonic-gate cd .../perl/t 271*0Sstevel@tonic-gate PERLIO=stdio ./perl harness 272*0Sstevel@tonic-gate PERLIO=perlio ./perl harness 273*0Sstevel@tonic-gate 274*0Sstevel@tonic-gateFor the various value of PERLIO see L<perlrun/PERLIO>. 275*0Sstevel@tonic-gate 276*0Sstevel@tonic-gate=head2 Querying the layers of filehandles 277*0Sstevel@tonic-gate 278*0Sstevel@tonic-gateThe following returns the B<names> of the PerlIO layers on a filehandle. 279*0Sstevel@tonic-gate 280*0Sstevel@tonic-gate my @layers = PerlIO::get_layers($fh); # Or FH, *FH, "FH". 281*0Sstevel@tonic-gate 282*0Sstevel@tonic-gateThe layers are returned in the order an open() or binmode() call would 283*0Sstevel@tonic-gateuse them. Note that the "default stack" depends on the operating 284*0Sstevel@tonic-gatesystem and on the Perl version, and both the compile-time and 285*0Sstevel@tonic-gateruntime configurations of Perl. 286*0Sstevel@tonic-gate 287*0Sstevel@tonic-gateThe following table summarizes the default layers on UNIX-like and 288*0Sstevel@tonic-gateDOS-like platforms and depending on the setting of the C<$ENV{PERLIO}>: 289*0Sstevel@tonic-gate 290*0Sstevel@tonic-gate PERLIO UNIX-like DOS-like 291*0Sstevel@tonic-gate 292*0Sstevel@tonic-gate unset / "" unix perlio / stdio [1] unix crlf 293*0Sstevel@tonic-gate stdio unix perlio / stdio [1] stdio 294*0Sstevel@tonic-gate perlio unix perlio unix perlio 295*0Sstevel@tonic-gate mmap unix mmap unix mmap 296*0Sstevel@tonic-gate 297*0Sstevel@tonic-gate # [1] "stdio" if Configure found out how to do "fast stdio" (depends 298*0Sstevel@tonic-gate # on the stdio implementation) and in Perl 5.8, otherwise "unix perlio" 299*0Sstevel@tonic-gate 300*0Sstevel@tonic-gateBy default the layers from the input side of the filehandle is 301*0Sstevel@tonic-gatereturned, to get the output side use the optional C<output> argument: 302*0Sstevel@tonic-gate 303*0Sstevel@tonic-gate my @layers = PerlIO::get_layers($fh, output => 1); 304*0Sstevel@tonic-gate 305*0Sstevel@tonic-gate(Usually the layers are identical on either side of a filehandle but 306*0Sstevel@tonic-gatefor example with sockets there may be differences, or if you have 307*0Sstevel@tonic-gatebeen using the C<open> pragma.) 308*0Sstevel@tonic-gate 309*0Sstevel@tonic-gateThere is no set_layers(), nor does get_layers() return a tied array 310*0Sstevel@tonic-gatemirroring the stack, or anything fancy like that. This is not 311*0Sstevel@tonic-gateaccidental or unintentional. The PerlIO layer stack is a bit more 312*0Sstevel@tonic-gatecomplicated than just a stack (see for example the behaviour of C<:raw>). 313*0Sstevel@tonic-gateYou are supposed to use open() and binmode() to manipulate the stack. 314*0Sstevel@tonic-gate 315*0Sstevel@tonic-gateB<Implementation details follow, please close your eyes.> 316*0Sstevel@tonic-gate 317*0Sstevel@tonic-gateThe arguments to layers are by default returned in parenthesis after 318*0Sstevel@tonic-gatethe name of the layer, and certain layers (like C<utf8>) are not real 319*0Sstevel@tonic-gatelayers but instead flags on real layers: to get all of these returned 320*0Sstevel@tonic-gateseparately use the optional C<details> argument: 321*0Sstevel@tonic-gate 322*0Sstevel@tonic-gate my @layer_and_args_and_flags = PerlIO::get_layers($fh, details => 1); 323*0Sstevel@tonic-gate 324*0Sstevel@tonic-gateThe result will be up to be three times the number of layers: 325*0Sstevel@tonic-gatethe first element will be a name, the second element the arguments 326*0Sstevel@tonic-gate(unspecified arguments will be C<undef>), the third element the flags, 327*0Sstevel@tonic-gatethe fourth element a name again, and so forth. 328*0Sstevel@tonic-gate 329*0Sstevel@tonic-gateB<You may open your eyes now.> 330*0Sstevel@tonic-gate 331*0Sstevel@tonic-gate=head1 AUTHOR 332*0Sstevel@tonic-gate 333*0Sstevel@tonic-gateNick Ing-Simmons E<lt>nick@ing-simmons.netE<gt> 334*0Sstevel@tonic-gate 335*0Sstevel@tonic-gate=head1 SEE ALSO 336*0Sstevel@tonic-gate 337*0Sstevel@tonic-gateL<perlfunc/"binmode">, L<perlfunc/"open">, L<perlunicode>, L<perliol>, 338*0Sstevel@tonic-gateL<Encode> 339*0Sstevel@tonic-gate 340*0Sstevel@tonic-gate=cut 341*0Sstevel@tonic-gate 342