xref: /onnv-gate/usr/src/cmd/perl/5.8.4/distrib/lib/PerlIO.pm (revision 0:68f95e015346)
1*0Sstevel@tonic-gatepackage PerlIO;
2*0Sstevel@tonic-gate
3*0Sstevel@tonic-gateour $VERSION = '1.03';
4*0Sstevel@tonic-gate
5*0Sstevel@tonic-gate# Map layer name to package that defines it
6*0Sstevel@tonic-gateour %alias;
7*0Sstevel@tonic-gate
8*0Sstevel@tonic-gatesub import
9*0Sstevel@tonic-gate{
10*0Sstevel@tonic-gate my $class = shift;
11*0Sstevel@tonic-gate while (@_)
12*0Sstevel@tonic-gate  {
13*0Sstevel@tonic-gate   my $layer = shift;
14*0Sstevel@tonic-gate   if (exists $alias{$layer})
15*0Sstevel@tonic-gate    {
16*0Sstevel@tonic-gate     $layer = $alias{$layer}
17*0Sstevel@tonic-gate    }
18*0Sstevel@tonic-gate   else
19*0Sstevel@tonic-gate    {
20*0Sstevel@tonic-gate     $layer = "${class}::$layer";
21*0Sstevel@tonic-gate    }
22*0Sstevel@tonic-gate   eval "require $layer";
23*0Sstevel@tonic-gate   warn $@ if $@;
24*0Sstevel@tonic-gate  }
25*0Sstevel@tonic-gate}
26*0Sstevel@tonic-gate
27*0Sstevel@tonic-gatesub F_UTF8 () { 0x8000 }
28*0Sstevel@tonic-gate
29*0Sstevel@tonic-gate1;
30*0Sstevel@tonic-gate__END__
31*0Sstevel@tonic-gate
32*0Sstevel@tonic-gate=head1 NAME
33*0Sstevel@tonic-gate
34*0Sstevel@tonic-gatePerlIO - On demand loader for PerlIO layers and root of PerlIO::* name space
35*0Sstevel@tonic-gate
36*0Sstevel@tonic-gate=head1 SYNOPSIS
37*0Sstevel@tonic-gate
38*0Sstevel@tonic-gate  open($fh,"<:crlf", "my.txt"); # portably open a text file for reading
39*0Sstevel@tonic-gate
40*0Sstevel@tonic-gate  open($fh,"<","his.jpg");      # portably open a binary file for reading
41*0Sstevel@tonic-gate  binmode($fh);
42*0Sstevel@tonic-gate
43*0Sstevel@tonic-gate  Shell:
44*0Sstevel@tonic-gate    PERLIO=perlio perl ....
45*0Sstevel@tonic-gate
46*0Sstevel@tonic-gate=head1 DESCRIPTION
47*0Sstevel@tonic-gate
48*0Sstevel@tonic-gateWhen an undefined layer 'foo' is encountered in an C<open> or
49*0Sstevel@tonic-gateC<binmode> layer specification then C code performs the equivalent of:
50*0Sstevel@tonic-gate
51*0Sstevel@tonic-gate  use PerlIO 'foo';
52*0Sstevel@tonic-gate
53*0Sstevel@tonic-gateThe perl code in PerlIO.pm then attempts to locate a layer by doing
54*0Sstevel@tonic-gate
55*0Sstevel@tonic-gate  require PerlIO::foo;
56*0Sstevel@tonic-gate
57*0Sstevel@tonic-gateOtherwise the C<PerlIO> package is a place holder for additional
58*0Sstevel@tonic-gatePerlIO related functions.
59*0Sstevel@tonic-gate
60*0Sstevel@tonic-gateThe following layers are currently defined:
61*0Sstevel@tonic-gate
62*0Sstevel@tonic-gate=over 4
63*0Sstevel@tonic-gate
64*0Sstevel@tonic-gate=item :unix
65*0Sstevel@tonic-gate
66*0Sstevel@tonic-gateLowest level layer which provides basic PerlIO operations in terms of
67*0Sstevel@tonic-gateUNIX/POSIX numeric file descriptor calls
68*0Sstevel@tonic-gate(open(), read(), write(), lseek(), close()).
69*0Sstevel@tonic-gate
70*0Sstevel@tonic-gate=item :stdio
71*0Sstevel@tonic-gate
72*0Sstevel@tonic-gateLayer which calls C<fread>, C<fwrite> and C<fseek>/C<ftell> etc.  Note
73*0Sstevel@tonic-gatethat as this is "real" stdio it will ignore any layers beneath it and
74*0Sstevel@tonic-gategot straight to the operating system via the C library as usual.
75*0Sstevel@tonic-gate
76*0Sstevel@tonic-gate=item :perlio
77*0Sstevel@tonic-gate
78*0Sstevel@tonic-gateA from scratch implementation of buffering for PerlIO. Provides fast
79*0Sstevel@tonic-gateaccess to the buffer for C<sv_gets> which implements perl's readline/E<lt>E<gt>
80*0Sstevel@tonic-gateand in general attempts to minimize data copying.
81*0Sstevel@tonic-gate
82*0Sstevel@tonic-gateC<:perlio> will insert a C<:unix> layer below itself to do low level IO.
83*0Sstevel@tonic-gate
84*0Sstevel@tonic-gate=item :crlf
85*0Sstevel@tonic-gate
86*0Sstevel@tonic-gateA layer that implements DOS/Windows like CRLF line endings.  On read
87*0Sstevel@tonic-gateconverts pairs of CR,LF to a single "\n" newline character.  On write
88*0Sstevel@tonic-gateconverts each "\n" to a CR,LF pair.  Note that this layer likes to be
89*0Sstevel@tonic-gateone of its kind: it silently ignores attempts to be pushed into the
90*0Sstevel@tonic-gatelayer stack more than once.
91*0Sstevel@tonic-gate
92*0Sstevel@tonic-gateIt currently does I<not> mimic MS-DOS as far as treating of Control-Z
93*0Sstevel@tonic-gateas being an end-of-file marker.
94*0Sstevel@tonic-gate
95*0Sstevel@tonic-gate(Gory details follow) To be more exact what happens is this: after
96*0Sstevel@tonic-gatepushing itself to the stack, the C<:crlf> layer checks all the layers
97*0Sstevel@tonic-gatebelow itself to find the first layer that is capable of being a CRLF
98*0Sstevel@tonic-gatelayer but is not yet enabled to be a CRLF layer.  If it finds such a
99*0Sstevel@tonic-gatelayer, it enables the CRLFness of that other deeper layer, and then
100*0Sstevel@tonic-gatepops itself off the stack.  If not, fine, use the one we just pushed.
101*0Sstevel@tonic-gate
102*0Sstevel@tonic-gateThe end result is that a C<:crlf> means "please enable the first CRLF
103*0Sstevel@tonic-gatelayer you can find, and if you can't find one, here would be a good
104*0Sstevel@tonic-gatespot to place a new one."
105*0Sstevel@tonic-gate
106*0Sstevel@tonic-gateBased on the C<:perlio> layer.
107*0Sstevel@tonic-gate
108*0Sstevel@tonic-gate=item :mmap
109*0Sstevel@tonic-gate
110*0Sstevel@tonic-gateA layer which implements "reading" of files by using C<mmap()> to
111*0Sstevel@tonic-gatemake (whole) file appear in the process's address space, and then
112*0Sstevel@tonic-gateusing that as PerlIO's "buffer". This I<may> be faster in certain
113*0Sstevel@tonic-gatecircumstances for large files, and may result in less physical memory
114*0Sstevel@tonic-gateuse when multiple processes are reading the same file.
115*0Sstevel@tonic-gate
116*0Sstevel@tonic-gateFiles which are not C<mmap()>-able revert to behaving like the C<:perlio>
117*0Sstevel@tonic-gatelayer. Writes also behave like C<:perlio> layer as C<mmap()> for write
118*0Sstevel@tonic-gateneeds extra house-keeping (to extend the file) which negates any advantage.
119*0Sstevel@tonic-gate
120*0Sstevel@tonic-gateThe C<:mmap> layer will not exist if platform does not support C<mmap()>.
121*0Sstevel@tonic-gate
122*0Sstevel@tonic-gate=item :utf8
123*0Sstevel@tonic-gate
124*0Sstevel@tonic-gateDeclares that the stream accepts perl's internal encoding of
125*0Sstevel@tonic-gatecharacters.  (Which really is UTF-8 on ASCII machines, but is
126*0Sstevel@tonic-gateUTF-EBCDIC on EBCDIC machines.)  This allows any character perl can
127*0Sstevel@tonic-gaterepresent to be read from or written to the stream. The UTF-X encoding
128*0Sstevel@tonic-gateis chosen to render simple text parts (i.e.  non-accented letters,
129*0Sstevel@tonic-gatedigits and common punctuation) human readable in the encoded file.
130*0Sstevel@tonic-gate
131*0Sstevel@tonic-gateHere is how to write your native data out using UTF-8 (or UTF-EBCDIC)
132*0Sstevel@tonic-gateand then read it back in.
133*0Sstevel@tonic-gate
134*0Sstevel@tonic-gate	open(F, ">:utf8", "data.utf");
135*0Sstevel@tonic-gate	print F $out;
136*0Sstevel@tonic-gate	close(F);
137*0Sstevel@tonic-gate
138*0Sstevel@tonic-gate	open(F, "<:utf8", "data.utf");
139*0Sstevel@tonic-gate	$in = <F>;
140*0Sstevel@tonic-gate	close(F);
141*0Sstevel@tonic-gate
142*0Sstevel@tonic-gate=item :bytes
143*0Sstevel@tonic-gate
144*0Sstevel@tonic-gateThis is the inverse of C<:utf8> layer. It turns off the flag
145*0Sstevel@tonic-gateon the layer below so that data read from it is considered to
146*0Sstevel@tonic-gatebe "octets" i.e. characters in range 0..255 only. Likewise
147*0Sstevel@tonic-gateon output perl will warn if a "wide" character is written
148*0Sstevel@tonic-gateto a such a stream.
149*0Sstevel@tonic-gate
150*0Sstevel@tonic-gate=item :raw
151*0Sstevel@tonic-gate
152*0Sstevel@tonic-gateThe C<:raw> layer is I<defined> as being identical to calling
153*0Sstevel@tonic-gateC<binmode($fh)> - the stream is made suitable for passing binary data
154*0Sstevel@tonic-gatei.e. each byte is passed as-is. The stream will still be
155*0Sstevel@tonic-gatebuffered.
156*0Sstevel@tonic-gate
157*0Sstevel@tonic-gateIn Perl 5.6 and some books the C<:raw> layer (previously sometimes also
158*0Sstevel@tonic-gatereferred to as a "discipline") is documented as the inverse of the
159*0Sstevel@tonic-gateC<:crlf> layer. That is no longer the case - other layers which would
160*0Sstevel@tonic-gatealter binary nature of the stream are also disabled.  If you want UNIX
161*0Sstevel@tonic-gateline endings on a platform that normally does CRLF translation, but still
162*0Sstevel@tonic-gatewant UTF-8 or encoding defaults the appropriate thing to do is to add
163*0Sstevel@tonic-gateC<:perlio> to PERLIO environment variable.
164*0Sstevel@tonic-gate
165*0Sstevel@tonic-gateThe implementation of C<:raw> is as a pseudo-layer which when "pushed"
166*0Sstevel@tonic-gatepops itself and then any layers which do not declare themselves as suitable
167*0Sstevel@tonic-gatefor binary data. (Undoing :utf8 and :crlf are implemented by clearing
168*0Sstevel@tonic-gateflags rather than popping layers but that is an implementation detail.)
169*0Sstevel@tonic-gate
170*0Sstevel@tonic-gateAs a consequence of the fact that C<:raw> normally pops layers
171*0Sstevel@tonic-gateit usually only makes sense to have it as the only or first element in
172*0Sstevel@tonic-gatea layer specification.  When used as the first element it provides
173*0Sstevel@tonic-gatea known base on which to build e.g.
174*0Sstevel@tonic-gate
175*0Sstevel@tonic-gate    open($fh,":raw:utf8",...)
176*0Sstevel@tonic-gate
177*0Sstevel@tonic-gatewill construct a "binary" stream, but then enable UTF-8 translation.
178*0Sstevel@tonic-gate
179*0Sstevel@tonic-gate=item :pop
180*0Sstevel@tonic-gate
181*0Sstevel@tonic-gateA pseudo layer that removes the top-most layer. Gives perl code
182*0Sstevel@tonic-gatea way to manipulate the layer stack. Should be considered
183*0Sstevel@tonic-gateas experimental. Note that C<:pop> only works on real layers
184*0Sstevel@tonic-gateand will not undo the effects of pseudo layers like C<:utf8>.
185*0Sstevel@tonic-gateAn example of a possible use might be:
186*0Sstevel@tonic-gate
187*0Sstevel@tonic-gate    open($fh,...)
188*0Sstevel@tonic-gate    ...
189*0Sstevel@tonic-gate    binmode($fh,":encoding(...)");  # next chunk is encoded
190*0Sstevel@tonic-gate    ...
191*0Sstevel@tonic-gate    binmode($fh,":pop");            # back to un-encocded
192*0Sstevel@tonic-gate
193*0Sstevel@tonic-gateA more elegant (and safer) interface is needed.
194*0Sstevel@tonic-gate
195*0Sstevel@tonic-gate=item :win32
196*0Sstevel@tonic-gate
197*0Sstevel@tonic-gateOn Win32 platforms this I<experimental> layer uses native "handle" IO
198*0Sstevel@tonic-gaterather than unix-like numeric file descriptor layer. Known to be
199*0Sstevel@tonic-gatebuggy as of perl 5.8.2.
200*0Sstevel@tonic-gate
201*0Sstevel@tonic-gate=back
202*0Sstevel@tonic-gate
203*0Sstevel@tonic-gate=head2 Custom Layers
204*0Sstevel@tonic-gate
205*0Sstevel@tonic-gateIt is possible to write custom layers in addition to the above builtin
206*0Sstevel@tonic-gateones, both in C/XS and Perl.  Two such layers (and one example written
207*0Sstevel@tonic-gatein Perl using the latter) come with the Perl distribution.
208*0Sstevel@tonic-gate
209*0Sstevel@tonic-gate=over 4
210*0Sstevel@tonic-gate
211*0Sstevel@tonic-gate=item :encoding
212*0Sstevel@tonic-gate
213*0Sstevel@tonic-gateUse C<:encoding(ENCODING)> either in open() or binmode() to install
214*0Sstevel@tonic-gatea layer that does transparently character set and encoding transformations,
215*0Sstevel@tonic-gatefor example from Shift-JIS to Unicode.  Note that under C<stdio>
216*0Sstevel@tonic-gatean C<:encoding> also enables C<:utf8>.  See L<PerlIO::encoding>
217*0Sstevel@tonic-gatefor more information.
218*0Sstevel@tonic-gate
219*0Sstevel@tonic-gate=item :via
220*0Sstevel@tonic-gate
221*0Sstevel@tonic-gateUse C<:via(MODULE)> either in open() or binmode() to install a layer
222*0Sstevel@tonic-gatethat does whatever transformation (for example compression /
223*0Sstevel@tonic-gatedecompression, encryption / decryption) to the filehandle.
224*0Sstevel@tonic-gateSee L<PerlIO::via> for more information.
225*0Sstevel@tonic-gate
226*0Sstevel@tonic-gate=back
227*0Sstevel@tonic-gate
228*0Sstevel@tonic-gate=head2 Alternatives to raw
229*0Sstevel@tonic-gate
230*0Sstevel@tonic-gateTo get a binary stream an alternate method is to use:
231*0Sstevel@tonic-gate
232*0Sstevel@tonic-gate    open($fh,"whatever")
233*0Sstevel@tonic-gate    binmode($fh);
234*0Sstevel@tonic-gate
235*0Sstevel@tonic-gatethis has advantage of being backward compatible with how such things have
236*0Sstevel@tonic-gatehad to be coded on some platforms for years.
237*0Sstevel@tonic-gate
238*0Sstevel@tonic-gateTo get an un-buffered stream specify an unbuffered layer (e.g. C<:unix>)
239*0Sstevel@tonic-gatein the open call:
240*0Sstevel@tonic-gate
241*0Sstevel@tonic-gate    open($fh,"<:unix",$path)
242*0Sstevel@tonic-gate
243*0Sstevel@tonic-gate=head2 Defaults and how to override them
244*0Sstevel@tonic-gate
245*0Sstevel@tonic-gateIf the platform is MS-DOS like and normally does CRLF to "\n"
246*0Sstevel@tonic-gatetranslation for text files then the default layers are :
247*0Sstevel@tonic-gate
248*0Sstevel@tonic-gate  unix crlf
249*0Sstevel@tonic-gate
250*0Sstevel@tonic-gate(The low level "unix" layer may be replaced by a platform specific low
251*0Sstevel@tonic-gatelevel layer.)
252*0Sstevel@tonic-gate
253*0Sstevel@tonic-gateOtherwise if C<Configure> found out how to do "fast" IO using system's
254*0Sstevel@tonic-gatestdio, then the default layers are:
255*0Sstevel@tonic-gate
256*0Sstevel@tonic-gate  unix stdio
257*0Sstevel@tonic-gate
258*0Sstevel@tonic-gateOtherwise the default layers are
259*0Sstevel@tonic-gate
260*0Sstevel@tonic-gate  unix perlio
261*0Sstevel@tonic-gate
262*0Sstevel@tonic-gateThese defaults may change once perlio has been better tested and tuned.
263*0Sstevel@tonic-gate
264*0Sstevel@tonic-gateThe default can be overridden by setting the environment variable
265*0Sstevel@tonic-gatePERLIO to a space separated list of layers (C<unix> or platform low
266*0Sstevel@tonic-gatelevel layer is always pushed first).
267*0Sstevel@tonic-gate
268*0Sstevel@tonic-gateThis can be used to see the effect of/bugs in the various layers e.g.
269*0Sstevel@tonic-gate
270*0Sstevel@tonic-gate  cd .../perl/t
271*0Sstevel@tonic-gate  PERLIO=stdio  ./perl harness
272*0Sstevel@tonic-gate  PERLIO=perlio ./perl harness
273*0Sstevel@tonic-gate
274*0Sstevel@tonic-gateFor the various value of PERLIO see L<perlrun/PERLIO>.
275*0Sstevel@tonic-gate
276*0Sstevel@tonic-gate=head2 Querying the layers of filehandles
277*0Sstevel@tonic-gate
278*0Sstevel@tonic-gateThe following returns the B<names> of the PerlIO layers on a filehandle.
279*0Sstevel@tonic-gate
280*0Sstevel@tonic-gate   my @layers = PerlIO::get_layers($fh); # Or FH, *FH, "FH".
281*0Sstevel@tonic-gate
282*0Sstevel@tonic-gateThe layers are returned in the order an open() or binmode() call would
283*0Sstevel@tonic-gateuse them.  Note that the "default stack" depends on the operating
284*0Sstevel@tonic-gatesystem and on the Perl version, and both the compile-time and
285*0Sstevel@tonic-gateruntime configurations of Perl.
286*0Sstevel@tonic-gate
287*0Sstevel@tonic-gateThe following table summarizes the default layers on UNIX-like and
288*0Sstevel@tonic-gateDOS-like platforms and depending on the setting of the C<$ENV{PERLIO}>:
289*0Sstevel@tonic-gate
290*0Sstevel@tonic-gate PERLIO     UNIX-like                   DOS-like
291*0Sstevel@tonic-gate
292*0Sstevel@tonic-gate unset / "" unix perlio / stdio [1]     unix crlf
293*0Sstevel@tonic-gate stdio      unix perlio / stdio [1]     stdio
294*0Sstevel@tonic-gate perlio     unix perlio                 unix perlio
295*0Sstevel@tonic-gate mmap       unix mmap                   unix mmap
296*0Sstevel@tonic-gate
297*0Sstevel@tonic-gate # [1] "stdio" if Configure found out how to do "fast stdio" (depends
298*0Sstevel@tonic-gate # on the stdio implementation) and in Perl 5.8, otherwise "unix perlio"
299*0Sstevel@tonic-gate
300*0Sstevel@tonic-gateBy default the layers from the input side of the filehandle is
301*0Sstevel@tonic-gatereturned, to get the output side use the optional C<output> argument:
302*0Sstevel@tonic-gate
303*0Sstevel@tonic-gate   my @layers = PerlIO::get_layers($fh, output => 1);
304*0Sstevel@tonic-gate
305*0Sstevel@tonic-gate(Usually the layers are identical on either side of a filehandle but
306*0Sstevel@tonic-gatefor example with sockets there may be differences, or if you have
307*0Sstevel@tonic-gatebeen using the C<open> pragma.)
308*0Sstevel@tonic-gate
309*0Sstevel@tonic-gateThere is no set_layers(), nor does get_layers() return a tied array
310*0Sstevel@tonic-gatemirroring the stack, or anything fancy like that.  This is not
311*0Sstevel@tonic-gateaccidental or unintentional.  The PerlIO layer stack is a bit more
312*0Sstevel@tonic-gatecomplicated than just a stack (see for example the behaviour of C<:raw>).
313*0Sstevel@tonic-gateYou are supposed to use open() and binmode() to manipulate the stack.
314*0Sstevel@tonic-gate
315*0Sstevel@tonic-gateB<Implementation details follow, please close your eyes.>
316*0Sstevel@tonic-gate
317*0Sstevel@tonic-gateThe arguments to layers are by default returned in parenthesis after
318*0Sstevel@tonic-gatethe name of the layer, and certain layers (like C<utf8>) are not real
319*0Sstevel@tonic-gatelayers but instead flags on real layers: to get all of these returned
320*0Sstevel@tonic-gateseparately use the optional C<details> argument:
321*0Sstevel@tonic-gate
322*0Sstevel@tonic-gate   my @layer_and_args_and_flags = PerlIO::get_layers($fh, details => 1);
323*0Sstevel@tonic-gate
324*0Sstevel@tonic-gateThe result will be up to be three times the number of layers:
325*0Sstevel@tonic-gatethe first element will be a name, the second element the arguments
326*0Sstevel@tonic-gate(unspecified arguments will be C<undef>), the third element the flags,
327*0Sstevel@tonic-gatethe fourth element a name again, and so forth.
328*0Sstevel@tonic-gate
329*0Sstevel@tonic-gateB<You may open your eyes now.>
330*0Sstevel@tonic-gate
331*0Sstevel@tonic-gate=head1 AUTHOR
332*0Sstevel@tonic-gate
333*0Sstevel@tonic-gateNick Ing-Simmons E<lt>nick@ing-simmons.netE<gt>
334*0Sstevel@tonic-gate
335*0Sstevel@tonic-gate=head1 SEE ALSO
336*0Sstevel@tonic-gate
337*0Sstevel@tonic-gateL<perlfunc/"binmode">, L<perlfunc/"open">, L<perlunicode>, L<perliol>,
338*0Sstevel@tonic-gateL<Encode>
339*0Sstevel@tonic-gate
340*0Sstevel@tonic-gate=cut
341*0Sstevel@tonic-gate
342