xref: /onnv-gate/usr/src/cmd/perl/5.8.4/distrib/pod/perlopentut.pod (revision 0:68f95e015346)
1*0Sstevel@tonic-gate=head1 NAME
2*0Sstevel@tonic-gate
3*0Sstevel@tonic-gateperlopentut - tutorial on opening things in Perl
4*0Sstevel@tonic-gate
5*0Sstevel@tonic-gate=head1 DESCRIPTION
6*0Sstevel@tonic-gate
7*0Sstevel@tonic-gatePerl has two simple, built-in ways to open files: the shell way for
8*0Sstevel@tonic-gateconvenience, and the C way for precision.  The shell way also has 2- and
9*0Sstevel@tonic-gate3-argument forms, which have different semantics for handling the filename.
10*0Sstevel@tonic-gateThe choice is yours.
11*0Sstevel@tonic-gate
12*0Sstevel@tonic-gate=head1 Open E<agrave> la shell
13*0Sstevel@tonic-gate
14*0Sstevel@tonic-gatePerl's C<open> function was designed to mimic the way command-line
15*0Sstevel@tonic-gateredirection in the shell works.  Here are some basic examples
16*0Sstevel@tonic-gatefrom the shell:
17*0Sstevel@tonic-gate
18*0Sstevel@tonic-gate    $ myprogram file1 file2 file3
19*0Sstevel@tonic-gate    $ myprogram    <  inputfile
20*0Sstevel@tonic-gate    $ myprogram    >  outputfile
21*0Sstevel@tonic-gate    $ myprogram    >> outputfile
22*0Sstevel@tonic-gate    $ myprogram    |  otherprogram
23*0Sstevel@tonic-gate    $ otherprogram |  myprogram
24*0Sstevel@tonic-gate
25*0Sstevel@tonic-gateAnd here are some more advanced examples:
26*0Sstevel@tonic-gate
27*0Sstevel@tonic-gate    $ otherprogram      | myprogram f1 - f2
28*0Sstevel@tonic-gate    $ otherprogram 2>&1 | myprogram -
29*0Sstevel@tonic-gate    $ myprogram     <&3
30*0Sstevel@tonic-gate    $ myprogram     >&4
31*0Sstevel@tonic-gate
32*0Sstevel@tonic-gateProgrammers accustomed to constructs like those above can take comfort
33*0Sstevel@tonic-gatein learning that Perl directly supports these familiar constructs using
34*0Sstevel@tonic-gatevirtually the same syntax as the shell.
35*0Sstevel@tonic-gate
36*0Sstevel@tonic-gate=head2 Simple Opens
37*0Sstevel@tonic-gate
38*0Sstevel@tonic-gateThe C<open> function takes two arguments: the first is a filehandle,
39*0Sstevel@tonic-gateand the second is a single string comprising both what to open and how
40*0Sstevel@tonic-gateto open it.  C<open> returns true when it works, and when it fails,
41*0Sstevel@tonic-gatereturns a false value and sets the special variable C<$!> to reflect
42*0Sstevel@tonic-gatethe system error.  If the filehandle was previously opened, it will
43*0Sstevel@tonic-gatebe implicitly closed first.
44*0Sstevel@tonic-gate
45*0Sstevel@tonic-gateFor example:
46*0Sstevel@tonic-gate
47*0Sstevel@tonic-gate    open(INFO,      "datafile") || die("can't open datafile: $!");
48*0Sstevel@tonic-gate    open(INFO,   "<  datafile") || die("can't open datafile: $!");
49*0Sstevel@tonic-gate    open(RESULTS,">  runstats") || die("can't open runstats: $!");
50*0Sstevel@tonic-gate    open(LOG,    ">> logfile ") || die("can't open logfile:  $!");
51*0Sstevel@tonic-gate
52*0Sstevel@tonic-gateIf you prefer the low-punctuation version, you could write that this way:
53*0Sstevel@tonic-gate
54*0Sstevel@tonic-gate    open INFO,   "<  datafile"  or die "can't open datafile: $!";
55*0Sstevel@tonic-gate    open RESULTS,">  runstats"  or die "can't open runstats: $!";
56*0Sstevel@tonic-gate    open LOG,    ">> logfile "  or die "can't open logfile:  $!";
57*0Sstevel@tonic-gate
58*0Sstevel@tonic-gateA few things to notice.  First, the leading less-than is optional.
59*0Sstevel@tonic-gateIf omitted, Perl assumes that you want to open the file for reading.
60*0Sstevel@tonic-gate
61*0Sstevel@tonic-gateNote also that the first example uses the C<||> logical operator, and the
62*0Sstevel@tonic-gatesecond uses C<or>, which has lower precedence.  Using C<||> in the latter
63*0Sstevel@tonic-gateexamples would effectively mean
64*0Sstevel@tonic-gate
65*0Sstevel@tonic-gate    open INFO, ( "<  datafile"  || die "can't open datafile: $!" );
66*0Sstevel@tonic-gate
67*0Sstevel@tonic-gatewhich is definitely not what you want.
68*0Sstevel@tonic-gate
69*0Sstevel@tonic-gateThe other important thing to notice is that, just as in the shell,
70*0Sstevel@tonic-gateany white space before or after the filename is ignored.  This is good,
71*0Sstevel@tonic-gatebecause you wouldn't want these to do different things:
72*0Sstevel@tonic-gate
73*0Sstevel@tonic-gate    open INFO,   "<datafile"
74*0Sstevel@tonic-gate    open INFO,   "< datafile"
75*0Sstevel@tonic-gate    open INFO,   "<  datafile"
76*0Sstevel@tonic-gate
77*0Sstevel@tonic-gateIgnoring surrounding whitespace also helps for when you read a filename
78*0Sstevel@tonic-gatein from a different file, and forget to trim it before opening:
79*0Sstevel@tonic-gate
80*0Sstevel@tonic-gate    $filename = <INFO>;         # oops, \n still there
81*0Sstevel@tonic-gate    open(EXTRA, "< $filename") || die "can't open $filename: $!";
82*0Sstevel@tonic-gate
83*0Sstevel@tonic-gateThis is not a bug, but a feature.  Because C<open> mimics the shell in
84*0Sstevel@tonic-gateits style of using redirection arrows to specify how to open the file, it
85*0Sstevel@tonic-gatealso does so with respect to extra white space around the filename itself
86*0Sstevel@tonic-gateas well.  For accessing files with naughty names, see
87*0Sstevel@tonic-gateL<"Dispelling the Dweomer">.
88*0Sstevel@tonic-gate
89*0Sstevel@tonic-gateThere is also a 3-argument version of C<open>, which lets you put the
90*0Sstevel@tonic-gatespecial redirection characters into their own argument:
91*0Sstevel@tonic-gate
92*0Sstevel@tonic-gate    open( INFO, ">", $datafile ) || die "Can't create $datafile: $!";
93*0Sstevel@tonic-gate
94*0Sstevel@tonic-gateIn this case, the filename to open is the actual string in C<$datafile>,
95*0Sstevel@tonic-gateso you don't have to worry about C<$datafile> containing characters
96*0Sstevel@tonic-gatethat might influence the open mode, or whitespace at the beginning of
97*0Sstevel@tonic-gatethe filename that would be absorbed in the 2-argument version.  Also,
98*0Sstevel@tonic-gateany reduction of unnecessary string interpolation is a good thing.
99*0Sstevel@tonic-gate
100*0Sstevel@tonic-gate=head2 Indirect Filehandles
101*0Sstevel@tonic-gate
102*0Sstevel@tonic-gateC<open>'s first argument can be a reference to a filehandle.  As of
103*0Sstevel@tonic-gateperl 5.6.0, if the argument is uninitialized, Perl will automatically
104*0Sstevel@tonic-gatecreate a filehandle and put a reference to it in the first argument,
105*0Sstevel@tonic-gatelike so:
106*0Sstevel@tonic-gate
107*0Sstevel@tonic-gate    open( my $in, $infile )   or die "Couldn't read $infile: $!";
108*0Sstevel@tonic-gate    while ( <$in> ) {
109*0Sstevel@tonic-gate	# do something with $_
110*0Sstevel@tonic-gate    }
111*0Sstevel@tonic-gate    close $in;
112*0Sstevel@tonic-gate
113*0Sstevel@tonic-gateIndirect filehandles make namespace management easier.  Since filehandles
114*0Sstevel@tonic-gateare global to the current package, two subroutines trying to open
115*0Sstevel@tonic-gateC<INFILE> will clash.  With two functions opening indirect filehandles
116*0Sstevel@tonic-gatelike C<my $infile>, there's no clash and no need to worry about future
117*0Sstevel@tonic-gateconflicts.
118*0Sstevel@tonic-gate
119*0Sstevel@tonic-gateAnother convenient behavior is that an indirect filehandle automatically
120*0Sstevel@tonic-gatecloses when it goes out of scope or when you undefine it:
121*0Sstevel@tonic-gate
122*0Sstevel@tonic-gate    sub firstline {
123*0Sstevel@tonic-gate	open( my $in, shift ) && return scalar <$in>;
124*0Sstevel@tonic-gate	# no close() required
125*0Sstevel@tonic-gate    }
126*0Sstevel@tonic-gate
127*0Sstevel@tonic-gate=head2 Pipe Opens
128*0Sstevel@tonic-gate
129*0Sstevel@tonic-gateIn C, when you want to open a file using the standard I/O library,
130*0Sstevel@tonic-gateyou use the C<fopen> function, but when opening a pipe, you use the
131*0Sstevel@tonic-gateC<popen> function.  But in the shell, you just use a different redirection
132*0Sstevel@tonic-gatecharacter.  That's also the case for Perl.  The C<open> call
133*0Sstevel@tonic-gateremains the same--just its argument differs.
134*0Sstevel@tonic-gate
135*0Sstevel@tonic-gateIf the leading character is a pipe symbol, C<open> starts up a new
136*0Sstevel@tonic-gatecommand and opens a write-only filehandle leading into that command.
137*0Sstevel@tonic-gateThis lets you write into that handle and have what you write show up on
138*0Sstevel@tonic-gatethat command's standard input.  For example:
139*0Sstevel@tonic-gate
140*0Sstevel@tonic-gate    open(PRINTER, "| lpr -Plp1")    || die "can't run lpr: $!";
141*0Sstevel@tonic-gate    print PRINTER "stuff\n";
142*0Sstevel@tonic-gate    close(PRINTER)                  || die "can't close lpr: $!";
143*0Sstevel@tonic-gate
144*0Sstevel@tonic-gateIf the trailing character is a pipe, you start up a new command and open a
145*0Sstevel@tonic-gateread-only filehandle leading out of that command.  This lets whatever that
146*0Sstevel@tonic-gatecommand writes to its standard output show up on your handle for reading.
147*0Sstevel@tonic-gateFor example:
148*0Sstevel@tonic-gate
149*0Sstevel@tonic-gate    open(NET, "netstat -i -n |")    || die "can't fork netstat: $!";
150*0Sstevel@tonic-gate    while (<NET>) { }               # do something with input
151*0Sstevel@tonic-gate    close(NET)                      || die "can't close netstat: $!";
152*0Sstevel@tonic-gate
153*0Sstevel@tonic-gateWhat happens if you try to open a pipe to or from a non-existent
154*0Sstevel@tonic-gatecommand?  If possible, Perl will detect the failure and set C<$!> as
155*0Sstevel@tonic-gateusual.  But if the command contains special shell characters, such as
156*0Sstevel@tonic-gateC<E<gt>> or C<*>, called 'metacharacters', Perl does not execute the
157*0Sstevel@tonic-gatecommand directly.  Instead, Perl runs the shell, which then tries to
158*0Sstevel@tonic-gaterun the command.  This means that it's the shell that gets the error
159*0Sstevel@tonic-gateindication.  In such a case, the C<open> call will only indicate
160*0Sstevel@tonic-gatefailure if Perl can't even run the shell.  See L<perlfaq8/"How can I
161*0Sstevel@tonic-gatecapture STDERR from an external command?"> to see how to cope with
162*0Sstevel@tonic-gatethis.  There's also an explanation in L<perlipc>.
163*0Sstevel@tonic-gate
164*0Sstevel@tonic-gateIf you would like to open a bidirectional pipe, the IPC::Open2
165*0Sstevel@tonic-gatelibrary will handle this for you.  Check out
166*0Sstevel@tonic-gateL<perlipc/"Bidirectional Communication with Another Process">
167*0Sstevel@tonic-gate
168*0Sstevel@tonic-gate=head2 The Minus File
169*0Sstevel@tonic-gate
170*0Sstevel@tonic-gateAgain following the lead of the standard shell utilities, Perl's
171*0Sstevel@tonic-gateC<open> function treats a file whose name is a single minus, "-", in a
172*0Sstevel@tonic-gatespecial way.  If you open minus for reading, it really means to access
173*0Sstevel@tonic-gatethe standard input.  If you open minus for writing, it really means to
174*0Sstevel@tonic-gateaccess the standard output.
175*0Sstevel@tonic-gate
176*0Sstevel@tonic-gateIf minus can be used as the default input or default output, what happens
177*0Sstevel@tonic-gateif you open a pipe into or out of minus?  What's the default command it
178*0Sstevel@tonic-gatewould run?  The same script as you're currently running!  This is actually
179*0Sstevel@tonic-gatea stealth C<fork> hidden inside an C<open> call.  See
180*0Sstevel@tonic-gateL<perlipc/"Safe Pipe Opens"> for details.
181*0Sstevel@tonic-gate
182*0Sstevel@tonic-gate=head2 Mixing Reads and Writes
183*0Sstevel@tonic-gate
184*0Sstevel@tonic-gateIt is possible to specify both read and write access.  All you do is
185*0Sstevel@tonic-gateadd a "+" symbol in front of the redirection.  But as in the shell,
186*0Sstevel@tonic-gateusing a less-than on a file never creates a new file; it only opens an
187*0Sstevel@tonic-gateexisting one.  On the other hand, using a greater-than always clobbers
188*0Sstevel@tonic-gate(truncates to zero length) an existing file, or creates a brand-new one
189*0Sstevel@tonic-gateif there isn't an old one.  Adding a "+" for read-write doesn't affect
190*0Sstevel@tonic-gatewhether it only works on existing files or always clobbers existing ones.
191*0Sstevel@tonic-gate
192*0Sstevel@tonic-gate    open(WTMP, "+< /usr/adm/wtmp")
193*0Sstevel@tonic-gate        || die "can't open /usr/adm/wtmp: $!";
194*0Sstevel@tonic-gate
195*0Sstevel@tonic-gate    open(SCREEN, "+> lkscreen")
196*0Sstevel@tonic-gate        || die "can't open lkscreen: $!";
197*0Sstevel@tonic-gate
198*0Sstevel@tonic-gate    open(LOGFILE, "+>> /var/log/applog"
199*0Sstevel@tonic-gate        || die "can't open /var/log/applog: $!";
200*0Sstevel@tonic-gate
201*0Sstevel@tonic-gateThe first one won't create a new file, and the second one will always
202*0Sstevel@tonic-gateclobber an old one.  The third one will create a new file if necessary
203*0Sstevel@tonic-gateand not clobber an old one, and it will allow you to read at any point
204*0Sstevel@tonic-gatein the file, but all writes will always go to the end.  In short,
205*0Sstevel@tonic-gatethe first case is substantially more common than the second and third
206*0Sstevel@tonic-gatecases, which are almost always wrong.  (If you know C, the plus in
207*0Sstevel@tonic-gatePerl's C<open> is historically derived from the one in C's fopen(3S),
208*0Sstevel@tonic-gatewhich it ultimately calls.)
209*0Sstevel@tonic-gate
210*0Sstevel@tonic-gateIn fact, when it comes to updating a file, unless you're working on
211*0Sstevel@tonic-gatea binary file as in the WTMP case above, you probably don't want to
212*0Sstevel@tonic-gateuse this approach for updating.  Instead, Perl's B<-i> flag comes to
213*0Sstevel@tonic-gatethe rescue.  The following command takes all the C, C++, or yacc source
214*0Sstevel@tonic-gateor header files and changes all their foo's to bar's, leaving
215*0Sstevel@tonic-gatethe old version in the original filename with a ".orig" tacked
216*0Sstevel@tonic-gateon the end:
217*0Sstevel@tonic-gate
218*0Sstevel@tonic-gate    $ perl -i.orig -pe 's/\bfoo\b/bar/g' *.[Cchy]
219*0Sstevel@tonic-gate
220*0Sstevel@tonic-gateThis is a short cut for some renaming games that are really
221*0Sstevel@tonic-gatethe best way to update textfiles.  See the second question in
222*0Sstevel@tonic-gateL<perlfaq5> for more details.
223*0Sstevel@tonic-gate
224*0Sstevel@tonic-gate=head2 Filters
225*0Sstevel@tonic-gate
226*0Sstevel@tonic-gateOne of the most common uses for C<open> is one you never
227*0Sstevel@tonic-gateeven notice.  When you process the ARGV filehandle using
228*0Sstevel@tonic-gateC<< <ARGV> >>, Perl actually does an implicit open
229*0Sstevel@tonic-gateon each file in @ARGV.  Thus a program called like this:
230*0Sstevel@tonic-gate
231*0Sstevel@tonic-gate    $ myprogram file1 file2 file3
232*0Sstevel@tonic-gate
233*0Sstevel@tonic-gateCan have all its files opened and processed one at a time
234*0Sstevel@tonic-gateusing a construct no more complex than:
235*0Sstevel@tonic-gate
236*0Sstevel@tonic-gate    while (<>) {
237*0Sstevel@tonic-gate        # do something with $_
238*0Sstevel@tonic-gate    }
239*0Sstevel@tonic-gate
240*0Sstevel@tonic-gateIf @ARGV is empty when the loop first begins, Perl pretends you've opened
241*0Sstevel@tonic-gateup minus, that is, the standard input.  In fact, $ARGV, the currently
242*0Sstevel@tonic-gateopen file during C<< <ARGV> >> processing, is even set to "-"
243*0Sstevel@tonic-gatein these circumstances.
244*0Sstevel@tonic-gate
245*0Sstevel@tonic-gateYou are welcome to pre-process your @ARGV before starting the loop to
246*0Sstevel@tonic-gatemake sure it's to your liking.  One reason to do this might be to remove
247*0Sstevel@tonic-gatecommand options beginning with a minus.  While you can always roll the
248*0Sstevel@tonic-gatesimple ones by hand, the Getopts modules are good for this:
249*0Sstevel@tonic-gate
250*0Sstevel@tonic-gate    use Getopt::Std;
251*0Sstevel@tonic-gate
252*0Sstevel@tonic-gate    # -v, -D, -o ARG, sets $opt_v, $opt_D, $opt_o
253*0Sstevel@tonic-gate    getopts("vDo:");
254*0Sstevel@tonic-gate
255*0Sstevel@tonic-gate    # -v, -D, -o ARG, sets $args{v}, $args{D}, $args{o}
256*0Sstevel@tonic-gate    getopts("vDo:", \%args);
257*0Sstevel@tonic-gate
258*0Sstevel@tonic-gateOr the standard Getopt::Long module to permit named arguments:
259*0Sstevel@tonic-gate
260*0Sstevel@tonic-gate    use Getopt::Long;
261*0Sstevel@tonic-gate    GetOptions( "verbose"  => \$verbose,        # --verbose
262*0Sstevel@tonic-gate                "Debug"    => \$debug,          # --Debug
263*0Sstevel@tonic-gate                "output=s" => \$output );
264*0Sstevel@tonic-gate	    # --output=somestring or --output somestring
265*0Sstevel@tonic-gate
266*0Sstevel@tonic-gateAnother reason for preprocessing arguments is to make an empty
267*0Sstevel@tonic-gateargument list default to all files:
268*0Sstevel@tonic-gate
269*0Sstevel@tonic-gate    @ARGV = glob("*") unless @ARGV;
270*0Sstevel@tonic-gate
271*0Sstevel@tonic-gateYou could even filter out all but plain, text files.  This is a bit
272*0Sstevel@tonic-gatesilent, of course, and you might prefer to mention them on the way.
273*0Sstevel@tonic-gate
274*0Sstevel@tonic-gate    @ARGV = grep { -f && -T } @ARGV;
275*0Sstevel@tonic-gate
276*0Sstevel@tonic-gateIf you're using the B<-n> or B<-p> command-line options, you
277*0Sstevel@tonic-gateshould put changes to @ARGV in a C<BEGIN{}> block.
278*0Sstevel@tonic-gate
279*0Sstevel@tonic-gateRemember that a normal C<open> has special properties, in that it might
280*0Sstevel@tonic-gatecall fopen(3S) or it might called popen(3S), depending on what its
281*0Sstevel@tonic-gateargument looks like; that's why it's sometimes called "magic open".
282*0Sstevel@tonic-gateHere's an example:
283*0Sstevel@tonic-gate
284*0Sstevel@tonic-gate    $pwdinfo = `domainname` =~ /^(\(none\))?$/
285*0Sstevel@tonic-gate                    ? '< /etc/passwd'
286*0Sstevel@tonic-gate                    : 'ypcat passwd |';
287*0Sstevel@tonic-gate
288*0Sstevel@tonic-gate    open(PWD, $pwdinfo)
289*0Sstevel@tonic-gate                or die "can't open $pwdinfo: $!";
290*0Sstevel@tonic-gate
291*0Sstevel@tonic-gateThis sort of thing also comes into play in filter processing.  Because
292*0Sstevel@tonic-gateC<< <ARGV> >> processing employs the normal, shell-style Perl C<open>,
293*0Sstevel@tonic-gateit respects all the special things we've already seen:
294*0Sstevel@tonic-gate
295*0Sstevel@tonic-gate    $ myprogram f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile
296*0Sstevel@tonic-gate
297*0Sstevel@tonic-gateThat program will read from the file F<f1>, the process F<cmd1>, standard
298*0Sstevel@tonic-gateinput (F<tmpfile> in this case), the F<f2> file, the F<cmd2> command,
299*0Sstevel@tonic-gateand finally the F<f3> file.
300*0Sstevel@tonic-gate
301*0Sstevel@tonic-gateYes, this also means that if you have files named "-" (and so on) in
302*0Sstevel@tonic-gateyour directory, they won't be processed as literal files by C<open>.
303*0Sstevel@tonic-gateYou'll need to pass them as "./-", much as you would for the I<rm> program,
304*0Sstevel@tonic-gateor you could use C<sysopen> as described below.
305*0Sstevel@tonic-gate
306*0Sstevel@tonic-gateOne of the more interesting applications is to change files of a certain
307*0Sstevel@tonic-gatename into pipes.  For example, to autoprocess gzipped or compressed
308*0Sstevel@tonic-gatefiles by decompressing them with I<gzip>:
309*0Sstevel@tonic-gate
310*0Sstevel@tonic-gate    @ARGV = map { /^\.(gz|Z)$/ ? "gzip -dc $_ |" : $_  } @ARGV;
311*0Sstevel@tonic-gate
312*0Sstevel@tonic-gateOr, if you have the I<GET> program installed from LWP,
313*0Sstevel@tonic-gateyou can fetch URLs before processing them:
314*0Sstevel@tonic-gate
315*0Sstevel@tonic-gate    @ARGV = map { m#^\w+://# ? "GET $_ |" : $_ } @ARGV;
316*0Sstevel@tonic-gate
317*0Sstevel@tonic-gateIt's not for nothing that this is called magic C<< <ARGV> >>.
318*0Sstevel@tonic-gatePretty nifty, eh?
319*0Sstevel@tonic-gate
320*0Sstevel@tonic-gate=head1 Open E<agrave> la C
321*0Sstevel@tonic-gate
322*0Sstevel@tonic-gateIf you want the convenience of the shell, then Perl's C<open> is
323*0Sstevel@tonic-gatedefinitely the way to go.  On the other hand, if you want finer precision
324*0Sstevel@tonic-gatethan C's simplistic fopen(3S) provides you should look to Perl's
325*0Sstevel@tonic-gateC<sysopen>, which is a direct hook into the open(2) system call.
326*0Sstevel@tonic-gateThat does mean it's a bit more involved, but that's the price of
327*0Sstevel@tonic-gateprecision.
328*0Sstevel@tonic-gate
329*0Sstevel@tonic-gateC<sysopen> takes 3 (or 4) arguments.
330*0Sstevel@tonic-gate
331*0Sstevel@tonic-gate    sysopen HANDLE, PATH, FLAGS, [MASK]
332*0Sstevel@tonic-gate
333*0Sstevel@tonic-gateThe HANDLE argument is a filehandle just as with C<open>.  The PATH is
334*0Sstevel@tonic-gatea literal path, one that doesn't pay attention to any greater-thans or
335*0Sstevel@tonic-gateless-thans or pipes or minuses, nor ignore white space.  If it's there,
336*0Sstevel@tonic-gateit's part of the path.  The FLAGS argument contains one or more values
337*0Sstevel@tonic-gatederived from the Fcntl module that have been or'd together using the
338*0Sstevel@tonic-gatebitwise "|" operator.  The final argument, the MASK, is optional; if
339*0Sstevel@tonic-gatepresent, it is combined with the user's current umask for the creation
340*0Sstevel@tonic-gatemode of the file.  You should usually omit this.
341*0Sstevel@tonic-gate
342*0Sstevel@tonic-gateAlthough the traditional values of read-only, write-only, and read-write
343*0Sstevel@tonic-gateare 0, 1, and 2 respectively, this is known not to hold true on some
344*0Sstevel@tonic-gatesystems.  Instead, it's best to load in the appropriate constants first
345*0Sstevel@tonic-gatefrom the Fcntl module, which supplies the following standard flags:
346*0Sstevel@tonic-gate
347*0Sstevel@tonic-gate    O_RDONLY            Read only
348*0Sstevel@tonic-gate    O_WRONLY            Write only
349*0Sstevel@tonic-gate    O_RDWR              Read and write
350*0Sstevel@tonic-gate    O_CREAT             Create the file if it doesn't exist
351*0Sstevel@tonic-gate    O_EXCL              Fail if the file already exists
352*0Sstevel@tonic-gate    O_APPEND            Append to the file
353*0Sstevel@tonic-gate    O_TRUNC             Truncate the file
354*0Sstevel@tonic-gate    O_NONBLOCK          Non-blocking access
355*0Sstevel@tonic-gate
356*0Sstevel@tonic-gateLess common flags that are sometimes available on some operating
357*0Sstevel@tonic-gatesystems include C<O_BINARY>, C<O_TEXT>, C<O_SHLOCK>, C<O_EXLOCK>,
358*0Sstevel@tonic-gateC<O_DEFER>, C<O_SYNC>, C<O_ASYNC>, C<O_DSYNC>, C<O_RSYNC>,
359*0Sstevel@tonic-gateC<O_NOCTTY>, C<O_NDELAY> and C<O_LARGEFILE>.  Consult your open(2)
360*0Sstevel@tonic-gatemanpage or its local equivalent for details.  (Note: starting from
361*0Sstevel@tonic-gatePerl release 5.6 the C<O_LARGEFILE> flag, if available, is automatically
362*0Sstevel@tonic-gateadded to the sysopen() flags because large files are the default.)
363*0Sstevel@tonic-gate
364*0Sstevel@tonic-gateHere's how to use C<sysopen> to emulate the simple C<open> calls we had
365*0Sstevel@tonic-gatebefore.  We'll omit the C<|| die $!> checks for clarity, but make sure
366*0Sstevel@tonic-gateyou always check the return values in real code.  These aren't quite
367*0Sstevel@tonic-gatethe same, since C<open> will trim leading and trailing white space,
368*0Sstevel@tonic-gatebut you'll get the idea.
369*0Sstevel@tonic-gate
370*0Sstevel@tonic-gateTo open a file for reading:
371*0Sstevel@tonic-gate
372*0Sstevel@tonic-gate    open(FH, "< $path");
373*0Sstevel@tonic-gate    sysopen(FH, $path, O_RDONLY);
374*0Sstevel@tonic-gate
375*0Sstevel@tonic-gateTo open a file for writing, creating a new file if needed or else truncating
376*0Sstevel@tonic-gatean old file:
377*0Sstevel@tonic-gate
378*0Sstevel@tonic-gate    open(FH, "> $path");
379*0Sstevel@tonic-gate    sysopen(FH, $path, O_WRONLY | O_TRUNC | O_CREAT);
380*0Sstevel@tonic-gate
381*0Sstevel@tonic-gateTo open a file for appending, creating one if necessary:
382*0Sstevel@tonic-gate
383*0Sstevel@tonic-gate    open(FH, ">> $path");
384*0Sstevel@tonic-gate    sysopen(FH, $path, O_WRONLY | O_APPEND | O_CREAT);
385*0Sstevel@tonic-gate
386*0Sstevel@tonic-gateTo open a file for update, where the file must already exist:
387*0Sstevel@tonic-gate
388*0Sstevel@tonic-gate    open(FH, "+< $path");
389*0Sstevel@tonic-gate    sysopen(FH, $path, O_RDWR);
390*0Sstevel@tonic-gate
391*0Sstevel@tonic-gateAnd here are things you can do with C<sysopen> that you cannot do with
392*0Sstevel@tonic-gatea regular C<open>.  As you'll see, it's just a matter of controlling the
393*0Sstevel@tonic-gateflags in the third argument.
394*0Sstevel@tonic-gate
395*0Sstevel@tonic-gateTo open a file for writing, creating a new file which must not previously
396*0Sstevel@tonic-gateexist:
397*0Sstevel@tonic-gate
398*0Sstevel@tonic-gate    sysopen(FH, $path, O_WRONLY | O_EXCL | O_CREAT);
399*0Sstevel@tonic-gate
400*0Sstevel@tonic-gateTo open a file for appending, where that file must already exist:
401*0Sstevel@tonic-gate
402*0Sstevel@tonic-gate    sysopen(FH, $path, O_WRONLY | O_APPEND);
403*0Sstevel@tonic-gate
404*0Sstevel@tonic-gateTo open a file for update, creating a new file if necessary:
405*0Sstevel@tonic-gate
406*0Sstevel@tonic-gate    sysopen(FH, $path, O_RDWR | O_CREAT);
407*0Sstevel@tonic-gate
408*0Sstevel@tonic-gateTo open a file for update, where that file must not already exist:
409*0Sstevel@tonic-gate
410*0Sstevel@tonic-gate    sysopen(FH, $path, O_RDWR | O_EXCL | O_CREAT);
411*0Sstevel@tonic-gate
412*0Sstevel@tonic-gateTo open a file without blocking, creating one if necessary:
413*0Sstevel@tonic-gate
414*0Sstevel@tonic-gate    sysopen(FH, $path, O_WRONLY | O_NONBLOCK | O_CREAT);
415*0Sstevel@tonic-gate
416*0Sstevel@tonic-gate=head2 Permissions E<agrave> la mode
417*0Sstevel@tonic-gate
418*0Sstevel@tonic-gateIf you omit the MASK argument to C<sysopen>, Perl uses the octal value
419*0Sstevel@tonic-gate0666.  The normal MASK to use for executables and directories should
420*0Sstevel@tonic-gatebe 0777, and for anything else, 0666.
421*0Sstevel@tonic-gate
422*0Sstevel@tonic-gateWhy so permissive?  Well, it isn't really.  The MASK will be modified
423*0Sstevel@tonic-gateby your process's current C<umask>.  A umask is a number representing
424*0Sstevel@tonic-gateI<disabled> permissions bits; that is, bits that will not be turned on
425*0Sstevel@tonic-gatein the created files' permissions field.
426*0Sstevel@tonic-gate
427*0Sstevel@tonic-gateFor example, if your C<umask> were 027, then the 020 part would
428*0Sstevel@tonic-gatedisable the group from writing, and the 007 part would disable others
429*0Sstevel@tonic-gatefrom reading, writing, or executing.  Under these conditions, passing
430*0Sstevel@tonic-gateC<sysopen> 0666 would create a file with mode 0640, since C<0666 & ~027>
431*0Sstevel@tonic-gateis 0640.
432*0Sstevel@tonic-gate
433*0Sstevel@tonic-gateYou should seldom use the MASK argument to C<sysopen()>.  That takes
434*0Sstevel@tonic-gateaway the user's freedom to choose what permission new files will have.
435*0Sstevel@tonic-gateDenying choice is almost always a bad thing.  One exception would be for
436*0Sstevel@tonic-gatecases where sensitive or private data is being stored, such as with mail
437*0Sstevel@tonic-gatefolders, cookie files, and internal temporary files.
438*0Sstevel@tonic-gate
439*0Sstevel@tonic-gate=head1 Obscure Open Tricks
440*0Sstevel@tonic-gate
441*0Sstevel@tonic-gate=head2 Re-Opening Files (dups)
442*0Sstevel@tonic-gate
443*0Sstevel@tonic-gateSometimes you already have a filehandle open, and want to make another
444*0Sstevel@tonic-gatehandle that's a duplicate of the first one.  In the shell, we place an
445*0Sstevel@tonic-gateampersand in front of a file descriptor number when doing redirections.
446*0Sstevel@tonic-gateFor example, C<< 2>&1 >> makes descriptor 2 (that's STDERR in Perl)
447*0Sstevel@tonic-gatebe redirected into descriptor 1 (which is usually Perl's STDOUT).
448*0Sstevel@tonic-gateThe same is essentially true in Perl: a filename that begins with an
449*0Sstevel@tonic-gateampersand is treated instead as a file descriptor if a number, or as a
450*0Sstevel@tonic-gatefilehandle if a string.
451*0Sstevel@tonic-gate
452*0Sstevel@tonic-gate    open(SAVEOUT, ">&SAVEERR") || die "couldn't dup SAVEERR: $!";
453*0Sstevel@tonic-gate    open(MHCONTEXT, "<&4")     || die "couldn't dup fd4: $!";
454*0Sstevel@tonic-gate
455*0Sstevel@tonic-gateThat means that if a function is expecting a filename, but you don't
456*0Sstevel@tonic-gatewant to give it a filename because you already have the file open, you
457*0Sstevel@tonic-gatecan just pass the filehandle with a leading ampersand.  It's best to
458*0Sstevel@tonic-gateuse a fully qualified handle though, just in case the function happens
459*0Sstevel@tonic-gateto be in a different package:
460*0Sstevel@tonic-gate
461*0Sstevel@tonic-gate    somefunction("&main::LOGFILE");
462*0Sstevel@tonic-gate
463*0Sstevel@tonic-gateThis way if somefunction() is planning on opening its argument, it can
464*0Sstevel@tonic-gatejust use the already opened handle.  This differs from passing a handle,
465*0Sstevel@tonic-gatebecause with a handle, you don't open the file.  Here you have something
466*0Sstevel@tonic-gateyou can pass to open.
467*0Sstevel@tonic-gate
468*0Sstevel@tonic-gateIf you have one of those tricky, newfangled I/O objects that the C++
469*0Sstevel@tonic-gatefolks are raving about, then this doesn't work because those aren't a
470*0Sstevel@tonic-gateproper filehandle in the native Perl sense.  You'll have to use fileno()
471*0Sstevel@tonic-gateto pull out the proper descriptor number, assuming you can:
472*0Sstevel@tonic-gate
473*0Sstevel@tonic-gate    use IO::Socket;
474*0Sstevel@tonic-gate    $handle = IO::Socket::INET->new("www.perl.com:80");
475*0Sstevel@tonic-gate    $fd = $handle->fileno;
476*0Sstevel@tonic-gate    somefunction("&$fd");  # not an indirect function call
477*0Sstevel@tonic-gate
478*0Sstevel@tonic-gateIt can be easier (and certainly will be faster) just to use real
479*0Sstevel@tonic-gatefilehandles though:
480*0Sstevel@tonic-gate
481*0Sstevel@tonic-gate    use IO::Socket;
482*0Sstevel@tonic-gate    local *REMOTE = IO::Socket::INET->new("www.perl.com:80");
483*0Sstevel@tonic-gate    die "can't connect" unless defined(fileno(REMOTE));
484*0Sstevel@tonic-gate    somefunction("&main::REMOTE");
485*0Sstevel@tonic-gate
486*0Sstevel@tonic-gateIf the filehandle or descriptor number is preceded not just with a simple
487*0Sstevel@tonic-gate"&" but rather with a "&=" combination, then Perl will not create a
488*0Sstevel@tonic-gatecompletely new descriptor opened to the same place using the dup(2)
489*0Sstevel@tonic-gatesystem call.  Instead, it will just make something of an alias to the
490*0Sstevel@tonic-gateexisting one using the fdopen(3S) library call  This is slightly more
491*0Sstevel@tonic-gateparsimonious of systems resources, although this is less a concern
492*0Sstevel@tonic-gatethese days.  Here's an example of that:
493*0Sstevel@tonic-gate
494*0Sstevel@tonic-gate    $fd = $ENV{"MHCONTEXTFD"};
495*0Sstevel@tonic-gate    open(MHCONTEXT, "<&=$fd")   or die "couldn't fdopen $fd: $!";
496*0Sstevel@tonic-gate
497*0Sstevel@tonic-gateIf you're using magic C<< <ARGV> >>, you could even pass in as a
498*0Sstevel@tonic-gatecommand line argument in @ARGV something like C<"<&=$MHCONTEXTFD">,
499*0Sstevel@tonic-gatebut we've never seen anyone actually do this.
500*0Sstevel@tonic-gate
501*0Sstevel@tonic-gate=head2 Dispelling the Dweomer
502*0Sstevel@tonic-gate
503*0Sstevel@tonic-gatePerl is more of a DWIMmer language than something like Java--where DWIM
504*0Sstevel@tonic-gateis an acronym for "do what I mean".  But this principle sometimes leads
505*0Sstevel@tonic-gateto more hidden magic than one knows what to do with.  In this way, Perl
506*0Sstevel@tonic-gateis also filled with I<dweomer>, an obscure word meaning an enchantment.
507*0Sstevel@tonic-gateSometimes, Perl's DWIMmer is just too much like dweomer for comfort.
508*0Sstevel@tonic-gate
509*0Sstevel@tonic-gateIf magic C<open> is a bit too magical for you, you don't have to turn
510*0Sstevel@tonic-gateto C<sysopen>.  To open a file with arbitrary weird characters in
511*0Sstevel@tonic-gateit, it's necessary to protect any leading and trailing whitespace.
512*0Sstevel@tonic-gateLeading whitespace is protected by inserting a C<"./"> in front of a
513*0Sstevel@tonic-gatefilename that starts with whitespace.  Trailing whitespace is protected
514*0Sstevel@tonic-gateby appending an ASCII NUL byte (C<"\0">) at the end of the string.
515*0Sstevel@tonic-gate
516*0Sstevel@tonic-gate    $file =~ s#^(\s)#./$1#;
517*0Sstevel@tonic-gate    open(FH, "< $file\0")   || die "can't open $file: $!";
518*0Sstevel@tonic-gate
519*0Sstevel@tonic-gateThis assumes, of course, that your system considers dot the current
520*0Sstevel@tonic-gateworking directory, slash the directory separator, and disallows ASCII
521*0Sstevel@tonic-gateNULs within a valid filename.  Most systems follow these conventions,
522*0Sstevel@tonic-gateincluding all POSIX systems as well as proprietary Microsoft systems.
523*0Sstevel@tonic-gateThe only vaguely popular system that doesn't work this way is the
524*0Sstevel@tonic-gateproprietary Macintosh system, which uses a colon where the rest of us
525*0Sstevel@tonic-gateuse a slash.  Maybe C<sysopen> isn't such a bad idea after all.
526*0Sstevel@tonic-gate
527*0Sstevel@tonic-gateIf you want to use C<< <ARGV> >> processing in a totally boring
528*0Sstevel@tonic-gateand non-magical way, you could do this first:
529*0Sstevel@tonic-gate
530*0Sstevel@tonic-gate    #   "Sam sat on the ground and put his head in his hands.
531*0Sstevel@tonic-gate    #   'I wish I had never come here, and I don't want to see
532*0Sstevel@tonic-gate    #   no more magic,' he said, and fell silent."
533*0Sstevel@tonic-gate    for (@ARGV) {
534*0Sstevel@tonic-gate        s#^([^./])#./$1#;
535*0Sstevel@tonic-gate        $_ .= "\0";
536*0Sstevel@tonic-gate    }
537*0Sstevel@tonic-gate    while (<>) {
538*0Sstevel@tonic-gate        # now process $_
539*0Sstevel@tonic-gate    }
540*0Sstevel@tonic-gate
541*0Sstevel@tonic-gateBut be warned that users will not appreciate being unable to use "-"
542*0Sstevel@tonic-gateto mean standard input, per the standard convention.
543*0Sstevel@tonic-gate
544*0Sstevel@tonic-gate=head2 Paths as Opens
545*0Sstevel@tonic-gate
546*0Sstevel@tonic-gateYou've probably noticed how Perl's C<warn> and C<die> functions can
547*0Sstevel@tonic-gateproduce messages like:
548*0Sstevel@tonic-gate
549*0Sstevel@tonic-gate    Some warning at scriptname line 29, <FH> line 7.
550*0Sstevel@tonic-gate
551*0Sstevel@tonic-gateThat's because you opened a filehandle FH, and had read in seven records
552*0Sstevel@tonic-gatefrom it.  But what was the name of the file, rather than the handle?
553*0Sstevel@tonic-gate
554*0Sstevel@tonic-gateIf you aren't running with C<strict refs>, or if you've turned them off
555*0Sstevel@tonic-gatetemporarily, then all you have to do is this:
556*0Sstevel@tonic-gate
557*0Sstevel@tonic-gate    open($path, "< $path") || die "can't open $path: $!";
558*0Sstevel@tonic-gate    while (<$path>) {
559*0Sstevel@tonic-gate        # whatever
560*0Sstevel@tonic-gate    }
561*0Sstevel@tonic-gate
562*0Sstevel@tonic-gateSince you're using the pathname of the file as its handle,
563*0Sstevel@tonic-gateyou'll get warnings more like
564*0Sstevel@tonic-gate
565*0Sstevel@tonic-gate    Some warning at scriptname line 29, </etc/motd> line 7.
566*0Sstevel@tonic-gate
567*0Sstevel@tonic-gate=head2 Single Argument Open
568*0Sstevel@tonic-gate
569*0Sstevel@tonic-gateRemember how we said that Perl's open took two arguments?  That was a
570*0Sstevel@tonic-gatepassive prevarication.  You see, it can also take just one argument.
571*0Sstevel@tonic-gateIf and only if the variable is a global variable, not a lexical, you
572*0Sstevel@tonic-gatecan pass C<open> just one argument, the filehandle, and it will
573*0Sstevel@tonic-gateget the path from the global scalar variable of the same name.
574*0Sstevel@tonic-gate
575*0Sstevel@tonic-gate    $FILE = "/etc/motd";
576*0Sstevel@tonic-gate    open FILE or die "can't open $FILE: $!";
577*0Sstevel@tonic-gate    while (<FILE>) {
578*0Sstevel@tonic-gate        # whatever
579*0Sstevel@tonic-gate    }
580*0Sstevel@tonic-gate
581*0Sstevel@tonic-gateWhy is this here?  Someone has to cater to the hysterical porpoises.
582*0Sstevel@tonic-gateIt's something that's been in Perl since the very beginning, if not
583*0Sstevel@tonic-gatebefore.
584*0Sstevel@tonic-gate
585*0Sstevel@tonic-gate=head2 Playing with STDIN and STDOUT
586*0Sstevel@tonic-gate
587*0Sstevel@tonic-gateOne clever move with STDOUT is to explicitly close it when you're done
588*0Sstevel@tonic-gatewith the program.
589*0Sstevel@tonic-gate
590*0Sstevel@tonic-gate    END { close(STDOUT) || die "can't close stdout: $!" }
591*0Sstevel@tonic-gate
592*0Sstevel@tonic-gateIf you don't do this, and your program fills up the disk partition due
593*0Sstevel@tonic-gateto a command line redirection, it won't report the error exit with a
594*0Sstevel@tonic-gatefailure status.
595*0Sstevel@tonic-gate
596*0Sstevel@tonic-gateYou don't have to accept the STDIN and STDOUT you were given.  You are
597*0Sstevel@tonic-gatewelcome to reopen them if you'd like.
598*0Sstevel@tonic-gate
599*0Sstevel@tonic-gate    open(STDIN, "< datafile")
600*0Sstevel@tonic-gate	|| die "can't open datafile: $!";
601*0Sstevel@tonic-gate
602*0Sstevel@tonic-gate    open(STDOUT, "> output")
603*0Sstevel@tonic-gate	|| die "can't open output: $!";
604*0Sstevel@tonic-gate
605*0Sstevel@tonic-gateAnd then these can be accessed directly or passed on to subprocesses.
606*0Sstevel@tonic-gateThis makes it look as though the program were initially invoked
607*0Sstevel@tonic-gatewith those redirections from the command line.
608*0Sstevel@tonic-gate
609*0Sstevel@tonic-gateIt's probably more interesting to connect these to pipes.  For example:
610*0Sstevel@tonic-gate
611*0Sstevel@tonic-gate    $pager = $ENV{PAGER} || "(less || more)";
612*0Sstevel@tonic-gate    open(STDOUT, "| $pager")
613*0Sstevel@tonic-gate	|| die "can't fork a pager: $!";
614*0Sstevel@tonic-gate
615*0Sstevel@tonic-gateThis makes it appear as though your program were called with its stdout
616*0Sstevel@tonic-gatealready piped into your pager.  You can also use this kind of thing
617*0Sstevel@tonic-gatein conjunction with an implicit fork to yourself.  You might do this
618*0Sstevel@tonic-gateif you would rather handle the post processing in your own program,
619*0Sstevel@tonic-gatejust in a different process:
620*0Sstevel@tonic-gate
621*0Sstevel@tonic-gate    head(100);
622*0Sstevel@tonic-gate    while (<>) {
623*0Sstevel@tonic-gate        print;
624*0Sstevel@tonic-gate    }
625*0Sstevel@tonic-gate
626*0Sstevel@tonic-gate    sub head {
627*0Sstevel@tonic-gate        my $lines = shift || 20;
628*0Sstevel@tonic-gate        return if $pid = open(STDOUT, "|-");       # return if parent
629*0Sstevel@tonic-gate        die "cannot fork: $!" unless defined $pid;
630*0Sstevel@tonic-gate        while (<STDIN>) {
631*0Sstevel@tonic-gate            last if --$lines < 0;
632*0Sstevel@tonic-gate            print;
633*0Sstevel@tonic-gate        }
634*0Sstevel@tonic-gate        exit;
635*0Sstevel@tonic-gate    }
636*0Sstevel@tonic-gate
637*0Sstevel@tonic-gateThis technique can be applied to repeatedly push as many filters on your
638*0Sstevel@tonic-gateoutput stream as you wish.
639*0Sstevel@tonic-gate
640*0Sstevel@tonic-gate=head1 Other I/O Issues
641*0Sstevel@tonic-gate
642*0Sstevel@tonic-gateThese topics aren't really arguments related to C<open> or C<sysopen>,
643*0Sstevel@tonic-gatebut they do affect what you do with your open files.
644*0Sstevel@tonic-gate
645*0Sstevel@tonic-gate=head2 Opening Non-File Files
646*0Sstevel@tonic-gate
647*0Sstevel@tonic-gateWhen is a file not a file?  Well, you could say when it exists but
648*0Sstevel@tonic-gateisn't a plain file.   We'll check whether it's a symbolic link first,
649*0Sstevel@tonic-gatejust in case.
650*0Sstevel@tonic-gate
651*0Sstevel@tonic-gate    if (-l $file || ! -f _) {
652*0Sstevel@tonic-gate        print "$file is not a plain file\n";
653*0Sstevel@tonic-gate    }
654*0Sstevel@tonic-gate
655*0Sstevel@tonic-gateWhat other kinds of files are there than, well, files?  Directories,
656*0Sstevel@tonic-gatesymbolic links, named pipes, Unix-domain sockets, and block and character
657*0Sstevel@tonic-gatedevices.  Those are all files, too--just not I<plain> files.  This isn't
658*0Sstevel@tonic-gatethe same issue as being a text file. Not all text files are plain files.
659*0Sstevel@tonic-gateNot all plain files are text files.  That's why there are separate C<-f>
660*0Sstevel@tonic-gateand C<-T> file tests.
661*0Sstevel@tonic-gate
662*0Sstevel@tonic-gateTo open a directory, you should use the C<opendir> function, then
663*0Sstevel@tonic-gateprocess it with C<readdir>, carefully restoring the directory
664*0Sstevel@tonic-gatename if necessary:
665*0Sstevel@tonic-gate
666*0Sstevel@tonic-gate    opendir(DIR, $dirname) or die "can't opendir $dirname: $!";
667*0Sstevel@tonic-gate    while (defined($file = readdir(DIR))) {
668*0Sstevel@tonic-gate        # do something with "$dirname/$file"
669*0Sstevel@tonic-gate    }
670*0Sstevel@tonic-gate    closedir(DIR);
671*0Sstevel@tonic-gate
672*0Sstevel@tonic-gateIf you want to process directories recursively, it's better to use the
673*0Sstevel@tonic-gateFile::Find module.  For example, this prints out all files recursively
674*0Sstevel@tonic-gateand adds a slash to their names if the file is a directory.
675*0Sstevel@tonic-gate
676*0Sstevel@tonic-gate    @ARGV = qw(.) unless @ARGV;
677*0Sstevel@tonic-gate    use File::Find;
678*0Sstevel@tonic-gate    find sub { print $File::Find::name, -d && '/', "\n" }, @ARGV;
679*0Sstevel@tonic-gate
680*0Sstevel@tonic-gateThis finds all bogus symbolic links beneath a particular directory:
681*0Sstevel@tonic-gate
682*0Sstevel@tonic-gate    find sub { print "$File::Find::name\n" if -l && !-e }, $dir;
683*0Sstevel@tonic-gate
684*0Sstevel@tonic-gateAs you see, with symbolic links, you can just pretend that it is
685*0Sstevel@tonic-gatewhat it points to.  Or, if you want to know I<what> it points to, then
686*0Sstevel@tonic-gateC<readlink> is called for:
687*0Sstevel@tonic-gate
688*0Sstevel@tonic-gate    if (-l $file) {
689*0Sstevel@tonic-gate        if (defined($whither = readlink($file))) {
690*0Sstevel@tonic-gate            print "$file points to $whither\n";
691*0Sstevel@tonic-gate        } else {
692*0Sstevel@tonic-gate            print "$file points nowhere: $!\n";
693*0Sstevel@tonic-gate        }
694*0Sstevel@tonic-gate    }
695*0Sstevel@tonic-gate
696*0Sstevel@tonic-gate=head2 Opening Named Pipes
697*0Sstevel@tonic-gate
698*0Sstevel@tonic-gateNamed pipes are a different matter.  You pretend they're regular files,
699*0Sstevel@tonic-gatebut their opens will normally block until there is both a reader and
700*0Sstevel@tonic-gatea writer.  You can read more about them in L<perlipc/"Named Pipes">.
701*0Sstevel@tonic-gateUnix-domain sockets are rather different beasts as well; they're
702*0Sstevel@tonic-gatedescribed in L<perlipc/"Unix-Domain TCP Clients and Servers">.
703*0Sstevel@tonic-gate
704*0Sstevel@tonic-gateWhen it comes to opening devices, it can be easy and it can be tricky.
705*0Sstevel@tonic-gateWe'll assume that if you're opening up a block device, you know what
706*0Sstevel@tonic-gateyou're doing.  The character devices are more interesting.  These are
707*0Sstevel@tonic-gatetypically used for modems, mice, and some kinds of printers.  This is
708*0Sstevel@tonic-gatedescribed in L<perlfaq8/"How do I read and write the serial port?">
709*0Sstevel@tonic-gateIt's often enough to open them carefully:
710*0Sstevel@tonic-gate
711*0Sstevel@tonic-gate    sysopen(TTYIN, "/dev/ttyS1", O_RDWR | O_NDELAY | O_NOCTTY)
712*0Sstevel@tonic-gate		# (O_NOCTTY no longer needed on POSIX systems)
713*0Sstevel@tonic-gate        or die "can't open /dev/ttyS1: $!";
714*0Sstevel@tonic-gate    open(TTYOUT, "+>&TTYIN")
715*0Sstevel@tonic-gate        or die "can't dup TTYIN: $!";
716*0Sstevel@tonic-gate
717*0Sstevel@tonic-gate    $ofh = select(TTYOUT); $| = 1; select($ofh);
718*0Sstevel@tonic-gate
719*0Sstevel@tonic-gate    print TTYOUT "+++at\015";
720*0Sstevel@tonic-gate    $answer = <TTYIN>;
721*0Sstevel@tonic-gate
722*0Sstevel@tonic-gateWith descriptors that you haven't opened using C<sysopen>, such as
723*0Sstevel@tonic-gatesockets, you can set them to be non-blocking using C<fcntl>:
724*0Sstevel@tonic-gate
725*0Sstevel@tonic-gate    use Fcntl;
726*0Sstevel@tonic-gate    my $old_flags = fcntl($handle, F_GETFL, 0)
727*0Sstevel@tonic-gate        or die "can't get flags: $!";
728*0Sstevel@tonic-gate    fcntl($handle, F_SETFL, $old_flags | O_NONBLOCK)
729*0Sstevel@tonic-gate        or die "can't set non blocking: $!";
730*0Sstevel@tonic-gate
731*0Sstevel@tonic-gateRather than losing yourself in a morass of twisting, turning C<ioctl>s,
732*0Sstevel@tonic-gateall dissimilar, if you're going to manipulate ttys, it's best to
733*0Sstevel@tonic-gatemake calls out to the stty(1) program if you have it, or else use the
734*0Sstevel@tonic-gateportable POSIX interface.  To figure this all out, you'll need to read the
735*0Sstevel@tonic-gatetermios(3) manpage, which describes the POSIX interface to tty devices,
736*0Sstevel@tonic-gateand then L<POSIX>, which describes Perl's interface to POSIX.  There are
737*0Sstevel@tonic-gatealso some high-level modules on CPAN that can help you with these games.
738*0Sstevel@tonic-gateCheck out Term::ReadKey and Term::ReadLine.
739*0Sstevel@tonic-gate
740*0Sstevel@tonic-gate=head2 Opening Sockets
741*0Sstevel@tonic-gate
742*0Sstevel@tonic-gateWhat else can you open?  To open a connection using sockets, you won't use
743*0Sstevel@tonic-gateone of Perl's two open functions.  See
744*0Sstevel@tonic-gateL<perlipc/"Sockets: Client/Server Communication"> for that.  Here's an
745*0Sstevel@tonic-gateexample.  Once you have it, you can use FH as a bidirectional filehandle.
746*0Sstevel@tonic-gate
747*0Sstevel@tonic-gate    use IO::Socket;
748*0Sstevel@tonic-gate    local *FH = IO::Socket::INET->new("www.perl.com:80");
749*0Sstevel@tonic-gate
750*0Sstevel@tonic-gateFor opening up a URL, the LWP modules from CPAN are just what
751*0Sstevel@tonic-gatethe doctor ordered.  There's no filehandle interface, but
752*0Sstevel@tonic-gateit's still easy to get the contents of a document:
753*0Sstevel@tonic-gate
754*0Sstevel@tonic-gate    use LWP::Simple;
755*0Sstevel@tonic-gate    $doc = get('http://www.linpro.no/lwp/');
756*0Sstevel@tonic-gate
757*0Sstevel@tonic-gate=head2 Binary Files
758*0Sstevel@tonic-gate
759*0Sstevel@tonic-gateOn certain legacy systems with what could charitably be called terminally
760*0Sstevel@tonic-gateconvoluted (some would say broken) I/O models, a file isn't a file--at
761*0Sstevel@tonic-gateleast, not with respect to the C standard I/O library.  On these old
762*0Sstevel@tonic-gatesystems whose libraries (but not kernels) distinguish between text and
763*0Sstevel@tonic-gatebinary streams, to get files to behave properly you'll have to bend over
764*0Sstevel@tonic-gatebackwards to avoid nasty problems.  On such infelicitous systems, sockets
765*0Sstevel@tonic-gateand pipes are already opened in binary mode, and there is currently no
766*0Sstevel@tonic-gateway to turn that off.  With files, you have more options.
767*0Sstevel@tonic-gate
768*0Sstevel@tonic-gateAnother option is to use the C<binmode> function on the appropriate
769*0Sstevel@tonic-gatehandles before doing regular I/O on them:
770*0Sstevel@tonic-gate
771*0Sstevel@tonic-gate    binmode(STDIN);
772*0Sstevel@tonic-gate    binmode(STDOUT);
773*0Sstevel@tonic-gate    while (<STDIN>) { print }
774*0Sstevel@tonic-gate
775*0Sstevel@tonic-gatePassing C<sysopen> a non-standard flag option will also open the file in
776*0Sstevel@tonic-gatebinary mode on those systems that support it.  This is the equivalent of
777*0Sstevel@tonic-gateopening the file normally, then calling C<binmode> on the handle.
778*0Sstevel@tonic-gate
779*0Sstevel@tonic-gate    sysopen(BINDAT, "records.data", O_RDWR | O_BINARY)
780*0Sstevel@tonic-gate        || die "can't open records.data: $!";
781*0Sstevel@tonic-gate
782*0Sstevel@tonic-gateNow you can use C<read> and C<print> on that handle without worrying
783*0Sstevel@tonic-gateabout the non-standard system I/O library breaking your data.  It's not
784*0Sstevel@tonic-gatea pretty picture, but then, legacy systems seldom are.  CP/M will be
785*0Sstevel@tonic-gatewith us until the end of days, and after.
786*0Sstevel@tonic-gate
787*0Sstevel@tonic-gateOn systems with exotic I/O systems, it turns out that, astonishingly
788*0Sstevel@tonic-gateenough, even unbuffered I/O using C<sysread> and C<syswrite> might do
789*0Sstevel@tonic-gatesneaky data mutilation behind your back.
790*0Sstevel@tonic-gate
791*0Sstevel@tonic-gate    while (sysread(WHENCE, $buf, 1024)) {
792*0Sstevel@tonic-gate        syswrite(WHITHER, $buf, length($buf));
793*0Sstevel@tonic-gate    }
794*0Sstevel@tonic-gate
795*0Sstevel@tonic-gateDepending on the vicissitudes of your runtime system, even these calls
796*0Sstevel@tonic-gatemay need C<binmode> or C<O_BINARY> first.  Systems known to be free of
797*0Sstevel@tonic-gatesuch difficulties include Unix, the Mac OS, Plan 9, and Inferno.
798*0Sstevel@tonic-gate
799*0Sstevel@tonic-gate=head2 File Locking
800*0Sstevel@tonic-gate
801*0Sstevel@tonic-gateIn a multitasking environment, you may need to be careful not to collide
802*0Sstevel@tonic-gatewith other processes who want to do I/O on the same files as you
803*0Sstevel@tonic-gateare working on.  You'll often need shared or exclusive locks
804*0Sstevel@tonic-gateon files for reading and writing respectively.  You might just
805*0Sstevel@tonic-gatepretend that only exclusive locks exist.
806*0Sstevel@tonic-gate
807*0Sstevel@tonic-gateNever use the existence of a file C<-e $file> as a locking indication,
808*0Sstevel@tonic-gatebecause there is a race condition between the test for the existence of
809*0Sstevel@tonic-gatethe file and its creation.  It's possible for another process to create
810*0Sstevel@tonic-gatea file in the slice of time between your existence check and your attempt
811*0Sstevel@tonic-gateto create the file.  Atomicity is critical.
812*0Sstevel@tonic-gate
813*0Sstevel@tonic-gatePerl's most portable locking interface is via the C<flock> function,
814*0Sstevel@tonic-gatewhose simplicity is emulated on systems that don't directly support it
815*0Sstevel@tonic-gatesuch as SysV or Windows.  The underlying semantics may affect how
816*0Sstevel@tonic-gateit all works, so you should learn how C<flock> is implemented on your
817*0Sstevel@tonic-gatesystem's port of Perl.
818*0Sstevel@tonic-gate
819*0Sstevel@tonic-gateFile locking I<does not> lock out another process that would like to
820*0Sstevel@tonic-gatedo I/O.  A file lock only locks out others trying to get a lock, not
821*0Sstevel@tonic-gateprocesses trying to do I/O.  Because locks are advisory, if one process
822*0Sstevel@tonic-gateuses locking and another doesn't, all bets are off.
823*0Sstevel@tonic-gate
824*0Sstevel@tonic-gateBy default, the C<flock> call will block until a lock is granted.
825*0Sstevel@tonic-gateA request for a shared lock will be granted as soon as there is no
826*0Sstevel@tonic-gateexclusive locker.  A request for an exclusive lock will be granted as
827*0Sstevel@tonic-gatesoon as there is no locker of any kind.  Locks are on file descriptors,
828*0Sstevel@tonic-gatenot file names.  You can't lock a file until you open it, and you can't
829*0Sstevel@tonic-gatehold on to a lock once the file has been closed.
830*0Sstevel@tonic-gate
831*0Sstevel@tonic-gateHere's how to get a blocking shared lock on a file, typically used
832*0Sstevel@tonic-gatefor reading:
833*0Sstevel@tonic-gate
834*0Sstevel@tonic-gate    use 5.004;
835*0Sstevel@tonic-gate    use Fcntl qw(:DEFAULT :flock);
836*0Sstevel@tonic-gate    open(FH, "< filename")  or die "can't open filename: $!";
837*0Sstevel@tonic-gate    flock(FH, LOCK_SH) 	    or die "can't lock filename: $!";
838*0Sstevel@tonic-gate    # now read from FH
839*0Sstevel@tonic-gate
840*0Sstevel@tonic-gateYou can get a non-blocking lock by using C<LOCK_NB>.
841*0Sstevel@tonic-gate
842*0Sstevel@tonic-gate    flock(FH, LOCK_SH | LOCK_NB)
843*0Sstevel@tonic-gate        or die "can't lock filename: $!";
844*0Sstevel@tonic-gate
845*0Sstevel@tonic-gateThis can be useful for producing more user-friendly behaviour by warning
846*0Sstevel@tonic-gateif you're going to be blocking:
847*0Sstevel@tonic-gate
848*0Sstevel@tonic-gate    use 5.004;
849*0Sstevel@tonic-gate    use Fcntl qw(:DEFAULT :flock);
850*0Sstevel@tonic-gate    open(FH, "< filename")  or die "can't open filename: $!";
851*0Sstevel@tonic-gate    unless (flock(FH, LOCK_SH | LOCK_NB)) {
852*0Sstevel@tonic-gate	$| = 1;
853*0Sstevel@tonic-gate	print "Waiting for lock...";
854*0Sstevel@tonic-gate	flock(FH, LOCK_SH)  or die "can't lock filename: $!";
855*0Sstevel@tonic-gate	print "got it.\n"
856*0Sstevel@tonic-gate    }
857*0Sstevel@tonic-gate    # now read from FH
858*0Sstevel@tonic-gate
859*0Sstevel@tonic-gateTo get an exclusive lock, typically used for writing, you have to be
860*0Sstevel@tonic-gatecareful.  We C<sysopen> the file so it can be locked before it gets
861*0Sstevel@tonic-gateemptied.  You can get a nonblocking version using C<LOCK_EX | LOCK_NB>.
862*0Sstevel@tonic-gate
863*0Sstevel@tonic-gate    use 5.004;
864*0Sstevel@tonic-gate    use Fcntl qw(:DEFAULT :flock);
865*0Sstevel@tonic-gate    sysopen(FH, "filename", O_WRONLY | O_CREAT)
866*0Sstevel@tonic-gate        or die "can't open filename: $!";
867*0Sstevel@tonic-gate    flock(FH, LOCK_EX)
868*0Sstevel@tonic-gate        or die "can't lock filename: $!";
869*0Sstevel@tonic-gate    truncate(FH, 0)
870*0Sstevel@tonic-gate        or die "can't truncate filename: $!";
871*0Sstevel@tonic-gate    # now write to FH
872*0Sstevel@tonic-gate
873*0Sstevel@tonic-gateFinally, due to the uncounted millions who cannot be dissuaded from
874*0Sstevel@tonic-gatewasting cycles on useless vanity devices called hit counters, here's
875*0Sstevel@tonic-gatehow to increment a number in a file safely:
876*0Sstevel@tonic-gate
877*0Sstevel@tonic-gate    use Fcntl qw(:DEFAULT :flock);
878*0Sstevel@tonic-gate
879*0Sstevel@tonic-gate    sysopen(FH, "numfile", O_RDWR | O_CREAT)
880*0Sstevel@tonic-gate        or die "can't open numfile: $!";
881*0Sstevel@tonic-gate    # autoflush FH
882*0Sstevel@tonic-gate    $ofh = select(FH); $| = 1; select ($ofh);
883*0Sstevel@tonic-gate    flock(FH, LOCK_EX)
884*0Sstevel@tonic-gate        or die "can't write-lock numfile: $!";
885*0Sstevel@tonic-gate
886*0Sstevel@tonic-gate    $num = <FH> || 0;
887*0Sstevel@tonic-gate    seek(FH, 0, 0)
888*0Sstevel@tonic-gate        or die "can't rewind numfile : $!";
889*0Sstevel@tonic-gate    print FH $num+1, "\n"
890*0Sstevel@tonic-gate        or die "can't write numfile: $!";
891*0Sstevel@tonic-gate
892*0Sstevel@tonic-gate    truncate(FH, tell(FH))
893*0Sstevel@tonic-gate        or die "can't truncate numfile: $!";
894*0Sstevel@tonic-gate    close(FH)
895*0Sstevel@tonic-gate        or die "can't close numfile: $!";
896*0Sstevel@tonic-gate
897*0Sstevel@tonic-gate=head2 IO Layers
898*0Sstevel@tonic-gate
899*0Sstevel@tonic-gateIn Perl 5.8.0 a new I/O framework called "PerlIO" was introduced.
900*0Sstevel@tonic-gateThis is a new "plumbing" for all the I/O happening in Perl; for the
901*0Sstevel@tonic-gatemost part everything will work just as it did, but PerlIO also brought
902*0Sstevel@tonic-gatein some new features such as the ability to think of I/O as "layers".
903*0Sstevel@tonic-gateOne I/O layer may in addition to just moving the data also do
904*0Sstevel@tonic-gatetransformations on the data.  Such transformations may include
905*0Sstevel@tonic-gatecompression and decompression, encryption and decryption, and transforming
906*0Sstevel@tonic-gatebetween various character encodings.
907*0Sstevel@tonic-gate
908*0Sstevel@tonic-gateFull discussion about the features of PerlIO is out of scope for this
909*0Sstevel@tonic-gatetutorial, but here is how to recognize the layers being used:
910*0Sstevel@tonic-gate
911*0Sstevel@tonic-gate=over 4
912*0Sstevel@tonic-gate
913*0Sstevel@tonic-gate=item *
914*0Sstevel@tonic-gate
915*0Sstevel@tonic-gateThe three-(or more)-argument form of C<open> is being used and the
916*0Sstevel@tonic-gatesecond argument contains something else in addition to the usual
917*0Sstevel@tonic-gateC<< '<' >>, C<< '>' >>, C<< '>>' >>, C<< '|' >> and their variants,
918*0Sstevel@tonic-gatefor example:
919*0Sstevel@tonic-gate
920*0Sstevel@tonic-gate    open(my $fh, "<:utf8", $fn);
921*0Sstevel@tonic-gate
922*0Sstevel@tonic-gate=item *
923*0Sstevel@tonic-gate
924*0Sstevel@tonic-gateThe two-argument form of C<binmode> is being used, for example
925*0Sstevel@tonic-gate
926*0Sstevel@tonic-gate    binmode($fh, ":encoding(utf16)");
927*0Sstevel@tonic-gate
928*0Sstevel@tonic-gate=back
929*0Sstevel@tonic-gate
930*0Sstevel@tonic-gateFor more detailed discussion about PerlIO see L<PerlIO>;
931*0Sstevel@tonic-gatefor more detailed discussion about Unicode and I/O see L<perluniintro>.
932*0Sstevel@tonic-gate
933*0Sstevel@tonic-gate=head1 SEE ALSO
934*0Sstevel@tonic-gate
935*0Sstevel@tonic-gateThe C<open> and C<sysopen> functions in perlfunc(1);
936*0Sstevel@tonic-gatethe system open(2), dup(2), fopen(3), and fdopen(3) manpages;
937*0Sstevel@tonic-gatethe POSIX documentation.
938*0Sstevel@tonic-gate
939*0Sstevel@tonic-gate=head1 AUTHOR and COPYRIGHT
940*0Sstevel@tonic-gate
941*0Sstevel@tonic-gateCopyright 1998 Tom Christiansen.
942*0Sstevel@tonic-gate
943*0Sstevel@tonic-gateThis documentation is free; you can redistribute it and/or modify it
944*0Sstevel@tonic-gateunder the same terms as Perl itself.
945*0Sstevel@tonic-gate
946*0Sstevel@tonic-gateIrrespective of its distribution, all code examples in these files are
947*0Sstevel@tonic-gatehereby placed into the public domain.  You are permitted and
948*0Sstevel@tonic-gateencouraged to use this code in your own programs for fun or for profit
949*0Sstevel@tonic-gateas you see fit.  A simple comment in the code giving credit would be
950*0Sstevel@tonic-gatecourteous but is not required.
951*0Sstevel@tonic-gate
952*0Sstevel@tonic-gate=head1 HISTORY
953*0Sstevel@tonic-gate
954*0Sstevel@tonic-gateFirst release: Sat Jan  9 08:09:11 MST 1999
955