1*0Sstevel@tonic-gate=head1 NAME 2*0Sstevel@tonic-gate 3*0Sstevel@tonic-gateperlopentut - tutorial on opening things in Perl 4*0Sstevel@tonic-gate 5*0Sstevel@tonic-gate=head1 DESCRIPTION 6*0Sstevel@tonic-gate 7*0Sstevel@tonic-gatePerl has two simple, built-in ways to open files: the shell way for 8*0Sstevel@tonic-gateconvenience, and the C way for precision. The shell way also has 2- and 9*0Sstevel@tonic-gate3-argument forms, which have different semantics for handling the filename. 10*0Sstevel@tonic-gateThe choice is yours. 11*0Sstevel@tonic-gate 12*0Sstevel@tonic-gate=head1 Open E<agrave> la shell 13*0Sstevel@tonic-gate 14*0Sstevel@tonic-gatePerl's C<open> function was designed to mimic the way command-line 15*0Sstevel@tonic-gateredirection in the shell works. Here are some basic examples 16*0Sstevel@tonic-gatefrom the shell: 17*0Sstevel@tonic-gate 18*0Sstevel@tonic-gate $ myprogram file1 file2 file3 19*0Sstevel@tonic-gate $ myprogram < inputfile 20*0Sstevel@tonic-gate $ myprogram > outputfile 21*0Sstevel@tonic-gate $ myprogram >> outputfile 22*0Sstevel@tonic-gate $ myprogram | otherprogram 23*0Sstevel@tonic-gate $ otherprogram | myprogram 24*0Sstevel@tonic-gate 25*0Sstevel@tonic-gateAnd here are some more advanced examples: 26*0Sstevel@tonic-gate 27*0Sstevel@tonic-gate $ otherprogram | myprogram f1 - f2 28*0Sstevel@tonic-gate $ otherprogram 2>&1 | myprogram - 29*0Sstevel@tonic-gate $ myprogram <&3 30*0Sstevel@tonic-gate $ myprogram >&4 31*0Sstevel@tonic-gate 32*0Sstevel@tonic-gateProgrammers accustomed to constructs like those above can take comfort 33*0Sstevel@tonic-gatein learning that Perl directly supports these familiar constructs using 34*0Sstevel@tonic-gatevirtually the same syntax as the shell. 35*0Sstevel@tonic-gate 36*0Sstevel@tonic-gate=head2 Simple Opens 37*0Sstevel@tonic-gate 38*0Sstevel@tonic-gateThe C<open> function takes two arguments: the first is a filehandle, 39*0Sstevel@tonic-gateand the second is a single string comprising both what to open and how 40*0Sstevel@tonic-gateto open it. C<open> returns true when it works, and when it fails, 41*0Sstevel@tonic-gatereturns a false value and sets the special variable C<$!> to reflect 42*0Sstevel@tonic-gatethe system error. If the filehandle was previously opened, it will 43*0Sstevel@tonic-gatebe implicitly closed first. 44*0Sstevel@tonic-gate 45*0Sstevel@tonic-gateFor example: 46*0Sstevel@tonic-gate 47*0Sstevel@tonic-gate open(INFO, "datafile") || die("can't open datafile: $!"); 48*0Sstevel@tonic-gate open(INFO, "< datafile") || die("can't open datafile: $!"); 49*0Sstevel@tonic-gate open(RESULTS,"> runstats") || die("can't open runstats: $!"); 50*0Sstevel@tonic-gate open(LOG, ">> logfile ") || die("can't open logfile: $!"); 51*0Sstevel@tonic-gate 52*0Sstevel@tonic-gateIf you prefer the low-punctuation version, you could write that this way: 53*0Sstevel@tonic-gate 54*0Sstevel@tonic-gate open INFO, "< datafile" or die "can't open datafile: $!"; 55*0Sstevel@tonic-gate open RESULTS,"> runstats" or die "can't open runstats: $!"; 56*0Sstevel@tonic-gate open LOG, ">> logfile " or die "can't open logfile: $!"; 57*0Sstevel@tonic-gate 58*0Sstevel@tonic-gateA few things to notice. First, the leading less-than is optional. 59*0Sstevel@tonic-gateIf omitted, Perl assumes that you want to open the file for reading. 60*0Sstevel@tonic-gate 61*0Sstevel@tonic-gateNote also that the first example uses the C<||> logical operator, and the 62*0Sstevel@tonic-gatesecond uses C<or>, which has lower precedence. Using C<||> in the latter 63*0Sstevel@tonic-gateexamples would effectively mean 64*0Sstevel@tonic-gate 65*0Sstevel@tonic-gate open INFO, ( "< datafile" || die "can't open datafile: $!" ); 66*0Sstevel@tonic-gate 67*0Sstevel@tonic-gatewhich is definitely not what you want. 68*0Sstevel@tonic-gate 69*0Sstevel@tonic-gateThe other important thing to notice is that, just as in the shell, 70*0Sstevel@tonic-gateany white space before or after the filename is ignored. This is good, 71*0Sstevel@tonic-gatebecause you wouldn't want these to do different things: 72*0Sstevel@tonic-gate 73*0Sstevel@tonic-gate open INFO, "<datafile" 74*0Sstevel@tonic-gate open INFO, "< datafile" 75*0Sstevel@tonic-gate open INFO, "< datafile" 76*0Sstevel@tonic-gate 77*0Sstevel@tonic-gateIgnoring surrounding whitespace also helps for when you read a filename 78*0Sstevel@tonic-gatein from a different file, and forget to trim it before opening: 79*0Sstevel@tonic-gate 80*0Sstevel@tonic-gate $filename = <INFO>; # oops, \n still there 81*0Sstevel@tonic-gate open(EXTRA, "< $filename") || die "can't open $filename: $!"; 82*0Sstevel@tonic-gate 83*0Sstevel@tonic-gateThis is not a bug, but a feature. Because C<open> mimics the shell in 84*0Sstevel@tonic-gateits style of using redirection arrows to specify how to open the file, it 85*0Sstevel@tonic-gatealso does so with respect to extra white space around the filename itself 86*0Sstevel@tonic-gateas well. For accessing files with naughty names, see 87*0Sstevel@tonic-gateL<"Dispelling the Dweomer">. 88*0Sstevel@tonic-gate 89*0Sstevel@tonic-gateThere is also a 3-argument version of C<open>, which lets you put the 90*0Sstevel@tonic-gatespecial redirection characters into their own argument: 91*0Sstevel@tonic-gate 92*0Sstevel@tonic-gate open( INFO, ">", $datafile ) || die "Can't create $datafile: $!"; 93*0Sstevel@tonic-gate 94*0Sstevel@tonic-gateIn this case, the filename to open is the actual string in C<$datafile>, 95*0Sstevel@tonic-gateso you don't have to worry about C<$datafile> containing characters 96*0Sstevel@tonic-gatethat might influence the open mode, or whitespace at the beginning of 97*0Sstevel@tonic-gatethe filename that would be absorbed in the 2-argument version. Also, 98*0Sstevel@tonic-gateany reduction of unnecessary string interpolation is a good thing. 99*0Sstevel@tonic-gate 100*0Sstevel@tonic-gate=head2 Indirect Filehandles 101*0Sstevel@tonic-gate 102*0Sstevel@tonic-gateC<open>'s first argument can be a reference to a filehandle. As of 103*0Sstevel@tonic-gateperl 5.6.0, if the argument is uninitialized, Perl will automatically 104*0Sstevel@tonic-gatecreate a filehandle and put a reference to it in the first argument, 105*0Sstevel@tonic-gatelike so: 106*0Sstevel@tonic-gate 107*0Sstevel@tonic-gate open( my $in, $infile ) or die "Couldn't read $infile: $!"; 108*0Sstevel@tonic-gate while ( <$in> ) { 109*0Sstevel@tonic-gate # do something with $_ 110*0Sstevel@tonic-gate } 111*0Sstevel@tonic-gate close $in; 112*0Sstevel@tonic-gate 113*0Sstevel@tonic-gateIndirect filehandles make namespace management easier. Since filehandles 114*0Sstevel@tonic-gateare global to the current package, two subroutines trying to open 115*0Sstevel@tonic-gateC<INFILE> will clash. With two functions opening indirect filehandles 116*0Sstevel@tonic-gatelike C<my $infile>, there's no clash and no need to worry about future 117*0Sstevel@tonic-gateconflicts. 118*0Sstevel@tonic-gate 119*0Sstevel@tonic-gateAnother convenient behavior is that an indirect filehandle automatically 120*0Sstevel@tonic-gatecloses when it goes out of scope or when you undefine it: 121*0Sstevel@tonic-gate 122*0Sstevel@tonic-gate sub firstline { 123*0Sstevel@tonic-gate open( my $in, shift ) && return scalar <$in>; 124*0Sstevel@tonic-gate # no close() required 125*0Sstevel@tonic-gate } 126*0Sstevel@tonic-gate 127*0Sstevel@tonic-gate=head2 Pipe Opens 128*0Sstevel@tonic-gate 129*0Sstevel@tonic-gateIn C, when you want to open a file using the standard I/O library, 130*0Sstevel@tonic-gateyou use the C<fopen> function, but when opening a pipe, you use the 131*0Sstevel@tonic-gateC<popen> function. But in the shell, you just use a different redirection 132*0Sstevel@tonic-gatecharacter. That's also the case for Perl. The C<open> call 133*0Sstevel@tonic-gateremains the same--just its argument differs. 134*0Sstevel@tonic-gate 135*0Sstevel@tonic-gateIf the leading character is a pipe symbol, C<open> starts up a new 136*0Sstevel@tonic-gatecommand and opens a write-only filehandle leading into that command. 137*0Sstevel@tonic-gateThis lets you write into that handle and have what you write show up on 138*0Sstevel@tonic-gatethat command's standard input. For example: 139*0Sstevel@tonic-gate 140*0Sstevel@tonic-gate open(PRINTER, "| lpr -Plp1") || die "can't run lpr: $!"; 141*0Sstevel@tonic-gate print PRINTER "stuff\n"; 142*0Sstevel@tonic-gate close(PRINTER) || die "can't close lpr: $!"; 143*0Sstevel@tonic-gate 144*0Sstevel@tonic-gateIf the trailing character is a pipe, you start up a new command and open a 145*0Sstevel@tonic-gateread-only filehandle leading out of that command. This lets whatever that 146*0Sstevel@tonic-gatecommand writes to its standard output show up on your handle for reading. 147*0Sstevel@tonic-gateFor example: 148*0Sstevel@tonic-gate 149*0Sstevel@tonic-gate open(NET, "netstat -i -n |") || die "can't fork netstat: $!"; 150*0Sstevel@tonic-gate while (<NET>) { } # do something with input 151*0Sstevel@tonic-gate close(NET) || die "can't close netstat: $!"; 152*0Sstevel@tonic-gate 153*0Sstevel@tonic-gateWhat happens if you try to open a pipe to or from a non-existent 154*0Sstevel@tonic-gatecommand? If possible, Perl will detect the failure and set C<$!> as 155*0Sstevel@tonic-gateusual. But if the command contains special shell characters, such as 156*0Sstevel@tonic-gateC<E<gt>> or C<*>, called 'metacharacters', Perl does not execute the 157*0Sstevel@tonic-gatecommand directly. Instead, Perl runs the shell, which then tries to 158*0Sstevel@tonic-gaterun the command. This means that it's the shell that gets the error 159*0Sstevel@tonic-gateindication. In such a case, the C<open> call will only indicate 160*0Sstevel@tonic-gatefailure if Perl can't even run the shell. See L<perlfaq8/"How can I 161*0Sstevel@tonic-gatecapture STDERR from an external command?"> to see how to cope with 162*0Sstevel@tonic-gatethis. There's also an explanation in L<perlipc>. 163*0Sstevel@tonic-gate 164*0Sstevel@tonic-gateIf you would like to open a bidirectional pipe, the IPC::Open2 165*0Sstevel@tonic-gatelibrary will handle this for you. Check out 166*0Sstevel@tonic-gateL<perlipc/"Bidirectional Communication with Another Process"> 167*0Sstevel@tonic-gate 168*0Sstevel@tonic-gate=head2 The Minus File 169*0Sstevel@tonic-gate 170*0Sstevel@tonic-gateAgain following the lead of the standard shell utilities, Perl's 171*0Sstevel@tonic-gateC<open> function treats a file whose name is a single minus, "-", in a 172*0Sstevel@tonic-gatespecial way. If you open minus for reading, it really means to access 173*0Sstevel@tonic-gatethe standard input. If you open minus for writing, it really means to 174*0Sstevel@tonic-gateaccess the standard output. 175*0Sstevel@tonic-gate 176*0Sstevel@tonic-gateIf minus can be used as the default input or default output, what happens 177*0Sstevel@tonic-gateif you open a pipe into or out of minus? What's the default command it 178*0Sstevel@tonic-gatewould run? The same script as you're currently running! This is actually 179*0Sstevel@tonic-gatea stealth C<fork> hidden inside an C<open> call. See 180*0Sstevel@tonic-gateL<perlipc/"Safe Pipe Opens"> for details. 181*0Sstevel@tonic-gate 182*0Sstevel@tonic-gate=head2 Mixing Reads and Writes 183*0Sstevel@tonic-gate 184*0Sstevel@tonic-gateIt is possible to specify both read and write access. All you do is 185*0Sstevel@tonic-gateadd a "+" symbol in front of the redirection. But as in the shell, 186*0Sstevel@tonic-gateusing a less-than on a file never creates a new file; it only opens an 187*0Sstevel@tonic-gateexisting one. On the other hand, using a greater-than always clobbers 188*0Sstevel@tonic-gate(truncates to zero length) an existing file, or creates a brand-new one 189*0Sstevel@tonic-gateif there isn't an old one. Adding a "+" for read-write doesn't affect 190*0Sstevel@tonic-gatewhether it only works on existing files or always clobbers existing ones. 191*0Sstevel@tonic-gate 192*0Sstevel@tonic-gate open(WTMP, "+< /usr/adm/wtmp") 193*0Sstevel@tonic-gate || die "can't open /usr/adm/wtmp: $!"; 194*0Sstevel@tonic-gate 195*0Sstevel@tonic-gate open(SCREEN, "+> lkscreen") 196*0Sstevel@tonic-gate || die "can't open lkscreen: $!"; 197*0Sstevel@tonic-gate 198*0Sstevel@tonic-gate open(LOGFILE, "+>> /var/log/applog" 199*0Sstevel@tonic-gate || die "can't open /var/log/applog: $!"; 200*0Sstevel@tonic-gate 201*0Sstevel@tonic-gateThe first one won't create a new file, and the second one will always 202*0Sstevel@tonic-gateclobber an old one. The third one will create a new file if necessary 203*0Sstevel@tonic-gateand not clobber an old one, and it will allow you to read at any point 204*0Sstevel@tonic-gatein the file, but all writes will always go to the end. In short, 205*0Sstevel@tonic-gatethe first case is substantially more common than the second and third 206*0Sstevel@tonic-gatecases, which are almost always wrong. (If you know C, the plus in 207*0Sstevel@tonic-gatePerl's C<open> is historically derived from the one in C's fopen(3S), 208*0Sstevel@tonic-gatewhich it ultimately calls.) 209*0Sstevel@tonic-gate 210*0Sstevel@tonic-gateIn fact, when it comes to updating a file, unless you're working on 211*0Sstevel@tonic-gatea binary file as in the WTMP case above, you probably don't want to 212*0Sstevel@tonic-gateuse this approach for updating. Instead, Perl's B<-i> flag comes to 213*0Sstevel@tonic-gatethe rescue. The following command takes all the C, C++, or yacc source 214*0Sstevel@tonic-gateor header files and changes all their foo's to bar's, leaving 215*0Sstevel@tonic-gatethe old version in the original filename with a ".orig" tacked 216*0Sstevel@tonic-gateon the end: 217*0Sstevel@tonic-gate 218*0Sstevel@tonic-gate $ perl -i.orig -pe 's/\bfoo\b/bar/g' *.[Cchy] 219*0Sstevel@tonic-gate 220*0Sstevel@tonic-gateThis is a short cut for some renaming games that are really 221*0Sstevel@tonic-gatethe best way to update textfiles. See the second question in 222*0Sstevel@tonic-gateL<perlfaq5> for more details. 223*0Sstevel@tonic-gate 224*0Sstevel@tonic-gate=head2 Filters 225*0Sstevel@tonic-gate 226*0Sstevel@tonic-gateOne of the most common uses for C<open> is one you never 227*0Sstevel@tonic-gateeven notice. When you process the ARGV filehandle using 228*0Sstevel@tonic-gateC<< <ARGV> >>, Perl actually does an implicit open 229*0Sstevel@tonic-gateon each file in @ARGV. Thus a program called like this: 230*0Sstevel@tonic-gate 231*0Sstevel@tonic-gate $ myprogram file1 file2 file3 232*0Sstevel@tonic-gate 233*0Sstevel@tonic-gateCan have all its files opened and processed one at a time 234*0Sstevel@tonic-gateusing a construct no more complex than: 235*0Sstevel@tonic-gate 236*0Sstevel@tonic-gate while (<>) { 237*0Sstevel@tonic-gate # do something with $_ 238*0Sstevel@tonic-gate } 239*0Sstevel@tonic-gate 240*0Sstevel@tonic-gateIf @ARGV is empty when the loop first begins, Perl pretends you've opened 241*0Sstevel@tonic-gateup minus, that is, the standard input. In fact, $ARGV, the currently 242*0Sstevel@tonic-gateopen file during C<< <ARGV> >> processing, is even set to "-" 243*0Sstevel@tonic-gatein these circumstances. 244*0Sstevel@tonic-gate 245*0Sstevel@tonic-gateYou are welcome to pre-process your @ARGV before starting the loop to 246*0Sstevel@tonic-gatemake sure it's to your liking. One reason to do this might be to remove 247*0Sstevel@tonic-gatecommand options beginning with a minus. While you can always roll the 248*0Sstevel@tonic-gatesimple ones by hand, the Getopts modules are good for this: 249*0Sstevel@tonic-gate 250*0Sstevel@tonic-gate use Getopt::Std; 251*0Sstevel@tonic-gate 252*0Sstevel@tonic-gate # -v, -D, -o ARG, sets $opt_v, $opt_D, $opt_o 253*0Sstevel@tonic-gate getopts("vDo:"); 254*0Sstevel@tonic-gate 255*0Sstevel@tonic-gate # -v, -D, -o ARG, sets $args{v}, $args{D}, $args{o} 256*0Sstevel@tonic-gate getopts("vDo:", \%args); 257*0Sstevel@tonic-gate 258*0Sstevel@tonic-gateOr the standard Getopt::Long module to permit named arguments: 259*0Sstevel@tonic-gate 260*0Sstevel@tonic-gate use Getopt::Long; 261*0Sstevel@tonic-gate GetOptions( "verbose" => \$verbose, # --verbose 262*0Sstevel@tonic-gate "Debug" => \$debug, # --Debug 263*0Sstevel@tonic-gate "output=s" => \$output ); 264*0Sstevel@tonic-gate # --output=somestring or --output somestring 265*0Sstevel@tonic-gate 266*0Sstevel@tonic-gateAnother reason for preprocessing arguments is to make an empty 267*0Sstevel@tonic-gateargument list default to all files: 268*0Sstevel@tonic-gate 269*0Sstevel@tonic-gate @ARGV = glob("*") unless @ARGV; 270*0Sstevel@tonic-gate 271*0Sstevel@tonic-gateYou could even filter out all but plain, text files. This is a bit 272*0Sstevel@tonic-gatesilent, of course, and you might prefer to mention them on the way. 273*0Sstevel@tonic-gate 274*0Sstevel@tonic-gate @ARGV = grep { -f && -T } @ARGV; 275*0Sstevel@tonic-gate 276*0Sstevel@tonic-gateIf you're using the B<-n> or B<-p> command-line options, you 277*0Sstevel@tonic-gateshould put changes to @ARGV in a C<BEGIN{}> block. 278*0Sstevel@tonic-gate 279*0Sstevel@tonic-gateRemember that a normal C<open> has special properties, in that it might 280*0Sstevel@tonic-gatecall fopen(3S) or it might called popen(3S), depending on what its 281*0Sstevel@tonic-gateargument looks like; that's why it's sometimes called "magic open". 282*0Sstevel@tonic-gateHere's an example: 283*0Sstevel@tonic-gate 284*0Sstevel@tonic-gate $pwdinfo = `domainname` =~ /^(\(none\))?$/ 285*0Sstevel@tonic-gate ? '< /etc/passwd' 286*0Sstevel@tonic-gate : 'ypcat passwd |'; 287*0Sstevel@tonic-gate 288*0Sstevel@tonic-gate open(PWD, $pwdinfo) 289*0Sstevel@tonic-gate or die "can't open $pwdinfo: $!"; 290*0Sstevel@tonic-gate 291*0Sstevel@tonic-gateThis sort of thing also comes into play in filter processing. Because 292*0Sstevel@tonic-gateC<< <ARGV> >> processing employs the normal, shell-style Perl C<open>, 293*0Sstevel@tonic-gateit respects all the special things we've already seen: 294*0Sstevel@tonic-gate 295*0Sstevel@tonic-gate $ myprogram f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile 296*0Sstevel@tonic-gate 297*0Sstevel@tonic-gateThat program will read from the file F<f1>, the process F<cmd1>, standard 298*0Sstevel@tonic-gateinput (F<tmpfile> in this case), the F<f2> file, the F<cmd2> command, 299*0Sstevel@tonic-gateand finally the F<f3> file. 300*0Sstevel@tonic-gate 301*0Sstevel@tonic-gateYes, this also means that if you have files named "-" (and so on) in 302*0Sstevel@tonic-gateyour directory, they won't be processed as literal files by C<open>. 303*0Sstevel@tonic-gateYou'll need to pass them as "./-", much as you would for the I<rm> program, 304*0Sstevel@tonic-gateor you could use C<sysopen> as described below. 305*0Sstevel@tonic-gate 306*0Sstevel@tonic-gateOne of the more interesting applications is to change files of a certain 307*0Sstevel@tonic-gatename into pipes. For example, to autoprocess gzipped or compressed 308*0Sstevel@tonic-gatefiles by decompressing them with I<gzip>: 309*0Sstevel@tonic-gate 310*0Sstevel@tonic-gate @ARGV = map { /^\.(gz|Z)$/ ? "gzip -dc $_ |" : $_ } @ARGV; 311*0Sstevel@tonic-gate 312*0Sstevel@tonic-gateOr, if you have the I<GET> program installed from LWP, 313*0Sstevel@tonic-gateyou can fetch URLs before processing them: 314*0Sstevel@tonic-gate 315*0Sstevel@tonic-gate @ARGV = map { m#^\w+://# ? "GET $_ |" : $_ } @ARGV; 316*0Sstevel@tonic-gate 317*0Sstevel@tonic-gateIt's not for nothing that this is called magic C<< <ARGV> >>. 318*0Sstevel@tonic-gatePretty nifty, eh? 319*0Sstevel@tonic-gate 320*0Sstevel@tonic-gate=head1 Open E<agrave> la C 321*0Sstevel@tonic-gate 322*0Sstevel@tonic-gateIf you want the convenience of the shell, then Perl's C<open> is 323*0Sstevel@tonic-gatedefinitely the way to go. On the other hand, if you want finer precision 324*0Sstevel@tonic-gatethan C's simplistic fopen(3S) provides you should look to Perl's 325*0Sstevel@tonic-gateC<sysopen>, which is a direct hook into the open(2) system call. 326*0Sstevel@tonic-gateThat does mean it's a bit more involved, but that's the price of 327*0Sstevel@tonic-gateprecision. 328*0Sstevel@tonic-gate 329*0Sstevel@tonic-gateC<sysopen> takes 3 (or 4) arguments. 330*0Sstevel@tonic-gate 331*0Sstevel@tonic-gate sysopen HANDLE, PATH, FLAGS, [MASK] 332*0Sstevel@tonic-gate 333*0Sstevel@tonic-gateThe HANDLE argument is a filehandle just as with C<open>. The PATH is 334*0Sstevel@tonic-gatea literal path, one that doesn't pay attention to any greater-thans or 335*0Sstevel@tonic-gateless-thans or pipes or minuses, nor ignore white space. If it's there, 336*0Sstevel@tonic-gateit's part of the path. The FLAGS argument contains one or more values 337*0Sstevel@tonic-gatederived from the Fcntl module that have been or'd together using the 338*0Sstevel@tonic-gatebitwise "|" operator. The final argument, the MASK, is optional; if 339*0Sstevel@tonic-gatepresent, it is combined with the user's current umask for the creation 340*0Sstevel@tonic-gatemode of the file. You should usually omit this. 341*0Sstevel@tonic-gate 342*0Sstevel@tonic-gateAlthough the traditional values of read-only, write-only, and read-write 343*0Sstevel@tonic-gateare 0, 1, and 2 respectively, this is known not to hold true on some 344*0Sstevel@tonic-gatesystems. Instead, it's best to load in the appropriate constants first 345*0Sstevel@tonic-gatefrom the Fcntl module, which supplies the following standard flags: 346*0Sstevel@tonic-gate 347*0Sstevel@tonic-gate O_RDONLY Read only 348*0Sstevel@tonic-gate O_WRONLY Write only 349*0Sstevel@tonic-gate O_RDWR Read and write 350*0Sstevel@tonic-gate O_CREAT Create the file if it doesn't exist 351*0Sstevel@tonic-gate O_EXCL Fail if the file already exists 352*0Sstevel@tonic-gate O_APPEND Append to the file 353*0Sstevel@tonic-gate O_TRUNC Truncate the file 354*0Sstevel@tonic-gate O_NONBLOCK Non-blocking access 355*0Sstevel@tonic-gate 356*0Sstevel@tonic-gateLess common flags that are sometimes available on some operating 357*0Sstevel@tonic-gatesystems include C<O_BINARY>, C<O_TEXT>, C<O_SHLOCK>, C<O_EXLOCK>, 358*0Sstevel@tonic-gateC<O_DEFER>, C<O_SYNC>, C<O_ASYNC>, C<O_DSYNC>, C<O_RSYNC>, 359*0Sstevel@tonic-gateC<O_NOCTTY>, C<O_NDELAY> and C<O_LARGEFILE>. Consult your open(2) 360*0Sstevel@tonic-gatemanpage or its local equivalent for details. (Note: starting from 361*0Sstevel@tonic-gatePerl release 5.6 the C<O_LARGEFILE> flag, if available, is automatically 362*0Sstevel@tonic-gateadded to the sysopen() flags because large files are the default.) 363*0Sstevel@tonic-gate 364*0Sstevel@tonic-gateHere's how to use C<sysopen> to emulate the simple C<open> calls we had 365*0Sstevel@tonic-gatebefore. We'll omit the C<|| die $!> checks for clarity, but make sure 366*0Sstevel@tonic-gateyou always check the return values in real code. These aren't quite 367*0Sstevel@tonic-gatethe same, since C<open> will trim leading and trailing white space, 368*0Sstevel@tonic-gatebut you'll get the idea. 369*0Sstevel@tonic-gate 370*0Sstevel@tonic-gateTo open a file for reading: 371*0Sstevel@tonic-gate 372*0Sstevel@tonic-gate open(FH, "< $path"); 373*0Sstevel@tonic-gate sysopen(FH, $path, O_RDONLY); 374*0Sstevel@tonic-gate 375*0Sstevel@tonic-gateTo open a file for writing, creating a new file if needed or else truncating 376*0Sstevel@tonic-gatean old file: 377*0Sstevel@tonic-gate 378*0Sstevel@tonic-gate open(FH, "> $path"); 379*0Sstevel@tonic-gate sysopen(FH, $path, O_WRONLY | O_TRUNC | O_CREAT); 380*0Sstevel@tonic-gate 381*0Sstevel@tonic-gateTo open a file for appending, creating one if necessary: 382*0Sstevel@tonic-gate 383*0Sstevel@tonic-gate open(FH, ">> $path"); 384*0Sstevel@tonic-gate sysopen(FH, $path, O_WRONLY | O_APPEND | O_CREAT); 385*0Sstevel@tonic-gate 386*0Sstevel@tonic-gateTo open a file for update, where the file must already exist: 387*0Sstevel@tonic-gate 388*0Sstevel@tonic-gate open(FH, "+< $path"); 389*0Sstevel@tonic-gate sysopen(FH, $path, O_RDWR); 390*0Sstevel@tonic-gate 391*0Sstevel@tonic-gateAnd here are things you can do with C<sysopen> that you cannot do with 392*0Sstevel@tonic-gatea regular C<open>. As you'll see, it's just a matter of controlling the 393*0Sstevel@tonic-gateflags in the third argument. 394*0Sstevel@tonic-gate 395*0Sstevel@tonic-gateTo open a file for writing, creating a new file which must not previously 396*0Sstevel@tonic-gateexist: 397*0Sstevel@tonic-gate 398*0Sstevel@tonic-gate sysopen(FH, $path, O_WRONLY | O_EXCL | O_CREAT); 399*0Sstevel@tonic-gate 400*0Sstevel@tonic-gateTo open a file for appending, where that file must already exist: 401*0Sstevel@tonic-gate 402*0Sstevel@tonic-gate sysopen(FH, $path, O_WRONLY | O_APPEND); 403*0Sstevel@tonic-gate 404*0Sstevel@tonic-gateTo open a file for update, creating a new file if necessary: 405*0Sstevel@tonic-gate 406*0Sstevel@tonic-gate sysopen(FH, $path, O_RDWR | O_CREAT); 407*0Sstevel@tonic-gate 408*0Sstevel@tonic-gateTo open a file for update, where that file must not already exist: 409*0Sstevel@tonic-gate 410*0Sstevel@tonic-gate sysopen(FH, $path, O_RDWR | O_EXCL | O_CREAT); 411*0Sstevel@tonic-gate 412*0Sstevel@tonic-gateTo open a file without blocking, creating one if necessary: 413*0Sstevel@tonic-gate 414*0Sstevel@tonic-gate sysopen(FH, $path, O_WRONLY | O_NONBLOCK | O_CREAT); 415*0Sstevel@tonic-gate 416*0Sstevel@tonic-gate=head2 Permissions E<agrave> la mode 417*0Sstevel@tonic-gate 418*0Sstevel@tonic-gateIf you omit the MASK argument to C<sysopen>, Perl uses the octal value 419*0Sstevel@tonic-gate0666. The normal MASK to use for executables and directories should 420*0Sstevel@tonic-gatebe 0777, and for anything else, 0666. 421*0Sstevel@tonic-gate 422*0Sstevel@tonic-gateWhy so permissive? Well, it isn't really. The MASK will be modified 423*0Sstevel@tonic-gateby your process's current C<umask>. A umask is a number representing 424*0Sstevel@tonic-gateI<disabled> permissions bits; that is, bits that will not be turned on 425*0Sstevel@tonic-gatein the created files' permissions field. 426*0Sstevel@tonic-gate 427*0Sstevel@tonic-gateFor example, if your C<umask> were 027, then the 020 part would 428*0Sstevel@tonic-gatedisable the group from writing, and the 007 part would disable others 429*0Sstevel@tonic-gatefrom reading, writing, or executing. Under these conditions, passing 430*0Sstevel@tonic-gateC<sysopen> 0666 would create a file with mode 0640, since C<0666 & ~027> 431*0Sstevel@tonic-gateis 0640. 432*0Sstevel@tonic-gate 433*0Sstevel@tonic-gateYou should seldom use the MASK argument to C<sysopen()>. That takes 434*0Sstevel@tonic-gateaway the user's freedom to choose what permission new files will have. 435*0Sstevel@tonic-gateDenying choice is almost always a bad thing. One exception would be for 436*0Sstevel@tonic-gatecases where sensitive or private data is being stored, such as with mail 437*0Sstevel@tonic-gatefolders, cookie files, and internal temporary files. 438*0Sstevel@tonic-gate 439*0Sstevel@tonic-gate=head1 Obscure Open Tricks 440*0Sstevel@tonic-gate 441*0Sstevel@tonic-gate=head2 Re-Opening Files (dups) 442*0Sstevel@tonic-gate 443*0Sstevel@tonic-gateSometimes you already have a filehandle open, and want to make another 444*0Sstevel@tonic-gatehandle that's a duplicate of the first one. In the shell, we place an 445*0Sstevel@tonic-gateampersand in front of a file descriptor number when doing redirections. 446*0Sstevel@tonic-gateFor example, C<< 2>&1 >> makes descriptor 2 (that's STDERR in Perl) 447*0Sstevel@tonic-gatebe redirected into descriptor 1 (which is usually Perl's STDOUT). 448*0Sstevel@tonic-gateThe same is essentially true in Perl: a filename that begins with an 449*0Sstevel@tonic-gateampersand is treated instead as a file descriptor if a number, or as a 450*0Sstevel@tonic-gatefilehandle if a string. 451*0Sstevel@tonic-gate 452*0Sstevel@tonic-gate open(SAVEOUT, ">&SAVEERR") || die "couldn't dup SAVEERR: $!"; 453*0Sstevel@tonic-gate open(MHCONTEXT, "<&4") || die "couldn't dup fd4: $!"; 454*0Sstevel@tonic-gate 455*0Sstevel@tonic-gateThat means that if a function is expecting a filename, but you don't 456*0Sstevel@tonic-gatewant to give it a filename because you already have the file open, you 457*0Sstevel@tonic-gatecan just pass the filehandle with a leading ampersand. It's best to 458*0Sstevel@tonic-gateuse a fully qualified handle though, just in case the function happens 459*0Sstevel@tonic-gateto be in a different package: 460*0Sstevel@tonic-gate 461*0Sstevel@tonic-gate somefunction("&main::LOGFILE"); 462*0Sstevel@tonic-gate 463*0Sstevel@tonic-gateThis way if somefunction() is planning on opening its argument, it can 464*0Sstevel@tonic-gatejust use the already opened handle. This differs from passing a handle, 465*0Sstevel@tonic-gatebecause with a handle, you don't open the file. Here you have something 466*0Sstevel@tonic-gateyou can pass to open. 467*0Sstevel@tonic-gate 468*0Sstevel@tonic-gateIf you have one of those tricky, newfangled I/O objects that the C++ 469*0Sstevel@tonic-gatefolks are raving about, then this doesn't work because those aren't a 470*0Sstevel@tonic-gateproper filehandle in the native Perl sense. You'll have to use fileno() 471*0Sstevel@tonic-gateto pull out the proper descriptor number, assuming you can: 472*0Sstevel@tonic-gate 473*0Sstevel@tonic-gate use IO::Socket; 474*0Sstevel@tonic-gate $handle = IO::Socket::INET->new("www.perl.com:80"); 475*0Sstevel@tonic-gate $fd = $handle->fileno; 476*0Sstevel@tonic-gate somefunction("&$fd"); # not an indirect function call 477*0Sstevel@tonic-gate 478*0Sstevel@tonic-gateIt can be easier (and certainly will be faster) just to use real 479*0Sstevel@tonic-gatefilehandles though: 480*0Sstevel@tonic-gate 481*0Sstevel@tonic-gate use IO::Socket; 482*0Sstevel@tonic-gate local *REMOTE = IO::Socket::INET->new("www.perl.com:80"); 483*0Sstevel@tonic-gate die "can't connect" unless defined(fileno(REMOTE)); 484*0Sstevel@tonic-gate somefunction("&main::REMOTE"); 485*0Sstevel@tonic-gate 486*0Sstevel@tonic-gateIf the filehandle or descriptor number is preceded not just with a simple 487*0Sstevel@tonic-gate"&" but rather with a "&=" combination, then Perl will not create a 488*0Sstevel@tonic-gatecompletely new descriptor opened to the same place using the dup(2) 489*0Sstevel@tonic-gatesystem call. Instead, it will just make something of an alias to the 490*0Sstevel@tonic-gateexisting one using the fdopen(3S) library call This is slightly more 491*0Sstevel@tonic-gateparsimonious of systems resources, although this is less a concern 492*0Sstevel@tonic-gatethese days. Here's an example of that: 493*0Sstevel@tonic-gate 494*0Sstevel@tonic-gate $fd = $ENV{"MHCONTEXTFD"}; 495*0Sstevel@tonic-gate open(MHCONTEXT, "<&=$fd") or die "couldn't fdopen $fd: $!"; 496*0Sstevel@tonic-gate 497*0Sstevel@tonic-gateIf you're using magic C<< <ARGV> >>, you could even pass in as a 498*0Sstevel@tonic-gatecommand line argument in @ARGV something like C<"<&=$MHCONTEXTFD">, 499*0Sstevel@tonic-gatebut we've never seen anyone actually do this. 500*0Sstevel@tonic-gate 501*0Sstevel@tonic-gate=head2 Dispelling the Dweomer 502*0Sstevel@tonic-gate 503*0Sstevel@tonic-gatePerl is more of a DWIMmer language than something like Java--where DWIM 504*0Sstevel@tonic-gateis an acronym for "do what I mean". But this principle sometimes leads 505*0Sstevel@tonic-gateto more hidden magic than one knows what to do with. In this way, Perl 506*0Sstevel@tonic-gateis also filled with I<dweomer>, an obscure word meaning an enchantment. 507*0Sstevel@tonic-gateSometimes, Perl's DWIMmer is just too much like dweomer for comfort. 508*0Sstevel@tonic-gate 509*0Sstevel@tonic-gateIf magic C<open> is a bit too magical for you, you don't have to turn 510*0Sstevel@tonic-gateto C<sysopen>. To open a file with arbitrary weird characters in 511*0Sstevel@tonic-gateit, it's necessary to protect any leading and trailing whitespace. 512*0Sstevel@tonic-gateLeading whitespace is protected by inserting a C<"./"> in front of a 513*0Sstevel@tonic-gatefilename that starts with whitespace. Trailing whitespace is protected 514*0Sstevel@tonic-gateby appending an ASCII NUL byte (C<"\0">) at the end of the string. 515*0Sstevel@tonic-gate 516*0Sstevel@tonic-gate $file =~ s#^(\s)#./$1#; 517*0Sstevel@tonic-gate open(FH, "< $file\0") || die "can't open $file: $!"; 518*0Sstevel@tonic-gate 519*0Sstevel@tonic-gateThis assumes, of course, that your system considers dot the current 520*0Sstevel@tonic-gateworking directory, slash the directory separator, and disallows ASCII 521*0Sstevel@tonic-gateNULs within a valid filename. Most systems follow these conventions, 522*0Sstevel@tonic-gateincluding all POSIX systems as well as proprietary Microsoft systems. 523*0Sstevel@tonic-gateThe only vaguely popular system that doesn't work this way is the 524*0Sstevel@tonic-gateproprietary Macintosh system, which uses a colon where the rest of us 525*0Sstevel@tonic-gateuse a slash. Maybe C<sysopen> isn't such a bad idea after all. 526*0Sstevel@tonic-gate 527*0Sstevel@tonic-gateIf you want to use C<< <ARGV> >> processing in a totally boring 528*0Sstevel@tonic-gateand non-magical way, you could do this first: 529*0Sstevel@tonic-gate 530*0Sstevel@tonic-gate # "Sam sat on the ground and put his head in his hands. 531*0Sstevel@tonic-gate # 'I wish I had never come here, and I don't want to see 532*0Sstevel@tonic-gate # no more magic,' he said, and fell silent." 533*0Sstevel@tonic-gate for (@ARGV) { 534*0Sstevel@tonic-gate s#^([^./])#./$1#; 535*0Sstevel@tonic-gate $_ .= "\0"; 536*0Sstevel@tonic-gate } 537*0Sstevel@tonic-gate while (<>) { 538*0Sstevel@tonic-gate # now process $_ 539*0Sstevel@tonic-gate } 540*0Sstevel@tonic-gate 541*0Sstevel@tonic-gateBut be warned that users will not appreciate being unable to use "-" 542*0Sstevel@tonic-gateto mean standard input, per the standard convention. 543*0Sstevel@tonic-gate 544*0Sstevel@tonic-gate=head2 Paths as Opens 545*0Sstevel@tonic-gate 546*0Sstevel@tonic-gateYou've probably noticed how Perl's C<warn> and C<die> functions can 547*0Sstevel@tonic-gateproduce messages like: 548*0Sstevel@tonic-gate 549*0Sstevel@tonic-gate Some warning at scriptname line 29, <FH> line 7. 550*0Sstevel@tonic-gate 551*0Sstevel@tonic-gateThat's because you opened a filehandle FH, and had read in seven records 552*0Sstevel@tonic-gatefrom it. But what was the name of the file, rather than the handle? 553*0Sstevel@tonic-gate 554*0Sstevel@tonic-gateIf you aren't running with C<strict refs>, or if you've turned them off 555*0Sstevel@tonic-gatetemporarily, then all you have to do is this: 556*0Sstevel@tonic-gate 557*0Sstevel@tonic-gate open($path, "< $path") || die "can't open $path: $!"; 558*0Sstevel@tonic-gate while (<$path>) { 559*0Sstevel@tonic-gate # whatever 560*0Sstevel@tonic-gate } 561*0Sstevel@tonic-gate 562*0Sstevel@tonic-gateSince you're using the pathname of the file as its handle, 563*0Sstevel@tonic-gateyou'll get warnings more like 564*0Sstevel@tonic-gate 565*0Sstevel@tonic-gate Some warning at scriptname line 29, </etc/motd> line 7. 566*0Sstevel@tonic-gate 567*0Sstevel@tonic-gate=head2 Single Argument Open 568*0Sstevel@tonic-gate 569*0Sstevel@tonic-gateRemember how we said that Perl's open took two arguments? That was a 570*0Sstevel@tonic-gatepassive prevarication. You see, it can also take just one argument. 571*0Sstevel@tonic-gateIf and only if the variable is a global variable, not a lexical, you 572*0Sstevel@tonic-gatecan pass C<open> just one argument, the filehandle, and it will 573*0Sstevel@tonic-gateget the path from the global scalar variable of the same name. 574*0Sstevel@tonic-gate 575*0Sstevel@tonic-gate $FILE = "/etc/motd"; 576*0Sstevel@tonic-gate open FILE or die "can't open $FILE: $!"; 577*0Sstevel@tonic-gate while (<FILE>) { 578*0Sstevel@tonic-gate # whatever 579*0Sstevel@tonic-gate } 580*0Sstevel@tonic-gate 581*0Sstevel@tonic-gateWhy is this here? Someone has to cater to the hysterical porpoises. 582*0Sstevel@tonic-gateIt's something that's been in Perl since the very beginning, if not 583*0Sstevel@tonic-gatebefore. 584*0Sstevel@tonic-gate 585*0Sstevel@tonic-gate=head2 Playing with STDIN and STDOUT 586*0Sstevel@tonic-gate 587*0Sstevel@tonic-gateOne clever move with STDOUT is to explicitly close it when you're done 588*0Sstevel@tonic-gatewith the program. 589*0Sstevel@tonic-gate 590*0Sstevel@tonic-gate END { close(STDOUT) || die "can't close stdout: $!" } 591*0Sstevel@tonic-gate 592*0Sstevel@tonic-gateIf you don't do this, and your program fills up the disk partition due 593*0Sstevel@tonic-gateto a command line redirection, it won't report the error exit with a 594*0Sstevel@tonic-gatefailure status. 595*0Sstevel@tonic-gate 596*0Sstevel@tonic-gateYou don't have to accept the STDIN and STDOUT you were given. You are 597*0Sstevel@tonic-gatewelcome to reopen them if you'd like. 598*0Sstevel@tonic-gate 599*0Sstevel@tonic-gate open(STDIN, "< datafile") 600*0Sstevel@tonic-gate || die "can't open datafile: $!"; 601*0Sstevel@tonic-gate 602*0Sstevel@tonic-gate open(STDOUT, "> output") 603*0Sstevel@tonic-gate || die "can't open output: $!"; 604*0Sstevel@tonic-gate 605*0Sstevel@tonic-gateAnd then these can be accessed directly or passed on to subprocesses. 606*0Sstevel@tonic-gateThis makes it look as though the program were initially invoked 607*0Sstevel@tonic-gatewith those redirections from the command line. 608*0Sstevel@tonic-gate 609*0Sstevel@tonic-gateIt's probably more interesting to connect these to pipes. For example: 610*0Sstevel@tonic-gate 611*0Sstevel@tonic-gate $pager = $ENV{PAGER} || "(less || more)"; 612*0Sstevel@tonic-gate open(STDOUT, "| $pager") 613*0Sstevel@tonic-gate || die "can't fork a pager: $!"; 614*0Sstevel@tonic-gate 615*0Sstevel@tonic-gateThis makes it appear as though your program were called with its stdout 616*0Sstevel@tonic-gatealready piped into your pager. You can also use this kind of thing 617*0Sstevel@tonic-gatein conjunction with an implicit fork to yourself. You might do this 618*0Sstevel@tonic-gateif you would rather handle the post processing in your own program, 619*0Sstevel@tonic-gatejust in a different process: 620*0Sstevel@tonic-gate 621*0Sstevel@tonic-gate head(100); 622*0Sstevel@tonic-gate while (<>) { 623*0Sstevel@tonic-gate print; 624*0Sstevel@tonic-gate } 625*0Sstevel@tonic-gate 626*0Sstevel@tonic-gate sub head { 627*0Sstevel@tonic-gate my $lines = shift || 20; 628*0Sstevel@tonic-gate return if $pid = open(STDOUT, "|-"); # return if parent 629*0Sstevel@tonic-gate die "cannot fork: $!" unless defined $pid; 630*0Sstevel@tonic-gate while (<STDIN>) { 631*0Sstevel@tonic-gate last if --$lines < 0; 632*0Sstevel@tonic-gate print; 633*0Sstevel@tonic-gate } 634*0Sstevel@tonic-gate exit; 635*0Sstevel@tonic-gate } 636*0Sstevel@tonic-gate 637*0Sstevel@tonic-gateThis technique can be applied to repeatedly push as many filters on your 638*0Sstevel@tonic-gateoutput stream as you wish. 639*0Sstevel@tonic-gate 640*0Sstevel@tonic-gate=head1 Other I/O Issues 641*0Sstevel@tonic-gate 642*0Sstevel@tonic-gateThese topics aren't really arguments related to C<open> or C<sysopen>, 643*0Sstevel@tonic-gatebut they do affect what you do with your open files. 644*0Sstevel@tonic-gate 645*0Sstevel@tonic-gate=head2 Opening Non-File Files 646*0Sstevel@tonic-gate 647*0Sstevel@tonic-gateWhen is a file not a file? Well, you could say when it exists but 648*0Sstevel@tonic-gateisn't a plain file. We'll check whether it's a symbolic link first, 649*0Sstevel@tonic-gatejust in case. 650*0Sstevel@tonic-gate 651*0Sstevel@tonic-gate if (-l $file || ! -f _) { 652*0Sstevel@tonic-gate print "$file is not a plain file\n"; 653*0Sstevel@tonic-gate } 654*0Sstevel@tonic-gate 655*0Sstevel@tonic-gateWhat other kinds of files are there than, well, files? Directories, 656*0Sstevel@tonic-gatesymbolic links, named pipes, Unix-domain sockets, and block and character 657*0Sstevel@tonic-gatedevices. Those are all files, too--just not I<plain> files. This isn't 658*0Sstevel@tonic-gatethe same issue as being a text file. Not all text files are plain files. 659*0Sstevel@tonic-gateNot all plain files are text files. That's why there are separate C<-f> 660*0Sstevel@tonic-gateand C<-T> file tests. 661*0Sstevel@tonic-gate 662*0Sstevel@tonic-gateTo open a directory, you should use the C<opendir> function, then 663*0Sstevel@tonic-gateprocess it with C<readdir>, carefully restoring the directory 664*0Sstevel@tonic-gatename if necessary: 665*0Sstevel@tonic-gate 666*0Sstevel@tonic-gate opendir(DIR, $dirname) or die "can't opendir $dirname: $!"; 667*0Sstevel@tonic-gate while (defined($file = readdir(DIR))) { 668*0Sstevel@tonic-gate # do something with "$dirname/$file" 669*0Sstevel@tonic-gate } 670*0Sstevel@tonic-gate closedir(DIR); 671*0Sstevel@tonic-gate 672*0Sstevel@tonic-gateIf you want to process directories recursively, it's better to use the 673*0Sstevel@tonic-gateFile::Find module. For example, this prints out all files recursively 674*0Sstevel@tonic-gateand adds a slash to their names if the file is a directory. 675*0Sstevel@tonic-gate 676*0Sstevel@tonic-gate @ARGV = qw(.) unless @ARGV; 677*0Sstevel@tonic-gate use File::Find; 678*0Sstevel@tonic-gate find sub { print $File::Find::name, -d && '/', "\n" }, @ARGV; 679*0Sstevel@tonic-gate 680*0Sstevel@tonic-gateThis finds all bogus symbolic links beneath a particular directory: 681*0Sstevel@tonic-gate 682*0Sstevel@tonic-gate find sub { print "$File::Find::name\n" if -l && !-e }, $dir; 683*0Sstevel@tonic-gate 684*0Sstevel@tonic-gateAs you see, with symbolic links, you can just pretend that it is 685*0Sstevel@tonic-gatewhat it points to. Or, if you want to know I<what> it points to, then 686*0Sstevel@tonic-gateC<readlink> is called for: 687*0Sstevel@tonic-gate 688*0Sstevel@tonic-gate if (-l $file) { 689*0Sstevel@tonic-gate if (defined($whither = readlink($file))) { 690*0Sstevel@tonic-gate print "$file points to $whither\n"; 691*0Sstevel@tonic-gate } else { 692*0Sstevel@tonic-gate print "$file points nowhere: $!\n"; 693*0Sstevel@tonic-gate } 694*0Sstevel@tonic-gate } 695*0Sstevel@tonic-gate 696*0Sstevel@tonic-gate=head2 Opening Named Pipes 697*0Sstevel@tonic-gate 698*0Sstevel@tonic-gateNamed pipes are a different matter. You pretend they're regular files, 699*0Sstevel@tonic-gatebut their opens will normally block until there is both a reader and 700*0Sstevel@tonic-gatea writer. You can read more about them in L<perlipc/"Named Pipes">. 701*0Sstevel@tonic-gateUnix-domain sockets are rather different beasts as well; they're 702*0Sstevel@tonic-gatedescribed in L<perlipc/"Unix-Domain TCP Clients and Servers">. 703*0Sstevel@tonic-gate 704*0Sstevel@tonic-gateWhen it comes to opening devices, it can be easy and it can be tricky. 705*0Sstevel@tonic-gateWe'll assume that if you're opening up a block device, you know what 706*0Sstevel@tonic-gateyou're doing. The character devices are more interesting. These are 707*0Sstevel@tonic-gatetypically used for modems, mice, and some kinds of printers. This is 708*0Sstevel@tonic-gatedescribed in L<perlfaq8/"How do I read and write the serial port?"> 709*0Sstevel@tonic-gateIt's often enough to open them carefully: 710*0Sstevel@tonic-gate 711*0Sstevel@tonic-gate sysopen(TTYIN, "/dev/ttyS1", O_RDWR | O_NDELAY | O_NOCTTY) 712*0Sstevel@tonic-gate # (O_NOCTTY no longer needed on POSIX systems) 713*0Sstevel@tonic-gate or die "can't open /dev/ttyS1: $!"; 714*0Sstevel@tonic-gate open(TTYOUT, "+>&TTYIN") 715*0Sstevel@tonic-gate or die "can't dup TTYIN: $!"; 716*0Sstevel@tonic-gate 717*0Sstevel@tonic-gate $ofh = select(TTYOUT); $| = 1; select($ofh); 718*0Sstevel@tonic-gate 719*0Sstevel@tonic-gate print TTYOUT "+++at\015"; 720*0Sstevel@tonic-gate $answer = <TTYIN>; 721*0Sstevel@tonic-gate 722*0Sstevel@tonic-gateWith descriptors that you haven't opened using C<sysopen>, such as 723*0Sstevel@tonic-gatesockets, you can set them to be non-blocking using C<fcntl>: 724*0Sstevel@tonic-gate 725*0Sstevel@tonic-gate use Fcntl; 726*0Sstevel@tonic-gate my $old_flags = fcntl($handle, F_GETFL, 0) 727*0Sstevel@tonic-gate or die "can't get flags: $!"; 728*0Sstevel@tonic-gate fcntl($handle, F_SETFL, $old_flags | O_NONBLOCK) 729*0Sstevel@tonic-gate or die "can't set non blocking: $!"; 730*0Sstevel@tonic-gate 731*0Sstevel@tonic-gateRather than losing yourself in a morass of twisting, turning C<ioctl>s, 732*0Sstevel@tonic-gateall dissimilar, if you're going to manipulate ttys, it's best to 733*0Sstevel@tonic-gatemake calls out to the stty(1) program if you have it, or else use the 734*0Sstevel@tonic-gateportable POSIX interface. To figure this all out, you'll need to read the 735*0Sstevel@tonic-gatetermios(3) manpage, which describes the POSIX interface to tty devices, 736*0Sstevel@tonic-gateand then L<POSIX>, which describes Perl's interface to POSIX. There are 737*0Sstevel@tonic-gatealso some high-level modules on CPAN that can help you with these games. 738*0Sstevel@tonic-gateCheck out Term::ReadKey and Term::ReadLine. 739*0Sstevel@tonic-gate 740*0Sstevel@tonic-gate=head2 Opening Sockets 741*0Sstevel@tonic-gate 742*0Sstevel@tonic-gateWhat else can you open? To open a connection using sockets, you won't use 743*0Sstevel@tonic-gateone of Perl's two open functions. See 744*0Sstevel@tonic-gateL<perlipc/"Sockets: Client/Server Communication"> for that. Here's an 745*0Sstevel@tonic-gateexample. Once you have it, you can use FH as a bidirectional filehandle. 746*0Sstevel@tonic-gate 747*0Sstevel@tonic-gate use IO::Socket; 748*0Sstevel@tonic-gate local *FH = IO::Socket::INET->new("www.perl.com:80"); 749*0Sstevel@tonic-gate 750*0Sstevel@tonic-gateFor opening up a URL, the LWP modules from CPAN are just what 751*0Sstevel@tonic-gatethe doctor ordered. There's no filehandle interface, but 752*0Sstevel@tonic-gateit's still easy to get the contents of a document: 753*0Sstevel@tonic-gate 754*0Sstevel@tonic-gate use LWP::Simple; 755*0Sstevel@tonic-gate $doc = get('http://www.linpro.no/lwp/'); 756*0Sstevel@tonic-gate 757*0Sstevel@tonic-gate=head2 Binary Files 758*0Sstevel@tonic-gate 759*0Sstevel@tonic-gateOn certain legacy systems with what could charitably be called terminally 760*0Sstevel@tonic-gateconvoluted (some would say broken) I/O models, a file isn't a file--at 761*0Sstevel@tonic-gateleast, not with respect to the C standard I/O library. On these old 762*0Sstevel@tonic-gatesystems whose libraries (but not kernels) distinguish between text and 763*0Sstevel@tonic-gatebinary streams, to get files to behave properly you'll have to bend over 764*0Sstevel@tonic-gatebackwards to avoid nasty problems. On such infelicitous systems, sockets 765*0Sstevel@tonic-gateand pipes are already opened in binary mode, and there is currently no 766*0Sstevel@tonic-gateway to turn that off. With files, you have more options. 767*0Sstevel@tonic-gate 768*0Sstevel@tonic-gateAnother option is to use the C<binmode> function on the appropriate 769*0Sstevel@tonic-gatehandles before doing regular I/O on them: 770*0Sstevel@tonic-gate 771*0Sstevel@tonic-gate binmode(STDIN); 772*0Sstevel@tonic-gate binmode(STDOUT); 773*0Sstevel@tonic-gate while (<STDIN>) { print } 774*0Sstevel@tonic-gate 775*0Sstevel@tonic-gatePassing C<sysopen> a non-standard flag option will also open the file in 776*0Sstevel@tonic-gatebinary mode on those systems that support it. This is the equivalent of 777*0Sstevel@tonic-gateopening the file normally, then calling C<binmode> on the handle. 778*0Sstevel@tonic-gate 779*0Sstevel@tonic-gate sysopen(BINDAT, "records.data", O_RDWR | O_BINARY) 780*0Sstevel@tonic-gate || die "can't open records.data: $!"; 781*0Sstevel@tonic-gate 782*0Sstevel@tonic-gateNow you can use C<read> and C<print> on that handle without worrying 783*0Sstevel@tonic-gateabout the non-standard system I/O library breaking your data. It's not 784*0Sstevel@tonic-gatea pretty picture, but then, legacy systems seldom are. CP/M will be 785*0Sstevel@tonic-gatewith us until the end of days, and after. 786*0Sstevel@tonic-gate 787*0Sstevel@tonic-gateOn systems with exotic I/O systems, it turns out that, astonishingly 788*0Sstevel@tonic-gateenough, even unbuffered I/O using C<sysread> and C<syswrite> might do 789*0Sstevel@tonic-gatesneaky data mutilation behind your back. 790*0Sstevel@tonic-gate 791*0Sstevel@tonic-gate while (sysread(WHENCE, $buf, 1024)) { 792*0Sstevel@tonic-gate syswrite(WHITHER, $buf, length($buf)); 793*0Sstevel@tonic-gate } 794*0Sstevel@tonic-gate 795*0Sstevel@tonic-gateDepending on the vicissitudes of your runtime system, even these calls 796*0Sstevel@tonic-gatemay need C<binmode> or C<O_BINARY> first. Systems known to be free of 797*0Sstevel@tonic-gatesuch difficulties include Unix, the Mac OS, Plan 9, and Inferno. 798*0Sstevel@tonic-gate 799*0Sstevel@tonic-gate=head2 File Locking 800*0Sstevel@tonic-gate 801*0Sstevel@tonic-gateIn a multitasking environment, you may need to be careful not to collide 802*0Sstevel@tonic-gatewith other processes who want to do I/O on the same files as you 803*0Sstevel@tonic-gateare working on. You'll often need shared or exclusive locks 804*0Sstevel@tonic-gateon files for reading and writing respectively. You might just 805*0Sstevel@tonic-gatepretend that only exclusive locks exist. 806*0Sstevel@tonic-gate 807*0Sstevel@tonic-gateNever use the existence of a file C<-e $file> as a locking indication, 808*0Sstevel@tonic-gatebecause there is a race condition between the test for the existence of 809*0Sstevel@tonic-gatethe file and its creation. It's possible for another process to create 810*0Sstevel@tonic-gatea file in the slice of time between your existence check and your attempt 811*0Sstevel@tonic-gateto create the file. Atomicity is critical. 812*0Sstevel@tonic-gate 813*0Sstevel@tonic-gatePerl's most portable locking interface is via the C<flock> function, 814*0Sstevel@tonic-gatewhose simplicity is emulated on systems that don't directly support it 815*0Sstevel@tonic-gatesuch as SysV or Windows. The underlying semantics may affect how 816*0Sstevel@tonic-gateit all works, so you should learn how C<flock> is implemented on your 817*0Sstevel@tonic-gatesystem's port of Perl. 818*0Sstevel@tonic-gate 819*0Sstevel@tonic-gateFile locking I<does not> lock out another process that would like to 820*0Sstevel@tonic-gatedo I/O. A file lock only locks out others trying to get a lock, not 821*0Sstevel@tonic-gateprocesses trying to do I/O. Because locks are advisory, if one process 822*0Sstevel@tonic-gateuses locking and another doesn't, all bets are off. 823*0Sstevel@tonic-gate 824*0Sstevel@tonic-gateBy default, the C<flock> call will block until a lock is granted. 825*0Sstevel@tonic-gateA request for a shared lock will be granted as soon as there is no 826*0Sstevel@tonic-gateexclusive locker. A request for an exclusive lock will be granted as 827*0Sstevel@tonic-gatesoon as there is no locker of any kind. Locks are on file descriptors, 828*0Sstevel@tonic-gatenot file names. You can't lock a file until you open it, and you can't 829*0Sstevel@tonic-gatehold on to a lock once the file has been closed. 830*0Sstevel@tonic-gate 831*0Sstevel@tonic-gateHere's how to get a blocking shared lock on a file, typically used 832*0Sstevel@tonic-gatefor reading: 833*0Sstevel@tonic-gate 834*0Sstevel@tonic-gate use 5.004; 835*0Sstevel@tonic-gate use Fcntl qw(:DEFAULT :flock); 836*0Sstevel@tonic-gate open(FH, "< filename") or die "can't open filename: $!"; 837*0Sstevel@tonic-gate flock(FH, LOCK_SH) or die "can't lock filename: $!"; 838*0Sstevel@tonic-gate # now read from FH 839*0Sstevel@tonic-gate 840*0Sstevel@tonic-gateYou can get a non-blocking lock by using C<LOCK_NB>. 841*0Sstevel@tonic-gate 842*0Sstevel@tonic-gate flock(FH, LOCK_SH | LOCK_NB) 843*0Sstevel@tonic-gate or die "can't lock filename: $!"; 844*0Sstevel@tonic-gate 845*0Sstevel@tonic-gateThis can be useful for producing more user-friendly behaviour by warning 846*0Sstevel@tonic-gateif you're going to be blocking: 847*0Sstevel@tonic-gate 848*0Sstevel@tonic-gate use 5.004; 849*0Sstevel@tonic-gate use Fcntl qw(:DEFAULT :flock); 850*0Sstevel@tonic-gate open(FH, "< filename") or die "can't open filename: $!"; 851*0Sstevel@tonic-gate unless (flock(FH, LOCK_SH | LOCK_NB)) { 852*0Sstevel@tonic-gate $| = 1; 853*0Sstevel@tonic-gate print "Waiting for lock..."; 854*0Sstevel@tonic-gate flock(FH, LOCK_SH) or die "can't lock filename: $!"; 855*0Sstevel@tonic-gate print "got it.\n" 856*0Sstevel@tonic-gate } 857*0Sstevel@tonic-gate # now read from FH 858*0Sstevel@tonic-gate 859*0Sstevel@tonic-gateTo get an exclusive lock, typically used for writing, you have to be 860*0Sstevel@tonic-gatecareful. We C<sysopen> the file so it can be locked before it gets 861*0Sstevel@tonic-gateemptied. You can get a nonblocking version using C<LOCK_EX | LOCK_NB>. 862*0Sstevel@tonic-gate 863*0Sstevel@tonic-gate use 5.004; 864*0Sstevel@tonic-gate use Fcntl qw(:DEFAULT :flock); 865*0Sstevel@tonic-gate sysopen(FH, "filename", O_WRONLY | O_CREAT) 866*0Sstevel@tonic-gate or die "can't open filename: $!"; 867*0Sstevel@tonic-gate flock(FH, LOCK_EX) 868*0Sstevel@tonic-gate or die "can't lock filename: $!"; 869*0Sstevel@tonic-gate truncate(FH, 0) 870*0Sstevel@tonic-gate or die "can't truncate filename: $!"; 871*0Sstevel@tonic-gate # now write to FH 872*0Sstevel@tonic-gate 873*0Sstevel@tonic-gateFinally, due to the uncounted millions who cannot be dissuaded from 874*0Sstevel@tonic-gatewasting cycles on useless vanity devices called hit counters, here's 875*0Sstevel@tonic-gatehow to increment a number in a file safely: 876*0Sstevel@tonic-gate 877*0Sstevel@tonic-gate use Fcntl qw(:DEFAULT :flock); 878*0Sstevel@tonic-gate 879*0Sstevel@tonic-gate sysopen(FH, "numfile", O_RDWR | O_CREAT) 880*0Sstevel@tonic-gate or die "can't open numfile: $!"; 881*0Sstevel@tonic-gate # autoflush FH 882*0Sstevel@tonic-gate $ofh = select(FH); $| = 1; select ($ofh); 883*0Sstevel@tonic-gate flock(FH, LOCK_EX) 884*0Sstevel@tonic-gate or die "can't write-lock numfile: $!"; 885*0Sstevel@tonic-gate 886*0Sstevel@tonic-gate $num = <FH> || 0; 887*0Sstevel@tonic-gate seek(FH, 0, 0) 888*0Sstevel@tonic-gate or die "can't rewind numfile : $!"; 889*0Sstevel@tonic-gate print FH $num+1, "\n" 890*0Sstevel@tonic-gate or die "can't write numfile: $!"; 891*0Sstevel@tonic-gate 892*0Sstevel@tonic-gate truncate(FH, tell(FH)) 893*0Sstevel@tonic-gate or die "can't truncate numfile: $!"; 894*0Sstevel@tonic-gate close(FH) 895*0Sstevel@tonic-gate or die "can't close numfile: $!"; 896*0Sstevel@tonic-gate 897*0Sstevel@tonic-gate=head2 IO Layers 898*0Sstevel@tonic-gate 899*0Sstevel@tonic-gateIn Perl 5.8.0 a new I/O framework called "PerlIO" was introduced. 900*0Sstevel@tonic-gateThis is a new "plumbing" for all the I/O happening in Perl; for the 901*0Sstevel@tonic-gatemost part everything will work just as it did, but PerlIO also brought 902*0Sstevel@tonic-gatein some new features such as the ability to think of I/O as "layers". 903*0Sstevel@tonic-gateOne I/O layer may in addition to just moving the data also do 904*0Sstevel@tonic-gatetransformations on the data. Such transformations may include 905*0Sstevel@tonic-gatecompression and decompression, encryption and decryption, and transforming 906*0Sstevel@tonic-gatebetween various character encodings. 907*0Sstevel@tonic-gate 908*0Sstevel@tonic-gateFull discussion about the features of PerlIO is out of scope for this 909*0Sstevel@tonic-gatetutorial, but here is how to recognize the layers being used: 910*0Sstevel@tonic-gate 911*0Sstevel@tonic-gate=over 4 912*0Sstevel@tonic-gate 913*0Sstevel@tonic-gate=item * 914*0Sstevel@tonic-gate 915*0Sstevel@tonic-gateThe three-(or more)-argument form of C<open> is being used and the 916*0Sstevel@tonic-gatesecond argument contains something else in addition to the usual 917*0Sstevel@tonic-gateC<< '<' >>, C<< '>' >>, C<< '>>' >>, C<< '|' >> and their variants, 918*0Sstevel@tonic-gatefor example: 919*0Sstevel@tonic-gate 920*0Sstevel@tonic-gate open(my $fh, "<:utf8", $fn); 921*0Sstevel@tonic-gate 922*0Sstevel@tonic-gate=item * 923*0Sstevel@tonic-gate 924*0Sstevel@tonic-gateThe two-argument form of C<binmode> is being used, for example 925*0Sstevel@tonic-gate 926*0Sstevel@tonic-gate binmode($fh, ":encoding(utf16)"); 927*0Sstevel@tonic-gate 928*0Sstevel@tonic-gate=back 929*0Sstevel@tonic-gate 930*0Sstevel@tonic-gateFor more detailed discussion about PerlIO see L<PerlIO>; 931*0Sstevel@tonic-gatefor more detailed discussion about Unicode and I/O see L<perluniintro>. 932*0Sstevel@tonic-gate 933*0Sstevel@tonic-gate=head1 SEE ALSO 934*0Sstevel@tonic-gate 935*0Sstevel@tonic-gateThe C<open> and C<sysopen> functions in perlfunc(1); 936*0Sstevel@tonic-gatethe system open(2), dup(2), fopen(3), and fdopen(3) manpages; 937*0Sstevel@tonic-gatethe POSIX documentation. 938*0Sstevel@tonic-gate 939*0Sstevel@tonic-gate=head1 AUTHOR and COPYRIGHT 940*0Sstevel@tonic-gate 941*0Sstevel@tonic-gateCopyright 1998 Tom Christiansen. 942*0Sstevel@tonic-gate 943*0Sstevel@tonic-gateThis documentation is free; you can redistribute it and/or modify it 944*0Sstevel@tonic-gateunder the same terms as Perl itself. 945*0Sstevel@tonic-gate 946*0Sstevel@tonic-gateIrrespective of its distribution, all code examples in these files are 947*0Sstevel@tonic-gatehereby placed into the public domain. You are permitted and 948*0Sstevel@tonic-gateencouraged to use this code in your own programs for fun or for profit 949*0Sstevel@tonic-gateas you see fit. A simple comment in the code giving credit would be 950*0Sstevel@tonic-gatecourteous but is not required. 951*0Sstevel@tonic-gate 952*0Sstevel@tonic-gate=head1 HISTORY 953*0Sstevel@tonic-gate 954*0Sstevel@tonic-gateFirst release: Sat Jan 9 08:09:11 MST 1999 955