1=head1 NAME 2 3perlipc - Perl interprocess communication (signals, fifos, pipes, safe subprocesses, sockets, and semaphores) 4 5=head1 DESCRIPTION 6 7The basic IPC facilities of Perl are built out of the good old Unix 8signals, named pipes, pipe opens, the Berkeley socket routines, and SysV 9IPC calls. Each is used in slightly different situations. 10 11=head1 Signals 12 13Perl uses a simple signal handling model: the %SIG hash contains names 14or references of user-installed signal handlers. These handlers will 15be called with an argument which is the name of the signal that 16triggered it. A signal may be generated intentionally from a 17particular keyboard sequence like control-C or control-Z, sent to you 18from another process, or triggered automatically by the kernel when 19special events transpire, like a child process exiting, your process 20running out of stack space, or hitting file size limit. 21 22For example, to trap an interrupt signal, set up a handler like this: 23 24 sub catch_zap { 25 my $signame = shift; 26 $shucks++; 27 die "Somebody sent me a SIG$signame"; 28 } 29 $SIG{INT} = 'catch_zap'; # could fail in modules 30 $SIG{INT} = \&catch_zap; # best strategy 31 32Prior to Perl 5.7.3 it was necessary to do as little as you possibly 33could in your handler; notice how all we do is set a global variable 34and then raise an exception. That's because on most systems, 35libraries are not re-entrant; particularly, memory allocation and I/O 36routines are not. That meant that doing nearly I<anything> in your 37handler could in theory trigger a memory fault and subsequent core 38dump - see L</Deferred Signals (Safe Signals)> below. 39 40The names of the signals are the ones listed out by C<kill -l> on your 41system, or you can retrieve them from the Config module. Set up an 42@signame list indexed by number to get the name and a %signo table 43indexed by name to get the number: 44 45 use Config; 46 defined $Config{sig_name} || die "No sigs?"; 47 foreach $name (split(' ', $Config{sig_name})) { 48 $signo{$name} = $i; 49 $signame[$i] = $name; 50 $i++; 51 } 52 53So to check whether signal 17 and SIGALRM were the same, do just this: 54 55 print "signal #17 = $signame[17]\n"; 56 if ($signo{ALRM}) { 57 print "SIGALRM is $signo{ALRM}\n"; 58 } 59 60You may also choose to assign the strings C<'IGNORE'> or C<'DEFAULT'> as 61the handler, in which case Perl will try to discard the signal or do the 62default thing. 63 64On most Unix platforms, the C<CHLD> (sometimes also known as C<CLD>) signal 65has special behavior with respect to a value of C<'IGNORE'>. 66Setting C<$SIG{CHLD}> to C<'IGNORE'> on such a platform has the effect of 67not creating zombie processes when the parent process fails to C<wait()> 68on its child processes (i.e. child processes are automatically reaped). 69Calling C<wait()> with C<$SIG{CHLD}> set to C<'IGNORE'> usually returns 70C<-1> on such platforms. 71 72Some signals can be neither trapped nor ignored, such as 73the KILL and STOP (but not the TSTP) signals. One strategy for 74temporarily ignoring signals is to use a local() statement, which will be 75automatically restored once your block is exited. (Remember that local() 76values are "inherited" by functions called from within that block.) 77 78 sub precious { 79 local $SIG{INT} = 'IGNORE'; 80 &more_functions; 81 } 82 sub more_functions { 83 # interrupts still ignored, for now... 84 } 85 86Sending a signal to a negative process ID means that you send the signal 87to the entire Unix process-group. This code sends a hang-up signal to all 88processes in the current process group (and sets $SIG{HUP} to IGNORE so 89it doesn't kill itself): 90 91 { 92 local $SIG{HUP} = 'IGNORE'; 93 kill HUP => -$$; 94 # snazzy writing of: kill('HUP', -$$) 95 } 96 97Another interesting signal to send is signal number zero. This doesn't 98actually affect a child process, but instead checks whether it's alive 99or has changed its UID. 100 101 unless (kill 0 => $kid_pid) { 102 warn "something wicked happened to $kid_pid"; 103 } 104 105When directed at a process whose UID is not identical to that 106of the sending process, signal number zero may fail because 107you lack permission to send the signal, even though the process is alive. 108You may be able to determine the cause of failure using C<%!>. 109 110 unless (kill 0 => $pid or $!{EPERM}) { 111 warn "$pid looks dead"; 112 } 113 114You might also want to employ anonymous functions for simple signal 115handlers: 116 117 $SIG{INT} = sub { die "\nOutta here!\n" }; 118 119But that will be problematic for the more complicated handlers that need 120to reinstall themselves. Because Perl's signal mechanism is currently 121based on the signal(3) function from the C library, you may sometimes be so 122misfortunate as to run on systems where that function is "broken", that 123is, it behaves in the old unreliable SysV way rather than the newer, more 124reasonable BSD and POSIX fashion. So you'll see defensive people writing 125signal handlers like this: 126 127 sub REAPER { 128 $waitedpid = wait; 129 # loathe sysV: it makes us not only reinstate 130 # the handler, but place it after the wait 131 $SIG{CHLD} = \&REAPER; 132 } 133 $SIG{CHLD} = \&REAPER; 134 # now do something that forks... 135 136or better still: 137 138 use POSIX ":sys_wait_h"; 139 sub REAPER { 140 my $child; 141 # If a second child dies while in the signal handler caused by the 142 # first death, we won't get another signal. So must loop here else 143 # we will leave the unreaped child as a zombie. And the next time 144 # two children die we get another zombie. And so on. 145 while (($child = waitpid(-1,WNOHANG)) > 0) { 146 $Kid_Status{$child} = $?; 147 } 148 $SIG{CHLD} = \&REAPER; # still loathe sysV 149 } 150 $SIG{CHLD} = \&REAPER; 151 # do something that forks... 152 153Signal handling is also used for timeouts in Unix, While safely 154protected within an C<eval{}> block, you set a signal handler to trap 155alarm signals and then schedule to have one delivered to you in some 156number of seconds. Then try your blocking operation, clearing the alarm 157when it's done but not before you've exited your C<eval{}> block. If it 158goes off, you'll use die() to jump out of the block, much as you might 159using longjmp() or throw() in other languages. 160 161Here's an example: 162 163 eval { 164 local $SIG{ALRM} = sub { die "alarm clock restart" }; 165 alarm 10; 166 flock(FH, 2); # blocking write lock 167 alarm 0; 168 }; 169 if ($@ and $@ !~ /alarm clock restart/) { die } 170 171If the operation being timed out is system() or qx(), this technique 172is liable to generate zombies. If this matters to you, you'll 173need to do your own fork() and exec(), and kill the errant child process. 174 175For more complex signal handling, you might see the standard POSIX 176module. Lamentably, this is almost entirely undocumented, but 177the F<t/lib/posix.t> file from the Perl source distribution has some 178examples in it. 179 180=head2 Handling the SIGHUP Signal in Daemons 181 182A process that usually starts when the system boots and shuts down 183when the system is shut down is called a daemon (Disk And Execution 184MONitor). If a daemon process has a configuration file which is 185modified after the process has been started, there should be a way to 186tell that process to re-read its configuration file, without stopping 187the process. Many daemons provide this mechanism using the C<SIGHUP> 188signal handler. When you want to tell the daemon to re-read the file 189you simply send it the C<SIGHUP> signal. 190 191Not all platforms automatically reinstall their (native) signal 192handlers after a signal delivery. This means that the handler works 193only the first time the signal is sent. The solution to this problem 194is to use C<POSIX> signal handlers if available, their behaviour 195is well-defined. 196 197The following example implements a simple daemon, which restarts 198itself every time the C<SIGHUP> signal is received. The actual code is 199located in the subroutine C<code()>, which simply prints some debug 200info to show that it works and should be replaced with the real code. 201 202 #!/usr/bin/perl -w 203 204 use POSIX (); 205 use FindBin (); 206 use File::Basename (); 207 use File::Spec::Functions; 208 209 $|=1; 210 211 # make the daemon cross-platform, so exec always calls the script 212 # itself with the right path, no matter how the script was invoked. 213 my $script = File::Basename::basename($0); 214 my $SELF = catfile $FindBin::Bin, $script; 215 216 # POSIX unmasks the sigprocmask properly 217 my $sigset = POSIX::SigSet->new(); 218 my $action = POSIX::SigAction->new('sigHUP_handler', 219 $sigset, 220 &POSIX::SA_NODEFER); 221 POSIX::sigaction(&POSIX::SIGHUP, $action); 222 223 sub sigHUP_handler { 224 print "got SIGHUP\n"; 225 exec($SELF, @ARGV) or die "Couldn't restart: $!\n"; 226 } 227 228 code(); 229 230 sub code { 231 print "PID: $$\n"; 232 print "ARGV: @ARGV\n"; 233 my $c = 0; 234 while (++$c) { 235 sleep 2; 236 print "$c\n"; 237 } 238 } 239 __END__ 240 241 242=head1 Named Pipes 243 244A named pipe (often referred to as a FIFO) is an old Unix IPC 245mechanism for processes communicating on the same machine. It works 246just like a regular, connected anonymous pipes, except that the 247processes rendezvous using a filename and don't have to be related. 248 249To create a named pipe, use the Unix command mknod(1) or on some 250systems, mkfifo(1). These may not be in your normal path. 251 252 # system return val is backwards, so && not || 253 # 254 $ENV{PATH} .= ":/etc:/usr/etc"; 255 if ( system('mknod', $path, 'p') 256 && system('mkfifo', $path) ) 257 { 258 die "mk{nod,fifo} $path failed"; 259 } 260 261 262A fifo is convenient when you want to connect a process to an unrelated 263one. When you open a fifo, the program will block until there's something 264on the other end. 265 266For example, let's say you'd like to have your F<.signature> file be a 267named pipe that has a Perl program on the other end. Now every time any 268program (like a mailer, news reader, finger program, etc.) tries to read 269from that file, the reading program will block and your program will 270supply the new signature. We'll use the pipe-checking file test B<-p> 271to find out whether anyone (or anything) has accidentally removed our fifo. 272 273 chdir; # go home 274 $FIFO = '.signature'; 275 $ENV{PATH} .= ":/etc:/usr/games"; 276 277 while (1) { 278 unless (-p $FIFO) { 279 unlink $FIFO; 280 system('mknod', $FIFO, 'p') 281 && die "can't mknod $FIFO: $!"; 282 } 283 284 # next line blocks until there's a reader 285 open (FIFO, "> $FIFO") || die "can't write $FIFO: $!"; 286 print FIFO "John Smith (smith\@host.org)\n", `fortune -s`; 287 close FIFO; 288 sleep 2; # to avoid dup signals 289 } 290 291=head2 Deferred Signals (Safe Signals) 292 293In Perls before Perl 5.7.3 by installing Perl code to deal with 294signals, you were exposing yourself to danger from two things. First, 295few system library functions are re-entrant. If the signal interrupts 296while Perl is executing one function (like malloc(3) or printf(3)), 297and your signal handler then calls the same function again, you could 298get unpredictable behavior--often, a core dump. Second, Perl isn't 299itself re-entrant at the lowest levels. If the signal interrupts Perl 300while Perl is changing its own internal data structures, similarly 301unpredictable behaviour may result. 302 303There were two things you could do, knowing this: be paranoid or be 304pragmatic. The paranoid approach was to do as little as possible in your 305signal handler. Set an existing integer variable that already has a 306value, and return. This doesn't help you if you're in a slow system call, 307which will just restart. That means you have to C<die> to longjump(3) out 308of the handler. Even this is a little cavalier for the true paranoiac, 309who avoids C<die> in a handler because the system I<is> out to get you. 310The pragmatic approach was to say ``I know the risks, but prefer the 311convenience'', and to do anything you wanted in your signal handler, 312and be prepared to clean up core dumps now and again. 313 314In Perl 5.7.3 and later to avoid these problems signals are 315"deferred"-- that is when the signal is delivered to the process by 316the system (to the C code that implements Perl) a flag is set, and the 317handler returns immediately. Then at strategic "safe" points in the 318Perl interpreter (e.g. when it is about to execute a new opcode) the 319flags are checked and the Perl level handler from %SIG is 320executed. The "deferred" scheme allows much more flexibility in the 321coding of signal handler as we know Perl interpreter is in a safe 322state, and that we are not in a system library function when the 323handler is called. However the implementation does differ from 324previous Perls in the following ways: 325 326=over 4 327 328=item Long running opcodes 329 330As Perl interpreter only looks at the signal flags when it about to 331execute a new opcode if a signal arrives during a long running opcode 332(e.g. a regular expression operation on a very large string) then 333signal will not be seen until operation completes. 334 335=item Interrupting IO 336 337When a signal is delivered (e.g. INT control-C) the operating system 338breaks into IO operations like C<read> (used to implement Perls 339E<lt>E<gt> operator). On older Perls the handler was called 340immediately (and as C<read> is not "unsafe" this worked well). With 341the "deferred" scheme the handler is not called immediately, and if 342Perl is using system's C<stdio> library that library may re-start the 343C<read> without returning to Perl and giving it a chance to call the 344%SIG handler. If this happens on your system the solution is to use 345C<:perlio> layer to do IO - at least on those handles which you want 346to be able to break into with signals. (The C<:perlio> layer checks 347the signal flags and calls %SIG handlers before resuming IO operation.) 348 349Note that the default in Perl 5.7.3 and later is to automatically use 350the C<:perlio> layer. 351 352Note that some networking library functions like gethostbyname() are 353known to have their own implementations of timeouts which may conflict 354with your timeouts. If you are having problems with such functions, 355you can try using the POSIX sigaction() function, which bypasses the 356Perl safe signals (note that this means subjecting yourself to 357possible memory corruption, as described above). Instead of setting 358C<$SIG{ALRM}> try something like the following: 359 360 use POSIX; 361 sigaction SIGALRM, new POSIX::SigAction sub { die "alarm\n" } 362 or die "Error setting SIGALRM handler: $!\n"; 363 364=item Restartable system calls 365 366On systems that supported it, older versions of Perl used the 367SA_RESTART flag when installing %SIG handlers. This meant that 368restartable system calls would continue rather than returning when 369a signal arrived. In order to deliver deferred signals promptly, 370Perl 5.7.3 and later do I<not> use SA_RESTART. Consequently, 371restartable system calls can fail (with $! set to C<EINTR>) in places 372where they previously would have succeeded. 373 374Note that the default C<:perlio> layer will retry C<read>, C<write> 375and C<close> as described above and that interrupted C<wait> and 376C<waitpid> calls will always be retried. 377 378=item Signals as "faults" 379 380Certain signals e.g. SEGV, ILL, BUS are generated as a result of 381virtual memory or other "faults". These are normally fatal and there 382is little a Perl-level handler can do with them. (In particular the 383old signal scheme was particularly unsafe in such cases.) However if 384a %SIG handler is set the new scheme simply sets a flag and returns as 385described above. This may cause the operating system to try the 386offending machine instruction again and - as nothing has changed - it 387will generate the signal again. The result of this is a rather odd 388"loop". In future Perl's signal mechanism may be changed to avoid this 389- perhaps by simply disallowing %SIG handlers on signals of that 390type. Until then the work-round is not to set a %SIG handler on those 391signals. (Which signals they are is operating system dependant.) 392 393=item Signals triggered by operating system state 394 395On some operating systems certain signal handlers are supposed to "do 396something" before returning. One example can be CHLD or CLD which 397indicates a child process has completed. On some operating systems the 398signal handler is expected to C<wait> for the completed child 399process. On such systems the deferred signal scheme will not work for 400those signals (it does not do the C<wait>). Again the failure will 401look like a loop as the operating system will re-issue the signal as 402there are un-waited-for completed child processes. 403 404=back 405 406If you want the old signal behaviour back regardless of possible 407memory corruption, set the environment variable C<PERL_SIGNALS> to 408C<"unsafe"> (a new feature since Perl 5.8.1). 409 410=head1 Using open() for IPC 411 412Perl's basic open() statement can also be used for unidirectional 413interprocess communication by either appending or prepending a pipe 414symbol to the second argument to open(). Here's how to start 415something up in a child process you intend to write to: 416 417 open(SPOOLER, "| cat -v | lpr -h 2>/dev/null") 418 || die "can't fork: $!"; 419 local $SIG{PIPE} = sub { die "spooler pipe broke" }; 420 print SPOOLER "stuff\n"; 421 close SPOOLER || die "bad spool: $! $?"; 422 423And here's how to start up a child process you intend to read from: 424 425 open(STATUS, "netstat -an 2>&1 |") 426 || die "can't fork: $!"; 427 while (<STATUS>) { 428 next if /^(tcp|udp)/; 429 print; 430 } 431 close STATUS || die "bad netstat: $! $?"; 432 433If one can be sure that a particular program is a Perl script that is 434expecting filenames in @ARGV, the clever programmer can write something 435like this: 436 437 % program f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile 438 439and irrespective of which shell it's called from, the Perl program will 440read from the file F<f1>, the process F<cmd1>, standard input (F<tmpfile> 441in this case), the F<f2> file, the F<cmd2> command, and finally the F<f3> 442file. Pretty nifty, eh? 443 444You might notice that you could use backticks for much the 445same effect as opening a pipe for reading: 446 447 print grep { !/^(tcp|udp)/ } `netstat -an 2>&1`; 448 die "bad netstat" if $?; 449 450While this is true on the surface, it's much more efficient to process the 451file one line or record at a time because then you don't have to read the 452whole thing into memory at once. It also gives you finer control of the 453whole process, letting you to kill off the child process early if you'd 454like. 455 456Be careful to check both the open() and the close() return values. If 457you're I<writing> to a pipe, you should also trap SIGPIPE. Otherwise, 458think of what happens when you start up a pipe to a command that doesn't 459exist: the open() will in all likelihood succeed (it only reflects the 460fork()'s success), but then your output will fail--spectacularly. Perl 461can't know whether the command worked because your command is actually 462running in a separate process whose exec() might have failed. Therefore, 463while readers of bogus commands return just a quick end of file, writers 464to bogus command will trigger a signal they'd better be prepared to 465handle. Consider: 466 467 open(FH, "|bogus") or die "can't fork: $!"; 468 print FH "bang\n" or die "can't write: $!"; 469 close FH or die "can't close: $!"; 470 471That won't blow up until the close, and it will blow up with a SIGPIPE. 472To catch it, you could use this: 473 474 $SIG{PIPE} = 'IGNORE'; 475 open(FH, "|bogus") or die "can't fork: $!"; 476 print FH "bang\n" or die "can't write: $!"; 477 close FH or die "can't close: status=$?"; 478 479=head2 Filehandles 480 481Both the main process and any child processes it forks share the same 482STDIN, STDOUT, and STDERR filehandles. If both processes try to access 483them at once, strange things can happen. You may also want to close 484or reopen the filehandles for the child. You can get around this by 485opening your pipe with open(), but on some systems this means that the 486child process cannot outlive the parent. 487 488=head2 Background Processes 489 490You can run a command in the background with: 491 492 system("cmd &"); 493 494The command's STDOUT and STDERR (and possibly STDIN, depending on your 495shell) will be the same as the parent's. You won't need to catch 496SIGCHLD because of the double-fork taking place (see below for more 497details). 498 499=head2 Complete Dissociation of Child from Parent 500 501In some cases (starting server processes, for instance) you'll want to 502completely dissociate the child process from the parent. This is 503often called daemonization. A well behaved daemon will also chdir() 504to the root directory (so it doesn't prevent unmounting the filesystem 505containing the directory from which it was launched) and redirect its 506standard file descriptors from and to F</dev/null> (so that random 507output doesn't wind up on the user's terminal). 508 509 use POSIX 'setsid'; 510 511 sub daemonize { 512 chdir '/' or die "Can't chdir to /: $!"; 513 open STDIN, '/dev/null' or die "Can't read /dev/null: $!"; 514 open STDOUT, '>/dev/null' 515 or die "Can't write to /dev/null: $!"; 516 defined(my $pid = fork) or die "Can't fork: $!"; 517 exit if $pid; 518 setsid or die "Can't start a new session: $!"; 519 open STDERR, '>&STDOUT' or die "Can't dup stdout: $!"; 520 } 521 522The fork() has to come before the setsid() to ensure that you aren't a 523process group leader (the setsid() will fail if you are). If your 524system doesn't have the setsid() function, open F</dev/tty> and use the 525C<TIOCNOTTY> ioctl() on it instead. See L<tty(4)> for details. 526 527Non-Unix users should check their Your_OS::Process module for other 528solutions. 529 530=head2 Safe Pipe Opens 531 532Another interesting approach to IPC is making your single program go 533multiprocess and communicate between (or even amongst) yourselves. The 534open() function will accept a file argument of either C<"-|"> or C<"|-"> 535to do a very interesting thing: it forks a child connected to the 536filehandle you've opened. The child is running the same program as the 537parent. This is useful for safely opening a file when running under an 538assumed UID or GID, for example. If you open a pipe I<to> minus, you can 539write to the filehandle you opened and your kid will find it in his 540STDIN. If you open a pipe I<from> minus, you can read from the filehandle 541you opened whatever your kid writes to his STDOUT. 542 543 use English '-no_match_vars'; 544 my $sleep_count = 0; 545 546 do { 547 $pid = open(KID_TO_WRITE, "|-"); 548 unless (defined $pid) { 549 warn "cannot fork: $!"; 550 die "bailing out" if $sleep_count++ > 6; 551 sleep 10; 552 } 553 } until defined $pid; 554 555 if ($pid) { # parent 556 print KID_TO_WRITE @some_data; 557 close(KID_TO_WRITE) || warn "kid exited $?"; 558 } else { # child 559 ($EUID, $EGID) = ($UID, $GID); # suid progs only 560 open (FILE, "> /safe/file") 561 || die "can't open /safe/file: $!"; 562 while (<STDIN>) { 563 print FILE; # child's STDIN is parent's KID 564 } 565 exit; # don't forget this 566 } 567 568Another common use for this construct is when you need to execute 569something without the shell's interference. With system(), it's 570straightforward, but you can't use a pipe open or backticks safely. 571That's because there's no way to stop the shell from getting its hands on 572your arguments. Instead, use lower-level control to call exec() directly. 573 574Here's a safe backtick or pipe open for read: 575 576 # add error processing as above 577 $pid = open(KID_TO_READ, "-|"); 578 579 if ($pid) { # parent 580 while (<KID_TO_READ>) { 581 # do something interesting 582 } 583 close(KID_TO_READ) || warn "kid exited $?"; 584 585 } else { # child 586 ($EUID, $EGID) = ($UID, $GID); # suid only 587 exec($program, @options, @args) 588 || die "can't exec program: $!"; 589 # NOTREACHED 590 } 591 592 593And here's a safe pipe open for writing: 594 595 # add error processing as above 596 $pid = open(KID_TO_WRITE, "|-"); 597 $SIG{PIPE} = sub { die "whoops, $program pipe broke" }; 598 599 if ($pid) { # parent 600 for (@data) { 601 print KID_TO_WRITE; 602 } 603 close(KID_TO_WRITE) || warn "kid exited $?"; 604 605 } else { # child 606 ($EUID, $EGID) = ($UID, $GID); 607 exec($program, @options, @args) 608 || die "can't exec program: $!"; 609 # NOTREACHED 610 } 611 612Since Perl 5.8.0, you can also use the list form of C<open> for pipes : 613the syntax 614 615 open KID_PS, "-|", "ps", "aux" or die $!; 616 617forks the ps(1) command (without spawning a shell, as there are more than 618three arguments to open()), and reads its standard output via the 619C<KID_PS> filehandle. The corresponding syntax to read from command 620pipes (with C<"|-"> in place of C<"-|">) is also implemented. 621 622Note that these operations are full Unix forks, which means they may not be 623correctly implemented on alien systems. Additionally, these are not true 624multithreading. If you'd like to learn more about threading, see the 625F<modules> file mentioned below in the SEE ALSO section. 626 627=head2 Bidirectional Communication with Another Process 628 629While this works reasonably well for unidirectional communication, what 630about bidirectional communication? The obvious thing you'd like to do 631doesn't actually work: 632 633 open(PROG_FOR_READING_AND_WRITING, "| some program |") 634 635and if you forget to use the C<use warnings> pragma or the B<-w> flag, 636then you'll miss out entirely on the diagnostic message: 637 638 Can't do bidirectional pipe at -e line 1. 639 640If you really want to, you can use the standard open2() library function 641to catch both ends. There's also an open3() for tridirectional I/O so you 642can also catch your child's STDERR, but doing so would then require an 643awkward select() loop and wouldn't allow you to use normal Perl input 644operations. 645 646If you look at its source, you'll see that open2() uses low-level 647primitives like Unix pipe() and exec() calls to create all the connections. 648While it might have been slightly more efficient by using socketpair(), it 649would have then been even less portable than it already is. The open2() 650and open3() functions are unlikely to work anywhere except on a Unix 651system or some other one purporting to be POSIX compliant. 652 653Here's an example of using open2(): 654 655 use FileHandle; 656 use IPC::Open2; 657 $pid = open2(*Reader, *Writer, "cat -u -n" ); 658 print Writer "stuff\n"; 659 $got = <Reader>; 660 661The problem with this is that Unix buffering is really going to 662ruin your day. Even though your C<Writer> filehandle is auto-flushed, 663and the process on the other end will get your data in a timely manner, 664you can't usually do anything to force it to give it back to you 665in a similarly quick fashion. In this case, we could, because we 666gave I<cat> a B<-u> flag to make it unbuffered. But very few Unix 667commands are designed to operate over pipes, so this seldom works 668unless you yourself wrote the program on the other end of the 669double-ended pipe. 670 671A solution to this is the nonstandard F<Comm.pl> library. It uses 672pseudo-ttys to make your program behave more reasonably: 673 674 require 'Comm.pl'; 675 $ph = open_proc('cat -n'); 676 for (1..10) { 677 print $ph "a line\n"; 678 print "got back ", scalar <$ph>; 679 } 680 681This way you don't have to have control over the source code of the 682program you're using. The F<Comm> library also has expect() 683and interact() functions. Find the library (and we hope its 684successor F<IPC::Chat>) at your nearest CPAN archive as detailed 685in the SEE ALSO section below. 686 687The newer Expect.pm module from CPAN also addresses this kind of thing. 688This module requires two other modules from CPAN: IO::Pty and IO::Stty. 689It sets up a pseudo-terminal to interact with programs that insist on 690using talking to the terminal device driver. If your system is 691amongst those supported, this may be your best bet. 692 693=head2 Bidirectional Communication with Yourself 694 695If you want, you may make low-level pipe() and fork() 696to stitch this together by hand. This example only 697talks to itself, but you could reopen the appropriate 698handles to STDIN and STDOUT and call other processes. 699 700 #!/usr/bin/perl -w 701 # pipe1 - bidirectional communication using two pipe pairs 702 # designed for the socketpair-challenged 703 use IO::Handle; # thousands of lines just for autoflush :-( 704 pipe(PARENT_RDR, CHILD_WTR); # XXX: failure? 705 pipe(CHILD_RDR, PARENT_WTR); # XXX: failure? 706 CHILD_WTR->autoflush(1); 707 PARENT_WTR->autoflush(1); 708 709 if ($pid = fork) { 710 close PARENT_RDR; close PARENT_WTR; 711 print CHILD_WTR "Parent Pid $$ is sending this\n"; 712 chomp($line = <CHILD_RDR>); 713 print "Parent Pid $$ just read this: `$line'\n"; 714 close CHILD_RDR; close CHILD_WTR; 715 waitpid($pid,0); 716 } else { 717 die "cannot fork: $!" unless defined $pid; 718 close CHILD_RDR; close CHILD_WTR; 719 chomp($line = <PARENT_RDR>); 720 print "Child Pid $$ just read this: `$line'\n"; 721 print PARENT_WTR "Child Pid $$ is sending this\n"; 722 close PARENT_RDR; close PARENT_WTR; 723 exit; 724 } 725 726But you don't actually have to make two pipe calls. If you 727have the socketpair() system call, it will do this all for you. 728 729 #!/usr/bin/perl -w 730 # pipe2 - bidirectional communication using socketpair 731 # "the best ones always go both ways" 732 733 use Socket; 734 use IO::Handle; # thousands of lines just for autoflush :-( 735 # We say AF_UNIX because although *_LOCAL is the 736 # POSIX 1003.1g form of the constant, many machines 737 # still don't have it. 738 socketpair(CHILD, PARENT, AF_UNIX, SOCK_STREAM, PF_UNSPEC) 739 or die "socketpair: $!"; 740 741 CHILD->autoflush(1); 742 PARENT->autoflush(1); 743 744 if ($pid = fork) { 745 close PARENT; 746 print CHILD "Parent Pid $$ is sending this\n"; 747 chomp($line = <CHILD>); 748 print "Parent Pid $$ just read this: `$line'\n"; 749 close CHILD; 750 waitpid($pid,0); 751 } else { 752 die "cannot fork: $!" unless defined $pid; 753 close CHILD; 754 chomp($line = <PARENT>); 755 print "Child Pid $$ just read this: `$line'\n"; 756 print PARENT "Child Pid $$ is sending this\n"; 757 close PARENT; 758 exit; 759 } 760 761=head1 Sockets: Client/Server Communication 762 763While not limited to Unix-derived operating systems (e.g., WinSock on PCs 764provides socket support, as do some VMS libraries), you may not have 765sockets on your system, in which case this section probably isn't going to do 766you much good. With sockets, you can do both virtual circuits (i.e., TCP 767streams) and datagrams (i.e., UDP packets). You may be able to do even more 768depending on your system. 769 770The Perl function calls for dealing with sockets have the same names as 771the corresponding system calls in C, but their arguments tend to differ 772for two reasons: first, Perl filehandles work differently than C file 773descriptors. Second, Perl already knows the length of its strings, so you 774don't need to pass that information. 775 776One of the major problems with old socket code in Perl was that it used 777hard-coded values for some of the constants, which severely hurt 778portability. If you ever see code that does anything like explicitly 779setting C<$AF_INET = 2>, you know you're in for big trouble: An 780immeasurably superior approach is to use the C<Socket> module, which more 781reliably grants access to various constants and functions you'll need. 782 783If you're not writing a server/client for an existing protocol like 784NNTP or SMTP, you should give some thought to how your server will 785know when the client has finished talking, and vice-versa. Most 786protocols are based on one-line messages and responses (so one party 787knows the other has finished when a "\n" is received) or multi-line 788messages and responses that end with a period on an empty line 789("\n.\n" terminates a message/response). 790 791=head2 Internet Line Terminators 792 793The Internet line terminator is "\015\012". Under ASCII variants of 794Unix, that could usually be written as "\r\n", but under other systems, 795"\r\n" might at times be "\015\015\012", "\012\012\015", or something 796completely different. The standards specify writing "\015\012" to be 797conformant (be strict in what you provide), but they also recommend 798accepting a lone "\012" on input (but be lenient in what you require). 799We haven't always been very good about that in the code in this manpage, 800but unless you're on a Mac, you'll probably be ok. 801 802=head2 Internet TCP Clients and Servers 803 804Use Internet-domain sockets when you want to do client-server 805communication that might extend to machines outside of your own system. 806 807Here's a sample TCP client using Internet-domain sockets: 808 809 #!/usr/bin/perl -w 810 use strict; 811 use Socket; 812 my ($remote,$port, $iaddr, $paddr, $proto, $line); 813 814 $remote = shift || 'localhost'; 815 $port = shift || 2345; # random port 816 if ($port =~ /\D/) { $port = getservbyname($port, 'tcp') } 817 die "No port" unless $port; 818 $iaddr = inet_aton($remote) || die "no host: $remote"; 819 $paddr = sockaddr_in($port, $iaddr); 820 821 $proto = getprotobyname('tcp'); 822 socket(SOCK, PF_INET, SOCK_STREAM, $proto) || die "socket: $!"; 823 connect(SOCK, $paddr) || die "connect: $!"; 824 while (defined($line = <SOCK>)) { 825 print $line; 826 } 827 828 close (SOCK) || die "close: $!"; 829 exit; 830 831And here's a corresponding server to go along with it. We'll 832leave the address as INADDR_ANY so that the kernel can choose 833the appropriate interface on multihomed hosts. If you want sit 834on a particular interface (like the external side of a gateway 835or firewall machine), you should fill this in with your real address 836instead. 837 838 #!/usr/bin/perl -Tw 839 use strict; 840 BEGIN { $ENV{PATH} = '/usr/ucb:/bin' } 841 use Socket; 842 use Carp; 843 my $EOL = "\015\012"; 844 845 sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" } 846 847 my $port = shift || 2345; 848 my $proto = getprotobyname('tcp'); 849 850 ($port) = $port =~ /^(\d+)$/ or die "invalid port"; 851 852 socket(Server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!"; 853 setsockopt(Server, SOL_SOCKET, SO_REUSEADDR, 854 pack("l", 1)) || die "setsockopt: $!"; 855 bind(Server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!"; 856 listen(Server,SOMAXCONN) || die "listen: $!"; 857 858 logmsg "server started on port $port"; 859 860 my $paddr; 861 862 $SIG{CHLD} = \&REAPER; 863 864 for ( ; $paddr = accept(Client,Server); close Client) { 865 my($port,$iaddr) = sockaddr_in($paddr); 866 my $name = gethostbyaddr($iaddr,AF_INET); 867 868 logmsg "connection from $name [", 869 inet_ntoa($iaddr), "] 870 at port $port"; 871 872 print Client "Hello there, $name, it's now ", 873 scalar localtime, $EOL; 874 } 875 876And here's a multithreaded version. It's multithreaded in that 877like most typical servers, it spawns (forks) a slave server to 878handle the client request so that the master server can quickly 879go back to service a new client. 880 881 #!/usr/bin/perl -Tw 882 use strict; 883 BEGIN { $ENV{PATH} = '/usr/ucb:/bin' } 884 use Socket; 885 use Carp; 886 my $EOL = "\015\012"; 887 888 sub spawn; # forward declaration 889 sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" } 890 891 my $port = shift || 2345; 892 my $proto = getprotobyname('tcp'); 893 894 ($port) = $port =~ /^(\d+)$/ or die "invalid port"; 895 896 socket(Server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!"; 897 setsockopt(Server, SOL_SOCKET, SO_REUSEADDR, 898 pack("l", 1)) || die "setsockopt: $!"; 899 bind(Server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!"; 900 listen(Server,SOMAXCONN) || die "listen: $!"; 901 902 logmsg "server started on port $port"; 903 904 my $waitedpid = 0; 905 my $paddr; 906 907 use POSIX ":sys_wait_h"; 908 sub REAPER { 909 my $child; 910 while (($waitedpid = waitpid(-1,WNOHANG)) > 0) { 911 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : ''); 912 } 913 $SIG{CHLD} = \&REAPER; # loathe sysV 914 } 915 916 $SIG{CHLD} = \&REAPER; 917 918 for ( $waitedpid = 0; 919 ($paddr = accept(Client,Server)) || $waitedpid; 920 $waitedpid = 0, close Client) 921 { 922 next if $waitedpid and not $paddr; 923 my($port,$iaddr) = sockaddr_in($paddr); 924 my $name = gethostbyaddr($iaddr,AF_INET); 925 926 logmsg "connection from $name [", 927 inet_ntoa($iaddr), "] 928 at port $port"; 929 930 spawn sub { 931 $|=1; 932 print "Hello there, $name, it's now ", scalar localtime, $EOL; 933 exec '/usr/games/fortune' # XXX: `wrong' line terminators 934 or confess "can't exec fortune: $!"; 935 }; 936 937 } 938 939 sub spawn { 940 my $coderef = shift; 941 942 unless (@_ == 0 && $coderef && ref($coderef) eq 'CODE') { 943 confess "usage: spawn CODEREF"; 944 } 945 946 my $pid; 947 if (!defined($pid = fork)) { 948 logmsg "cannot fork: $!"; 949 return; 950 } elsif ($pid) { 951 logmsg "begat $pid"; 952 return; # I'm the parent 953 } 954 # else I'm the child -- go spawn 955 956 open(STDIN, "<&Client") || die "can't dup client to stdin"; 957 open(STDOUT, ">&Client") || die "can't dup client to stdout"; 958 ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr"; 959 exit &$coderef(); 960 } 961 962This server takes the trouble to clone off a child version via fork() for 963each incoming request. That way it can handle many requests at once, 964which you might not always want. Even if you don't fork(), the listen() 965will allow that many pending connections. Forking servers have to be 966particularly careful about cleaning up their dead children (called 967"zombies" in Unix parlance), because otherwise you'll quickly fill up your 968process table. 969 970We suggest that you use the B<-T> flag to use taint checking (see L<perlsec>) 971even if we aren't running setuid or setgid. This is always a good idea 972for servers and other programs run on behalf of someone else (like CGI 973scripts), because it lessens the chances that people from the outside will 974be able to compromise your system. 975 976Let's look at another TCP client. This one connects to the TCP "time" 977service on a number of different machines and shows how far their clocks 978differ from the system on which it's being run: 979 980 #!/usr/bin/perl -w 981 use strict; 982 use Socket; 983 984 my $SECS_of_70_YEARS = 2208988800; 985 sub ctime { scalar localtime(shift) } 986 987 my $iaddr = gethostbyname('localhost'); 988 my $proto = getprotobyname('tcp'); 989 my $port = getservbyname('time', 'tcp'); 990 my $paddr = sockaddr_in(0, $iaddr); 991 my($host); 992 993 $| = 1; 994 printf "%-24s %8s %s\n", "localhost", 0, ctime(time()); 995 996 foreach $host (@ARGV) { 997 printf "%-24s ", $host; 998 my $hisiaddr = inet_aton($host) || die "unknown host"; 999 my $hispaddr = sockaddr_in($port, $hisiaddr); 1000 socket(SOCKET, PF_INET, SOCK_STREAM, $proto) || die "socket: $!"; 1001 connect(SOCKET, $hispaddr) || die "bind: $!"; 1002 my $rtime = ' '; 1003 read(SOCKET, $rtime, 4); 1004 close(SOCKET); 1005 my $histime = unpack("N", $rtime) - $SECS_of_70_YEARS ; 1006 printf "%8d %s\n", $histime - time, ctime($histime); 1007 } 1008 1009=head2 Unix-Domain TCP Clients and Servers 1010 1011That's fine for Internet-domain clients and servers, but what about local 1012communications? While you can use the same setup, sometimes you don't 1013want to. Unix-domain sockets are local to the current host, and are often 1014used internally to implement pipes. Unlike Internet domain sockets, Unix 1015domain sockets can show up in the file system with an ls(1) listing. 1016 1017 % ls -l /dev/log 1018 srw-rw-rw- 1 root 0 Oct 31 07:23 /dev/log 1019 1020You can test for these with Perl's B<-S> file test: 1021 1022 unless ( -S '/dev/log' ) { 1023 die "something's wicked with the log system"; 1024 } 1025 1026Here's a sample Unix-domain client: 1027 1028 #!/usr/bin/perl -w 1029 use Socket; 1030 use strict; 1031 my ($rendezvous, $line); 1032 1033 $rendezvous = shift || 'catsock'; 1034 socket(SOCK, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!"; 1035 connect(SOCK, sockaddr_un($rendezvous)) || die "connect: $!"; 1036 while (defined($line = <SOCK>)) { 1037 print $line; 1038 } 1039 exit; 1040 1041And here's a corresponding server. You don't have to worry about silly 1042network terminators here because Unix domain sockets are guaranteed 1043to be on the localhost, and thus everything works right. 1044 1045 #!/usr/bin/perl -Tw 1046 use strict; 1047 use Socket; 1048 use Carp; 1049 1050 BEGIN { $ENV{PATH} = '/usr/ucb:/bin' } 1051 sub spawn; # forward declaration 1052 sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" } 1053 1054 my $NAME = 'catsock'; 1055 my $uaddr = sockaddr_un($NAME); 1056 my $proto = getprotobyname('tcp'); 1057 1058 socket(Server,PF_UNIX,SOCK_STREAM,0) || die "socket: $!"; 1059 unlink($NAME); 1060 bind (Server, $uaddr) || die "bind: $!"; 1061 listen(Server,SOMAXCONN) || die "listen: $!"; 1062 1063 logmsg "server started on $NAME"; 1064 1065 my $waitedpid; 1066 1067 use POSIX ":sys_wait_h"; 1068 sub REAPER { 1069 my $child; 1070 while (($waitedpid = waitpid(-1,WNOHANG)) > 0) { 1071 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : ''); 1072 } 1073 $SIG{CHLD} = \&REAPER; # loathe sysV 1074 } 1075 1076 $SIG{CHLD} = \&REAPER; 1077 1078 1079 for ( $waitedpid = 0; 1080 accept(Client,Server) || $waitedpid; 1081 $waitedpid = 0, close Client) 1082 { 1083 next if $waitedpid; 1084 logmsg "connection on $NAME"; 1085 spawn sub { 1086 print "Hello there, it's now ", scalar localtime, "\n"; 1087 exec '/usr/games/fortune' or die "can't exec fortune: $!"; 1088 }; 1089 } 1090 1091 sub spawn { 1092 my $coderef = shift; 1093 1094 unless (@_ == 0 && $coderef && ref($coderef) eq 'CODE') { 1095 confess "usage: spawn CODEREF"; 1096 } 1097 1098 my $pid; 1099 if (!defined($pid = fork)) { 1100 logmsg "cannot fork: $!"; 1101 return; 1102 } elsif ($pid) { 1103 logmsg "begat $pid"; 1104 return; # I'm the parent 1105 } 1106 # else I'm the child -- go spawn 1107 1108 open(STDIN, "<&Client") || die "can't dup client to stdin"; 1109 open(STDOUT, ">&Client") || die "can't dup client to stdout"; 1110 ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr"; 1111 exit &$coderef(); 1112 } 1113 1114As you see, it's remarkably similar to the Internet domain TCP server, so 1115much so, in fact, that we've omitted several duplicate functions--spawn(), 1116logmsg(), ctime(), and REAPER()--which are exactly the same as in the 1117other server. 1118 1119So why would you ever want to use a Unix domain socket instead of a 1120simpler named pipe? Because a named pipe doesn't give you sessions. You 1121can't tell one process's data from another's. With socket programming, 1122you get a separate session for each client: that's why accept() takes two 1123arguments. 1124 1125For example, let's say that you have a long running database server daemon 1126that you want folks from the World Wide Web to be able to access, but only 1127if they go through a CGI interface. You'd have a small, simple CGI 1128program that does whatever checks and logging you feel like, and then acts 1129as a Unix-domain client and connects to your private server. 1130 1131=head1 TCP Clients with IO::Socket 1132 1133For those preferring a higher-level interface to socket programming, the 1134IO::Socket module provides an object-oriented approach. IO::Socket is 1135included as part of the standard Perl distribution as of the 5.004 1136release. If you're running an earlier version of Perl, just fetch 1137IO::Socket from CPAN, where you'll also find modules providing easy 1138interfaces to the following systems: DNS, FTP, Ident (RFC 931), NIS and 1139NISPlus, NNTP, Ping, POP3, SMTP, SNMP, SSLeay, Telnet, and Time--just 1140to name a few. 1141 1142=head2 A Simple Client 1143 1144Here's a client that creates a TCP connection to the "daytime" 1145service at port 13 of the host name "localhost" and prints out everything 1146that the server there cares to provide. 1147 1148 #!/usr/bin/perl -w 1149 use IO::Socket; 1150 $remote = IO::Socket::INET->new( 1151 Proto => "tcp", 1152 PeerAddr => "localhost", 1153 PeerPort => "daytime(13)", 1154 ) 1155 or die "cannot connect to daytime port at localhost"; 1156 while ( <$remote> ) { print } 1157 1158When you run this program, you should get something back that 1159looks like this: 1160 1161 Wed May 14 08:40:46 MDT 1997 1162 1163Here are what those parameters to the C<new> constructor mean: 1164 1165=over 4 1166 1167=item C<Proto> 1168 1169This is which protocol to use. In this case, the socket handle returned 1170will be connected to a TCP socket, because we want a stream-oriented 1171connection, that is, one that acts pretty much like a plain old file. 1172Not all sockets are this of this type. For example, the UDP protocol 1173can be used to make a datagram socket, used for message-passing. 1174 1175=item C<PeerAddr> 1176 1177This is the name or Internet address of the remote host the server is 1178running on. We could have specified a longer name like C<"www.perl.com">, 1179or an address like C<"204.148.40.9">. For demonstration purposes, we've 1180used the special hostname C<"localhost">, which should always mean the 1181current machine you're running on. The corresponding Internet address 1182for localhost is C<"127.1">, if you'd rather use that. 1183 1184=item C<PeerPort> 1185 1186This is the service name or port number we'd like to connect to. 1187We could have gotten away with using just C<"daytime"> on systems with a 1188well-configured system services file,[FOOTNOTE: The system services file 1189is in I</etc/services> under Unix] but just in case, we've specified the 1190port number (13) in parentheses. Using just the number would also have 1191worked, but constant numbers make careful programmers nervous. 1192 1193=back 1194 1195Notice how the return value from the C<new> constructor is used as 1196a filehandle in the C<while> loop? That's what's called an indirect 1197filehandle, a scalar variable containing a filehandle. You can use 1198it the same way you would a normal filehandle. For example, you 1199can read one line from it this way: 1200 1201 $line = <$handle>; 1202 1203all remaining lines from is this way: 1204 1205 @lines = <$handle>; 1206 1207and send a line of data to it this way: 1208 1209 print $handle "some data\n"; 1210 1211=head2 A Webget Client 1212 1213Here's a simple client that takes a remote host to fetch a document 1214from, and then a list of documents to get from that host. This is a 1215more interesting client than the previous one because it first sends 1216something to the server before fetching the server's response. 1217 1218 #!/usr/bin/perl -w 1219 use IO::Socket; 1220 unless (@ARGV > 1) { die "usage: $0 host document ..." } 1221 $host = shift(@ARGV); 1222 $EOL = "\015\012"; 1223 $BLANK = $EOL x 2; 1224 foreach $document ( @ARGV ) { 1225 $remote = IO::Socket::INET->new( Proto => "tcp", 1226 PeerAddr => $host, 1227 PeerPort => "http(80)", 1228 ); 1229 unless ($remote) { die "cannot connect to http daemon on $host" } 1230 $remote->autoflush(1); 1231 print $remote "GET $document HTTP/1.0" . $BLANK; 1232 while ( <$remote> ) { print } 1233 close $remote; 1234 } 1235 1236The web server handing the "http" service, which is assumed to be at 1237its standard port, number 80. If the web server you're trying to 1238connect to is at a different port (like 1080 or 8080), you should specify 1239as the named-parameter pair, C<< PeerPort => 8080 >>. The C<autoflush> 1240method is used on the socket because otherwise the system would buffer 1241up the output we sent it. (If you're on a Mac, you'll also need to 1242change every C<"\n"> in your code that sends data over the network to 1243be a C<"\015\012"> instead.) 1244 1245Connecting to the server is only the first part of the process: once you 1246have the connection, you have to use the server's language. Each server 1247on the network has its own little command language that it expects as 1248input. The string that we send to the server starting with "GET" is in 1249HTTP syntax. In this case, we simply request each specified document. 1250Yes, we really are making a new connection for each document, even though 1251it's the same host. That's the way you always used to have to speak HTTP. 1252Recent versions of web browsers may request that the remote server leave 1253the connection open a little while, but the server doesn't have to honor 1254such a request. 1255 1256Here's an example of running that program, which we'll call I<webget>: 1257 1258 % webget www.perl.com /guanaco.html 1259 HTTP/1.1 404 File Not Found 1260 Date: Thu, 08 May 1997 18:02:32 GMT 1261 Server: Apache/1.2b6 1262 Connection: close 1263 Content-type: text/html 1264 1265 <HEAD><TITLE>404 File Not Found</TITLE></HEAD> 1266 <BODY><H1>File Not Found</H1> 1267 The requested URL /guanaco.html was not found on this server.<P> 1268 </BODY> 1269 1270Ok, so that's not very interesting, because it didn't find that 1271particular document. But a long response wouldn't have fit on this page. 1272 1273For a more fully-featured version of this program, you should look to 1274the I<lwp-request> program included with the LWP modules from CPAN. 1275 1276=head2 Interactive Client with IO::Socket 1277 1278Well, that's all fine if you want to send one command and get one answer, 1279but what about setting up something fully interactive, somewhat like 1280the way I<telnet> works? That way you can type a line, get the answer, 1281type a line, get the answer, etc. 1282 1283This client is more complicated than the two we've done so far, but if 1284you're on a system that supports the powerful C<fork> call, the solution 1285isn't that rough. Once you've made the connection to whatever service 1286you'd like to chat with, call C<fork> to clone your process. Each of 1287these two identical process has a very simple job to do: the parent 1288copies everything from the socket to standard output, while the child 1289simultaneously copies everything from standard input to the socket. 1290To accomplish the same thing using just one process would be I<much> 1291harder, because it's easier to code two processes to do one thing than it 1292is to code one process to do two things. (This keep-it-simple principle 1293a cornerstones of the Unix philosophy, and good software engineering as 1294well, which is probably why it's spread to other systems.) 1295 1296Here's the code: 1297 1298 #!/usr/bin/perl -w 1299 use strict; 1300 use IO::Socket; 1301 my ($host, $port, $kidpid, $handle, $line); 1302 1303 unless (@ARGV == 2) { die "usage: $0 host port" } 1304 ($host, $port) = @ARGV; 1305 1306 # create a tcp connection to the specified host and port 1307 $handle = IO::Socket::INET->new(Proto => "tcp", 1308 PeerAddr => $host, 1309 PeerPort => $port) 1310 or die "can't connect to port $port on $host: $!"; 1311 1312 $handle->autoflush(1); # so output gets there right away 1313 print STDERR "[Connected to $host:$port]\n"; 1314 1315 # split the program into two processes, identical twins 1316 die "can't fork: $!" unless defined($kidpid = fork()); 1317 1318 # the if{} block runs only in the parent process 1319 if ($kidpid) { 1320 # copy the socket to standard output 1321 while (defined ($line = <$handle>)) { 1322 print STDOUT $line; 1323 } 1324 kill("TERM", $kidpid); # send SIGTERM to child 1325 } 1326 # the else{} block runs only in the child process 1327 else { 1328 # copy standard input to the socket 1329 while (defined ($line = <STDIN>)) { 1330 print $handle $line; 1331 } 1332 } 1333 1334The C<kill> function in the parent's C<if> block is there to send a 1335signal to our child process (current running in the C<else> block) 1336as soon as the remote server has closed its end of the connection. 1337 1338If the remote server sends data a byte at time, and you need that 1339data immediately without waiting for a newline (which might not happen), 1340you may wish to replace the C<while> loop in the parent with the 1341following: 1342 1343 my $byte; 1344 while (sysread($handle, $byte, 1) == 1) { 1345 print STDOUT $byte; 1346 } 1347 1348Making a system call for each byte you want to read is not very efficient 1349(to put it mildly) but is the simplest to explain and works reasonably 1350well. 1351 1352=head1 TCP Servers with IO::Socket 1353 1354As always, setting up a server is little bit more involved than running a client. 1355The model is that the server creates a special kind of socket that 1356does nothing but listen on a particular port for incoming connections. 1357It does this by calling the C<< IO::Socket::INET->new() >> method with 1358slightly different arguments than the client did. 1359 1360=over 4 1361 1362=item Proto 1363 1364This is which protocol to use. Like our clients, we'll 1365still specify C<"tcp"> here. 1366 1367=item LocalPort 1368 1369We specify a local 1370port in the C<LocalPort> argument, which we didn't do for the client. 1371This is service name or port number for which you want to be the 1372server. (Under Unix, ports under 1024 are restricted to the 1373superuser.) In our sample, we'll use port 9000, but you can use 1374any port that's not currently in use on your system. If you try 1375to use one already in used, you'll get an "Address already in use" 1376message. Under Unix, the C<netstat -a> command will show 1377which services current have servers. 1378 1379=item Listen 1380 1381The C<Listen> parameter is set to the maximum number of 1382pending connections we can accept until we turn away incoming clients. 1383Think of it as a call-waiting queue for your telephone. 1384The low-level Socket module has a special symbol for the system maximum, which 1385is SOMAXCONN. 1386 1387=item Reuse 1388 1389The C<Reuse> parameter is needed so that we restart our server 1390manually without waiting a few minutes to allow system buffers to 1391clear out. 1392 1393=back 1394 1395Once the generic server socket has been created using the parameters 1396listed above, the server then waits for a new client to connect 1397to it. The server blocks in the C<accept> method, which eventually accepts a 1398bidirectional connection from the remote client. (Make sure to autoflush 1399this handle to circumvent buffering.) 1400 1401To add to user-friendliness, our server prompts the user for commands. 1402Most servers don't do this. Because of the prompt without a newline, 1403you'll have to use the C<sysread> variant of the interactive client above. 1404 1405This server accepts one of five different commands, sending output 1406back to the client. Note that unlike most network servers, this one 1407only handles one incoming client at a time. Multithreaded servers are 1408covered in Chapter 6 of the Camel. 1409 1410Here's the code. We'll 1411 1412 #!/usr/bin/perl -w 1413 use IO::Socket; 1414 use Net::hostent; # for OO version of gethostbyaddr 1415 1416 $PORT = 9000; # pick something not in use 1417 1418 $server = IO::Socket::INET->new( Proto => 'tcp', 1419 LocalPort => $PORT, 1420 Listen => SOMAXCONN, 1421 Reuse => 1); 1422 1423 die "can't setup server" unless $server; 1424 print "[Server $0 accepting clients]\n"; 1425 1426 while ($client = $server->accept()) { 1427 $client->autoflush(1); 1428 print $client "Welcome to $0; type help for command list.\n"; 1429 $hostinfo = gethostbyaddr($client->peeraddr); 1430 printf "[Connect from %s]\n", $hostinfo ? $hostinfo->name : $client->peerhost; 1431 print $client "Command? "; 1432 while ( <$client>) { 1433 next unless /\S/; # blank line 1434 if (/quit|exit/i) { last; } 1435 elsif (/date|time/i) { printf $client "%s\n", scalar localtime; } 1436 elsif (/who/i ) { print $client `who 2>&1`; } 1437 elsif (/cookie/i ) { print $client `/usr/games/fortune 2>&1`; } 1438 elsif (/motd/i ) { print $client `cat /etc/motd 2>&1`; } 1439 else { 1440 print $client "Commands: quit date who cookie motd\n"; 1441 } 1442 } continue { 1443 print $client "Command? "; 1444 } 1445 close $client; 1446 } 1447 1448=head1 UDP: Message Passing 1449 1450Another kind of client-server setup is one that uses not connections, but 1451messages. UDP communications involve much lower overhead but also provide 1452less reliability, as there are no promises that messages will arrive at 1453all, let alone in order and unmangled. Still, UDP offers some advantages 1454over TCP, including being able to "broadcast" or "multicast" to a whole 1455bunch of destination hosts at once (usually on your local subnet). If you 1456find yourself overly concerned about reliability and start building checks 1457into your message system, then you probably should use just TCP to start 1458with. 1459 1460Note that UDP datagrams are I<not> a bytestream and should not be treated 1461as such. This makes using I/O mechanisms with internal buffering 1462like stdio (i.e. print() and friends) especially cumbersome. Use syswrite(), 1463or better send(), like in the example below. 1464 1465Here's a UDP program similar to the sample Internet TCP client given 1466earlier. However, instead of checking one host at a time, the UDP version 1467will check many of them asynchronously by simulating a multicast and then 1468using select() to do a timed-out wait for I/O. To do something similar 1469with TCP, you'd have to use a different socket handle for each host. 1470 1471 #!/usr/bin/perl -w 1472 use strict; 1473 use Socket; 1474 use Sys::Hostname; 1475 1476 my ( $count, $hisiaddr, $hispaddr, $histime, 1477 $host, $iaddr, $paddr, $port, $proto, 1478 $rin, $rout, $rtime, $SECS_of_70_YEARS); 1479 1480 $SECS_of_70_YEARS = 2208988800; 1481 1482 $iaddr = gethostbyname(hostname()); 1483 $proto = getprotobyname('udp'); 1484 $port = getservbyname('time', 'udp'); 1485 $paddr = sockaddr_in(0, $iaddr); # 0 means let kernel pick 1486 1487 socket(SOCKET, PF_INET, SOCK_DGRAM, $proto) || die "socket: $!"; 1488 bind(SOCKET, $paddr) || die "bind: $!"; 1489 1490 $| = 1; 1491 printf "%-12s %8s %s\n", "localhost", 0, scalar localtime time; 1492 $count = 0; 1493 for $host (@ARGV) { 1494 $count++; 1495 $hisiaddr = inet_aton($host) || die "unknown host"; 1496 $hispaddr = sockaddr_in($port, $hisiaddr); 1497 defined(send(SOCKET, 0, 0, $hispaddr)) || die "send $host: $!"; 1498 } 1499 1500 $rin = ''; 1501 vec($rin, fileno(SOCKET), 1) = 1; 1502 1503 # timeout after 10.0 seconds 1504 while ($count && select($rout = $rin, undef, undef, 10.0)) { 1505 $rtime = ''; 1506 ($hispaddr = recv(SOCKET, $rtime, 4, 0)) || die "recv: $!"; 1507 ($port, $hisiaddr) = sockaddr_in($hispaddr); 1508 $host = gethostbyaddr($hisiaddr, AF_INET); 1509 $histime = unpack("N", $rtime) - $SECS_of_70_YEARS ; 1510 printf "%-12s ", $host; 1511 printf "%8d %s\n", $histime - time, scalar localtime($histime); 1512 $count--; 1513 } 1514 1515Note that this example does not include any retries and may consequently 1516fail to contact a reachable host. The most prominent reason for this 1517is congestion of the queues on the sending host if the number of 1518list of hosts to contact is sufficiently large. 1519 1520=head1 SysV IPC 1521 1522While System V IPC isn't so widely used as sockets, it still has some 1523interesting uses. You can't, however, effectively use SysV IPC or 1524Berkeley mmap() to have shared memory so as to share a variable amongst 1525several processes. That's because Perl would reallocate your string when 1526you weren't wanting it to. 1527 1528Here's a small example showing shared memory usage. 1529 1530 use IPC::SysV qw(IPC_PRIVATE IPC_RMID S_IRWXU); 1531 1532 $size = 2000; 1533 $id = shmget(IPC_PRIVATE, $size, S_IRWXU) || die "$!"; 1534 print "shm key $id\n"; 1535 1536 $message = "Message #1"; 1537 shmwrite($id, $message, 0, 60) || die "$!"; 1538 print "wrote: '$message'\n"; 1539 shmread($id, $buff, 0, 60) || die "$!"; 1540 print "read : '$buff'\n"; 1541 1542 # the buffer of shmread is zero-character end-padded. 1543 substr($buff, index($buff, "\0")) = ''; 1544 print "un" unless $buff eq $message; 1545 print "swell\n"; 1546 1547 print "deleting shm $id\n"; 1548 shmctl($id, IPC_RMID, 0) || die "$!"; 1549 1550Here's an example of a semaphore: 1551 1552 use IPC::SysV qw(IPC_CREAT); 1553 1554 $IPC_KEY = 1234; 1555 $id = semget($IPC_KEY, 10, 0666 | IPC_CREAT ) || die "$!"; 1556 print "shm key $id\n"; 1557 1558Put this code in a separate file to be run in more than one process. 1559Call the file F<take>: 1560 1561 # create a semaphore 1562 1563 $IPC_KEY = 1234; 1564 $id = semget($IPC_KEY, 0 , 0 ); 1565 die if !defined($id); 1566 1567 $semnum = 0; 1568 $semflag = 0; 1569 1570 # 'take' semaphore 1571 # wait for semaphore to be zero 1572 $semop = 0; 1573 $opstring1 = pack("s!s!s!", $semnum, $semop, $semflag); 1574 1575 # Increment the semaphore count 1576 $semop = 1; 1577 $opstring2 = pack("s!s!s!", $semnum, $semop, $semflag); 1578 $opstring = $opstring1 . $opstring2; 1579 1580 semop($id,$opstring) || die "$!"; 1581 1582Put this code in a separate file to be run in more than one process. 1583Call this file F<give>: 1584 1585 # 'give' the semaphore 1586 # run this in the original process and you will see 1587 # that the second process continues 1588 1589 $IPC_KEY = 1234; 1590 $id = semget($IPC_KEY, 0, 0); 1591 die if !defined($id); 1592 1593 $semnum = 0; 1594 $semflag = 0; 1595 1596 # Decrement the semaphore count 1597 $semop = -1; 1598 $opstring = pack("s!s!s!", $semnum, $semop, $semflag); 1599 1600 semop($id,$opstring) || die "$!"; 1601 1602The SysV IPC code above was written long ago, and it's definitely 1603clunky looking. For a more modern look, see the IPC::SysV module 1604which is included with Perl starting from Perl 5.005. 1605 1606A small example demonstrating SysV message queues: 1607 1608 use IPC::SysV qw(IPC_PRIVATE IPC_RMID IPC_CREAT S_IRWXU); 1609 1610 my $id = msgget(IPC_PRIVATE, IPC_CREAT | S_IRWXU); 1611 1612 my $sent = "message"; 1613 my $type_sent = 1234; 1614 my $rcvd; 1615 my $type_rcvd; 1616 1617 if (defined $id) { 1618 if (msgsnd($id, pack("l! a*", $type_sent, $sent), 0)) { 1619 if (msgrcv($id, $rcvd, 60, 0, 0)) { 1620 ($type_rcvd, $rcvd) = unpack("l! a*", $rcvd); 1621 if ($rcvd eq $sent) { 1622 print "okay\n"; 1623 } else { 1624 print "not okay\n"; 1625 } 1626 } else { 1627 die "# msgrcv failed\n"; 1628 } 1629 } else { 1630 die "# msgsnd failed\n"; 1631 } 1632 msgctl($id, IPC_RMID, 0) || die "# msgctl failed: $!\n"; 1633 } else { 1634 die "# msgget failed\n"; 1635 } 1636 1637=head1 NOTES 1638 1639Most of these routines quietly but politely return C<undef> when they 1640fail instead of causing your program to die right then and there due to 1641an uncaught exception. (Actually, some of the new I<Socket> conversion 1642functions croak() on bad arguments.) It is therefore essential to 1643check return values from these functions. Always begin your socket 1644programs this way for optimal success, and don't forget to add B<-T> 1645taint checking flag to the #! line for servers: 1646 1647 #!/usr/bin/perl -Tw 1648 use strict; 1649 use sigtrap; 1650 use Socket; 1651 1652=head1 BUGS 1653 1654All these routines create system-specific portability problems. As noted 1655elsewhere, Perl is at the mercy of your C libraries for much of its system 1656behaviour. It's probably safest to assume broken SysV semantics for 1657signals and to stick with simple TCP and UDP socket operations; e.g., don't 1658try to pass open file descriptors over a local UDP datagram socket if you 1659want your code to stand a chance of being portable. 1660 1661As mentioned in the signals section, because few vendors provide C 1662libraries that are safely re-entrant, the prudent programmer will do 1663little else within a handler beyond setting a numeric variable that 1664already exists; or, if locked into a slow (restarting) system call, 1665using die() to raise an exception and longjmp(3) out. In fact, even 1666these may in some cases cause a core dump. It's probably best to avoid 1667signals except where they are absolutely inevitable. This 1668will be addressed in a future release of Perl. 1669 1670=head1 AUTHOR 1671 1672Tom Christiansen, with occasional vestiges of Larry Wall's original 1673version and suggestions from the Perl Porters. 1674 1675=head1 SEE ALSO 1676 1677There's a lot more to networking than this, but this should get you 1678started. 1679 1680For intrepid programmers, the indispensable textbook is I<Unix 1681Network Programming, 2nd Edition, Volume 1> by W. Richard Stevens 1682(published by Prentice-Hall). Note that most books on networking 1683address the subject from the perspective of a C programmer; translation 1684to Perl is left as an exercise for the reader. 1685 1686The IO::Socket(3) manpage describes the object library, and the Socket(3) 1687manpage describes the low-level interface to sockets. Besides the obvious 1688functions in L<perlfunc>, you should also check out the F<modules> file 1689at your nearest CPAN site. (See L<perlmodlib> or best yet, the F<Perl 1690FAQ> for a description of what CPAN is and where to get it.) 1691 1692Section 5 of the F<modules> file is devoted to "Networking, Device Control 1693(modems), and Interprocess Communication", and contains numerous unbundled 1694modules numerous networking modules, Chat and Expect operations, CGI 1695programming, DCE, FTP, IPC, NNTP, Proxy, Ptty, RPC, SNMP, SMTP, Telnet, 1696Threads, and ToolTalk--just to name a few. 1697