xref: /onnv-gate/usr/src/cmd/perl/5.8.4/distrib/pod/perlsec.pod (revision 0:68f95e015346)
1*0Sstevel@tonic-gate=head1 NAME
2*0Sstevel@tonic-gate
3*0Sstevel@tonic-gateperlsec - Perl security
4*0Sstevel@tonic-gate
5*0Sstevel@tonic-gate=head1 DESCRIPTION
6*0Sstevel@tonic-gate
7*0Sstevel@tonic-gatePerl is designed to make it easy to program securely even when running
8*0Sstevel@tonic-gatewith extra privileges, like setuid or setgid programs.  Unlike most
9*0Sstevel@tonic-gatecommand line shells, which are based on multiple substitution passes on
10*0Sstevel@tonic-gateeach line of the script, Perl uses a more conventional evaluation scheme
11*0Sstevel@tonic-gatewith fewer hidden snags.  Additionally, because the language has more
12*0Sstevel@tonic-gatebuiltin functionality, it can rely less upon external (and possibly
13*0Sstevel@tonic-gateuntrustworthy) programs to accomplish its purposes.
14*0Sstevel@tonic-gate
15*0Sstevel@tonic-gatePerl automatically enables a set of special security checks, called I<taint
16*0Sstevel@tonic-gatemode>, when it detects its program running with differing real and effective
17*0Sstevel@tonic-gateuser or group IDs.  The setuid bit in Unix permissions is mode 04000, the
18*0Sstevel@tonic-gatesetgid bit mode 02000; either or both may be set.  You can also enable taint
19*0Sstevel@tonic-gatemode explicitly by using the B<-T> command line flag. This flag is
20*0Sstevel@tonic-gateI<strongly> suggested for server programs and any program run on behalf of
21*0Sstevel@tonic-gatesomeone else, such as a CGI script. Once taint mode is on, it's on for
22*0Sstevel@tonic-gatethe remainder of your script.
23*0Sstevel@tonic-gate
24*0Sstevel@tonic-gateWhile in this mode, Perl takes special precautions called I<taint
25*0Sstevel@tonic-gatechecks> to prevent both obvious and subtle traps.  Some of these checks
26*0Sstevel@tonic-gateare reasonably simple, such as verifying that path directories aren't
27*0Sstevel@tonic-gatewritable by others; careful programmers have always used checks like
28*0Sstevel@tonic-gatethese.  Other checks, however, are best supported by the language itself,
29*0Sstevel@tonic-gateand it is these checks especially that contribute to making a set-id Perl
30*0Sstevel@tonic-gateprogram more secure than the corresponding C program.
31*0Sstevel@tonic-gate
32*0Sstevel@tonic-gateYou may not use data derived from outside your program to affect
33*0Sstevel@tonic-gatesomething else outside your program--at least, not by accident.  All
34*0Sstevel@tonic-gatecommand line arguments, environment variables, locale information (see
35*0Sstevel@tonic-gateL<perllocale>), results of certain system calls (readdir(),
36*0Sstevel@tonic-gatereadlink(), the variable of shmread(), the messages returned by
37*0Sstevel@tonic-gatemsgrcv(), the password, gcos and shell fields returned by the
38*0Sstevel@tonic-gategetpwxxx() calls), and all file input are marked as "tainted".
39*0Sstevel@tonic-gateTainted data may not be used directly or indirectly in any command
40*0Sstevel@tonic-gatethat invokes a sub-shell, nor in any command that modifies files,
41*0Sstevel@tonic-gatedirectories, or processes, B<with the following exceptions>:
42*0Sstevel@tonic-gate
43*0Sstevel@tonic-gate=over 4
44*0Sstevel@tonic-gate
45*0Sstevel@tonic-gate=item *
46*0Sstevel@tonic-gate
47*0Sstevel@tonic-gateArguments to C<print> and C<syswrite> are B<not> checked for taintedness.
48*0Sstevel@tonic-gate
49*0Sstevel@tonic-gate=item *
50*0Sstevel@tonic-gate
51*0Sstevel@tonic-gateSymbolic methods
52*0Sstevel@tonic-gate
53*0Sstevel@tonic-gate    $obj->$method(@args);
54*0Sstevel@tonic-gate
55*0Sstevel@tonic-gateand symbolic sub references
56*0Sstevel@tonic-gate
57*0Sstevel@tonic-gate    &{$foo}(@args);
58*0Sstevel@tonic-gate    $foo->(@args);
59*0Sstevel@tonic-gate
60*0Sstevel@tonic-gateare not checked for taintedness.  This requires extra carefulness
61*0Sstevel@tonic-gateunless you want external data to affect your control flow.  Unless
62*0Sstevel@tonic-gateyou carefully limit what these symbolic values are, people are able
63*0Sstevel@tonic-gateto call functions B<outside> your Perl code, such as POSIX::system,
64*0Sstevel@tonic-gatein which case they are able to run arbitrary external code.
65*0Sstevel@tonic-gate
66*0Sstevel@tonic-gate=back
67*0Sstevel@tonic-gate
68*0Sstevel@tonic-gateFor efficiency reasons, Perl takes a conservative view of
69*0Sstevel@tonic-gatewhether data is tainted.  If an expression contains tainted data,
70*0Sstevel@tonic-gateany subexpression may be considered tainted, even if the value
71*0Sstevel@tonic-gateof the subexpression is not itself affected by the tainted data.
72*0Sstevel@tonic-gate
73*0Sstevel@tonic-gateBecause taintedness is associated with each scalar value, some
74*0Sstevel@tonic-gateelements of an array or hash can be tainted and others not.
75*0Sstevel@tonic-gateThe keys of a hash are never tainted.
76*0Sstevel@tonic-gate
77*0Sstevel@tonic-gateFor example:
78*0Sstevel@tonic-gate
79*0Sstevel@tonic-gate    $arg = shift;		# $arg is tainted
80*0Sstevel@tonic-gate    $hid = $arg, 'bar';		# $hid is also tainted
81*0Sstevel@tonic-gate    $line = <>;			# Tainted
82*0Sstevel@tonic-gate    $line = <STDIN>;		# Also tainted
83*0Sstevel@tonic-gate    open FOO, "/home/me/bar" or die $!;
84*0Sstevel@tonic-gate    $line = <FOO>;		# Still tainted
85*0Sstevel@tonic-gate    $path = $ENV{'PATH'};	# Tainted, but see below
86*0Sstevel@tonic-gate    $data = 'abc';		# Not tainted
87*0Sstevel@tonic-gate
88*0Sstevel@tonic-gate    system "echo $arg";		# Insecure
89*0Sstevel@tonic-gate    system "/bin/echo", $arg;	# Considered insecure
90*0Sstevel@tonic-gate				# (Perl doesn't know about /bin/echo)
91*0Sstevel@tonic-gate    system "echo $hid";		# Insecure
92*0Sstevel@tonic-gate    system "echo $data";	# Insecure until PATH set
93*0Sstevel@tonic-gate
94*0Sstevel@tonic-gate    $path = $ENV{'PATH'};	# $path now tainted
95*0Sstevel@tonic-gate
96*0Sstevel@tonic-gate    $ENV{'PATH'} = '/bin:/usr/bin';
97*0Sstevel@tonic-gate    delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};
98*0Sstevel@tonic-gate
99*0Sstevel@tonic-gate    $path = $ENV{'PATH'};	# $path now NOT tainted
100*0Sstevel@tonic-gate    system "echo $data";	# Is secure now!
101*0Sstevel@tonic-gate
102*0Sstevel@tonic-gate    open(FOO, "< $arg");	# OK - read-only file
103*0Sstevel@tonic-gate    open(FOO, "> $arg"); 	# Not OK - trying to write
104*0Sstevel@tonic-gate
105*0Sstevel@tonic-gate    open(FOO,"echo $arg|");	# Not OK
106*0Sstevel@tonic-gate    open(FOO,"-|")
107*0Sstevel@tonic-gate	or exec 'echo', $arg;	# Also not OK
108*0Sstevel@tonic-gate
109*0Sstevel@tonic-gate    $shout = `echo $arg`;	# Insecure, $shout now tainted
110*0Sstevel@tonic-gate
111*0Sstevel@tonic-gate    unlink $data, $arg;		# Insecure
112*0Sstevel@tonic-gate    umask $arg;			# Insecure
113*0Sstevel@tonic-gate
114*0Sstevel@tonic-gate    exec "echo $arg";		# Insecure
115*0Sstevel@tonic-gate    exec "echo", $arg;		# Insecure
116*0Sstevel@tonic-gate    exec "sh", '-c', $arg;	# Very insecure!
117*0Sstevel@tonic-gate
118*0Sstevel@tonic-gate    @files = <*.c>;		# insecure (uses readdir() or similar)
119*0Sstevel@tonic-gate    @files = glob('*.c');	# insecure (uses readdir() or similar)
120*0Sstevel@tonic-gate
121*0Sstevel@tonic-gate    # In Perl releases older than 5.6.0 the <*.c> and glob('*.c') would
122*0Sstevel@tonic-gate    # have used an external program to do the filename expansion; but in
123*0Sstevel@tonic-gate    # either case the result is tainted since the list of filenames comes
124*0Sstevel@tonic-gate    # from outside of the program.
125*0Sstevel@tonic-gate
126*0Sstevel@tonic-gate    $bad = ($arg, 23);		# $bad will be tainted
127*0Sstevel@tonic-gate    $arg, `true`;		# Insecure (although it isn't really)
128*0Sstevel@tonic-gate
129*0Sstevel@tonic-gateIf you try to do something insecure, you will get a fatal error saying
130*0Sstevel@tonic-gatesomething like "Insecure dependency" or "Insecure $ENV{PATH}".
131*0Sstevel@tonic-gate
132*0Sstevel@tonic-gate=head2 Laundering and Detecting Tainted Data
133*0Sstevel@tonic-gate
134*0Sstevel@tonic-gateTo test whether a variable contains tainted data, and whose use would
135*0Sstevel@tonic-gatethus trigger an "Insecure dependency" message, you can use the
136*0Sstevel@tonic-gatetainted() function of the Scalar::Util module, available in your
137*0Sstevel@tonic-gatenearby CPAN mirror, and included in Perl starting from the release 5.8.0.
138*0Sstevel@tonic-gateOr you may be able to use the following C<is_tainted()> function.
139*0Sstevel@tonic-gate
140*0Sstevel@tonic-gate    sub is_tainted {
141*0Sstevel@tonic-gate        return ! eval { eval("#" . substr(join("", @_), 0, 0)); 1 };
142*0Sstevel@tonic-gate    }
143*0Sstevel@tonic-gate
144*0Sstevel@tonic-gateThis function makes use of the fact that the presence of tainted data
145*0Sstevel@tonic-gateanywhere within an expression renders the entire expression tainted.  It
146*0Sstevel@tonic-gatewould be inefficient for every operator to test every argument for
147*0Sstevel@tonic-gatetaintedness.  Instead, the slightly more efficient and conservative
148*0Sstevel@tonic-gateapproach is used that if any tainted value has been accessed within the
149*0Sstevel@tonic-gatesame expression, the whole expression is considered tainted.
150*0Sstevel@tonic-gate
151*0Sstevel@tonic-gateBut testing for taintedness gets you only so far.  Sometimes you have just
152*0Sstevel@tonic-gateto clear your data's taintedness.  Values may be untainted by using them
153*0Sstevel@tonic-gateas keys in a hash; otherwise the only way to bypass the tainting
154*0Sstevel@tonic-gatemechanism is by referencing subpatterns from a regular expression match.
155*0Sstevel@tonic-gatePerl presumes that if you reference a substring using $1, $2, etc., that
156*0Sstevel@tonic-gateyou knew what you were doing when you wrote the pattern.  That means using
157*0Sstevel@tonic-gatea bit of thought--don't just blindly untaint anything, or you defeat the
158*0Sstevel@tonic-gateentire mechanism.  It's better to verify that the variable has only good
159*0Sstevel@tonic-gatecharacters (for certain values of "good") rather than checking whether it
160*0Sstevel@tonic-gatehas any bad characters.  That's because it's far too easy to miss bad
161*0Sstevel@tonic-gatecharacters that you never thought of.
162*0Sstevel@tonic-gate
163*0Sstevel@tonic-gateHere's a test to make sure that the data contains nothing but "word"
164*0Sstevel@tonic-gatecharacters (alphabetics, numerics, and underscores), a hyphen, an at sign,
165*0Sstevel@tonic-gateor a dot.
166*0Sstevel@tonic-gate
167*0Sstevel@tonic-gate    if ($data =~ /^([-\@\w.]+)$/) {
168*0Sstevel@tonic-gate	$data = $1; 			# $data now untainted
169*0Sstevel@tonic-gate    } else {
170*0Sstevel@tonic-gate	die "Bad data in '$data'"; 	# log this somewhere
171*0Sstevel@tonic-gate    }
172*0Sstevel@tonic-gate
173*0Sstevel@tonic-gateThis is fairly secure because C</\w+/> doesn't normally match shell
174*0Sstevel@tonic-gatemetacharacters, nor are dot, dash, or at going to mean something special
175*0Sstevel@tonic-gateto the shell.  Use of C</.+/> would have been insecure in theory because
176*0Sstevel@tonic-gateit lets everything through, but Perl doesn't check for that.  The lesson
177*0Sstevel@tonic-gateis that when untainting, you must be exceedingly careful with your patterns.
178*0Sstevel@tonic-gateLaundering data using regular expression is the I<only> mechanism for
179*0Sstevel@tonic-gateuntainting dirty data, unless you use the strategy detailed below to fork
180*0Sstevel@tonic-gatea child of lesser privilege.
181*0Sstevel@tonic-gate
182*0Sstevel@tonic-gateThe example does not untaint $data if C<use locale> is in effect,
183*0Sstevel@tonic-gatebecause the characters matched by C<\w> are determined by the locale.
184*0Sstevel@tonic-gatePerl considers that locale definitions are untrustworthy because they
185*0Sstevel@tonic-gatecontain data from outside the program.  If you are writing a
186*0Sstevel@tonic-gatelocale-aware program, and want to launder data with a regular expression
187*0Sstevel@tonic-gatecontaining C<\w>, put C<no locale> ahead of the expression in the same
188*0Sstevel@tonic-gateblock.  See L<perllocale/SECURITY> for further discussion and examples.
189*0Sstevel@tonic-gate
190*0Sstevel@tonic-gate=head2 Switches On the "#!" Line
191*0Sstevel@tonic-gate
192*0Sstevel@tonic-gateWhen you make a script executable, in order to make it usable as a
193*0Sstevel@tonic-gatecommand, the system will pass switches to perl from the script's #!
194*0Sstevel@tonic-gateline.  Perl checks that any command line switches given to a setuid
195*0Sstevel@tonic-gate(or setgid) script actually match the ones set on the #! line.  Some
196*0Sstevel@tonic-gateUnix and Unix-like environments impose a one-switch limit on the #!
197*0Sstevel@tonic-gateline, so you may need to use something like C<-wU> instead of C<-w -U>
198*0Sstevel@tonic-gateunder such systems.  (This issue should arise only in Unix or
199*0Sstevel@tonic-gateUnix-like environments that support #! and setuid or setgid scripts.)
200*0Sstevel@tonic-gate
201*0Sstevel@tonic-gate=head2 Taint mode and @INC
202*0Sstevel@tonic-gate
203*0Sstevel@tonic-gateWhen the taint mode (C<-T>) is in effect, the "." directory is removed
204*0Sstevel@tonic-gatefrom C<@INC>, and the environment variables C<PERL5LIB> and C<PERLLIB>
205*0Sstevel@tonic-gateare ignored by Perl. You can still adjust C<@INC> from outside the
206*0Sstevel@tonic-gateprogram by using the C<-I> command line option as explained in
207*0Sstevel@tonic-gateL<perlrun>. The two environment variables are ignored because
208*0Sstevel@tonic-gatethey are obscured, and a user running a program could be unaware that
209*0Sstevel@tonic-gatethey are set, whereas the C<-I> option is clearly visible and
210*0Sstevel@tonic-gatetherefore permitted.
211*0Sstevel@tonic-gate
212*0Sstevel@tonic-gateAnother way to modify C<@INC> without modifying the program, is to use
213*0Sstevel@tonic-gatethe C<lib> pragma, e.g.:
214*0Sstevel@tonic-gate
215*0Sstevel@tonic-gate  perl -Mlib=/foo program
216*0Sstevel@tonic-gate
217*0Sstevel@tonic-gateThe benefit of using C<-Mlib=/foo> over C<-I/foo>, is that the former
218*0Sstevel@tonic-gatewill automagically remove any duplicated directories, while the later
219*0Sstevel@tonic-gatewill not.
220*0Sstevel@tonic-gate
221*0Sstevel@tonic-gate=head2 Cleaning Up Your Path
222*0Sstevel@tonic-gate
223*0Sstevel@tonic-gateFor "Insecure C<$ENV{PATH}>" messages, you need to set C<$ENV{'PATH'}> to a
224*0Sstevel@tonic-gateknown value, and each directory in the path must be non-writable by others
225*0Sstevel@tonic-gatethan its owner and group.  You may be surprised to get this message even
226*0Sstevel@tonic-gateif the pathname to your executable is fully qualified.  This is I<not>
227*0Sstevel@tonic-gategenerated because you didn't supply a full path to the program; instead,
228*0Sstevel@tonic-gateit's generated because you never set your PATH environment variable, or
229*0Sstevel@tonic-gateyou didn't set it to something that was safe.  Because Perl can't
230*0Sstevel@tonic-gateguarantee that the executable in question isn't itself going to turn
231*0Sstevel@tonic-gatearound and execute some other program that is dependent on your PATH, it
232*0Sstevel@tonic-gatemakes sure you set the PATH.
233*0Sstevel@tonic-gate
234*0Sstevel@tonic-gateThe PATH isn't the only environment variable which can cause problems.
235*0Sstevel@tonic-gateBecause some shells may use the variables IFS, CDPATH, ENV, and
236*0Sstevel@tonic-gateBASH_ENV, Perl checks that those are either empty or untainted when
237*0Sstevel@tonic-gatestarting subprocesses. You may wish to add something like this to your
238*0Sstevel@tonic-gatesetid and taint-checking scripts.
239*0Sstevel@tonic-gate
240*0Sstevel@tonic-gate    delete @ENV{qw(IFS CDPATH ENV BASH_ENV)};   # Make %ENV safer
241*0Sstevel@tonic-gate
242*0Sstevel@tonic-gateIt's also possible to get into trouble with other operations that don't
243*0Sstevel@tonic-gatecare whether they use tainted values.  Make judicious use of the file
244*0Sstevel@tonic-gatetests in dealing with any user-supplied filenames.  When possible, do
245*0Sstevel@tonic-gateopens and such B<after> properly dropping any special user (or group!)
246*0Sstevel@tonic-gateprivileges. Perl doesn't prevent you from opening tainted filenames for reading,
247*0Sstevel@tonic-gateso be careful what you print out.  The tainting mechanism is intended to
248*0Sstevel@tonic-gateprevent stupid mistakes, not to remove the need for thought.
249*0Sstevel@tonic-gate
250*0Sstevel@tonic-gatePerl does not call the shell to expand wild cards when you pass B<system>
251*0Sstevel@tonic-gateand B<exec> explicit parameter lists instead of strings with possible shell
252*0Sstevel@tonic-gatewildcards in them.  Unfortunately, the B<open>, B<glob>, and
253*0Sstevel@tonic-gatebacktick functions provide no such alternate calling convention, so more
254*0Sstevel@tonic-gatesubterfuge will be required.
255*0Sstevel@tonic-gate
256*0Sstevel@tonic-gatePerl provides a reasonably safe way to open a file or pipe from a setuid
257*0Sstevel@tonic-gateor setgid program: just create a child process with reduced privilege who
258*0Sstevel@tonic-gatedoes the dirty work for you.  First, fork a child using the special
259*0Sstevel@tonic-gateB<open> syntax that connects the parent and child by a pipe.  Now the
260*0Sstevel@tonic-gatechild resets its ID set and any other per-process attributes, like
261*0Sstevel@tonic-gateenvironment variables, umasks, current working directories, back to the
262*0Sstevel@tonic-gateoriginals or known safe values.  Then the child process, which no longer
263*0Sstevel@tonic-gatehas any special permissions, does the B<open> or other system call.
264*0Sstevel@tonic-gateFinally, the child passes the data it managed to access back to the
265*0Sstevel@tonic-gateparent.  Because the file or pipe was opened in the child while running
266*0Sstevel@tonic-gateunder less privilege than the parent, it's not apt to be tricked into
267*0Sstevel@tonic-gatedoing something it shouldn't.
268*0Sstevel@tonic-gate
269*0Sstevel@tonic-gateHere's a way to do backticks reasonably safely.  Notice how the B<exec> is
270*0Sstevel@tonic-gatenot called with a string that the shell could expand.  This is by far the
271*0Sstevel@tonic-gatebest way to call something that might be subjected to shell escapes: just
272*0Sstevel@tonic-gatenever call the shell at all.
273*0Sstevel@tonic-gate
274*0Sstevel@tonic-gate        use English '-no_match_vars';
275*0Sstevel@tonic-gate        die "Can't fork: $!" unless defined($pid = open(KID, "-|"));
276*0Sstevel@tonic-gate        if ($pid) {           # parent
277*0Sstevel@tonic-gate            while (<KID>) {
278*0Sstevel@tonic-gate                # do something
279*0Sstevel@tonic-gate            }
280*0Sstevel@tonic-gate            close KID;
281*0Sstevel@tonic-gate        } else {
282*0Sstevel@tonic-gate            my @temp     = ($EUID, $EGID);
283*0Sstevel@tonic-gate            my $orig_uid = $UID;
284*0Sstevel@tonic-gate            my $orig_gid = $GID;
285*0Sstevel@tonic-gate            $EUID = $UID;
286*0Sstevel@tonic-gate            $EGID = $GID;
287*0Sstevel@tonic-gate            # Drop privileges
288*0Sstevel@tonic-gate            $UID  = $orig_uid;
289*0Sstevel@tonic-gate            $GID  = $orig_gid;
290*0Sstevel@tonic-gate            # Make sure privs are really gone
291*0Sstevel@tonic-gate            ($EUID, $EGID) = @temp;
292*0Sstevel@tonic-gate            die "Can't drop privileges"
293*0Sstevel@tonic-gate                unless $UID == $EUID  && $GID eq $EGID;
294*0Sstevel@tonic-gate            $ENV{PATH} = "/bin:/usr/bin"; # Minimal PATH.
295*0Sstevel@tonic-gate	    # Consider sanitizing the environment even more.
296*0Sstevel@tonic-gate            exec 'myprog', 'arg1', 'arg2'
297*0Sstevel@tonic-gate                or die "can't exec myprog: $!";
298*0Sstevel@tonic-gate        }
299*0Sstevel@tonic-gate
300*0Sstevel@tonic-gateA similar strategy would work for wildcard expansion via C<glob>, although
301*0Sstevel@tonic-gateyou can use C<readdir> instead.
302*0Sstevel@tonic-gate
303*0Sstevel@tonic-gateTaint checking is most useful when although you trust yourself not to have
304*0Sstevel@tonic-gatewritten a program to give away the farm, you don't necessarily trust those
305*0Sstevel@tonic-gatewho end up using it not to try to trick it into doing something bad.  This
306*0Sstevel@tonic-gateis the kind of security checking that's useful for set-id programs and
307*0Sstevel@tonic-gateprograms launched on someone else's behalf, like CGI programs.
308*0Sstevel@tonic-gate
309*0Sstevel@tonic-gateThis is quite different, however, from not even trusting the writer of the
310*0Sstevel@tonic-gatecode not to try to do something evil.  That's the kind of trust needed
311*0Sstevel@tonic-gatewhen someone hands you a program you've never seen before and says, "Here,
312*0Sstevel@tonic-gaterun this."  For that kind of safety, check out the Safe module,
313*0Sstevel@tonic-gateincluded standard in the Perl distribution.  This module allows the
314*0Sstevel@tonic-gateprogrammer to set up special compartments in which all system operations
315*0Sstevel@tonic-gateare trapped and namespace access is carefully controlled.
316*0Sstevel@tonic-gate
317*0Sstevel@tonic-gate=head2 Security Bugs
318*0Sstevel@tonic-gate
319*0Sstevel@tonic-gateBeyond the obvious problems that stem from giving special privileges to
320*0Sstevel@tonic-gatesystems as flexible as scripts, on many versions of Unix, set-id scripts
321*0Sstevel@tonic-gateare inherently insecure right from the start.  The problem is a race
322*0Sstevel@tonic-gatecondition in the kernel.  Between the time the kernel opens the file to
323*0Sstevel@tonic-gatesee which interpreter to run and when the (now-set-id) interpreter turns
324*0Sstevel@tonic-gatearound and reopens the file to interpret it, the file in question may have
325*0Sstevel@tonic-gatechanged, especially if you have symbolic links on your system.
326*0Sstevel@tonic-gate
327*0Sstevel@tonic-gateFortunately, sometimes this kernel "feature" can be disabled.
328*0Sstevel@tonic-gateUnfortunately, there are two ways to disable it.  The system can simply
329*0Sstevel@tonic-gateoutlaw scripts with any set-id bit set, which doesn't help much.
330*0Sstevel@tonic-gateAlternately, it can simply ignore the set-id bits on scripts.  If the
331*0Sstevel@tonic-gatelatter is true, Perl can emulate the setuid and setgid mechanism when it
332*0Sstevel@tonic-gatenotices the otherwise useless setuid/gid bits on Perl scripts.  It does
333*0Sstevel@tonic-gatethis via a special executable called B<suidperl> that is automatically
334*0Sstevel@tonic-gateinvoked for you if it's needed.
335*0Sstevel@tonic-gate
336*0Sstevel@tonic-gateHowever, if the kernel set-id script feature isn't disabled, Perl will
337*0Sstevel@tonic-gatecomplain loudly that your set-id script is insecure.  You'll need to
338*0Sstevel@tonic-gateeither disable the kernel set-id script feature, or put a C wrapper around
339*0Sstevel@tonic-gatethe script.  A C wrapper is just a compiled program that does nothing
340*0Sstevel@tonic-gateexcept call your Perl program.   Compiled programs are not subject to the
341*0Sstevel@tonic-gatekernel bug that plagues set-id scripts.  Here's a simple wrapper, written
342*0Sstevel@tonic-gatein C:
343*0Sstevel@tonic-gate
344*0Sstevel@tonic-gate    #define REAL_PATH "/path/to/script"
345*0Sstevel@tonic-gate    main(ac, av)
346*0Sstevel@tonic-gate	char **av;
347*0Sstevel@tonic-gate    {
348*0Sstevel@tonic-gate	execv(REAL_PATH, av);
349*0Sstevel@tonic-gate    }
350*0Sstevel@tonic-gate
351*0Sstevel@tonic-gateCompile this wrapper into a binary executable and then make I<it> rather
352*0Sstevel@tonic-gatethan your script setuid or setgid.
353*0Sstevel@tonic-gate
354*0Sstevel@tonic-gateIn recent years, vendors have begun to supply systems free of this
355*0Sstevel@tonic-gateinherent security bug.  On such systems, when the kernel passes the name
356*0Sstevel@tonic-gateof the set-id script to open to the interpreter, rather than using a
357*0Sstevel@tonic-gatepathname subject to meddling, it instead passes I</dev/fd/3>.  This is a
358*0Sstevel@tonic-gatespecial file already opened on the script, so that there can be no race
359*0Sstevel@tonic-gatecondition for evil scripts to exploit.  On these systems, Perl should be
360*0Sstevel@tonic-gatecompiled with C<-DSETUID_SCRIPTS_ARE_SECURE_NOW>.  The B<Configure>
361*0Sstevel@tonic-gateprogram that builds Perl tries to figure this out for itself, so you
362*0Sstevel@tonic-gateshould never have to specify this yourself.  Most modern releases of
363*0Sstevel@tonic-gateSysVr4 and BSD 4.4 use this approach to avoid the kernel race condition.
364*0Sstevel@tonic-gate
365*0Sstevel@tonic-gatePrior to release 5.6.1 of Perl, bugs in the code of B<suidperl> could
366*0Sstevel@tonic-gateintroduce a security hole.
367*0Sstevel@tonic-gate
368*0Sstevel@tonic-gate=head2 Protecting Your Programs
369*0Sstevel@tonic-gate
370*0Sstevel@tonic-gateThere are a number of ways to hide the source to your Perl programs,
371*0Sstevel@tonic-gatewith varying levels of "security".
372*0Sstevel@tonic-gate
373*0Sstevel@tonic-gateFirst of all, however, you I<can't> take away read permission, because
374*0Sstevel@tonic-gatethe source code has to be readable in order to be compiled and
375*0Sstevel@tonic-gateinterpreted.  (That doesn't mean that a CGI script's source is
376*0Sstevel@tonic-gatereadable by people on the web, though.)  So you have to leave the
377*0Sstevel@tonic-gatepermissions at the socially friendly 0755 level.  This lets
378*0Sstevel@tonic-gatepeople on your local system only see your source.
379*0Sstevel@tonic-gate
380*0Sstevel@tonic-gateSome people mistakenly regard this as a security problem.  If your program does
381*0Sstevel@tonic-gateinsecure things, and relies on people not knowing how to exploit those
382*0Sstevel@tonic-gateinsecurities, it is not secure.  It is often possible for someone to
383*0Sstevel@tonic-gatedetermine the insecure things and exploit them without viewing the
384*0Sstevel@tonic-gatesource.  Security through obscurity, the name for hiding your bugs
385*0Sstevel@tonic-gateinstead of fixing them, is little security indeed.
386*0Sstevel@tonic-gate
387*0Sstevel@tonic-gateYou can try using encryption via source filters (Filter::* from CPAN,
388*0Sstevel@tonic-gateor Filter::Util::Call and Filter::Simple since Perl 5.8).
389*0Sstevel@tonic-gateBut crackers might be able to decrypt it.  You can try using the byte
390*0Sstevel@tonic-gatecode compiler and interpreter described below, but crackers might be
391*0Sstevel@tonic-gateable to de-compile it.  You can try using the native-code compiler
392*0Sstevel@tonic-gatedescribed below, but crackers might be able to disassemble it.  These
393*0Sstevel@tonic-gatepose varying degrees of difficulty to people wanting to get at your
394*0Sstevel@tonic-gatecode, but none can definitively conceal it (this is true of every
395*0Sstevel@tonic-gatelanguage, not just Perl).
396*0Sstevel@tonic-gate
397*0Sstevel@tonic-gateIf you're concerned about people profiting from your code, then the
398*0Sstevel@tonic-gatebottom line is that nothing but a restrictive licence will give you
399*0Sstevel@tonic-gatelegal security.  License your software and pepper it with threatening
400*0Sstevel@tonic-gatestatements like "This is unpublished proprietary software of XYZ Corp.
401*0Sstevel@tonic-gateYour access to it does not give you permission to use it blah blah
402*0Sstevel@tonic-gateblah."  You should see a lawyer to be sure your licence's wording will
403*0Sstevel@tonic-gatestand up in court.
404*0Sstevel@tonic-gate
405*0Sstevel@tonic-gate=head2 Unicode
406*0Sstevel@tonic-gate
407*0Sstevel@tonic-gateUnicode is a new and complex technology and one may easily overlook
408*0Sstevel@tonic-gatecertain security pitfalls.  See L<perluniintro> for an overview and
409*0Sstevel@tonic-gateL<perlunicode> for details, and L<perlunicode/"Security Implications
410*0Sstevel@tonic-gateof Unicode"> for security implications in particular.
411*0Sstevel@tonic-gate
412*0Sstevel@tonic-gate=head2 Algorithmic Complexity Attacks
413*0Sstevel@tonic-gate
414*0Sstevel@tonic-gateCertain internal algorithms used in the implementation of Perl can
415*0Sstevel@tonic-gatebe attacked by choosing the input carefully to consume large amounts
416*0Sstevel@tonic-gateof either time or space or both.  This can lead into the so-called
417*0Sstevel@tonic-gateI<Denial of Service> (DoS) attacks.
418*0Sstevel@tonic-gate
419*0Sstevel@tonic-gate=over 4
420*0Sstevel@tonic-gate
421*0Sstevel@tonic-gate=item *
422*0Sstevel@tonic-gate
423*0Sstevel@tonic-gateHash Function - the algorithm used to "order" hash elements has been
424*0Sstevel@tonic-gatechanged several times during the development of Perl, mainly to be
425*0Sstevel@tonic-gatereasonably fast.  In Perl 5.8.1 also the security aspect was taken
426*0Sstevel@tonic-gateinto account.
427*0Sstevel@tonic-gate
428*0Sstevel@tonic-gateIn Perls before 5.8.1 one could rather easily generate data that as
429*0Sstevel@tonic-gatehash keys would cause Perl to consume large amounts of time because
430*0Sstevel@tonic-gateinternal structure of hashes would badly degenerate.  In Perl 5.8.1
431*0Sstevel@tonic-gatethe hash function is randomly perturbed by a pseudorandom seed which
432*0Sstevel@tonic-gatemakes generating such naughty hash keys harder.
433*0Sstevel@tonic-gateSee L<perlrun/PERL_HASH_SEED> for more information.
434*0Sstevel@tonic-gate
435*0Sstevel@tonic-gateThe random perturbation is done by default but if one wants for some
436*0Sstevel@tonic-gatereason emulate the old behaviour one can set the environment variable
437*0Sstevel@tonic-gatePERL_HASH_SEED to zero (or any other integer).  One possible reason
438*0Sstevel@tonic-gatefor wanting to emulate the old behaviour is that in the new behaviour
439*0Sstevel@tonic-gateconsecutive runs of Perl will order hash keys differently, which may
440*0Sstevel@tonic-gateconfuse some applications (like Data::Dumper: the outputs of two
441*0Sstevel@tonic-gatedifferent runs are no more identical).
442*0Sstevel@tonic-gate
443*0Sstevel@tonic-gateB<Perl has never guaranteed any ordering of the hash keys>, and the
444*0Sstevel@tonic-gateordering has already changed several times during the lifetime of
445*0Sstevel@tonic-gatePerl 5.  Also, the ordering of hash keys has always been, and
446*0Sstevel@tonic-gatecontinues to be, affected by the insertion order.
447*0Sstevel@tonic-gate
448*0Sstevel@tonic-gateAlso note that while the order of the hash elements might be
449*0Sstevel@tonic-gaterandomised, this "pseudoordering" should B<not> be used for
450*0Sstevel@tonic-gateapplications like shuffling a list randomly (use List::Util::shuffle()
451*0Sstevel@tonic-gatefor that, see L<List::Util>, a standard core module since Perl 5.8.0;
452*0Sstevel@tonic-gateor the CPAN module Algorithm::Numerical::Shuffle), or for generating
453*0Sstevel@tonic-gatepermutations (use e.g. the CPAN modules Algorithm::Permute or
454*0Sstevel@tonic-gateAlgorithm::FastPermute), or for any cryptographic applications.
455*0Sstevel@tonic-gate
456*0Sstevel@tonic-gate=item *
457*0Sstevel@tonic-gate
458*0Sstevel@tonic-gateRegular expressions - Perl's regular expression engine is so called
459*0Sstevel@tonic-gateNFA (Non-Finite Automaton), which among other things means that it can
460*0Sstevel@tonic-gaterather easily consume large amounts of both time and space if the
461*0Sstevel@tonic-gateregular expression may match in several ways.  Careful crafting of the
462*0Sstevel@tonic-gateregular expressions can help but quite often there really isn't much
463*0Sstevel@tonic-gateone can do (the book "Mastering Regular Expressions" is required
464*0Sstevel@tonic-gatereading, see L<perlfaq2>).  Running out of space manifests itself by
465*0Sstevel@tonic-gatePerl running out of memory.
466*0Sstevel@tonic-gate
467*0Sstevel@tonic-gate=item *
468*0Sstevel@tonic-gate
469*0Sstevel@tonic-gateSorting - the quicksort algorithm used in Perls before 5.8.0 to
470*0Sstevel@tonic-gateimplement the sort() function is very easy to trick into misbehaving
471*0Sstevel@tonic-gateso that it consumes a lot of time.  Nothing more is required than
472*0Sstevel@tonic-gateresorting a list already sorted.  Starting from Perl 5.8.0 a different
473*0Sstevel@tonic-gatesorting algorithm, mergesort, is used.  Mergesort is insensitive to
474*0Sstevel@tonic-gateits input data, so it cannot be similarly fooled.
475*0Sstevel@tonic-gate
476*0Sstevel@tonic-gate=back
477*0Sstevel@tonic-gate
478*0Sstevel@tonic-gateSee L<http://www.cs.rice.edu/~scrosby/hash/> for more information,
479*0Sstevel@tonic-gateand any computer science text book on the algorithmic complexity.
480*0Sstevel@tonic-gate
481*0Sstevel@tonic-gate=head1 SEE ALSO
482*0Sstevel@tonic-gate
483*0Sstevel@tonic-gateL<perlrun> for its description of cleaning up environment variables.
484