xref: /csrg-svn/old/awk/awk.1 (revision 50808)
148236Sbostic.\" Copyright (c) 1990 The Regents of the University of California.
248236Sbostic.\" All rights reserved.
319364Smckusick.\"
448236Sbostic.\" %sccs.include.proprietary.roff%
543082Scael.\"
6*50808Scael.\"	@(#)awk.1	6.6 (Berkeley) 08/07/91
748236Sbostic.\"
843082Scael.Dd
943082Scael.Dt AWK 1
1043082Scael.Os ATT 7
1143082Scael.Sh NAME
1243082Scael.Nm awk
1343082Scael.Nd pattern scanning and processing language
1443082Scael.Sh SYNOPSIS
1543082Scael.Nm awk
16*50808Scael.Op Fl F Ar c
1744952Scael.Op Fl f Ar prog_file
1843082Scael.Op Ar prog
1943082Scael.Ar
2043082Scael.Sh DESCRIPTION
2143082Scael.Nm Awk
2219364Smckusickscans each input
2343082Scael.Ar file
2419364Smckusickfor lines that match any of a set of patterns specified in
2543082Scael.Ar prog .
2619364SmckusickWith each pattern in
2743082Scael.Ar prog
2819364Smckusickthere can be an associated action that will be performed
2919364Smckusickwhen a line of a
3043082Scael.Ar file
3119364Smckusickmatches the pattern.
3219364SmckusickThe set of patterns may appear literally as
3343082Scael.Ar prog
3419364Smckusickor in a file
3519364Smckusickspecified as
3643082Scael.Fl f
3743082Scael.Ar file .
3843082Scael.Pp
39*50808Scael.Bl -tag -width flag
40*50808Scael.It Fl F Ns Ar c
4143082ScaelSpecify a field separator of
4243082Scael.Ar c .
43*50808Scael.It Fl f
4443082ScaelUse
4544952Scael.Ar prog_file
4643082Scaelas an input
4743082Scael.Ar prog
4843082Scael(an awk script).
49*50808Scael.El
5043082Scael.Pp
5119364SmckusickFiles are read in order;
5219364Smckusickif there are no files, the standard input is read.
5343082ScaelThe file name
5444952Scael.Sq Fl
5519364Smckusickmeans the standard input.
5619364SmckusickEach line is matched against the
5719364Smckusickpattern portion of every pattern-action statement;
5819364Smckusickthe associated action is performed for each matched pattern.
5943082Scael.Pp
6019364SmckusickAn input line is made up of fields separated by white space.
6143082Scael(This default can be changed by using
6243082Scael.Li FS ,
6343082Scael.Em vide infra . )
6419364SmckusickThe fields are denoted $1, $2, ... ;
6519364Smckusick$0 refers to the entire line.
6643082Scael.Pp
6719364SmckusickA pattern-action statement has the form
6843082Scael.Pp
6943082Scael.Dl pattern {action}
7043082Scael.Pp
7119364SmckusickA missing { action } means print the line;
7219364Smckusicka missing pattern always matches.
7343082Scael.Pp
7419364SmckusickAn action is a sequence of statements.
7519364SmckusickA statement can be one of the following:
76*50808Scael.Bd -unfilled -offset indent
7743082Scaelif ( conditional ) statement [ else statement ]
7843082Scaelwhile ( conditional ) statement
7943082Scaelfor ( expression ; conditional ; expression ) statement
8043082Scaelbreak
8143082Scaelcontinue
8243082Scael{ [ statement ] ... }
8343082Scaelvariable = expression
8443082Scaelprint [ expression-list ] [ >expression ]
8543082Scaelprintf format [, expression-list ] [ >expression ]
8643082Scaelnext	# skip remaining patterns on this input line
8743082Scaelexit	# skip the rest of the input
88*50808Scael.Ed
8943082Scael.Pp
9019364SmckusickStatements are terminated by
9119364Smckusicksemicolons, newlines or right braces.
9219364SmckusickAn empty expression-list stands for the whole line.
9319364SmckusickExpressions take on string or numeric values as appropriate,
9419364Smckusickand are built using the operators
9519364Smckusick+, \-, *, /, %,  and concatenation (indicated by a blank).
9619364SmckusickThe C operators ++, \-\-, +=, \-=, *=, /=, and %=
9719364Smckusickare also available in expressions.
9819364SmckusickVariables may be scalars, array elements
9919364Smckusick(denoted
100*50808Scael.x Ns Ns Op i )
10119364Smckusickor fields.
10219364SmckusickVariables are initialized to the null string.
10319364SmckusickArray subscripts may be any string,
10419364Smckusicknot necessarily numeric;
10519364Smckusickthis allows for a form of associative memory.
10619364SmckusickString constants are quoted "...".
10743082Scael.Pp
10843082ScaelThe
10943082Scael.Ic print
11019364Smckusickstatement prints its arguments on the standard output
11143082Scael(or on a file if
11243082Scael.Ar \&>file
11319364Smckusickis present), separated by the current output field separator,
11419364Smckusickand terminated by the output record separator.
11519364SmckusickThe
11643082Scael.Ic printf
11719364Smckusickstatement formats its expression list according to the format
11819364Smckusick(see
11943082Scael.Xr printf 3 ) .
12043082Scael.Pp
12119364SmckusickThe built-in function
12243082Scael.Ic length
12319364Smckusickreturns the length of its argument
12419364Smckusicktaken as a string,
12519364Smckusickor of the whole line if no argument.
12619364SmckusickThere are also built-in functions
12743082Scael.Ic exp ,
12843082Scael.Ic log ,
12943082Scael.Ic sqrt
13019364Smckusickand
13143082Scael.Ic int .
13219364SmckusickThe last truncates its argument to an integer.
13343082ScaelThe function
13444952Scael.Fn substr s m n
13543082Scaelreturns the
136*50808Scael.Ar n Ns \- character
13719364Smckusicksubstring of
13843082Scael.Ar s
13919364Smckusickthat begins at position
14043082Scael.Ar m .
14144952ScaelThe
142*50808Scael.Fn sprintf fmt expr expr ...
14344952Scaelfunction
14419364Smckusickformats the expressions
14519364Smckusickaccording to the
14643082Scael.Xr printf 3
14719364Smckusickformat given by
14843082Scael.Ar fmt
14919364Smckusickand returns the resulting string.
15043082Scael.Pp
15119364SmckusickPatterns are arbitrary Boolean combinations
15243082Scael(!, \(or\(or, &&, and parentheses) of
15319364Smckusickregular expressions and
15419364Smckusickrelational expressions.
15519364SmckusickRegular expressions must be surrounded
15619364Smckusickby slashes and are as in
15743082Scael.Xr egrep 1 .
15819364SmckusickIsolated regular expressions
15919364Smckusickin a pattern apply to the entire line.
16019364SmckusickRegular expressions may also occur in
16119364Smckusickrelational expressions.
16243082Scael.Pp
16319364SmckusickA pattern may consist of two patterns separated by a comma;
16419364Smckusickin this case, the action is performed for all lines
16519364Smckusickbetween an occurrence of the first pattern
16619364Smckusickand the next occurrence of the second.
16743082Scael.Pp
16819364SmckusickA relational expression is one of the following:
169*50808Scael.Bd -unfilled -offset indent
17043082Scaelexpression matchop regular-expression
17143082Scaelexpression relop expression
172*50808Scael.Ed
17343082Scael.Pp
17419364Smckusickwhere a relop is any of the six relational operators in C,
17519364Smckusickand a matchop is either ~ (for contains)
17619364Smckusickor !~ (for does not contain).
17719364SmckusickA conditional is an arithmetic expression,
17819364Smckusicka relational expression,
17919364Smckusickor a Boolean combination
18019364Smckusickof these.
18143082Scael.Pp
18219364SmckusickThe special patterns
18343082Scael.Li BEGIN
18419364Smckusickand
18543082Scael.Li END
18619364Smckusickmay be used to capture control before the first input line is read
18719364Smckusickand after the last.
18843082Scael.Li BEGIN
18943082Scaelmust be the first pattern,
19043082Scael.Li END
19143082Scaelthe last.
19243082Scael.Pp
19319364SmckusickA single character
19443082Scael.Ar c
19519364Smckusickmay be used to separate the fields by starting
19619364Smckusickthe program with
19743082Scael.Pp
19843082Scael.Dl BEGIN { FS = "c" }
19943082Scael.Pp
20019364Smckusickor by using the
201*50808Scael.Fl F Ns Ns Ar c
20219364Smckusickoption.
20343082Scael.Pp
20419364SmckusickOther variable names with special meanings
20543082Scaelinclude
206*50808Scael.Pp
207*50808Scael.Bl -tag -width "file name" -compact
208*50808Scael.It Li NF
20943082Scaelthe number of fields in the current record;
210*50808Scael.It Li NR
21143082Scaelthe ordinal number of the current record;
212*50808Scael.It Li FILENAME
21343082Scaelthe name of the current input file;
214*50808Scael.It Li OFS
21543082Scaelthe output field separator (default blank);
216*50808Scael.It Li ORS
21743082Scaelthe output record separator (default newline);
218*50808Scael.It Li OFMT
21943082Scaelthe output format for numbers (default "%.6g").
220*50808Scael.El
22143082Scael.Pp
22243082Scael.Sh EXAMPLES
22343082Scael.Pp
22419364SmckusickPrint lines longer than 72 characters:
22543082Scael.Pp
22643082Scael.Dl length > 72
22743082Scael.Pp
22819364SmckusickPrint first two fields in opposite order:
22943082Scael.Pp
23043082Scael.Dl { print $2, $1 }
23143082Scael.Pp
23219364SmckusickAdd up first column, print sum and average:
233*50808Scael.Bd -literal -offset indent
23443082Scael	{ s += $1 }
23543082ScaelEND	{ print "sum is", s, " average is", s/NR }
236*50808Scael.Ed
23743082Scael.Pp
23819364SmckusickPrint fields in reverse order:
23943082Scael.Pp
24043082Scael.Dl { for (i = NF; i > 0; \-\-i) print $i }
24143082Scael.Pp
24219364SmckusickPrint all lines between start/stop pairs:
24343082Scael.Pp
24443082Scael.Dl /start/, /stop/
24543082Scael.Pp
24619364SmckusickPrint all lines whose first field is different from previous one:
24743082Scael.Pp
24843082Scael.Dl $1 != prev { print; prev = $1 }
24943082Scael.Sh SEE ALSO
25043082Scael.Xr lex 1 ,
25143082Scael.Xr sed 1
25243082Scael.Pp
253*50808Scael.Rs
254*50808Scael.%A A. V. Aho
255*50808Scael.%A B. W. Kernighan
256*50808Scael.%A P. J. Weinberger
257*50808Scael.%T "Awk \- a pattern scanning and processing language"
258*50808Scael.Re
25943082Scael.Sh HISTORY
260*50808ScaelThe version of
261*50808Scael.Nm awk
262*50808Scaelthis man page describes
263*50808Scaelappeared in Version
264*50808Scael.At v7 .
265*50808ScaelA much improved
26643082Scaeland true to the book version of
26743082Scael.Nm awk
268*50808Scaelappeared in the
269*50808Scael.Tn AT&T
270*50808ScaelToolchest in the late 1980's.
27143082ScaelThe version of
27243082Scael.Nm awk
27343082Scaelthis manual page describes
27443082Scaelis a derivative of the original and not the Toolchest version.
27543082Scael.Sh BUGS
27619364SmckusickThere are no explicit conversions between numbers and strings.
27719364SmckusickTo force an expression to be treated as a number add 0 to it;
27844952Scaelto force it to be treated as a string concatenate "" (an empty
27944952Scaelstring) to it.
280