xref: /csrg-svn/old/awk/awk.1 (revision 44952)
144280Scael.\" Copyright (c) 1990 Regents of the University of California.
244280Scael.\" All rights reserved.  The Berkeley software License Agreement
344280Scael.\" specifies the terms and conditions for redistribution.
419364Smckusick.\"
5*44952Scael.\"     @(#)awk.1	6.4 (Berkeley) 07/24/90
643082Scael.\"
743082Scael.Dd
843082Scael.Dt AWK 1
943082Scael.Os ATT 7
1043082Scael.Sh NAME
1143082Scael.Nm awk
1243082Scael.Nd pattern scanning and processing language
1343082Scael.Sh SYNOPSIS
1443082Scael.Nm awk
1543082Scael.Oo
1643082Scael.Op Fl \&F Ar \&c
1743082Scael.Oo
18*44952Scael.Op Fl f Ar prog_file
1943082Scael.Op Ar prog
2043082Scael.Ar
2143082Scael.Sh DESCRIPTION
2243082Scael.Nm Awk
2319364Smckusickscans each input
2443082Scael.Ar file
2519364Smckusickfor lines that match any of a set of patterns specified in
2643082Scael.Ar prog .
2719364SmckusickWith each pattern in
2843082Scael.Ar prog
2919364Smckusickthere can be an associated action that will be performed
3019364Smckusickwhen a line of a
3143082Scael.Ar file
3219364Smckusickmatches the pattern.
3319364SmckusickThe set of patterns may appear literally as
3443082Scael.Ar prog
3519364Smckusickor in a file
3619364Smckusickspecified as
3743082Scael.Fl f
3843082Scael.Ar file .
3943082Scael.Pp
40*44952Scael.Tw Ds
4143082Scael.Tp Cx Fl F
4243082Scael.Ar c
4343082Scael.Cx
4443082ScaelSpecify a field separator of
4543082Scael.Ar c .
4643082Scael.Tp Fl f
4743082ScaelUse
48*44952Scael.Ar prog_file
4943082Scaelas an input
5043082Scael.Ar prog
5143082Scael(an awk script).
5243082Scael.Tp
5343082Scael.Pp
5419364SmckusickFiles are read in order;
5519364Smckusickif there are no files, the standard input is read.
5643082ScaelThe file name
57*44952Scael.Sq Fl
5819364Smckusickmeans the standard input.
5919364SmckusickEach line is matched against the
6019364Smckusickpattern portion of every pattern-action statement;
6119364Smckusickthe associated action is performed for each matched pattern.
6243082Scael.Pp
6319364SmckusickAn input line is made up of fields separated by white space.
6443082Scael(This default can be changed by using
6543082Scael.Li FS ,
6643082Scael.Em vide infra . )
6719364SmckusickThe fields are denoted $1, $2, ... ;
6819364Smckusick$0 refers to the entire line.
6943082Scael.Pp
7019364SmckusickA pattern-action statement has the form
7143082Scael.Pp
7243082Scael.Dl pattern {action}
7343082Scael.Pp
7419364SmckusickA missing { action } means print the line;
7519364Smckusicka missing pattern always matches.
7643082Scael.Pp
7719364SmckusickAn action is a sequence of statements.
7819364SmckusickA statement can be one of the following:
7943082Scael.Pp
8043082Scael.Ds I
8143082Scaelif ( conditional ) statement [ else statement ]
8243082Scaelwhile ( conditional ) statement
8343082Scaelfor ( expression ; conditional ; expression ) statement
8443082Scaelbreak
8543082Scaelcontinue
8643082Scael{ [ statement ] ... }
8743082Scaelvariable = expression
8843082Scaelprint [ expression-list ] [ >expression ]
8943082Scaelprintf format [, expression-list ] [ >expression ]
9043082Scaelnext	# skip remaining patterns on this input line
9143082Scaelexit	# skip the rest of the input
9243082Scael.De
9343082Scael.Pp
9419364SmckusickStatements are terminated by
9519364Smckusicksemicolons, newlines or right braces.
9619364SmckusickAn empty expression-list stands for the whole line.
9719364SmckusickExpressions take on string or numeric values as appropriate,
9819364Smckusickand are built using the operators
9919364Smckusick+, \-, *, /, %,  and concatenation (indicated by a blank).
10019364SmckusickThe C operators ++, \-\-, +=, \-=, *=, /=, and %=
10119364Smckusickare also available in expressions.
10219364SmckusickVariables may be scalars, array elements
10319364Smckusick(denoted
10443082Scael.Cx x
10543082Scael.Op i
10643082Scael.Cx )
10743082Scael.Cx
10819364Smckusickor fields.
10919364SmckusickVariables are initialized to the null string.
11019364SmckusickArray subscripts may be any string,
11119364Smckusicknot necessarily numeric;
11219364Smckusickthis allows for a form of associative memory.
11319364SmckusickString constants are quoted "...".
11443082Scael.Pp
11543082ScaelThe
11643082Scael.Ic print
11719364Smckusickstatement prints its arguments on the standard output
11843082Scael(or on a file if
11943082Scael.Ar \&>file
12019364Smckusickis present), separated by the current output field separator,
12119364Smckusickand terminated by the output record separator.
12219364SmckusickThe
12343082Scael.Ic printf
12419364Smckusickstatement formats its expression list according to the format
12519364Smckusick(see
12643082Scael.Xr printf 3 ) .
12743082Scael.Pp
12819364SmckusickThe built-in function
12943082Scael.Ic length
13019364Smckusickreturns the length of its argument
13119364Smckusicktaken as a string,
13219364Smckusickor of the whole line if no argument.
13319364SmckusickThere are also built-in functions
13443082Scael.Ic exp ,
13543082Scael.Ic log ,
13643082Scael.Ic sqrt
13719364Smckusickand
13843082Scael.Ic int .
13919364SmckusickThe last truncates its argument to an integer.
14043082ScaelThe function
141*44952Scael.Fn substr s m n
14243082Scaelreturns the
14343082Scael.Cx Ar n
14443082Scael.Cx \-
14543082Scael.Cx character
14643082Scael.Cx
14719364Smckusicksubstring of
14843082Scael.Ar s
14919364Smckusickthat begins at position
15043082Scael.Ar m .
151*44952ScaelThe
152*44952Scael.Fn sprintf fmt expr expr \&...
153*44952Scaelfunction
15419364Smckusickformats the expressions
15519364Smckusickaccording to the
15643082Scael.Xr printf 3
15719364Smckusickformat given by
15843082Scael.Ar fmt
15919364Smckusickand returns the resulting string.
16043082Scael.Pp
16119364SmckusickPatterns are arbitrary Boolean combinations
16243082Scael(!, \(or\(or, &&, and parentheses) of
16319364Smckusickregular expressions and
16419364Smckusickrelational expressions.
16519364SmckusickRegular expressions must be surrounded
16619364Smckusickby slashes and are as in
16743082Scael.Xr egrep 1 .
16819364SmckusickIsolated regular expressions
16919364Smckusickin a pattern apply to the entire line.
17019364SmckusickRegular expressions may also occur in
17119364Smckusickrelational expressions.
17243082Scael.Pp
17319364SmckusickA pattern may consist of two patterns separated by a comma;
17419364Smckusickin this case, the action is performed for all lines
17519364Smckusickbetween an occurrence of the first pattern
17619364Smckusickand the next occurrence of the second.
17743082Scael.Pp
17819364SmckusickA relational expression is one of the following:
179*44952Scael.Pp
180*44952Scael.Ds I
18143082Scaelexpression matchop regular-expression
18243082Scaelexpression relop expression
18343082Scael.De
18443082Scael.Pp
18519364Smckusickwhere a relop is any of the six relational operators in C,
18619364Smckusickand a matchop is either ~ (for contains)
18719364Smckusickor !~ (for does not contain).
18819364SmckusickA conditional is an arithmetic expression,
18919364Smckusicka relational expression,
19019364Smckusickor a Boolean combination
19119364Smckusickof these.
19243082Scael.Pp
19319364SmckusickThe special patterns
19443082Scael.Li BEGIN
19519364Smckusickand
19643082Scael.Li END
19719364Smckusickmay be used to capture control before the first input line is read
19819364Smckusickand after the last.
19943082Scael.Li BEGIN
20043082Scaelmust be the first pattern,
20143082Scael.Li END
20243082Scaelthe last.
20343082Scael.Pp
20419364SmckusickA single character
20543082Scael.Ar c
20619364Smckusickmay be used to separate the fields by starting
20719364Smckusickthe program with
20843082Scael.Pp
20943082Scael.Dl BEGIN { FS = "c" }
21043082Scael.Pp
21119364Smckusickor by using the
21243082Scael.Cx Fl F
21343082Scael.Ar c
21443082Scael.Cx
21519364Smckusickoption.
21643082Scael.Pp
21719364SmckusickOther variable names with special meanings
21843082Scaelinclude
21943082Scael.Dp Li NF
22043082Scaelthe number of fields in the current record;
22143082Scael.Dp Li NR
22243082Scaelthe ordinal number of the current record;
22343082Scael.Dp Li FILENAME
22443082Scaelthe name of the current input file;
22543082Scael.Dp Li OFS
22643082Scaelthe output field separator (default blank);
22743082Scael.Dp Li ORS
22843082Scaelthe output record separator (default newline);
22943082Scael.Dp Li OFMT
23043082Scaelthe output format for numbers (default "%.6g").
23143082Scael.Dp
23243082Scael.Pp
23343082Scael.Sh EXAMPLES
23443082Scael.Pp
23519364SmckusickPrint lines longer than 72 characters:
23643082Scael.Pp
23743082Scael.Dl length > 72
23843082Scael.Pp
23919364SmckusickPrint first two fields in opposite order:
24043082Scael.Pp
24143082Scael.Dl { print $2, $1 }
24243082Scael.Pp
24319364SmckusickAdd up first column, print sum and average:
24443082Scael.Pp
24543082Scael.Ds I
24643082Scael	{ s += $1 }
24743082ScaelEND	{ print "sum is", s, " average is", s/NR }
24843082Scael.De
24943082Scael.Pp
25019364SmckusickPrint fields in reverse order:
25143082Scael.Pp
25243082Scael.Dl { for (i = NF; i > 0; \-\-i) print $i }
25343082Scael.Pp
25419364SmckusickPrint all lines between start/stop pairs:
25543082Scael.Pp
25643082Scael.Dl /start/, /stop/
25743082Scael.Pp
25819364SmckusickPrint all lines whose first field is different from previous one:
25943082Scael.Pp
26043082Scael.Dl $1 != prev { print; prev = $1 }
26143082Scael.Sh SEE ALSO
26243082Scael.Xr lex 1 ,
26343082Scael.Xr sed 1
26443082Scael.Pp
26519364SmckusickA. V. Aho, B. W. Kernighan, P. J. Weinberger,
26643082Scael.Em Awk \- a pattern scanning and processing language
26743082Scael.Sh HISTORY
26843082Scael.Nm Awk
26943082Scaelappeared in Version 7 AT&T UNIX.  A much improved
27043082Scaeland true to the book version of
27143082Scael.Nm awk
27243082Scaelappeared in the AT&T Toolchest in the late 1980's.
27343082ScaelThe version of
27443082Scael.Nm awk
27543082Scaelthis manual page describes
27643082Scaelis a derivative of the original and not the Toolchest version.
27743082Scael.Sh BUGS
27819364SmckusickThere are no explicit conversions between numbers and strings.
27919364SmckusickTo force an expression to be treated as a number add 0 to it;
280*44952Scaelto force it to be treated as a string concatenate "" (an empty
281*44952Scaelstring) to it.
282