xref: /csrg-svn/old/awk/awk.1 (revision 44280)
1.\" Copyright (c) 1990 Regents of the University of California.
2.\" All rights reserved.  The Berkeley software License Agreement
3.\" specifies the terms and conditions for redistribution.
4.\"
5.\"     @(#)awk.1	6.3 (Berkeley) 06/26/90
6.\"
7.Dd
8.Dt AWK 1
9.Os ATT 7
10.Sh NAME
11.Nm awk
12.Nd pattern scanning and processing language
13.Sh SYNOPSIS
14.Nm awk
15.Oo
16.Op Fl \&F Ar \&c
17.Oo
18.\".Op Op Fl \&f Ar file Op Ar prog
19.Cx \&[
20.Op Fl f Ar file
21.Op Ar prog
22.Cx \&]
23.Cx
24.Ar
25.Sh DESCRIPTION
26.Nm Awk
27scans each input
28.Ar file
29for lines that match any of a set of patterns specified in
30.Ar prog .
31With each pattern in
32.Ar prog
33there can be an associated action that will be performed
34when a line of a
35.Ar file
36matches the pattern.
37The set of patterns may appear literally as
38.Ar prog
39or in a file
40specified as
41.Fl f
42.Ar file .
43.Pp
44.Tw Fl
45.Tp Cx Fl F
46.Ar c
47.Cx
48Specify a field separator of
49.Ar c .
50.Tp Fl f
51Use
52.Ar file
53as an input
54.Ar prog
55(an awk script).
56.Tp
57.Pp
58Files are read in order;
59if there are no files, the standard input is read.
60The file name
61.Fl
62means the standard input.
63Each line is matched against the
64pattern portion of every pattern-action statement;
65the associated action is performed for each matched pattern.
66.Pp
67An input line is made up of fields separated by white space.
68(This default can be changed by using
69.Li FS ,
70.Em vide infra . )
71The fields are denoted $1, $2, ... ;
72$0 refers to the entire line.
73.Pp
74A pattern-action statement has the form
75.Pp
76.Dl pattern {action}
77.Pp
78A missing { action } means print the line;
79a missing pattern always matches.
80.Pp
81An action is a sequence of statements.
82A statement can be one of the following:
83.Pp
84.Ds I
85if ( conditional ) statement [ else statement ]
86while ( conditional ) statement
87for ( expression ; conditional ; expression ) statement
88break
89continue
90{ [ statement ] ... }
91variable = expression
92print [ expression-list ] [ >expression ]
93printf format [, expression-list ] [ >expression ]
94next	# skip remaining patterns on this input line
95exit	# skip the rest of the input
96.De
97.Pp
98Statements are terminated by
99semicolons, newlines or right braces.
100An empty expression-list stands for the whole line.
101Expressions take on string or numeric values as appropriate,
102and are built using the operators
103+, \-, *, /, %,  and concatenation (indicated by a blank).
104The C operators ++, \-\-, +=, \-=, *=, /=, and %=
105are also available in expressions.
106Variables may be scalars, array elements
107(denoted
108.Cx x
109.Op i
110.Cx )
111.Cx
112or fields.
113Variables are initialized to the null string.
114Array subscripts may be any string,
115not necessarily numeric;
116this allows for a form of associative memory.
117String constants are quoted "...".
118.Pp
119The
120.Ic print
121statement prints its arguments on the standard output
122(or on a file if
123.Ar \&>file
124is present), separated by the current output field separator,
125and terminated by the output record separator.
126The
127.Ic printf
128statement formats its expression list according to the format
129(see
130.Xr printf 3 ) .
131.Pp
132The built-in function
133.Ic length
134returns the length of its argument
135taken as a string,
136or of the whole line if no argument.
137There are also built-in functions
138.Ic exp ,
139.Ic log ,
140.Ic sqrt
141and
142.Ic int .
143The last truncates its argument to an integer.
144The function
145.Cx Ic substr
146.Cx (
147.Ar s ,
148.Ar \& m ,
149.Ar \& n )
150.Cx
151returns the
152.Cx Ar n
153.Cx \-
154.Cx character
155.Cx
156substring of
157.Ar s
158that begins at position
159.Ar m .
160The function
161.Cx Ic sprintf
162.Cx (
163.Ar fmt ,
164.Ar \& expr ,
165.Ar \& expr ,
166.Ar \& ... )
167.Cx
168formats the expressions
169according to the
170.Xr printf 3
171format given by
172.Ar fmt
173and returns the resulting string.
174.Pp
175Patterns are arbitrary Boolean combinations
176(!, \(or\(or, &&, and parentheses) of
177regular expressions and
178relational expressions.
179Regular expressions must be surrounded
180by slashes and are as in
181.Xr egrep 1 .
182Isolated regular expressions
183in a pattern apply to the entire line.
184Regular expressions may also occur in
185relational expressions.
186.Pp
187A pattern may consist of two patterns separated by a comma;
188in this case, the action is performed for all lines
189between an occurrence of the first pattern
190and the next occurrence of the second.
191.Pp
192A relational expression is one of the following:
193.Ds
194expression matchop regular-expression
195expression relop expression
196.De
197.Pp
198where a relop is any of the six relational operators in C,
199and a matchop is either ~ (for contains)
200or !~ (for does not contain).
201A conditional is an arithmetic expression,
202a relational expression,
203or a Boolean combination
204of these.
205.Pp
206The special patterns
207.Li BEGIN
208and
209.Li END
210may be used to capture control before the first input line is read
211and after the last.
212.Li BEGIN
213must be the first pattern,
214.Li END
215the last.
216.Pp
217A single character
218.Ar c
219may be used to separate the fields by starting
220the program with
221.Pp
222.Dl BEGIN { FS = "c" }
223.Pp
224or by using the
225.Cx Fl F
226.Ar c
227.Cx
228option.
229.Pp
230Other variable names with special meanings
231include
232.Dp Li NF
233the number of fields in the current record;
234.Dp Li NR
235the ordinal number of the current record;
236.Dp Li FILENAME
237the name of the current input file;
238.Dp Li OFS
239the output field separator (default blank);
240.Dp Li ORS
241the output record separator (default newline);
242.Dp Li OFMT
243the output format for numbers (default "%.6g").
244.Dp
245.Pp
246.Sh EXAMPLES
247.Pp
248Print lines longer than 72 characters:
249.Pp
250.Dl length > 72
251.Pp
252Print first two fields in opposite order:
253.Pp
254.Dl { print $2, $1 }
255.Pp
256Add up first column, print sum and average:
257.Pp
258.Ds I
259	{ s += $1 }
260END	{ print "sum is", s, " average is", s/NR }
261.De
262.Pp
263Print fields in reverse order:
264.Pp
265.Dl { for (i = NF; i > 0; \-\-i) print $i }
266.Pp
267Print all lines between start/stop pairs:
268.Pp
269.Dl /start/, /stop/
270.Pp
271Print all lines whose first field is different from previous one:
272.Pp
273.Dl $1 != prev { print; prev = $1 }
274.Sh SEE ALSO
275.Xr lex 1 ,
276.Xr sed 1
277.Pp
278A. V. Aho, B. W. Kernighan, P. J. Weinberger,
279.Em Awk \- a pattern scanning and processing language
280.Sh HISTORY
281.Nm Awk
282appeared in Version 7 AT&T UNIX.  A much improved
283and true to the book version of
284.Nm awk
285appeared in the AT&T Toolchest in the late 1980's.
286The version of
287.Nm awk
288this manual page describes
289is a derivative of the original and not the Toolchest version.
290.Sh BUGS
291There are no explicit conversions between numbers and strings.
292To force an expression to be treated as a number add 0 to it;
293to force it to be treated as a string concatenate
294.Dq
295to it.
296