1.\" Copyright (c) 1990 Regents of the University of California. 2.\" All rights reserved. The Berkeley software License Agreement 3.\" specifies the terms and conditions for redistribution. 4.\" 5.\" @(#)awk.1 6.3 (Berkeley) 06/26/90 6.\" 7.Dd 8.Dt AWK 1 9.Os ATT 7 10.Sh NAME 11.Nm awk 12.Nd pattern scanning and processing language 13.Sh SYNOPSIS 14.Nm awk 15.Oo 16.Op Fl \&F Ar \&c 17.Oo 18.\".Op Op Fl \&f Ar file Op Ar prog 19.Cx \&[ 20.Op Fl f Ar file 21.Op Ar prog 22.Cx \&] 23.Cx 24.Ar 25.Sh DESCRIPTION 26.Nm Awk 27scans each input 28.Ar file 29for lines that match any of a set of patterns specified in 30.Ar prog . 31With each pattern in 32.Ar prog 33there can be an associated action that will be performed 34when a line of a 35.Ar file 36matches the pattern. 37The set of patterns may appear literally as 38.Ar prog 39or in a file 40specified as 41.Fl f 42.Ar file . 43.Pp 44.Tw Fl 45.Tp Cx Fl F 46.Ar c 47.Cx 48Specify a field separator of 49.Ar c . 50.Tp Fl f 51Use 52.Ar file 53as an input 54.Ar prog 55(an awk script). 56.Tp 57.Pp 58Files are read in order; 59if there are no files, the standard input is read. 60The file name 61.Fl 62means the standard input. 63Each line is matched against the 64pattern portion of every pattern-action statement; 65the associated action is performed for each matched pattern. 66.Pp 67An input line is made up of fields separated by white space. 68(This default can be changed by using 69.Li FS , 70.Em vide infra . ) 71The fields are denoted $1, $2, ... ; 72$0 refers to the entire line. 73.Pp 74A pattern-action statement has the form 75.Pp 76.Dl pattern {action} 77.Pp 78A missing { action } means print the line; 79a missing pattern always matches. 80.Pp 81An action is a sequence of statements. 82A statement can be one of the following: 83.Pp 84.Ds I 85if ( conditional ) statement [ else statement ] 86while ( conditional ) statement 87for ( expression ; conditional ; expression ) statement 88break 89continue 90{ [ statement ] ... } 91variable = expression 92print [ expression-list ] [ >expression ] 93printf format [, expression-list ] [ >expression ] 94next # skip remaining patterns on this input line 95exit # skip the rest of the input 96.De 97.Pp 98Statements are terminated by 99semicolons, newlines or right braces. 100An empty expression-list stands for the whole line. 101Expressions take on string or numeric values as appropriate, 102and are built using the operators 103+, \-, *, /, %, and concatenation (indicated by a blank). 104The C operators ++, \-\-, +=, \-=, *=, /=, and %= 105are also available in expressions. 106Variables may be scalars, array elements 107(denoted 108.Cx x 109.Op i 110.Cx ) 111.Cx 112or fields. 113Variables are initialized to the null string. 114Array subscripts may be any string, 115not necessarily numeric; 116this allows for a form of associative memory. 117String constants are quoted "...". 118.Pp 119The 120.Ic print 121statement prints its arguments on the standard output 122(or on a file if 123.Ar \&>file 124is present), separated by the current output field separator, 125and terminated by the output record separator. 126The 127.Ic printf 128statement formats its expression list according to the format 129(see 130.Xr printf 3 ) . 131.Pp 132The built-in function 133.Ic length 134returns the length of its argument 135taken as a string, 136or of the whole line if no argument. 137There are also built-in functions 138.Ic exp , 139.Ic log , 140.Ic sqrt 141and 142.Ic int . 143The last truncates its argument to an integer. 144The function 145.Cx Ic substr 146.Cx ( 147.Ar s , 148.Ar \& m , 149.Ar \& n ) 150.Cx 151returns the 152.Cx Ar n 153.Cx \- 154.Cx character 155.Cx 156substring of 157.Ar s 158that begins at position 159.Ar m . 160The function 161.Cx Ic sprintf 162.Cx ( 163.Ar fmt , 164.Ar \& expr , 165.Ar \& expr , 166.Ar \& ... ) 167.Cx 168formats the expressions 169according to the 170.Xr printf 3 171format given by 172.Ar fmt 173and returns the resulting string. 174.Pp 175Patterns are arbitrary Boolean combinations 176(!, \(or\(or, &&, and parentheses) of 177regular expressions and 178relational expressions. 179Regular expressions must be surrounded 180by slashes and are as in 181.Xr egrep 1 . 182Isolated regular expressions 183in a pattern apply to the entire line. 184Regular expressions may also occur in 185relational expressions. 186.Pp 187A pattern may consist of two patterns separated by a comma; 188in this case, the action is performed for all lines 189between an occurrence of the first pattern 190and the next occurrence of the second. 191.Pp 192A relational expression is one of the following: 193.Ds 194expression matchop regular-expression 195expression relop expression 196.De 197.Pp 198where a relop is any of the six relational operators in C, 199and a matchop is either ~ (for contains) 200or !~ (for does not contain). 201A conditional is an arithmetic expression, 202a relational expression, 203or a Boolean combination 204of these. 205.Pp 206The special patterns 207.Li BEGIN 208and 209.Li END 210may be used to capture control before the first input line is read 211and after the last. 212.Li BEGIN 213must be the first pattern, 214.Li END 215the last. 216.Pp 217A single character 218.Ar c 219may be used to separate the fields by starting 220the program with 221.Pp 222.Dl BEGIN { FS = "c" } 223.Pp 224or by using the 225.Cx Fl F 226.Ar c 227.Cx 228option. 229.Pp 230Other variable names with special meanings 231include 232.Dp Li NF 233the number of fields in the current record; 234.Dp Li NR 235the ordinal number of the current record; 236.Dp Li FILENAME 237the name of the current input file; 238.Dp Li OFS 239the output field separator (default blank); 240.Dp Li ORS 241the output record separator (default newline); 242.Dp Li OFMT 243the output format for numbers (default "%.6g"). 244.Dp 245.Pp 246.Sh EXAMPLES 247.Pp 248Print lines longer than 72 characters: 249.Pp 250.Dl length > 72 251.Pp 252Print first two fields in opposite order: 253.Pp 254.Dl { print $2, $1 } 255.Pp 256Add up first column, print sum and average: 257.Pp 258.Ds I 259 { s += $1 } 260END { print "sum is", s, " average is", s/NR } 261.De 262.Pp 263Print fields in reverse order: 264.Pp 265.Dl { for (i = NF; i > 0; \-\-i) print $i } 266.Pp 267Print all lines between start/stop pairs: 268.Pp 269.Dl /start/, /stop/ 270.Pp 271Print all lines whose first field is different from previous one: 272.Pp 273.Dl $1 != prev { print; prev = $1 } 274.Sh SEE ALSO 275.Xr lex 1 , 276.Xr sed 1 277.Pp 278A. V. Aho, B. W. Kernighan, P. J. Weinberger, 279.Em Awk \- a pattern scanning and processing language 280.Sh HISTORY 281.Nm Awk 282appeared in Version 7 AT&T UNIX. A much improved 283and true to the book version of 284.Nm awk 285appeared in the AT&T Toolchest in the late 1980's. 286The version of 287.Nm awk 288this manual page describes 289is a derivative of the original and not the Toolchest version. 290.Sh BUGS 291There are no explicit conversions between numbers and strings. 292To force an expression to be treated as a number add 0 to it; 293to force it to be treated as a string concatenate 294.Dq 295to it. 296