144280Scael.\" Copyright (c) 1990 Regents of the University of California. 244280Scael.\" All rights reserved. The Berkeley software License Agreement 344280Scael.\" specifies the terms and conditions for redistribution. 419364Smckusick.\" 5*44952Scael.\" @(#)awk.1 6.4 (Berkeley) 07/24/90 643082Scael.\" 743082Scael.Dd 843082Scael.Dt AWK 1 943082Scael.Os ATT 7 1043082Scael.Sh NAME 1143082Scael.Nm awk 1243082Scael.Nd pattern scanning and processing language 1343082Scael.Sh SYNOPSIS 1443082Scael.Nm awk 1543082Scael.Oo 1643082Scael.Op Fl \&F Ar \&c 1743082Scael.Oo 18*44952Scael.Op Fl f Ar prog_file 1943082Scael.Op Ar prog 2043082Scael.Ar 2143082Scael.Sh DESCRIPTION 2243082Scael.Nm Awk 2319364Smckusickscans each input 2443082Scael.Ar file 2519364Smckusickfor lines that match any of a set of patterns specified in 2643082Scael.Ar prog . 2719364SmckusickWith each pattern in 2843082Scael.Ar prog 2919364Smckusickthere can be an associated action that will be performed 3019364Smckusickwhen a line of a 3143082Scael.Ar file 3219364Smckusickmatches the pattern. 3319364SmckusickThe set of patterns may appear literally as 3443082Scael.Ar prog 3519364Smckusickor in a file 3619364Smckusickspecified as 3743082Scael.Fl f 3843082Scael.Ar file . 3943082Scael.Pp 40*44952Scael.Tw Ds 4143082Scael.Tp Cx Fl F 4243082Scael.Ar c 4343082Scael.Cx 4443082ScaelSpecify a field separator of 4543082Scael.Ar c . 4643082Scael.Tp Fl f 4743082ScaelUse 48*44952Scael.Ar prog_file 4943082Scaelas an input 5043082Scael.Ar prog 5143082Scael(an awk script). 5243082Scael.Tp 5343082Scael.Pp 5419364SmckusickFiles are read in order; 5519364Smckusickif there are no files, the standard input is read. 5643082ScaelThe file name 57*44952Scael.Sq Fl 5819364Smckusickmeans the standard input. 5919364SmckusickEach line is matched against the 6019364Smckusickpattern portion of every pattern-action statement; 6119364Smckusickthe associated action is performed for each matched pattern. 6243082Scael.Pp 6319364SmckusickAn input line is made up of fields separated by white space. 6443082Scael(This default can be changed by using 6543082Scael.Li FS , 6643082Scael.Em vide infra . ) 6719364SmckusickThe fields are denoted $1, $2, ... ; 6819364Smckusick$0 refers to the entire line. 6943082Scael.Pp 7019364SmckusickA pattern-action statement has the form 7143082Scael.Pp 7243082Scael.Dl pattern {action} 7343082Scael.Pp 7419364SmckusickA missing { action } means print the line; 7519364Smckusicka missing pattern always matches. 7643082Scael.Pp 7719364SmckusickAn action is a sequence of statements. 7819364SmckusickA statement can be one of the following: 7943082Scael.Pp 8043082Scael.Ds I 8143082Scaelif ( conditional ) statement [ else statement ] 8243082Scaelwhile ( conditional ) statement 8343082Scaelfor ( expression ; conditional ; expression ) statement 8443082Scaelbreak 8543082Scaelcontinue 8643082Scael{ [ statement ] ... } 8743082Scaelvariable = expression 8843082Scaelprint [ expression-list ] [ >expression ] 8943082Scaelprintf format [, expression-list ] [ >expression ] 9043082Scaelnext # skip remaining patterns on this input line 9143082Scaelexit # skip the rest of the input 9243082Scael.De 9343082Scael.Pp 9419364SmckusickStatements are terminated by 9519364Smckusicksemicolons, newlines or right braces. 9619364SmckusickAn empty expression-list stands for the whole line. 9719364SmckusickExpressions take on string or numeric values as appropriate, 9819364Smckusickand are built using the operators 9919364Smckusick+, \-, *, /, %, and concatenation (indicated by a blank). 10019364SmckusickThe C operators ++, \-\-, +=, \-=, *=, /=, and %= 10119364Smckusickare also available in expressions. 10219364SmckusickVariables may be scalars, array elements 10319364Smckusick(denoted 10443082Scael.Cx x 10543082Scael.Op i 10643082Scael.Cx ) 10743082Scael.Cx 10819364Smckusickor fields. 10919364SmckusickVariables are initialized to the null string. 11019364SmckusickArray subscripts may be any string, 11119364Smckusicknot necessarily numeric; 11219364Smckusickthis allows for a form of associative memory. 11319364SmckusickString constants are quoted "...". 11443082Scael.Pp 11543082ScaelThe 11643082Scael.Ic print 11719364Smckusickstatement prints its arguments on the standard output 11843082Scael(or on a file if 11943082Scael.Ar \&>file 12019364Smckusickis present), separated by the current output field separator, 12119364Smckusickand terminated by the output record separator. 12219364SmckusickThe 12343082Scael.Ic printf 12419364Smckusickstatement formats its expression list according to the format 12519364Smckusick(see 12643082Scael.Xr printf 3 ) . 12743082Scael.Pp 12819364SmckusickThe built-in function 12943082Scael.Ic length 13019364Smckusickreturns the length of its argument 13119364Smckusicktaken as a string, 13219364Smckusickor of the whole line if no argument. 13319364SmckusickThere are also built-in functions 13443082Scael.Ic exp , 13543082Scael.Ic log , 13643082Scael.Ic sqrt 13719364Smckusickand 13843082Scael.Ic int . 13919364SmckusickThe last truncates its argument to an integer. 14043082ScaelThe function 141*44952Scael.Fn substr s m n 14243082Scaelreturns the 14343082Scael.Cx Ar n 14443082Scael.Cx \- 14543082Scael.Cx character 14643082Scael.Cx 14719364Smckusicksubstring of 14843082Scael.Ar s 14919364Smckusickthat begins at position 15043082Scael.Ar m . 151*44952ScaelThe 152*44952Scael.Fn sprintf fmt expr expr \&... 153*44952Scaelfunction 15419364Smckusickformats the expressions 15519364Smckusickaccording to the 15643082Scael.Xr printf 3 15719364Smckusickformat given by 15843082Scael.Ar fmt 15919364Smckusickand returns the resulting string. 16043082Scael.Pp 16119364SmckusickPatterns are arbitrary Boolean combinations 16243082Scael(!, \(or\(or, &&, and parentheses) of 16319364Smckusickregular expressions and 16419364Smckusickrelational expressions. 16519364SmckusickRegular expressions must be surrounded 16619364Smckusickby slashes and are as in 16743082Scael.Xr egrep 1 . 16819364SmckusickIsolated regular expressions 16919364Smckusickin a pattern apply to the entire line. 17019364SmckusickRegular expressions may also occur in 17119364Smckusickrelational expressions. 17243082Scael.Pp 17319364SmckusickA pattern may consist of two patterns separated by a comma; 17419364Smckusickin this case, the action is performed for all lines 17519364Smckusickbetween an occurrence of the first pattern 17619364Smckusickand the next occurrence of the second. 17743082Scael.Pp 17819364SmckusickA relational expression is one of the following: 179*44952Scael.Pp 180*44952Scael.Ds I 18143082Scaelexpression matchop regular-expression 18243082Scaelexpression relop expression 18343082Scael.De 18443082Scael.Pp 18519364Smckusickwhere a relop is any of the six relational operators in C, 18619364Smckusickand a matchop is either ~ (for contains) 18719364Smckusickor !~ (for does not contain). 18819364SmckusickA conditional is an arithmetic expression, 18919364Smckusicka relational expression, 19019364Smckusickor a Boolean combination 19119364Smckusickof these. 19243082Scael.Pp 19319364SmckusickThe special patterns 19443082Scael.Li BEGIN 19519364Smckusickand 19643082Scael.Li END 19719364Smckusickmay be used to capture control before the first input line is read 19819364Smckusickand after the last. 19943082Scael.Li BEGIN 20043082Scaelmust be the first pattern, 20143082Scael.Li END 20243082Scaelthe last. 20343082Scael.Pp 20419364SmckusickA single character 20543082Scael.Ar c 20619364Smckusickmay be used to separate the fields by starting 20719364Smckusickthe program with 20843082Scael.Pp 20943082Scael.Dl BEGIN { FS = "c" } 21043082Scael.Pp 21119364Smckusickor by using the 21243082Scael.Cx Fl F 21343082Scael.Ar c 21443082Scael.Cx 21519364Smckusickoption. 21643082Scael.Pp 21719364SmckusickOther variable names with special meanings 21843082Scaelinclude 21943082Scael.Dp Li NF 22043082Scaelthe number of fields in the current record; 22143082Scael.Dp Li NR 22243082Scaelthe ordinal number of the current record; 22343082Scael.Dp Li FILENAME 22443082Scaelthe name of the current input file; 22543082Scael.Dp Li OFS 22643082Scaelthe output field separator (default blank); 22743082Scael.Dp Li ORS 22843082Scaelthe output record separator (default newline); 22943082Scael.Dp Li OFMT 23043082Scaelthe output format for numbers (default "%.6g"). 23143082Scael.Dp 23243082Scael.Pp 23343082Scael.Sh EXAMPLES 23443082Scael.Pp 23519364SmckusickPrint lines longer than 72 characters: 23643082Scael.Pp 23743082Scael.Dl length > 72 23843082Scael.Pp 23919364SmckusickPrint first two fields in opposite order: 24043082Scael.Pp 24143082Scael.Dl { print $2, $1 } 24243082Scael.Pp 24319364SmckusickAdd up first column, print sum and average: 24443082Scael.Pp 24543082Scael.Ds I 24643082Scael { s += $1 } 24743082ScaelEND { print "sum is", s, " average is", s/NR } 24843082Scael.De 24943082Scael.Pp 25019364SmckusickPrint fields in reverse order: 25143082Scael.Pp 25243082Scael.Dl { for (i = NF; i > 0; \-\-i) print $i } 25343082Scael.Pp 25419364SmckusickPrint all lines between start/stop pairs: 25543082Scael.Pp 25643082Scael.Dl /start/, /stop/ 25743082Scael.Pp 25819364SmckusickPrint all lines whose first field is different from previous one: 25943082Scael.Pp 26043082Scael.Dl $1 != prev { print; prev = $1 } 26143082Scael.Sh SEE ALSO 26243082Scael.Xr lex 1 , 26343082Scael.Xr sed 1 26443082Scael.Pp 26519364SmckusickA. V. Aho, B. W. Kernighan, P. J. Weinberger, 26643082Scael.Em Awk \- a pattern scanning and processing language 26743082Scael.Sh HISTORY 26843082Scael.Nm Awk 26943082Scaelappeared in Version 7 AT&T UNIX. A much improved 27043082Scaeland true to the book version of 27143082Scael.Nm awk 27243082Scaelappeared in the AT&T Toolchest in the late 1980's. 27343082ScaelThe version of 27443082Scael.Nm awk 27543082Scaelthis manual page describes 27643082Scaelis a derivative of the original and not the Toolchest version. 27743082Scael.Sh BUGS 27819364SmckusickThere are no explicit conversions between numbers and strings. 27919364SmckusickTo force an expression to be treated as a number add 0 to it; 280*44952Scaelto force it to be treated as a string concatenate "" (an empty 281*44952Scaelstring) to it. 282