Lines Matching +refs:po +refs:find +refs:awk +refs:string
2 @c $NetBSD: awk.texi,v 1.1 2010/12/13 06:21:53 mrg Exp $
4 @setfilename awk.info
10 * Gawk: (awk). A text scanning and processing language.
14 * awk: (awk)Invoking gawk. Text scanning and processing.
242 This file documents @command{awk}, a program that you can use to select
256 @command{awk}. How to run an @command{awk}
261 * Printing:: How to print using @command{awk}. Describes
276 * Library Functions:: A Library of @command{awk} Functions.
277 * Sample Programs:: Many @command{awk} programs with complete
279 * Language History:: The evolution of the @command{awk}
295 @command{awk}.
296 * Names:: What name to use to find @command{awk}.
306 * One-shot:: Running a short throwaway @command{awk}
310 * Long:: Putting permanent @command{awk} programs in
312 * Executable Scripts:: Making self-contained @command{awk}
318 @command{awk} programs illustrated in this
326 * Other Features:: Other Features of @command{awk}.
387 * Scalar Constants:: Numeric and string constants.
429 @command{awk}.
433 * If Statement:: Conditionally execute some @command{awk}
448 * Exit Statement:: Stop execution of @command{awk}.
451 control @command{awk}.
452 * Auto-set:: Built-in variables where @command{awk}
465 @command{awk}.
469 @command{awk}.
476 * String Functions:: Functions for string manipulation, such as
485 * I18N Functions:: Functions for string translation.
500 * I18N Portability:: @command{awk}-level portability issues.
509 * Profiling:: Profiling your @command{awk} programs.
510 * Command Line:: How to run @command{awk}.
513 * AWKPATH Variable:: Searching directories for @command{awk}
523 * Assert Function:: A function for assertions in @command{awk}
530 * Join Function:: A function to join an array into a string.
553 * Miscellaneous Programs:: Some interesting @command{awk} programs.
565 * Igawk Program:: A wrapper for @command{awk} that includes
573 version of @command{awk}.
575 POSIX @command{awk}.
612 * Other Versions:: Other freely available @command{awk}
669 1988. AWK's simple programming paradigm---find a pattern in the
674 Alas, the @command{awk} on my computer was a limited version of the
676 had ``old @command{awk}'' and the AWK book described ``new @command{awk}.''
678 aside or relinquish its name. If a system had a new @command{awk}, it was
680 The best way to get a new @command{awk} was to @command{ftp} the source code for
682 new @command{awk} written by David Trueman and Arnold, and available under
686 it's no longer difficult to find a new @command{awk}. @command{gawk} ships with
692 and the Unix community in general, and desiring a new @command{awk}, I wrote
781 Such jobs are often easier with @command{awk}.
782 The @command{awk} utility interprets a special-purpose programming language
785 The GNU implementation of @command{awk} is called @command{gawk}; it is fully
787 @command{awk}. @command{gawk} is also compatible with the POSIX
788 specification of the @command{awk} language. This means that all
789 properly written @command{awk} programs should work with @command{gawk}.
791 @command{awk} implementations.
793 @cindex @command{awk}, POSIX and, See Also POSIX @command{awk}
794 @cindex @command{awk}, POSIX and
795 @cindex POSIX, @command{awk} and
796 @cindex @command{gawk}, @command{awk} and
797 @cindex @command{awk}, @command{gawk} and
798 @cindex @command{awk}, uses for
799 Using @command{awk} allows you to:
819 @cindex @command{awk}, See Also @command{gawk}
820 @cindex @command{gawk}, See Also @command{awk}
837 This @value{DOCUMENT} teaches you about the @command{awk} language and
845 @cindex GNU @command{awk}, See @command{gawk}
846 Implementations of the @command{awk} language are available for many
848 the @command{awk} language in general, also describes the particular
849 implementation of @command{awk} called @command{gawk} (which stands for
850 ``GNU awk''). @command{gawk} runs on a broad range of Unix systems,
858 @command{awk}.
859 * Names:: What name to use to find @command{awk}.
870 @unnumberedsec History of @command{awk} and @command{gawk}
891 @cindex @command{awk}, history of
892 The name @command{awk} comes from the initials of its designers: Alfred V.@:
894 @command{awk} was written in 1977 at AT&T Bell Laboratories.
902 The specification for @command{awk} in the POSIX Command Language
904 Both the @command{gawk} designers and the original Bell Laboratories @command{awk}
914 with the newer @command{awk}.
920 from @command{awk}, and with a little help from me, set about adding
934 @cindex @command{awk}, new vs. old
935 The @command{awk} language has evolved over the years. Full details are
938 is often referred to as ``new @command{awk}'' (@command{nawk}).
940 @cindex @command{awk}, versions of
942 versions of @command{awk}.
943 Some systems have an @command{awk} utility that implements the
944 original version of the @command{awk} language and a @command{nawk} utility
947 Others have an @command{oawk} version for the ``old @command{awk}''
948 language and plain @command{awk} for the new one. Still others only
950 use @command{gawk} for their @command{awk} implementation!}
955 @command{awk} you should run when writing your programs. The best advice
956 I can give here is to check your local documentation. Look for @command{awk},
959 have some version of new @command{awk} on your system, which is what
964 that should be available in any complete implementation of POSIX @command{awk},
965 we simply use the term @command{awk}. When referring to a feature that is
970 @cindex @command{awk}, terms describing
972 The term @command{awk} refers to a particular program as well as to the language you
974 the language ``the @command{awk} language,''
975 and the program ``the @command{awk} utility.''
977 both the @command{awk} language and how to run the @command{awk} utility.
978 The term @dfn{@command{awk} program} refers to a program written by you in
979 the @command{awk} programming language.
981 @cindex @command{gawk}, @command{awk} and
982 @cindex @command{awk}, @command{gawk} and
983 @cindex POSIX @command{awk}
984 Primarily, this @value{DOCUMENT} explains the features of @command{awk},
988 and other @command{awk} implementations.@footnote{All such differences
990 entry ``differences in @command{awk} and @command{gawk}.''}
992 the POSIX standard for @command{awk} are noted.
1009 Most of the time, the examples use complete @command{awk} programs.
1010 In some of the more advanced sections, only the part of the @command{awk}
1015 to @command{awk}, there is a lot of information here that even the @command{awk}
1016 expert should find useful. In particular, the description of POSIX
1017 @command{awk} and the example programs in
1023 provides the essentials you need to know to begin using @command{awk}.
1027 supported by POSIX @command{awk} and @command{gawk}.
1030 describes how @command{awk} reads your data.
1036 describes how @command{awk} programs can produce output with
1046 @command{awk} and @command{gawk} use.
1049 covers @command{awk}'s one-and-only data structure: associative arrays.
1054 describes the built-in functions @command{awk} and
1067 profile your @command{awk} programs.
1071 command-line options, and how it finds @command{awk}
1076 provide many sample @command{awk} programs.
1077 Reading them allows you to see @command{awk}
1081 describes how the @command{awk} language has evolved since
1090 available implementations of @command{awk}.
1108 If you find terms that you aren't familiar with, try looking them up here.
1174 many features of @command{awk} were either poorly documented or not
1219 To find out more about the FSF and the GNU Project online,
1248 of @command{gawk} for their versions of @command{awk}.)
1273 version of @command{awk}. After substantial revision, the first version of
1314 version of @command{awk}.
1339 If you find an error in this @value{DOCUMENT}, please report it!
1346 As the maintainer of GNU @command{awk},
1347 I am starting a collection of publicly available @command{awk}
1351 If you have written an interesting @command{awk} program, or have written a
1366 manual. The paper @cite{A Supplemental Document for @command{awk}} by John W.@:
1368 issues relevant both to @command{awk} implementation and to this manual, that
1502 @majorheading I@ @ @ @ The @command{awk} Language and @command{gawk}
1503 Part I describes the @command{awk} language and @command{gawk} program in detail.
1504 It starts with the basics, and continues through all of the features of @command{awk}
1549 @chapter Getting Started with @command{awk}
1553 @c @cindex basic function of @command{awk}
1554 @cindex @command{awk}, function of
1556 The basic function of @command{awk} is to search files for lines (or other
1558 of the patterns, @command{awk} performs specified actions on that line.
1559 @command{awk} keeps processing input lines in this way until it reaches
1562 @cindex @command{awk}, uses for
1565 @cindex @command{awk} programs
1566 Programs in @command{awk} are different from programs in most other languages,
1567 because @command{awk} programs are @dfn{data-driven}; that is, you describe
1568 the data you want to work with and then what to do when you find it.
1573 For this reason, @command{awk} programs are often refreshingly easy to
1578 When you run @command{awk}, you specify an @command{awk} @dfn{program} that
1579 tells @command{awk} what to do. The program consists of a series of
1588 Newlines usually separate rules. Therefore, an @command{awk}
1600 * Sample Data Files:: Sample data files for use in the @command{awk}
1608 * Other Features:: Other Features of @command{awk}.
1614 @section How to Run @command{awk} Programs
1616 @cindex @command{awk} programs, running
1617 There are several ways to run an @command{awk} program. If the program is
1618 short, it is easiest to include it in the command that runs @command{awk},
1622 awk '@var{program}' @var{input-file1} @var{input-file2} @dots{}
1630 awk -f @var{program-file} @var{input-file1} @var{input-file2} @dots{}
1637 * One-shot:: Running a short throwaway @command{awk}
1641 * Long:: Putting permanent @command{awk} programs in
1643 * Executable Scripts:: Making self-contained @command{awk} programs.
1650 @subsection One-Shot Throwaway @command{awk} Programs
1652 Once you are familiar with @command{awk}, you will often type in simple
1654 program as the first argument of the @command{awk} command, like this:
1657 awk '@var{program}' @var{input-file1} @var{input-file2} @dots{}
1667 to start @command{awk} and use the @var{program} to process records in the
1669 the shell won't interpret any @command{awk} characters as special shell
1671 a single argument for @command{awk}, and allow @var{program} to be more
1675 @cindex @command{awk} programs, running, from shell scripts
1676 This format is also useful for running short or medium-sized @command{awk}
1678 file for the @command{awk} program. A self-contained shell script is more
1693 awk '/foo/' @var{files} @dots{}
1706 @subsection Running @command{awk} Without Input Files
1710 @cindex input files, running @command{awk} without
1711 You can also run @command{awk} without any input files. If you type the
1715 awk '@var{program}'
1719 @command{awk} applies the @var{program} to the @dfn{standard input},
1726 @cindex input files, running @command{awk} without
1727 @cindex @command{awk} programs, running, without input files
1734 $ awk "BEGIN @{ print \"Don't Panic!\" @}"
1750 This next simple @command{awk} program
1755 $ awk '@{ print @}'
1770 @cindex @command{awk} programs, running
1771 @cindex @command{awk} programs, lengthy
1772 @cindex files, @command{awk} programs in
1773 Sometimes your @command{awk} programs can be very long. In this case, it is
1775 @command{awk} to use that file for its program, you type:
1778 awk -f @var{source-file} @var{input-file1} @var{input-file2} @dots{}
1784 The @option{-f} instructs the @command{awk} utility to get the @command{awk} program
1796 awk -f advice
1803 awk "BEGIN @{ print \"Don't Panic!\" @}"
1812 special characters. Notice that in @file{advice}, the @command{awk}
1814 for programs that are provided on the @command{awk} command line.
1820 If you want to identify your @command{awk} program files clearly as such,
1821 you can add the extension @file{.awk} to the @value{FN}. This doesn't
1822 affect the execution of the @command{awk} program but it does make
1826 @subsection Executable @command{awk} Programs
1827 @cindex @command{awk} programs
1830 @cindex Unix, @command{awk} scripts and
1834 Once you have learned @command{awk}, you may want to write self-contained
1835 @command{awk} scripts, using the @samp{#!} script mechanism. You can do
1843 #! /bin/awk -f
1851 at the shell and the system arranges to run @command{awk}@footnote{The
1856 in the list is the full @value{FN} of the @command{awk} program. The rest of the
1857 argument list contains either options to @command{awk}, or @value{DF}s,
1859 typed @samp{awk -f advice}:
1872 Self-contained @command{awk} scripts are useful when you want to write a
1874 written in @command{awk}.
1884 line after the path to @command{awk}. It does not work. The operating system
1885 treats the rest of the line as a single argument and passes it to @command{awk}.
1887 of some sort from @command{awk}.
1895 Some systems put @samp{awk} there, some put the full pathname
1896 of @command{awk} (such as @file{/bin/awk}), and some put the name
1901 @subsection Comments in @command{awk} Programs
1905 @cindex @command{awk} programs, documenting
1913 In the @command{awk} language, a comment starts with the sharp sign
1916 @command{awk} language ignores the rest of a line following a sharp sign.
1925 You can put comment lines into keyboard-composed throwaway @command{awk}
1940 prints a message about mismatched quotes, and if @command{awk} actually
1945 $ awk '@{ print "hello" @} # let's be cute'
1952 With Unix @command{awk}, closing the quoted string produces this result:
1955 $ awk '@{ print "hello" @} # let's be cute'
1957 @error{} awk: can't open file be
1971 For short to medium length @command{awk} programs, it is most convenient
1972 to enter the program on the @command{awk} command line.
1978 awk '@var{program text}' @var{input-file1} @var{input-file2} @dots{}
2032 $ awk "BEGIN @{ print \"Don't Panic!\" @}"
2044 be set to the null string, use:
2047 awk -F "" '@var{program}' @var{files} # correct
2055 awk -F"" '@var{program}' @var{files} # wrong!
2059 In the second case, @command{awk} will attempt to use the text of the program
2069 $ awk 'BEGIN @{ print "Here is a single quote <'"'"'>" @}'
2080 $ awk 'BEGIN @{ print "Here is a single quote <'\''>" @}'
2087 Another option is to use double quotes, escaping the embedded, @command{awk}-level
2091 $ awk "BEGIN @{ print \"Here is a single quote <'>\" @}"
2099 are very common in @command{awk} programs.
2101 If you really need both single and double quotes in your @command{awk}
2183 file into a file for use with @command{awk}
2192 for an @command{awk} program that extracts these @value{DF}s from
2199 The following command runs a simple @command{awk} program that searches the
2200 input file @file{BBS-list} for the character string @samp{foo} (a
2201 grouping of characters is usually called a @dfn{string};
2202 the term @dfn{string} is based on similar usage in English, such
2203 as ``a string of pearls,'' or ``a string of cars in a train''):
2206 awk '/foo/ @{ print $0 @}' BBS-list
2215 You will notice that slashes (@samp{/}) surround the string @samp{foo}
2216 in the @command{awk} program. The slashes indicate that @samp{foo}
2222 single quotes around the @command{awk} program so that the shell won't
2228 $ awk '/foo/ @{ print $0 @}' BBS-list
2237 In an @command{awk} rule, either the pattern or the action can be omitted,
2249 @cindex @command{awk} programs, one-line examples
2250 Many practical @command{awk} programs are just a line or two. Following is a
2254 read the rest of the @value{DOCUMENT} to become an @command{awk} expert!)
2259 one way to do things in @command{awk}. At some point, you may want
2268 awk '@{ if (length($0) > max) max = length($0) @}
2276 awk 'length($0) > 80' data
2287 expand data | awk '@{ if (x < length()) x = length() @}
2298 awk 'NF > 0' data
2309 awk 'BEGIN @{ for (i = 1; i <= 7; i++)
2317 ls -l @var{files} | awk '@{ x += $5 @}
2326 ls -l @var{files} | awk '@{ x += $5 @}
2334 awk -F: '@{ print $1 @}' /etc/passwd | sort
2341 awk 'END @{ print NR @}' data
2348 awk 'NR % 2 == 0' data
2357 @cindex @command{awk} programs
2359 The @command{awk} utility reads the input files one line at a
2360 time. For each line, @command{awk} tries the patterns of each of the rules.
2362 which they appear in the @command{awk} program. If no patterns match, then
2366 @command{awk} reads the next line. (However,
2370 For example, the following @command{awk} program contains two rules:
2378 The first rule has the string @samp{12} as the
2380 string @samp{21} as the pattern and also has @samp{print $0} as the
2383 This program prints every line that contains the string
2384 @samp{12} @emph{or} the string @samp{21}. If a line contains both
2391 $ awk '/12/ @{ print $0 @}
2416 what typical @command{awk}
2417 programs do. This example shows how @command{awk} can be used to
2423 ls -l | awk '$6 == "Nov" @{ sum += $5 @}
2447 -rw-r--r-- 1 arnold user 10809 Nov 7 13:03 awk.h
2448 -rw-r--r-- 1 arnold user 983 Apr 13 12:14 awk.tab.h
2449 -rw-r--r-- 1 arnold user 31869 Jun 15 12:20 awk.y
2469 The @samp{$6 == "Nov"} in our @command{awk} program is an expression that
2471 matches the string @samp{Nov}. Each time a line has the string
2474 @code{sum}. As a result, when @command{awk} has finished reading all the
2476 lines matched the pattern. (This works because @command{awk} variables
2483 These more advanced @command{awk} techniques are covered in later sections
2485 advanced @command{awk} programming, you have to know how @command{awk} interprets
2491 @section @command{awk} Statements Versus Lines
2495 Most often, each line in an @command{awk} program is a separate statement or
2499 awk '/12/ @{ print $0 @}
2527 in the middle of a string or regular expression. For example:
2530 awk '/This regular expression is too long, so continue it\
2542 most useful when your @command{awk} program is in a separate source file
2544 many @command{awk} implementations are more particular about where you
2546 split a string constant using backslash continuation. Thus, for maximum
2547 portability of your @command{awk} programs, it is best not to split your
2548 lines in the middle of a regular expression or a string.
2549 @c 10/2000: gawk, mawk, and current bell labs awk allow it,
2550 @c solaris 2.7 nawk does not. Solaris /usr/xpg4/bin/awk does though! sigh.
2556 with the C shell.} It works for @command{awk} programs in files and
2561 in your awk program must be escaped with a backslash. To illustrate:
2564 % awk 'BEGIN @{ \
2578 $ awk 'BEGIN @{
2585 @command{awk} is a line-oriented language. Each rule's action has to
2594 comments do not mix. As soon as @command{awk} sees the @samp{#} that
2615 When @command{awk} statements within one rule are short, you might want to put
2628 separated with a semicolon was not in the original @command{awk}
2633 @section Other Features of @command{awk}
2636 The @command{awk} language provides a number of predefined, or
2638 from @command{awk}. There are other variables your program can set
2639 as well to control how @command{awk} processes your data.
2641 In addition, @command{awk} provides a number of built-in functions for doing
2642 common computational and string-related operations.
2644 performing bit manipulation, and for runtime string translation.
2646 As we develop our presentation of the @command{awk} language, we introduce
2652 @section When to Use @command{awk}
2654 @cindex @command{awk}, uses for
2655 Now that you've seen some of what @command{awk} can do,
2656 you might wonder how @command{awk} could be useful for you. By using
2659 complex output. The @command{awk} language is very useful for producing
2664 Programs written with @command{awk} are usually much smaller than they would
2665 be in other languages. This makes @command{awk} programs easy to compose and
2666 use. Often, @command{awk} programs can be quickly composed at your terminal,
2667 used once, and thrown away. Because @command{awk} programs are interpreted, you
2671 Complex programs have been written in @command{awk}, including a complete
2674 computer. However, @command{awk}'s capabilities are strained by tasks of
2677 @cindex @command{awk} programs, complex
2678 If you find yourself writing @command{awk} scripts of more than, say, a few
2680 language. Emacs Lisp is a good choice if you need sophisticated string
2681 or pattern matching capabilities. The shell is also good at string and
2686 of source code than the equivalent @command{awk} programs, but they are
2697 Because regular expressions are such a fundamental part of @command{awk}
2703 is an @command{awk} pattern that matches every input record whose text
2706 both. Such a regexp matches any string that contains that sequence.
2707 Thus, the regexp @samp{foo} matches any string containing @samp{foo}.
2738 following prints the second field of each record that contains the string
2742 $ awk '/foo/ @{ print $2 @}' BBS-list
2750 @cindex operators, string-matching
2752 @cindex string-matching operators
2765 expressions allow you to specify the string to match against; it need
2778 is true if the expression @var{exp} (taken as a string)
2784 $ awk '$1 ~ /J/' inventory-shipped
2794 awk '@{ if ($1 ~ /J/) print @}' inventory-shipped
2798 (taken as a character string)
2810 $ awk '$1 !~ /J/' inventory-shipped
2822 @code{"foo"} is a string constant.
2830 Some characters cannot be included literally in string constants
2835 a string constant. Because a plain double quote ends the string, you
2837 part of the string. For example:
2840 $ awk 'BEGIN @{ print "He said \"hi!\" to her." @}'
2846 string or regexp. Thus, the string whose contents are the two characters
2851 unprintable characters directly in a string constant or regexp constant,
2855 all the escape sequences used in @command{awk} and
2857 sequences apply to both string constants and regexp constants:
2863 @c @cindex @command{awk} language, V.4 version
2895 @c @cindex @command{awk} language, V.4 version
2908 @c @cindex @command{awk} language, V.4 version
2909 @c @cindex @command{awk} language, POSIX version
2919 POSIX @command{awk}.)
2928 in order to tell @command{awk} to keep processing the rest of the regexp.
2933 A literal double quote (necessary for string constants only).
2934 This expression is used when you want to write a string
2935 constant that contains a double quote. Because the string is delimited by
2936 double quotes, you need to escape the quote that is part of the string,
2937 in order to tell @command{awk} to keep processing the rest of the string.
2962 for both string constants and regexp constants. This happens very early,
2963 as soon as @command{awk} reads your program.
2979 @cindex POSIX @command{awk}, backslashes in string constants
2984 If you place a backslash in a string constant before something that is
2985 not one of the characters previously listed, POSIX @command{awk} purposely
2992 This is what Unix @command{awk} and @command{gawk} both do.
2998 two backslashes in the string @samp{FS = @w{"[ \t]+\\|[ \t]+"}}.)
3002 @cindex Unix @command{awk}, backslashes in escape sequences
3004 Some other @command{awk} implementations do this.
3016 Does @command{awk} treat the character as a literal character or as a regexp
3063 This matches the beginning of a string. For example, @samp{^@@chapter}
3064 matches @samp{@@chapter} at the beginning of a string and can be used
3067 match only at the beginning of the string.
3070 a line embedded in a string.
3080 This is similar to @samp{^}, but it matches only at the end of a string.
3083 and does not match the end of a line embedded in a string.
3095 matches any single character followed by a @samp{P} in a string. Using
3101 @cindex POSIX @command{awk}, period (@code{.}), using
3105 Otherwise, @sc{nul} is just another character. Other versions of @command{awk}
3119 the characters @samp{M}, @samp{V}, or @samp{X} in a string. A full
3128 @emph{except} those in the square brackets. For example, @samp{[^awk]}
3140 matches any string that matches either @samp{^P} or @samp{[[:digit:]]}. This
3141 means it matches any string that starts with @samp{P} or contains a digit.
3160 repeated as many times as necessary to find a match. For example, @samp{ph*}
3168 @samp{awk '/\(c[ad][ad]*r x\)/ @{ print @}' sample}
3169 prints every record in @file{sample} containing a string of the form
3185 awk '/\(c[ad]+r x\)/ @{ print @}' sample
3218 @cindex POSIX @command{awk}, interval expressions in
3219 Interval expressions were not traditionally available in @command{awk}.
3220 They were added as part of the POSIX standard to make @command{awk}
3233 any version of @command{awk}.@footnote{Use two backslashes if you're
3234 using a string constant with a regexp operator or function.}
3245 @cindex POSIX @command{awk}, regular expressions and
3247 In POSIX @command{awk} and @command{gawk}, the @samp{*}, @samp{+}, and @samp{?} operators
3250 @command{awk} treat such a usage as a syntax error.
3293 @cindex POSIX @command{awk}, character lists and
3298 is compatible with other @command{awk}
3300 The regular expressions in @command{awk} are a superset
3306 @cindex POSIX @command{awk}, character lists and, character classes
3425 @cindex POSIX @command{awk}, character lists and, character classes
3447 they are not available in other @command{awk} implementations.
3473 Matches the empty string at the beginning of a word.
3481 Matches the empty string at the end of a word.
3490 Matches the empty string at either the beginning or the
3498 Matches the empty string that occurs between two
3506 @cindex operators, string-matching, for buffers
3510 string to match as the buffer.
3518 Matches the empty string at the
3519 beginning of a buffer (string).
3525 Matches the empty string at the
3526 end of a buffer (string).
3535 for @command{awk}. They are provided for compatibility with other
3542 that conflicts with the @command{awk} language's definition of @samp{\b}
3578 Traditional Unix @command{awk} regexps are matched. The GNU operators
3612 @code{tolower} or @code{toupper} built-in string functions (which we
3623 This works in any POSIX-compliant @command{awk}.
3627 @cindex differences in @command{awk} and @command{gawk}, regular expressions
3636 When @code{IGNORECASE} is not zero, @emph{all} regexp and string
3671 affected regexp operations only. It did not affect string comparison
3673 Beginning with @value{PVERSION} 3.0, both regexp and string comparison
3699 echo aaaabcd | awk '@{ sub(/a+/, "<A>"); print @}'
3709 @command{awk} (and POSIX) regular expressions always match
3715 $ echo aaaabcd | awk '@{ sub(/a+/, "<A>"); print @}'
3744 regexp constant (i.e., a string of characters between slashes). It may
3745 be any expression. The expression is evaluated and converted to a string
3746 if necessary; the contents of the string are used as the
3763 enclosed in slashes and a string constant enclosed in double quotes.
3764 If you are going to use a string constant, you have to understand that
3765 the string is, in essence, scanned @emph{twice}: the first time when
3766 @command{awk} reads your program, and the second time when it goes to
3767 match the string on the lefthand side of the operator with the pattern
3768 on the right. This is true of any string-valued expression (such as
3769 @code{digits_regexp}, shown previously), not just string constants.
3776 What difference does it make if the string is
3779 string, you have to type two backslashes.
3782 Only one backslash is needed. To do the same thing with a string,
3784 second one so that the string actually contains the
3787 @cindex troubleshooting, regexp constants vs. string constants
3788 @cindex regexp constants, vs. string constants
3789 @cindex string constants, vs. regexp constants
3790 Given that you can use both regexp and string constants to describe
3802 It is more efficient to use regexp constants. @command{awk} can note
3804 makes pattern matching more efficient. When using a string constant,
3805 @command{awk} must first convert the string into this internal form and
3818 Some commercial versions of @command{awk} do not allow the newline
3822 $ awk '$0 ~ "[ \t\n]"'
3823 @error{} awk: newline in character class [
3834 $ awk '$0 ~ /[ \t\n]/'
3895 to make several function calls, @emph{per input character} to find the record
3905 In the typical @command{awk} program, all input is read either from the
3907 command) or from files whose names you specify on the @command{awk}
3908 command line. If you specify input files, @command{awk} reads them
3926 used with it do not have to be named on the @command{awk} command line
3950 The @command{awk} utility divides the input for your @command{awk}
3952 @command{awk} keeps track of the number of records that have
3972 the value of @code{RS} can be changed in the @command{awk} program
3976 which indicate a string constant. Often the right time to do this is
3985 awk 'BEGIN @{ RS = "/" @}
3991 This is a string whose first character is a slash; as a result, records
3993 rule in the @command{awk} program (the action with no pattern) prints each
3995 its output, this @command{awk} program copies the input
4000 $ awk 'BEGIN @{ RS = "/" @}
4046 @command{awk} when it printed the record!
4055 awk '@{ print $0 @}' RS="/" BBS-list
4066 $ echo | awk 'BEGIN @{ RS = "a" @} ; @{ print NF @}'
4080 The empty string @code{""} (a string without any characters)
4086 If you change the value of @code{RS} in the middle of an @command{awk} run,
4094 @cindex differences in @command{awk} and @command{gawk}, record separators
4103 string. It can be any regular expression
4106 ends at the next string that matches the regular expression; the next
4107 record starts at the end of the matching string. This general rule is
4109 newline: a record ends at the beginning of the next matching string (the
4111 the end of this string (at the first character of the following line).
4149 @cindex differences in @command{awk} and @command{gawk}, @code{RS}/@code{RT} variables
4177 @cindex differences in @command{awk} and @command{gawk}, strings, storing
4181 to other @command{awk} implementations.
4184 All other @command{awk} implementations@footnote{At least that we know
4186 @sc{nul} character as the string terminator. In effect, this means that
4206 @cindex POSIX @command{awk}, field separators and
4209 When @command{awk} reads an input record, the record is
4213 Whitespace in @command{awk} means any string of one or more spaces,
4214 tabs, or newlines;@footnote{In POSIX @command{awk}, newlines are not
4218 whitespace by @command{awk}.
4223 simple @command{awk} programs so powerful.
4232 to refer to a field in an @command{awk} program,
4253 in the current record. @command{awk} automatically updates the value
4259 the empty string. (If used in a numeric operation, you get zero.)
4267 $ awk '$1 ~ /foo/ @{ print $0 @}' BBS-list
4276 field contains the string @samp{foo}. The operator @samp{~} is called a
4279 it tests whether a string (here, the field @code{$1}) matches a given regular
4287 $ awk '/foo/ @{ print $1, $NF @}' BBS-list
4301 the @command{awk} language can be used after a @samp{$} to refer to a
4303 value is a string, rather than a number, it is converted to a number.
4307 awk '@{ print $NR @}'
4319 awk '@{ print $(2*2) @}' BBS-list
4322 @command{awk} evaluates the expression @samp{(2*2)} and uses
4329 @file{BBS-list}. (All of the @command{awk} operators are listed, in
4338 notices this and terminates your program. Other @command{awk}
4342 @command{awk} stores the current record's number of fields in the built-in
4352 The contents of a field, as seen by @command{awk}, can be changed within an
4353 @command{awk} program; this changes what @command{awk} perceives as the
4354 current input record. (The actual input is untouched; @command{awk} @emph{never}
4359 $ awk '@{ nboxes = $3 ; $3 = $3 - 10
4378 as a number; the string of characters must be converted to a number
4380 from the subtraction is converted back to a string of characters that
4384 When the value of a field is changed (as perceived by @command{awk}), the
4392 $ awk '@{ $2 = $2 - 10; print $0 @}' inventory-shipped
4403 $ awk '@{ $6 = ($5 + $4 + $3 + $2)
4419 Creating a new field changes @command{awk}'s internal copy of the current
4438 Referencing an out-of-range field only produces an empty string. For
4451 for more information about @command{awk}'s @code{if-else} statements.
4458 even when you assign the empty string to a field. For example:
4461 $ echo a b c d | awk '@{ OFS = ":"; $2 = ""
4473 $ echo a b c d | awk '@{ OFS = ":"; $2 = ""; $6 = "new"
4493 $ echo a b c d e f | awk '@{ print "NF =", NF;
4501 @strong{Caution:} Some versions of @command{awk} don't
4505 @command{awk} to rebuild the entire record, using the current
4515 This forces @command{awk} rebuild the record. It does help
4543 expression, controls the way @command{awk} splits an input record into fields.
4544 @command{awk} scans the input record for character sequences that
4560 @cindex troubleshooting, @command{awk} uses @code{FS} not @code{IFS}
4562 Shell programmers take note: @command{awk} does @emph{not} use the
4567 The value of @code{FS} can be changed in the @command{awk} program with the
4574 For example, here we set the value of @code{FS} to the string
4578 awk 'BEGIN @{ FS = "," @} ; @{ print $2 @}'
4590 this @command{awk} program extracts and prints the string
4612 can massage it first with a separate @command{awk} program.)
4619 is a string containing a single space, @w{@code{" "}}. If @command{awk}
4643 More generally, the value of @code{FS} may be a string containing any
4673 @command{awk} first strips leading and trailing whitespace from
4678 $ echo ' a b c d ' | awk '@{ print $2 @}'
4687 $ echo ' a b c d ' | awk 'BEGIN @{ FS = "[ \t\n]+" @}
4702 $ echo ' a b c d' | awk '@{ print; $2 = $2; print @}'
4720 @cindex differences in @command{awk} and @command{gawk}, single-character fields
4725 simply assigning the null string (@code{""}) to @code{FS}. In this case,
4740 @cindex dark corner, @code{FS} as null string
4741 @cindex FS variable, as null string
4743 In this case, most versions of Unix @command{awk} simply treat the entire record
4748 if @code{FS} is the null string, then @command{gawk} also
4765 awk -F, '@var{program}' @var{input-files}
4772 containing an @command{awk} program. Case is significant in command-line
4776 @emph{and} get an @command{awk} program from a file.
4786 awk -F\\\\ '@dots{}' files @dots{}
4792 Because @samp{\} is used for quoting in the shell, @command{awk} sees
4793 @samp{-F\\}. Then @command{awk} processes the @samp{\\} for escape
4802 shell, without any quotes, the @samp{\} gets deleted, so @command{awk}
4807 For example, let's use an @command{awk} program file called @file{baud.awk}
4821 $ awk -F- -f baud.awk BBS-list
4849 @cindex Unix @command{awk}, password files, field separators and
4867 awk -F: '$2 == ""' /etc/passwd
4873 It is important to remember that when you assign a string constant
4874 as the value of @code{FS}, it undergoes normal @command{awk} string
4875 processing. For example, with Unix @command{awk} and @command{gawk},
4876 the assignment @samp{FS = "\.."} assigns the character string @code{".."}
4910 @cindex POSIX @command{awk}, field separators and
4912 According to the POSIX standard, @command{awk} is supposed to behave
4921 However, many implementations of @command{awk} do not work this way. Instead,
4934 sed 1q /etc/passwd | awk '@{ FS = ":" ; print $1 @}'
4945 on an incorrect implementation of @command{awk}, while @command{gawk}
4982 feature of @command{gawk}. If you are a novice @command{awk} user,
4987 (This @value{SECTION} discusses an advanced feature of @command{awk}.
4988 If you are a novice @command{awk} user, you might want to skip it on
5003 spaces}. Clearly, @command{awk}'s normal field splitting based on @code{FS}
5004 does not work well in this case. Although a portable @command{awk} program
5014 assigning a string containing space-separated numbers to the built-in
5043 This program uses a number of @command{awk} features that
5082 vote on some issue, any column on the card may be empty. An @command{awk}
5129 One technique is to use an unusual character or string to separate
5131 @samp{\f} in @command{awk}, as in C) to separate them, making each record
5133 @code{"\f"} (a string containing the formfeed character). Any
5139 dispensation, an empty string as the value of @code{RS} indicates that
5141 to the empty string, each record always ends at the first blank line
5151 string @code{"\n\n+"} to @code{RS}. This regexp matches the newline
5173 string, @emph{and} @code{FS} is a set to a single character,
5176 @code{FS}.@footnote{When @code{FS} is the null string (@code{""})
5196 variable @code{FS} to the string @code{"\n"}. (This single
5217 # addrs.awk --- simple mailing list program
5234 $ awk -f addrs.awk addresses
5295 So far we have been getting our input data from @command{awk}'s main
5298 files specified on the command line. The @command{awk} language has a
5307 rest of this @value{DOCUMENT} and have a good knowledge of how @command{awk} works.
5310 @cindex differences in @command{awk} and @command{gawk}, @code{getline} command
5316 @code{ERRNO} to a string describing the error that occurred.
5318 In the following examples, @var{command} stands for a string value that
5370 This @command{awk} program deletes all C-style comments (@samp{/* @dots{}
5399 @command{awk}'s input into the variable @var{var}. No other processing is
5401 For example, suppose the next line is a comment or a special string,
5405 read-a-line-and-check-each-rule loop of @command{awk} never sees it.
5452 Here @var{file} is a string-valued expression that
5475 @cindex POSIX @command{awk}, @code{<} operator and
5482 to be portable to other @command{awk} implementations.
5492 is a string-valued expression that specifies the file from which to read.
5540 this case, the string @var{command} is run as a shell command and its output
5541 is piped into @command{awk} to be used as input. This form of @code{getline}
5604 @cindex POSIX @command{awk}, @code{|} I/O operator and
5611 to be portable to other @command{awk} implementations.
5643 program to be portable to other @command{awk} implementations.
5654 @cindex differences in @command{awk} and @command{gawk}, input/output operators
5658 sends data @emph{to} your @command{awk} program.
5716 @command{awk} does @emph{not} automatically jump to the start of the
5720 @cindex differences in @command{awk} and @command{gawk}, implementation limitations
5722 @cindex @command{awk}, implementations, limits
5725 Many @command{awk} implementations limit the number of pipelines that an @command{awk}
5740 causes @command{awk} to set the value of @code{FILENAME}. Normally,
5751 confusion. @command{awk} opens a separate input stream from the
5848 current record (such as @code{$1}), variables, or any @command{awk}
5856 line, use @samp{print ""}, where @code{""} is the empty string.
5857 To print a fixed piece of text, use a string constant, such as
5859 double-quote characters, your text is taken as an @command{awk}
5867 isn't limited to only one line. If an item value is a string that contains a
5868 newline, the newline is output along with the rest of the string. A
5872 The following is an example of printing a string that contains embedded newlines
5877 $ awk 'BEGIN @{ print "line one\nline two\nline three" @}'
5889 $ awk '@{ print $1, $2 @}' inventory-shipped
5902 juxtaposing two string expressions in @command{awk} means to concatenate
5906 $ awk '@{ print $1 $2 @}' inventory-shipped
5924 awk 'BEGIN @{ print "Month Crates"
5948 awk 'BEGIN @{ print "Month Crates"
5978 a single space is only the default. Any string of
5981 is the string @w{@code{" "}}---that is, a single space.
5985 record, and then outputs a string called the @dfn{output record separator}
5987 value of @code{ORS} is the string @code{"\n"}; i.e., a newline
6010 awk 'BEGIN @{ print "Month Crates"
6018 $ awk 'BEGIN @{ OFS = ";"; ORS = "\n\n" @}
6037 @command{awk} internally converts the number to a string of characters
6038 and prints that string. @command{awk} uses the @code{sprintf} function
6054 number to a string for printing.
6061 $ awk 'BEGIN @{
6069 @cindex POSIX @command{awk}, @code{OFMT} variable and
6070 @cindex @code{OFMT} variable, POSIX @command{awk} and
6071 According to the POSIX standard, @command{awk}'s behavior is undefined
6088 after the decimal point). This is done by supplying a string, called
6089 the @dfn{format string}, that controls how and where to print the other
6117 argument. This is an expression whose value is taken as a string; it
6119 @dfn{format string}.
6121 The format string is very similar to that in the ISO C library function
6128 to its output. It outputs only what the format string specifies.
6129 So if a newline is needed, you must include one in the format string.
6134 $ awk 'BEGIN @{
6161 65} outputs the letter @samp{A}. (The output for a string value is
6162 the first character of the string.)
6206 This prints a string.
6210 (This format is of marginal use, because all numbers in @command{awk}
6231 warns about this. Other versions of @command{awk} may print invalid
6251 @cindex differences in @command{awk} and @command{gawk}, @code{print}/@code{printf} statements
6258 given in the format string. With a positional specifier, the format
6363 Maximum number of characters from the string that should print.
6379 string, they are passed in the argument list. For example:
6398 Earlier versions of @command{awk} did not support this capability.
6400 concatenation to build up the format string, like so:
6414 @cindex POSIX @command{awk}, @code{printf} format strings and
6417 modifiers in @code{printf} format strings. These are not valid in @command{awk}.
6418 Most @command{awk} implementations silently ignore these modifiers.
6432 awk '@{ printf "%-10s %s\n", $1, $2 @}' BBS-list
6438 @file{BBS-list} as a string of 10 characters that are left-justified. It also
6444 $ awk '@{ printf "%-10s %s\n", $1, $2 @}' BBS-list
6471 the @command{awk} program:
6474 awk 'BEGIN @{ print "Name Number"
6484 awk 'BEGIN @{ printf "%-10s %s\n", "Name", "Number"
6498 awk 'BEGIN @{ format = "%-10s %s\n"
6524 Redirections in @command{awk} are written just like redirections in shell
6525 commands, except that they are written inside the @command{awk} program.
6542 expression. Its value is changed to a string and then used as a
6550 is how an @command{awk} program can write a list of BBS names to one
6555 $ awk '@{ print $2 > "phone-list"
6576 @var{output-file} are not erased. Instead, the @command{awk} output is
6589 The redirection argument @var{command} is actually an @command{awk}
6590 expression. Its value is converted to a string whose contents give
6603 awk '@{ print $1 > "names.unsorted"
6613 in an @command{awk} script run periodically for system maintenance:
6623 The message is built using string concatenation and saved in the variable
6635 use a string constant. Using a variable is generally a good idea,
6636 because @command{awk} requires that the string value be spelled identically
6642 @cindex differences in @command{awk} and @command{gawk}, input/output operators
6649 but subsidiary to, the @command{awk} program.
6652 POSIX @command{awk}.
6676 @command{awk}, it isn't necessary. In this kind of case, a program should
6680 @cindex differences in @command{awk} and @command{gawk}, implementation limitations
6683 @cindex @command{awk}, implementation issues, pipes
6693 @command{awk} implementations limit the number of pipelines that an @command{awk}
6716 The @code{tolower} function returns its argument string with all
6761 @cindex differences in @command{awk} and @command{gawk}, error messages
6763 In other implementations of @command{awk}, the only way to write an error
6764 message to standard error in an @command{awk} program is as follows:
6772 standard error stream that it inherits from the @command{awk} process.
6774 separate process. So people writing @command{awk} programs often
6786 @command{awk} is run from a background job, it may not have a terminal at all.
6815 be opened by the program initiating the @command{awk} execution (typically
6832 Like any other redirection, the value must be a string.
6895 as well as for I/O redirections within an @command{awk} program.
6914 Starting with @value{PVERSION} 3.1 of @command{gawk}, @command{awk} programs
7000 more than once during the execution of an @command{awk} program
7008 command associated with it is remembered by @command{awk}, and subsequent
7010 The file or pipe stays open until @command{awk} exits.
7030 value must @emph{exactly} match the string that was used to open the file or
7061 This helps avoid hard-to-find typographical errors in your @command{awk}
7066 To write a file and read it back later on in the same @command{awk}
7071 To write numerous files, successively, in the same @command{awk}
7072 program. If the files aren't closed, eventually @command{awk} may exceed a
7095 @cindex differences in @command{awk} and @command{gawk}, @code{close} function
7118 Without the call to @code{close} indicated in the comment, @command{awk}
7130 @command{awk} exits.
7139 of a file that was never opened, so @command{awk} silently
7151 The second argument should be a string, with either of the values
7165 @cindex differences in @command{awk} and @command{gawk}, @code{close} function
7166 @cindex Unix @command{awk}, @code{close} function and
7168 In many versions of Unix @command{awk}, the @code{close} function
7184 @code{ERRNO} to a string describing the problem.
7215 @cindex POSIX @command{awk}, pipes, closing
7245 Expressions are the basic building blocks of @command{awk} patterns
7253 operate. As in other languages, expressions in @command{awk} include
7288 string, and regular expression.
7295 * Scalar Constants:: Numeric and string constants.
7319 @cindex string constants
7320 A string constant consists of a sequence of characters enclosed in
7328 @cindex differences in @command{awk} and @command{gawk}, strings
7330 represents the string whose contents are @samp{parrot}. Strings in
7333 Other @command{awk}
7343 In @command{awk}, all numbers are in decimal; i.e., base 10. Many other
7448 @command{awk} programs are constant, but the @samp{~} and @samp{!~}
7518 @cindex differences in @command{awk} and @command{gawk}, regexp constants
7527 Modern implementations of @command{awk}, including @command{gawk}, allow
7574 on the @command{awk} command line.
7606 @command{awk}. All built-in variables' names are entirely uppercase.
7608 Variables in @command{awk} can be assigned either numeric or string values.
7610 By default, variables are initialized to the empty string, which
7612 ``initialize'' each variable explicitly in @command{awk},
7621 Any @command{awk} variable can be set by including a @dfn{variable assignment}
7622 among the arguments on the command line when @command{awk} is invoked
7634 @command{awk} run or in between input files.
7653 awk '@{ print $n @}' n=4 inventory-shipped n=2 BBS-list
7665 $ awk '@{ print $n @}' n=4 inventory-shipped n=2 BBS-list
7676 the @command{awk} program in the @code{ARGV} array
7678 @command{awk} processes the values of command-line assignments for escape
7691 of the @command{awk} program demands it. For example, if the value of
7693 happens to be a string, it is converted to a number before the addition
7694 is performed. If numeric values appear in string concatenation, they
7705 concatenated together. The resulting string is converted back to the
7711 string, concatenate the empty string, @code{""}, with that number.
7712 To force a string to be converted to a number, add zero to that string.
7713 A string is converted to a number by interpreting any numeric prefix
7714 of the string as numerals:
7721 by the @command{awk} built-in variable @code{CONVFMT} (@pxref{Built-in Variables}).
7737 Strange results can occur if you set @code{CONVFMT} to a string that doesn't
7739 For example, if you forget the @samp{%} in the format, @command{awk} converts
7740 all numbers to the same constant string.
7742 it to a string is @emph{always} an integer, no matter what the value of
7755 @cindex POSIX @command{awk}, @code{OFMT} variable and
7757 @cindex portability, new @command{awk} vs. old @command{awk}
7758 @cindex @command{awk}, new vs. old, @code{OFMT} variable
7759 Prior to the POSIX standard, @command{awk} used the value
7765 of cases, old @command{awk} programs do not change their behavior.
7767 port your new style program to older implementations of @command{awk}.
7777 characters. The locale also affects numeric formats. In particular, for @command{awk}
7783 The POSIX standard says that @command{awk} always uses the period as the decimal
7784 point when reading the @command{awk} program source code, and for command-line
7787 and for number to string conversion, the local decimal point character is used.
7821 The @command{awk} language uses the common arithmetic operators when
7840 $ awk '@{ sum = $2 + $3 + $4 ; avg = sum / 3
7847 The following list provides the arithmetic operators in @command{awk}, in order from
7857 @cindex POSIX @command{awk}, arithmetic operators and
7870 Division; because all numbers in @command{awk} are floating-point
7873 to forget that @emph{all} numbers in @command{awk} are floating-point,
7892 @cindex differences in @command{awk} and @command{gawk}, trunc-mod operation
7911 In other @command{awk} implementations, the signedness of the remainder
7931 @cindex string operators
7932 @cindex operators, string
7934 There is only one string operation: concatenation. It does not have a
7939 $ awk '@{ print "Field number one: " $1 @}' BBS-list
7945 Without the space in the string constant after the @samp{:}, the line
7949 $ awk '@{ print "Field number one:" $1 @}' BBS-list
7955 @cindex troubleshooting, string concatenation
7956 Because string concatenation does not have an explicit operator, it is
7980 Be careful about the kinds of expressions used in string concatenation.
7982 is undefined in the @command{awk} language. Consider this example:
7996 @c see test/nasty.awk for a worse example
8009 > prompt> cat bad.awk
8012 > prompt> gawk -f bad.awk
8024 $ awk 'BEGIN @{ print -12 " " -24 @}'
8031 @command{awk}'s automatic conversion rules. To get the desired result,
8035 $ awk 'BEGIN @{ print -12 " " (-24) @}'
8039 This forces @command{awk} to treat the @samp{-} on the @samp{-24} as unary.
8074 Assignments can also store string values. For example, the
8086 This also illustrates string concatenation.
8115 @code{foo} has a numeric value at first, and a string value later on:
8125 When the second assignment gives @code{foo} a string value, the fact that
8132 foo = "a string"
8137 @strong{Note:} Using a variable as a number and then later as a string
8139 illustrate how @command{awk} works, @emph{not} how you should write your
8247 @cindex @command{awk} language, POSIX version
8248 @cindex POSIX @command{awk}
8279 @cindex @command{awk} language, POSIX version
8280 @cindex POSIX @command{awk}
8285 @cindex POSIX @command{awk}, @code{**=} operator and
8309 This is most notable in commercial @command{awk} versions.
8313 $ awk /==/ /dev/null
8314 @error{} awk: syntax error at source line 1
8317 @error{} awk: bailing out at source line 1
8324 awk '/[=]=/' /dev/null
8344 the increment operators add no power to the @command{awk} language; however, they
8367 @command{awk} are floating-point---in floating-point, @samp{foo + 1 - 1} does
8445 In other words, it is up to the particular version of @command{awk}.
8459 @section True and False in @command{awk}
8470 However, @command{awk} is different.
8472 false from C. In @command{awk}, any nonzero numeric value @emph{or} any
8473 nonempty string value is true. Any other value (zero or the null
8474 string @code{""}) is false. The following program prints @samp{A strange
8490 the string constant @code{"0"} is actually true, because it is non-null.
8513 Unlike other programming languages, @command{awk} variables do not have a
8514 fixed type. Instead, they can be either a number or a string, depending
8519 @cindex POSIX @command{awk}, numeric strings and
8521 the concept of a @dfn{numeric string}, which is simply a string that looks
8534 A string constant or the result of a string operation has the @var{string}
8541 have the @var{strnum} attribute. Otherwise, they have the @var{string}
8549 @c value such that it has both a numeric and string value, this leaves the
8555 @code{a} has numeric type, even though it is later used in a string
8566 When two operands are compared, either string comparison or numeric comparison
8608 STRING &&string &string &string\cr
8609 NUMERIC &&string &numeric &numeric\cr
8610 STRNUM &&string &numeric &numeric\cr
8619 STRING | string string string
8621 NUMERIC | string numeric numeric
8623 STRNUM | string numeric numeric
8630 made of characters and is therefore also a string.
8631 Thus, for example, the string constant @w{@code{" +3.14"}}
8632 is a string, even though it looks numeric,
8636 In short, when one operand is a ``pure'' string, such as a string
8637 constant, then a string comparison is performed. Otherwise, a
8684 True if the string @var{x} matches the regexp denoted by @var{y}.
8687 True if the string @var{x} does not match the regexp denoted by @var{y}.
8701 strings where one is a prefix of the other, the shorter string is less than
8706 leave off one of the @samp{=} characters. The result is still valid @command{awk}
8717 Unless @code{b} happens to be zero or the null string, the @code{if}
8731 string comparison (false)
8734 string comparison (true)
8737 string comparison (true)
8741 string comparison (true)
8745 string comparison (false)
8751 $ echo 1e2 3 | awk '@{ print ($1 < $2) ? "true" : "false" @}'
8755 @cindex comparison expressions, string vs. regexp
8756 @c @cindex string comparison vs. regexp comparison
8757 @c @cindex regexp comparison vs. string comparison
8790 expression. In the latter case, the value of the expression as a string is used as a
8794 @cindex @command{awk}, regexp constants and
8796 In modern implementations of @command{awk}, a constant regular
8930 The variable @code{interested}, as with all @command{awk} variables, starts
8946 @code{next} tells @command{awk} to skip the rest of the rules, get the
8999 @cindex differences in @command{awk} and @command{gawk}, line continuations
9022 available in every @command{awk} program. The @code{sqrt} function is one
9067 treated as local variables and initialized to the empty string
9080 $ awk '@{ print "The square root of", $1, "is", sqrt($1) @}'
9128 This table presents @command{awk}'s operators, in order of highest
9269 @cindex portability, operators, not in POSIX @command{awk}
9282 As you have already seen, each @command{awk} statement consists of
9285 actions, and @command{awk}'s built-in variables.
9288 within actions form the core of @command{awk} programming.
9296 * Using Shell Variables:: How to use shell variables with @command{awk}.
9315 Patterns in @command{awk} control the execution of rules---a rule is
9317 The following is a summary of the types of @command{awk} patterns:
9327 is nonzero (if a number) or non-null (if a string).
9339 @command{awk} program.
9368 Any @command{awk} expression is valid as an @command{awk} pattern.
9370 number) or non-null (if a string).
9375 @command{awk} program.
9383 The left operand of the @samp{~} and @samp{!~} operators is a string.
9385 slashes (@code{/@var{regexp}/}), or any expression whose string value
9398 $ awk '$1 == "foo" @{ print $2 @}' BBS-list
9407 $ awk '$1 ~ /foo/ @{ print $2 @}' BBS-list
9429 $ awk '/2400/ && /foo/' BBS-list
9438 $ awk '/2400/ || /foo/' BBS-list
9449 @file{BBS-list} that do @emph{not} contain the string @samp{foo}:
9452 $ awk '! /foo/' BBS-list
9465 expressions, comparisons, or any other @command{awk} expressions. Range
9486 awk '$1 == "on", $1 == "off"' myfile
9517 This causes @command{awk} to skip any further processing of the current
9544 echo Yes | awk '/1/,/2/ || /Yes/'
9548 However, @command{awk} interprets this as @samp{/1/, (/2/ || /Yes/)}.
9569 They supply startup and cleanup actions for @command{awk} programs.
9573 ``@code{BEGIN} and @code{END} blocks'' by long-time @command{awk}
9589 $ awk '
9600 that contain the string @samp{foo}. The @code{BEGIN} rule prints a title
9602 initialize the counter @code{n} to zero, since @command{awk} does this
9610 An @command{awk} program may have multiple @code{BEGIN} and/or @code{END}
9614 This feature was added in the 1987 version of @command{awk} and is included
9616 The original (1978) version of @command{awk}
9635 If an @command{awk} program has only a @code{BEGIN} rule and no
9637 run.@footnote{The original version of @command{awk} used to keep
9653 yield a null string or zero, depending upon the context. One way
9658 @cindex differences in @command{awk} and @command{gawk}, @code{BEGIN}/@code{END} patterns
9659 @cindex POSIX @command{awk}, @code{BEGIN}/@code{END} patterns
9671 @code{END} rules. Be aware, however, that Unix @command{awk}, and possibly
9676 @samp{print $0}. If @code{$0} is the null string, then this prints an
9677 empty line. Many long time @command{awk} programmers use an unadorned
9706 awk '@{ print $1 @}' BBS-list
9716 @cindex @command{awk} programs, shell variables in
9717 @c @cindex shell and @command{awk} interaction
9719 @command{awk} programs are often used as components in larger
9722 hold a pattern that the @command{awk} program searches for.
9724 into the body of the @command{awk} program.
9734 awk "/$pattern/ "'@{ nmatches++ @}
9739 the @command{awk} program consists of two pieces of quoted text
9751 A better method is to use @command{awk}'s variable assignment feature
9753 to assign the shell variable's value to an @command{awk} variable's
9762 awk -v pat="$pattern" '$0 ~ pat @{ nmatches++ @}
9767 Now, the @command{awk} program is just one single-quoted string.
9770 The @command{awk} variable @code{pat} could be named @code{pattern}
9784 An @command{awk} program or script consists of a series of
9789 @command{awk} what to do once a match for the pattern is found. Thus,
9790 in outline, an @command{awk} program generally looks like this:
9806 An action consists of one or more @command{awk} @dfn{statements}, enclosed
9819 The following types of statements are supported in @command{awk}:
9831 Specify the control flow of @command{awk}
9832 programs. The @command{awk} language gives you C-like constructs
9845 Also supplied in @command{awk} are the @code{next}
9869 control the flow of execution in @command{awk} programs. Most of the
9870 control statements in @command{awk} are patterned on similar statements in C.
9892 * If Statement:: Conditionally execute some @command{awk}
9906 * Exit Statement:: Stop execution of @command{awk}.
9913 The @code{if}-@code{else} statement is @command{awk}'s decision-making
9926 the null string; otherwise, the condition is true.
9952 If the @samp{;} is left out, @command{awk} can't interpret the statement and
9966 @command{awk}. It repeatedly executes a statement as long as a condition is
9983 is not zero and not a null string.)
9989 never executed and @command{awk} continues with the statement following
9994 awk '@{ i = 1
10075 arbitrary @command{awk} expressions, and @var{body} stands for any
10076 @command{awk} statement.
10087 awk '@{ for (i = 1; i <= 3; i++)
10107 this context but it is not supported in @command{awk}.
10144 The @command{awk} language has a @code{for} statement in addition to a
10236 # find smallest divisor of num
10249 When the remainder is zero in the first @code{if} statement, @command{awk}
10251 that @command{awk} proceeds immediately to the statement following the loop
10253 statement, which stops the entire @command{awk} program.
10261 # find smallest divisor of num
10279 @c @cindex @command{awk} language, POSIX version
10280 @cindex POSIX @command{awk}, @code{break} statement and
10285 historical implementations of @command{awk} treated the @code{break}
10288 Recent versions of Unix @command{awk} no longer allow this usage.
10308 The @code{continue} statement in a @code{for} loop directs @command{awk} to
10348 @c @cindex @command{awk} language, POSIX version
10349 @cindex POSIX @command{awk}, @code{continue} statement and
10353 a loop. Historical versions of @command{awk} treated a @code{continue}
10358 Recent versions of Unix @command{awk} no longer work this way, and
10369 The @code{next} statement forces @command{awk} to immediately stop processing
10376 @command{awk} to read the next record immediately, but it does not alter the
10380 @cindex @command{awk} programs, execution of
10381 At the highest level, @command{awk} program execution is a loop that reads
10388 For example, suppose an @command{awk} program works only on records
10409 @c @cindex @command{awk} language, POSIX version
10413 @cindex POSIX @command{awk}, @code{next}/@code{nextfile} statements and
10420 some other @command{awk} implementations don't allow the @code{next}
10433 @cindex differences in @command{awk} and @command{gawk}, @code{next}/@code{nextfile} statements
10442 In most other @command{awk} implementations,
10466 @command{awk} does with the files listed in @code{ARGV}.
10468 If it's necessary to use an @command{awk} version that doesn't support
10476 The current version of the Bell Laboratories @command{awk}
10501 The @code{exit} statement causes @command{awk} to immediately stop
10534 status code for the @command{awk} process. If no argument is supplied,
10538 @command{awk} uses the previously supplied exit value.
10544 exiting with a nonzero status. An @command{awk} program can do this
10569 Most @command{awk} variables are available to use for your own
10572 However, a few variables in @command{awk} have special built-in meanings.
10573 @command{awk} examines some of these automatically, so that they enable you
10574 to tell @command{awk} how to do certain things. Others are set
10575 automatically by @command{awk}, so that they carry information from the
10576 internal workings of @command{awk} to your program.
10585 @command{awk}.
10586 * Auto-set:: Built-in variables where @command{awk} gives
10592 @subsection Built-in Variables That Control @command{awk}
10599 control how @command{awk} does certain things. The variables that are
10611 string values of @code{"r"} or @code{"w"} specify that input files and
10613 A string value of @code{"rw"} or @code{"wr"} indicates that all
10615 Any other string value is equivalent to @code{"rw"}, but @command{gawk}
10620 @cindex differences in @command{awk} and @command{gawk}, @code{BINMODE} variable
10622 In other @command{awk} implementations
10630 @cindex POSIX @command{awk}, @code{CONVFMT} variable and
10634 This string controls conversion of numbers to
10643 @cindex differences in @command{awk} and @command{gawk}, @code{FIELDWIDTHS} variable
10665 The value is a single-character string or a multi-character regular
10667 record. If the value is the null string (@code{""}), then each
10669 (This behavior is a @command{gawk} extension. POSIX @command{awk} does not
10670 specify the behavior when @code{FS} is the null string.)
10673 @cindex POSIX @command{awk}, @code{FS} variable and
10674 The default value is @w{@code{" "}}, a string consisting of a single
10677 POSIX @command{awk}, newline does not count as whitespace.} It also causes
10684 awk -F, '@var{program}' @var{input-files}
10694 @cindex differences in @command{awk} and @command{gawk}, @code{IGNORECASE} variable
10695 @cindex case sensitivity, string comparisons and
10699 If @code{IGNORECASE} is nonzero or non-null, then all string comparisons
10713 then @code{IGNORECASE} has no special meaning. Thus, string
10717 @cindex differences in @command{awk} and @command{gawk}, @code{LINT} variable
10731 in other @command{awk} implementations. Unlike the other special variables,
10737 of @command{awk} being executed.
10743 This string controls conversion of numbers to
10748 Its default value is @code{"%.6g"}. Earlier versions of @command{awk}
10760 default value is @w{@code{" "}}, a string consisting of a single space.
10772 This is @command{awk}'s input record separator. Its default value is a string
10775 It can also be the null string, in which case records are separated by
10783 In most other @command{awk} implementations,
10799 @cindex differences in @command{awk} and @command{gawk}, @code{TEXTDOMAIN} variable
10803 @command{awk} level. It sets the default text domain for specially
10804 marked string constants in the source text, as well as for the
10810 In other @command{awk} implementations,
10827 The following is an alphabetical list of variables that @command{awk}
10837 The command-line arguments available to @command{awk} programs are stored in
10840 Unlike most @command{awk} arrays,
10845 $ awk 'BEGIN @{
10849 @print{} awk
10855 @code{ARGV[0]} contains @code{"awk"}, @code{ARGV[1]}
10868 @code{ARGV}, nor are any of @command{awk}'s command-line options.
10870 about how @command{awk} uses these variables.
10873 @cindex differences in @command{awk} and @command{gawk}, @code{ARGIND} variable
10888 While you can change the value of @code{ARGIND} within your @command{awk}
10893 In other @command{awk} implementations,
10906 @command{awk} may spawn via redirection or the @code{system} function.
10915 @cindex differences in @command{awk} and @command{gawk}, @code{ERRNO} variable
10920 then @code{ERRNO} contains a string describing the error.
10923 In other @command{awk} implementations,
10931 The name of the file that @command{awk} is currently reading.
10932 When no @value{DF}s are listed on the command line, @command{awk} reads
10938 yet.@footnote{Some early implementations of Unix @command{awk} initialized
10969 @command{awk}'s internal workings. In particular, assignments
10975 The number of input records @command{awk} has processed since
10981 @cindex differences in @command{awk} and @command{gawk}, @code{PROCINFO} array
10984 running @command{awk} program.
11023 In other @command{awk} implementations,
11034 is the length of the matched string, or @minus{}1 if no match is found.
11042 is the position of the string where the matched substring starts, or zero
11046 @cindex differences in @command{awk} and @command{gawk}, @code{RT} variable
11052 In other @command{awk} implementations,
11066 @command{awk} increments @code{NR} and @code{FNR}
11078 > 4' | awk 'NR == 2 @{ NR = 17 @}
11087 Before @code{FNR} was added to the @command{awk} language
11089 many @command{awk} programs used this feature to track the number of
11104 $ awk 'BEGIN @{
11108 @print{} awk
11114 In this example, @code{ARGV[0]} contains @samp{awk}, @code{ARGV[1]}
11117 Notice that the @command{awk} program is not entered in @code{ARGV}. The
11125 $ cat showargs.awk
11132 $ awk -v A=1 -f showargs.awk B=2 /dev/null
11134 @print{} ARGV[0] = awk
11141 Each time @command{awk} reaches the end of an input file, it uses the next
11143 different string there, a program can change which files are read.
11153 To eliminate a file from the middle of the list, store the null string
11155 special feature, @command{awk} ignores @value{FN}s that have been
11156 replaced with the null string.
11188 To actually get the options into the @command{awk} program,
11189 end the @command{awk} options with @option{--} and then supply
11190 the @command{awk} program's options, in the following manner:
11193 awk -f myprog -- -v -d file1 file2 @dots{}
11196 @cindex differences in @command{awk} and @command{gawk}, @code{ARGC}/@code{ARGV} variables
11199 into @code{ARGV} for the @command{awk} program to deal with. As soon
11211 are passed on to the @command{awk} program.
11214 @chapter Arrays in @command{awk}
11222 This @value{CHAPTER} describes how arrays work in @command{awk},
11225 It also describes how @command{awk} simulates multidimensional
11235 @command{awk} maintains a single set
11239 same @command{awk} program.
11252 @command{awk}.
11255 @command{awk}.
11263 The @command{awk} language provides one-dimensional arrays
11265 Every @command{awk} array must have a name. Array names have the same
11268 as a variable) in the same @command{awk} program.
11270 Arrays in @command{awk} superficially resemble arrays in other programming
11271 languages, but there are fundamental differences. In @command{awk}, it
11273 Additionally, any number or string in @command{awk}, not just consecutive integers,
11342 Arrays in @command{awk} are different---they are @dfn{associative}. This means
11375 have to be positive integers. Any number, or even a string, can be
11393 Here, the number @code{1} isn't double-quoted, since @command{awk}
11394 automatically converts it to a string.
11400 The identical string value used to store an array element must be used
11402 When @command{awk} creates an array (e.g., with the @code{split}
11407 @command{awk}'s arrays are efficient---the time to access an element
11433 @code{""}, the null string. This includes elements
11436 automatically creates that array element, with the null string as its value.
11438 @command{awk}.)
11480 @command{awk} variables:
11505 @c file eg/misc/arraymax.awk
11570 in @command{awk}, because any number or string can be an array index.
11571 So @command{awk} has a special kind of @code{for} statement for scanning
11590 find all the distinct words that appear in the input. It prints each
11622 @command{awk} and cannot be controlled or changed. This can lead to
11666 same as assigning it a null value (the empty string, @code{""}).
11684 @cindex differences in @command{awk} and @command{gawk}, array elements, deleting
11714 apart the null string. Because there is no data to split out, the
11734 it is converted to a string value before being used for subscripting
11752 @code{data[xyz]} subscripts @code{data} with the string value @code{"12.153"}
11757 string value from @code{xyz}---this time @code{"12.15"}---because the value of
11759 since @code{"12.15"} is a different string from @code{"12.153"}.
11786 As with many things in @command{awk}, the majority of the time
11807 > line 3' | awk '@{ l[lines] = $0; ++lines @}
11821 So, @command{awk} should have printed the value of @code{l[0]}.
11823 The issue here is that subscripts for @command{awk} arrays are @emph{always}
11844 Even though it is somewhat unusual, the null string
11847 @command{gawk} warns about the use of the null string as a subscript
11859 languages, including @command{awk}) to refer to an element of a
11864 Multidimensional arrays are supported in @command{awk} through
11865 concatenation of indices into one string.
11866 @command{awk} converts the indices into strings
11869 a single string that describes the values of the separate indices. The
11870 combined string is used as a single index into an ordinary,
11880 Once the element's value is stored, @command{awk} has no record of whether
11885 The default value of @code{SUBSEP} is the string @code{"\034"},
11887 @command{awk} program or in most input data.
11889 that index values that contain a string matching @code{SUBSEP} can lead to
12007 In most @command{awk} implementations, sorting an array requires
12101 string comparisons, the value of @code{IGNORECASE} also
12113 This @value{CHAPTER} describes @command{awk}'s built-in functions,
12114 which fall into three categories: numeric, string, and I/O.
12119 Besides the built-in functions, @command{awk} has provisions for
12134 your @command{awk} program to call. This @value{SECTION} defines all
12136 functions in @command{awk}; some of these are mentioned in other sections
12143 * String Functions:: Functions for string manipulation, such as
12148 * I18N Functions:: Functions for string translation.
12154 To call one of @command{awk}'s built-in functions, write the name of
12171 @cindex differences in @command{awk} and @command{gawk}, function arguments (@command{gawk})
12175 individual functions. In some @command{awk} implementations, extra
12262 However, nothing requires that an @command{awk} implementation use the C
12263 @code{rand} to implement the @command{awk} version of @code{rand}.
12300 @strong{Caution:} In most @command{awk} implementations, including @command{gawk},
12302 starting number, or @dfn{seed}, each time you run @command{awk}. Thus,
12304 The numbers are random within one @command{awk} run but predictable
12323 Different @command{awk} implementations use different random-number
12324 generators internally. Don't expect the same @command{awk} program
12326 different versions of @command{awk}.
12400 the comparison performed is always a string comparison. (Here too,
12409 @item index(@var{in}, @var{find})
12412 This searches the string @var{in} for the first occurrence of the string
12413 @var{find}, and returns the position in characters where that occurrence
12414 begins in the string @var{in}. Consider the following example:
12417 $ awk 'BEGIN @{ print index("peanut", "an") @}'
12422 If @var{find} is not found, @code{index} returns zero.
12423 (Remember that string indices in @command{awk} start at one.)
12425 @item length(@r{[}@var{string}@r{]})
12427 This returns the number of characters in @var{string}. If
12428 @var{string} is a number, the length of the digit string representing
12431 525, and 525 is then converted to the string @code{"525"}, which has
12438 @cindex POSIX @command{awk}, functions and, @code{length}
12440 In older versions of @command{awk}, the @code{length} function could
12448 @item match(@var{string}, @var{regexp} @r{[}, @var{array}@r{]})
12450 The @code{match} function searches @var{string} for the
12454 @var{string}). If no match is found, it returns zero.
12457 (@samp{/@dots{}/}) or a string constant (@var{"@dots{}"}).
12458 In the latter case, the string is treated as a regexp to be matched.
12463 The order of the first two arguments is backwards from most other string
12467 @samp{@var{string} ~ @var{regexp}}.
12480 @c file eg/misc/findpat.awk
12513 @command{awk} prints:
12520 @cindex differences in @command{awk} and @command{gawk}, @code{match} function
12522 of @var{array} is set to the entire portion of @var{string}
12525 portion of @var{string} matching the corresponding parenthesized
12564 @item split(@var{string}, @var{array} @r{[}, @var{fieldsep}@r{]})
12566 This function divides @var{string} into pieces separated by @var{fieldsep}
12569 forth. The string value of the third argument, @var{fieldsep}, is
12570 a regexp describing where to split @var{string} (much as @code{FS} can
12584 splits the string @samp{cul-de-sac} into three fields using @samp{-} as the
12596 @cindex differences in @command{awk} and @command{gawk}, @code{split} function
12600 Also as with input field-splitting, if @var{fieldsep} is the null string, each
12601 individual character in the string is split into its own array element.
12609 Modern implementations of @command{awk}, including @command{gawk}, allow
12611 string.
12615 discussion of the difference between using a string constant or a regexp constant,
12618 Before splitting the string, @code{split} deletes any previously existing
12621 If @var{string} is null, the array has no elements. (So this is a portable
12625 If @var{string} does not match @var{fieldsep} at all (but is not null),
12627 @var{string}.
12631 This returns (without printing) the string that @code{printf} would
12641 assigns the string @w{@code{"pi = 3.14 (approx.)"}} to the variable @code{pival}.
12643 @cindex differences in @command{awk} and @command{gawk}, @code{strtonum} function (@command{gawk})
12659 to a string value; the automatic coercion of strings to numbers
12664 @cindex differences in @command{awk} and @command{gawk}, @code{strtonum} function (@command{gawk})
12671 It searches this value, which is treated as a string, for the
12673 Then the entire string is
12675 The modified string becomes the new value of @var{target}.
12678 (@samp{/@dots{}/}) or a string constant (@var{"@dots{}"}).
12679 In the latter case, the string is treated as a regexp to be matched.
12709 the regexp can match more than one string, then this precise substring
12722 $ awk 'BEGIN @{
12731 This shows how @samp{&} can represent a nonconstant string and also
12736 backslash before it in the string. As usual, to insert one backslash in
12737 the string, you must write two backslashes. Therefore, write @samp{\\&}
12738 in a string constant to include a literal @samp{&} in the replacement.
12750 Some versions of @command{awk} allow the third argument to
12754 to put it. Such versions of @command{awk} accept expressions
12769 string, and then the value of that string is treated as the regexp to match.
12775 substrings it can find. The @samp{g} in @code{gsub} stands for
12783 replaces all occurrences of the string @samp{Britain} with @samp{United
12795 @code{gsub}, it searches the target string @var{target} for matches of
12797 the modified string is returned as the result of the function and the
12798 original target string is @emph{not} changed. If @var{how} is a string
12823 to get one into the string.
12836 In this case, @code{$0} is used as the default target string.
12837 @code{gensub} returns the new string as its result, which is
12842 If the @var{how} argument is a string that does not begin with @samp{g} or
12853 @item substr(@var{string}, @var{start} @r{[}, @var{length}@r{]})
12855 This returns a @var{length}-character-long substring of @var{string},
12857 string is character number one.@footnote{This is different from
12862 @var{string} that begins at character number @var{start}. For example,
12866 in the string, counting from character @var{start}.
12870 Unix @command{awk} acts this way, and therefore @command{gawk}
12873 in the string, @code{substr} returns the null string.
12875 the null string is returned.
12878 The string returned by @code{substr} @emph{cannot} be
12880 a string, as shown in the following example:
12883 string = "abcdef"
12885 substr(string, 3, 3) = "CDE"
12897 (Some commercial versions of @command{awk} do in fact let you use
12900 If you need to replace bits and pieces of a string, combine @code{substr}
12901 with string concatenation, in the following manner:
12904 string = "abcdef"
12906 string = substr(string, 1, 2) "CDE" substr(string, 6)
12911 @item tolower(@var{string})
12913 This returns a copy of @var{string}, with each uppercase character
12914 in the string replaced with its corresponding lowercase character.
12918 @item toupper(@var{string})
12920 This returns a copy of @var{string}, with each lowercase character
12921 in the string replaced with its corresponding uppercase character.
12941 First, there is the @dfn{lexical} level, which is when @command{awk} reads
12944 Then there is the runtime level, which is when @command{awk} actually scans the
12945 replacement string to determine what to generate.
12947 At both levels, @command{awk} looks for a defined set of characters that
12950 Thus, for every @samp{\} that @command{awk} processes at the runtime
12953 @samp{\}, Unix @command{awk} and @command{gawk} both simply remove the initial
12954 @samp{\} and put the next character into the string. Thus, for
12962 the @var{replacement} string that did not precede an @samp{&} was passed
13009 @c @cindex @command{awk} language, POSIX version
13010 @cindex POSIX @command{awk}, functions and, @code{gsub}/@code{sub}
13053 Backslashes must now be doubled in the @var{replacement} string, breaking
13054 historical @command{awk} programs.
13057 To make sure that an @command{awk} program is portable, @emph{every} character
13058 in the @var{replacement} string must be preceded with a
13167 In @command{awk}, the @samp{*} operator can match the null string.
13172 $ echo abc | awk '@{ gsub(/m*/, "X"); print @}'
13198 should be one of the two string values @code{"to"} or @code{"from"},
13199 indicating which end of the pipe to close. Case in the string does
13225 version of @command{awk} in 1994; it is not part of the POSIX standard and is
13232 standard output is flushed. The second is to allow the null string
13254 commands and then returns to the @command{awk} program. The @code{system}
13255 function executes the command given by the string @var{command}.
13259 For example, if the following fragment of code is put in your @command{awk}
13264 system("date | mail -s 'awk run done' root")
13269 the system administrator is sent mail when the @command{awk} program
13284 However, if your @command{awk}
13310 $ awk '@{ print $1 + $2 @}'
13323 $ awk '@{ print $1 + $2 @}' | cat
13344 @command{awk} implementations. An alternative method to flush output
13345 buffers is to call @code{system} with a null string as its argument:
13356 with other @command{awk} implementations, it does not necessarily avoid
13390 If @command{awk} did not flush its buffers before calling @code{system},
13405 @cindex POSIX @command{awk}, timestamps and
13406 @code{awk} programs are commonly used to process log files
13426 version of @command{awk}.@footnote{The GNU @command{date} utility can
13445 same name in ISO C. The argument, @var{datespec}, is a string of the form
13447 The string consists of six or seven numbers representing, respectively,
13471 This function returns a string. It is similar to the function of the
13473 produce a string, based on the contents of the @var{format} string.
13478 @code{@w{"%a %b %d %H:%M:%S %Z %Y"}}. This format string produces
13501 returned string, while substituting date and time values for format
13502 specifications in the @var{format} string.
13662 returned string or appears literally.}
13711 @cindex POSIX @command{awk}, @code{date} utility and
13712 This example is an @command{awk} implementation of the POSIX
13718 the string. For example:
13841 For example, if you have a bit string @samp{10111001} and you shift it
13918 @cindex @code{testbits.awk} program
13921 @c file eg/lib/bits2str.awk
13941 @c this is a hack to make testbits.awk self-contained
13943 @c file eg/prog/testbits.awk
13962 @c file eg/prog/testbits.awk
13981 $ gawk -f testbits.awk
13993 The @code{bits2str} function turns a binary number into a string.
13999 of the string.
14021 @cindex @command{gawk}, string-translation functions
14022 @cindex functions, string-translation
14024 @cindex @command{awk} programs, internationalizing
14026 @command{gawk} provides facilities for internationalizing @command{awk} programs.
14035 @item dcgettext(@var{string} @r{[}, @var{domain} @r{[}, @var{category}@r{]]})
14036 This function returns the translation of @var{string} in
14060 If @var{directory} is the null string (@code{""}), then
14074 Complicated @command{awk} programs can often be simplified by defining
14077 them, i.e., to tell @command{awk} what they should do.
14094 @command{awk} program. Thus, the general form of an @command{awk} program is
14098 before all uses of the function. This is because @command{awk} reads the
14118 Within a single @command{awk} program, any particular name can only be
14125 the call. The local variables are initialized to the empty string.
14129 The @var{body-of-function} consists of @command{awk} statements. It is the
14145 null string.
14159 used in the @command{awk} program can be referenced or set normally in the
14173 @c @cindex @command{awk} language, POSIX version
14174 @c @cindex POSIX @command{awk}
14175 @cindex POSIX @command{awk}, @code{function} keyword in
14176 In many @command{awk} implementations, including @command{gawk},
14191 If the resulting string is non-null, the action is executed.
14192 This is probably not what is desired. (@command{awk} accepts this input as
14194 in @command{awk} programs.)
14199 To ensure that your @command{awk} programs are portable, always use the
14216 To illustrate, here is an @command{awk} rule that uses our @code{myprint}
14260 The following is an example of a recursive function. It takes a string
14261 as an input parameter and returns the string in backwards order.
14264 is zero, i.e., when there are no more characters left in the string.
14277 If this function is in a file named @file{rev.awk}, it can be tested
14282 > gawk --source '@{ print rev($0, length($0)) @}' -f rev.awk
14286 The C @code{ctime} function takes a timestamp and returns it in a string,
14290 to create an @command{awk} version of @code{ctime}:
14295 @c file eg/lib/ctime.awk
14296 # ctime.awk
14298 # awk version of C ctime(3) function
14321 in parentheses. @command{awk} expressions are what you write in the
14325 being a string concatenation):
14333 If you write whitespace by mistake, @command{awk} might think that you mean
14353 string value @code{"bar"}.
14408 Some @command{awk} implementations allow you to call a function that
14434 Some @command{awk} implementations generate a runtime
14447 This statement returns control to the calling part of the @command{awk} program. It
14448 can also be used to return a value for use in the rest of the @command{awk}
14460 body, then the function returns an unpredictable value. @command{awk}
14535 @command{awk} is a very fluid language.
14536 It is possible that @command{awk} can't tell if an identifier
14580 features available at the @command{awk} program level.
14581 Having internationalization available at the @command{awk} level
14630 in C or C++, as well as scripts written in @command{sh} or @command{awk}.
14641 and marks each string that is a candidate for translation.
14655 collected into a portable object file (@file{guide.po}),
14661 @cindex @code{.po} files
14662 @cindex files, @code{.po}
14666 For each language with a translator, @file{guide.po}
14674 Each language's @file{.po} file is converted into a binary
14695 At runtime, @command{guide} looks up each string via a call
14696 to @code{gettext}. The returned string is the translated string
14697 if available, or the original string if not.
14707 In C (or C++), the string marking and dynamic translation lookup
14708 are accomplished by wrapping each string in a call to @code{gettext}:
14735 This reduces the typing overhead to just three extra characters per string
14798 @section Internationalizing @command{awk} Programs
14800 @cindex @command{awk} programs, internationalizing
14820 @item dcgettext(@var{string} @r{[}, @var{domain} @r{[}, @var{category}@r{]]})
14821 This built-in function returns the translation of @var{string} in
14826 If you supply a value for @var{category}, it must be a string equal to
14837 @strong{Caution:} The order of arguments to the @command{awk} version
14839 the C version. The @command{awk} version's order was
14840 chosen to be simple and to allow for reasonable @command{awk}-style
14868 If @var{directory} is the null string (@code{""}), then
14873 To use these facilities in your @command{awk} program, follow the steps
14900 @cindex @code{_} (underscore), translatable string
14901 @cindex underscore (@code{_}), translatable string
14905 quote of the string. For example:
14925 text domain (@code{"adminprog"}) in which to find the
14938 # where to find our files
14951 and use translations from @command{awk}.
14954 @section Translating @command{awk} Programs
14956 @cindex @code{.po} files
14957 @cindex files, @code{.po}
14961 be extracted to create the initial @file{.po} file.
14965 @command{gawk}'s @option{--gen-po} command-line option extracts
14974 * I18N Portability:: @command{awk}-level portability issues.
14982 @cindex @code{--gen-po} option
14983 @cindex command-line options, string extraction
14984 @cindex string extraction (internationalization)
14985 @cindex marked string extraction (internationalization)
14988 @cindex @code{--gen-po} option
14989 Once your @command{awk} program is working, and all the strings have
14992 First, use the @option{--gen-po} command-line option to create
14993 the initial @file{.po} file:
14996 $ gawk --gen-po -f guide.awk > guide.po
15000 When run with @option{--gen-po}, @command{gawk} does not execute your
15007 @code{gettext} can handle @file{.awk} files.}
15027 string, length(string)))
15038 Even though @code{gettext} can return the translated string
15052 format string itself is @emph{not} included. Thus, in the following
15053 example, @samp{string} is the first argument and @samp{length(string)} is the second:
15057 > string = "Dont Panic"
15059 > string, length(string)
15089 and those with positional specifiers in the same string:
15101 Although positional specifiers can be used directly in @command{awk} programs,
15107 @subsection @command{awk} Portability Issues
15112 have as little impact as possible on the portability of @command{awk}
15113 programs that use them to other versions of @command{awk}.
15126 As written, it won't work on other versions of @command{awk}.
15134 since @code{TEXTDOMAIN} is not special in other @command{awk} implementations.
15137 Non-GNU versions of @command{awk} treat marked strings
15138 as the concatenation of a variable named @code{_} with the string
15140 @command{awk}'' contest.} Typically, the variable @code{_} has
15141 the null string (@code{""}) as its value, leaving the original string constant as
15146 and @code{bindtextdomain}, the @command{awk} program can be made to run, but
15154 @c file eg/lib/libintl.awk
15160 function dcgettext(string, domain, category)
15162 return string
15178 @command{awk} pass @code{printf} formats and arguments unchanged to the
15183 @emph{translated} format strings, and since non-GNU @command{awk}s never
15184 retrieve the translated string, this should not be a problem in practice.
15192 localize a simple @command{awk} program, using @file{guide.awk} as our
15196 @c file eg/prog/guide.awk
15208 Run @samp{gawk --gen-po} to create the @file{.po} file:
15211 $ gawk --gen-po -f guide.awk > guide.po
15218 @c file eg/data/guide.po
15219 #: guide.awk:4
15223 #: guide.awk:5
15232 is the original string and the @code{msgstr} is the translation.
15235 appear in the @file{guide.po} file.
15244 $ cp guide.po guide-mellow.po
15245 @var{Add translations to} guide-mellow.po @dots{}
15253 @c file eg/data/guide-mellow.po
15254 #: guide.awk:4
15258 #: guide.awk:5
15277 @cindex @code{.po} files, converting to @code{.mo}
15278 @cindex files, @code{.po}, converting to @code{.mo}
15279 @cindex @code{.mo} files, converting from @code{.po}
15280 @cindex files, @code{.mo}, converting from @code{.po}
15287 @file{.po} file to machine-readable @file{.mo} file.
15290 @command{gawk} can find it:
15293 $ msgfmt guide-mellow.po
15300 $ gawk -f guide.awk
15309 are in a file named @file{libintl.awk},
15310 then we can run @file{guide.awk} unchanged as follows:
15313 $ gawk --posix -f guide.awk -f libintl.awk
15375 nondecimal numbers in input data, not just in @command{awk}
15379 can @dfn{profile} an @command{awk} program, making it possible to tune
15392 * Profiling:: Profiling your @command{awk} programs.
15459 Newsgroups: comp.lang.awk
15469 @c Xref: cssun.mathcs.emory.edu comp.lang.awk:5403
15477 The scent of awk programmers is a lot more attractive to women than
15745 @section Profiling Your @command{awk} Programs
15747 @cindex @command{awk} programs, profiling
15749 @cindex profiling @command{awk} programs
15755 traces of your @command{awk} programs.
15774 $ pgawk --profile=myprog.prof -f myprog.awk data1 data2
15785 session showing a simple @command{awk} program, its input data, and the
15786 results from running @command{pgawk}. First, the @command{awk} program:
15827 on this program and data (this example also illustrates that @command{awk}
15938 All string concatenations are parenthesized too.
15979 @cindex profiling @command{awk} programs, dynamically
15983 This is useful if your @command{awk} program goes into an
16050 @chapter Running @command{awk} and @command{gawk}
16052 This @value{CHAPTER} covers how to run awk, both POSIX-standard
16054 @command{awk} and
16058 This @value{CHAPTER} rounds out the discussion of @command{awk}
16066 * Command Line:: How to run @command{awk}.
16069 * AWKPATH Variable:: Searching directories for @command{awk}
16077 @section Invoking @command{awk}
16078 @cindex command line, invoking @command{awk} from
16079 @cindex @command{awk}, invoking
16080 @cindex arguments, command-line, invoking @command{awk}
16081 @cindex options, command-line, invoking @command{awk}
16083 There are two ways to run @command{awk}---with an explicit program or with
16088 awk @r{[@var{options}]} -f progfile @r{[@code{--}]} @var{file} @dots{}
16089 awk @r{[@var{options}]} @r{[@code{--}]} '@var{program}' @var{file} @dots{}
16098 @cindex dark corner, invoking @command{awk}
16100 It is possible to invoke @command{awk} with an empty program:
16103 awk '' datafile1 datafile2
16108 Doing so makes little sense, though; @command{awk} exits
16136 @cindex POSIX @command{awk}, GNU long options and
16156 @cindex @command{awk} programs, location of
16157 Indicates that the @command{awk} program is to be found in @var{source-file}
16172 @samp{awk @w{-v foo=1} @w{-v bar=2} @dots{}}.
16179 variables may lead to surprising results. @command{awk} will reset the
16190 Bell Laboratories research version of Unix @command{awk}. They are provided
16193 (The Bell Laboratories @command{awk} no longer needs these options;
16223 as well as options available in the Bell Laboratories version of @command{awk}.
16235 the @command{awk} language are disabled, so that @command{gawk} behaves just
16236 like the Bell Laboratories research version of Unix @command{awk}.
16274 @item -W gen-po
16275 @itemx --gen-po
16276 @cindex @code{--gen-po} option
16281 output for all string constants that have been marked for translation.
16303 other @command{awk} implementations.
16309 development of cleaner @command{awk} programs.
16317 @command{awk} from Version 7 Unix
16404 @cindex @command{awk} programs, profiling, enabling
16405 Enable profiling of @command{awk} programs
16423 Because interval expressions were traditionally not available in @command{awk},
16424 @command{gawk} does not provide them by default. This prevents old @command{awk}
16464 If it is, @command{awk} reads its program source from all of the named files, as
16466 useful for creating libraries of @command{awk} functions. These functions
16480 Because it is clumsy using the standard @command{awk} mechanisms to mix source
16481 file and command-line @command{awk} programs, @command{gawk} provides the
16543 All these arguments are made available to your @command{awk} program in the
16553 arguments is made when @command{awk} is about to open the next input file.
16555 it is really a variable assignment; if so, @command{awk} sets the variable
16563 because such rules are run before @command{awk} begins scanning the argument list.
16570 In some earlier implementations of @command{awk}, when a variable assignment
16572 the @code{BEGIN} rule was executed. @command{awk}'s behavior was thus
16576 upon this ``feature.'' When @command{awk} was changed to be more consistent,
16588 awk 'pass == 1 @{ @var{pass 1 stuff} @}
16601 @cindex differences in @command{awk} and @command{gawk}, @code{AWKPATH} environment variable
16603 The previous @value{SECTION} described how @command{awk} program files can be named
16606 In most @command{awk}
16614 The search path is a string consisting of directory names
16618 @samp{.:/usr/local/share/awk}.@footnote{Your version of @command{gawk}
16628 of useful @command{awk} functions. The library files can be placed in a
16634 @command{awk} programs can use facilities in @command{awk} library files
16654 from within an @command{awk} program.
16656 While you can change @code{ENVIRON["AWKPATH"]} within your @command{awk}
16658 sense: the @env{AWKPATH} environment variable is used to find the program
16726 Print the message @code{"awk: bailing out near line 1"} and dump core.
16728 Unix @command{awk} and by a t--shirt.
16733 Early versions of @command{awk} used to not require any separator (either
16734 a newline or @samp{;}) between the rules in @command{awk} programs. Thus,
16738 awk '@{ sum += $1 @} END @{ print sum @}'
16746 awk '@{ sum += $1 @} ; END @{ print sum @}'
16753 awk '@{ sum += $1 @}
16762 This seems to have been a long-undocumented feature in Unix @command{awk}.
16766 long-undocumented ``feature'' of Unix @code{awk}.
16772 The comparison used for sorting is simple string comparison;
16810 @majorheading II@ @ @ Using @command{awk} and @command{gawk}
16811 Part II shows how to use @command{awk} and @command{gawk} for problem solving.
16831 @chapter A Library of @command{awk} Functions
16833 @cindex libraries of @command{awk} functions
16840 your own @command{awk} functions. Writing functions is important, because
16852 This @value{CHAPTER} presents a library of useful @command{awk} functions.
16864 If you have written one or more useful, general-purpose @command{awk} functions
16865 and would like to contribute them to the author's collection of @command{awk}
16873 Rewriting these programs for different implementations of awk is pretty straightforward.
16902 Also, verify that all regexp and string constants used in
16923 @cindex @command{awk} programs, documenting
16924 @cindex documentation, of @command{awk} programs
16925 Due to the way the @command{awk} language evolved, variables are either
16964 show how my own @command{awk} programming style has evolved and to
16974 not one of @command{awk}'s built-in variables, such as @code{FS}.
16994 @cindex libraries of @command{awk} functions, associative arrays and
17020 * Assert Function:: A function for assertions in @command{awk}
17027 * Join Function:: A function to join an array into a string.
17036 @cindex libraries of @command{awk} functions, @code{nextfile} statement
17045 implementations of @command{awk}. This @value{SECTION} shows two versions of a
17053 # this should be read in before the "main" awk program
17080 @c If the function can't be used on other versions of awk, this whole
17082 @footnote{@command{gawk} is the only known @command{awk} implementation
17099 @c file eg/lib/nextfile.awk
17104 @c file eg/lib/nextfile.awk
17111 @c file eg/lib/nextfile.awk
17112 # this should be read in before the "main" awk program
17130 then @command{awk} closes the current @value{DF} and moves on to the next
17137 is reset to the empty string, so that further executions of this rule
17150 at the beginning of a large @value{DF}, @command{awk} still has to scan the entire
17156 @command{awk}, because @command{awk} programs are generally I/O-bound (i.e.,
17171 @cindex libraries of @command{awk} functions, assertions
17174 @cindex @command{awk} programs, lengthy, assertions
17203 The C language makes it possible to turn the condition into a string for use
17204 in printing the diagnostic message. This is not possible in @command{awk}, so
17205 this @code{assert} function also requires a string version of the condition
17210 @c file eg/lib/assert.awk
17214 @c file eg/lib/assert.awk
17222 @c file eg/lib/assert.awk
17223 function assert(condition, string)
17227 FILENAME, FNR, string) > "/dev/stderr"
17243 is false, it prints a message to standard error, using the @code{string}
17255 For all of this to work correctly, @file{assert.awk} must be the
17256 first source file read by @command{awk}.
17279 not read. However, now that the program has an @code{END} rule, @command{awk}
17299 @cindex libraries of @command{awk} functions, rounding numbers
17313 traditional rounding; it might be useful if your awk's @code{printf}
17318 @c file eg/lib/round.awk
17319 # round.awk --- do normal rounding
17322 @c file eg/lib/round.awk
17329 @c file eg/lib/round.awk
17371 It is easily programmed, in less than 10 lines of @command{awk} code:
17375 @c file eg/lib/cliff_rand.awk
17376 # cliff_rand.awk --- generate Cliff random numbers
17379 @c file eg/lib/cliff_rand.awk
17386 @c file eg/lib/cliff_rand.awk
17408 @cindex libraries of @command{awk} functions, character values as numbers
17412 One commercial implementation of @command{awk} supplies a built-in function,
17414 character in the machine's character set. If the string passed to
17419 Both functions are written very nicely in @command{awk}; there is no real
17420 reason to build them into the @command{awk} interpreter:
17425 @c file eg/lib/ord.awk
17426 # ord.awk --- do ord and chr
17433 @c file eg/lib/ord.awk
17441 @c file eg/lib/ord.awk
17491 @c file eg/lib/ord.awk
17528 @cindex libraries of @command{awk} functions, merging arrays into strings
17532 When doing string processing, it is often useful to be able to join
17533 all the strings in an array into one long string. The following function,
17547 @c file eg/lib/join.awk
17548 # join.awk --- join an array into a string
17551 @c file eg/lib/join.awk
17558 @c file eg/lib/join.awk
17581 be nice if @command{awk} had an assignment operator for concatenation.
17582 The lack of an explicit operator for concatenation makes string operations
17588 @cindex libraries of @command{awk} functions, managing, time
17600 with preformatted time information. It returns a string with the current
17605 @c file eg/lib/gettime.awk
17606 # gettimeofday.awk --- get the time of day in a usable format
17609 @c file eg/lib/gettime.awk
17615 @c file eg/lib/gettime.awk
17617 # Returns a string in the format of output of date(1)
17677 The string indices are easier to use and read than the various formats
17691 @cindex libraries of @command{awk} functions, managing, @value{DF}s
17711 the beginning and end of your @command{awk} program, respectively
17721 the job can be done cleanly in @command{awk} itself, as illustrated
17726 @emph{portably}; this works with any implementation of @command{awk}:
17729 # transfile.awk
17754 This rule relies on @command{awk}'s @code{FILENAME} variable that
17761 string. The program then assigns the current @value{FN} to
17763 Because, like all @command{awk} variables, @code{_oldfilename} is
17764 initialized to the null string, this rule executes correctly even for the
17782 @c file eg/lib/ftrans.awk
17783 # ftrans.awk --- handle data file transitions
17788 @c file eg/lib/ftrans.awk
17795 @c file eg/lib/ftrans.awk
17828 @c file eg/lib/rewind.awk
17829 # rewind.awk --- rewind the current file and start over
17832 @c file eg/lib/rewind.awk
17839 @c file eg/lib/rewind.awk
17884 Normally, if you give @command{awk} a @value{DF} that isn't readable,
17887 do this by prepending the following program to your @command{awk}
17890 @cindex @code{readable.awk} program
17892 @c file eg/lib/readable.awk
17893 # readable.awk --- library file to skip over unreadable files
17896 @c file eg/lib/readable.awk
17903 @c file eg/lib/readable.awk
17929 All known @command{awk} implementations silently skip over zero-length files.
17930 This is a by-product of @command{awk}'s implicit
17931 read-a-record-and-match-against-the-rules loop: when @command{awk}
17935 @command{awk} program code.
17944 @cindex @code{zerofile.awk} program
17946 @c file eg/lib/zerofile.awk
17947 # zerofile.awk --- library file to process empty input files
17950 @c file eg/lib/zerofile.awk
17957 @c file eg/lib/zerofile.awk
17975 The user-level variable @code{Argind} allows the @command{awk} program
17996 # zerofile2.awk --- same thing, portably
18033 Occasionally, you might not want @command{awk} to process command-line
18037 @command{awk} treats the @value{FN} as an assignment, and does not process it.
18043 @cindex @code{noassign.awk} program
18045 @c file eg/lib/noassign.awk
18046 # noassign.awk --- library file to avoid the need for a
18050 @c file eg/lib/noassign.awk
18057 @c file eg/lib/noassign.awk
18075 awk -v No_command_assign=1 -f noassign.awk -f yourprog.awk *
18095 @cindex libraries of @command{awk} functions, command-line options
18107 @command{awk} is an example of such a program
18110 correctly obey the command-line option. For example, @command{awk}'s
18111 @option{-F} option requires a string to use as the field separator.
18113 string that does not begin with @samp{-} ends the options.
18117 command-line arguments. The programmer provides a string describing the
18119 string with a colon. @code{getopt} is also passed the
18156 The string value of the argument to an option.
18169 arguments for @command{awk}:
18207 handy in @command{awk} programs as well. Following is an @command{awk}
18209 greatest weaknesses in @command{awk}, which is that it is very poor at
18221 @c file eg/lib/getopt.awk
18222 # getopt.awk --- do C library getopt(3) function in awk
18225 @c file eg/lib/getopt.awk
18234 @c file eg/lib/getopt.awk
18237 # Optarg -- string value of argument to current option
18257 The @code{getopt} function first checks that it was indeed called with a string of options
18263 @c file eg/lib/getopt.awk
18295 @c file eg/lib/getopt.awk
18321 the string of the next character to look at (we skip the @samp{-}, which
18326 If @code{thisopt} is not in the @code{options} string, then it is an
18343 @c file eg/lib/getopt.awk
18357 in the @code{options} string. If there are remaining characters in the
18359 string is assigned to @code{Optarg}. Otherwise, the next command-line
18365 @c file eg/lib/getopt.awk
18390 @c file eg/lib/getopt.awk
18413 $ awk -f getopt.awk -v _getopt_test=1 -- -a -cbARG bax -x
18421 $ awk -f getopt.awk -v _getopt_test=1 -- -a -x -- xyz abc
18431 the first @option{--} terminates the arguments to @command{awk}, so that it does
18445 @cindex libraries of @command{awk} functions, user database, reading
18460 information to the average user. There needs to be some way to find the
18482 While an @command{awk} program could simply read @file{/etc/passwd}
18631 # passwd.awk --- access password file information
18645 _pw_awklib = "/usr/local/libexec/awk/"
18680 @command{pwcat} is stored. Because it is used to help out an @command{awk} library
18681 routine, we have chosen to put it in @file{/usr/local/libexec/awk};
18706 @command{awk} implementation.
18718 The @code{getpwnam} function takes a username as a string argument. If that
18720 returns the null string:
18741 returns the null string:
18801 @command{awk} program, the check of @code{_pw_inited} could be moved out of
18803 this is not necessary, since most @command{awk} programs are I/O-bound, and it
18817 @cindex libraries of @command{awk} functions, group database, reading
18978 # group.awk --- functions for dealing with the group file
18994 _gr_awklib = "/usr/local/libexec/awk/"
19044 @command{grcat} is stored. Because it is used to help out an @command{awk} library
19045 routine, we have chosen to put it in @file{/usr/local/libexec/awk}. You might
19086 string:
19174 simple, relying on @command{awk}'s associative arrays to do work.
19187 @chapter Practical @command{awk} Programs
19189 @cindex @command{awk} programs, examples of
19194 presenting a potpourri of @command{awk} programs for your reading
19201 The second presents @command{awk}
19205 By reimplementing these programs in @command{awk},
19206 you can focus on the @command{awk}-related aspects of solving
19211 problems. Many of the programs are short, which emphasizes @command{awk}'s
19221 * Miscellaneous Programs:: Some interesting @command{awk} programs.
19230 awk -f @var{program} -- @var{options} @var{files}
19234 Here, @var{program} is the name of the @command{awk} program (such as
19235 @file{cut.awk}), @var{options} are any command-line options for the
19243 cut.awk -c1-8 myfiles > results
19246 If your @command{awk} is not @command{gawk}, you may instead need to use this:
19249 cut.awk -- -c1-8 myfiles > results
19256 @cindex POSIX, programs, implementing in @command{awk}
19259 @command{awk}. Reinventing these programs in @command{awk} is often enjoyable,
19261 very concise and simple. This is true because @command{awk} does so much for you.
19265 purpose is to illustrate @command{awk} language programming for ``real world''
19295 definition of fields is less general than @command{awk}'s.
19325 The @command{awk} implementation of @command{cut} uses the @code{getopt} library
19335 @cindex @code{cut.awk} program
19337 @c file eg/prog/cut.awk
19338 # cut.awk --- implement cut in awk
19341 @c file eg/prog/cut.awk
19348 @c file eg/prog/cut.awk
19381 @cindex @code{BEGIN} pattern, running @command{awk} programs and
19382 @cindex @code{FS} variable, running @command{awk} programs and
19391 string:
19394 @c file eg/prog/cut.awk
19415 if (FS == " ") # defeat awk semantics
19431 incorrect---@command{awk} would separate fields with runs of spaces,
19435 so that @command{awk} does not try to process the command-line options
19445 @c file eg/prog/cut.awk
19471 is used. The program lets @command{awk} handle the job of doing the
19475 @c file eg/prog/cut.awk
19520 @c file eg/prog/cut.awk
19579 @c file eg/prog/cut.awk
19598 other @command{awk} implementations to use @code{substr}
19620 expressions that are almost identical to those available in @command{awk}
19673 @cindex @code{egrep.awk} program
19675 @c file eg/prog/egrep.awk
19676 # egrep.awk --- simulate egrep in awk
19679 @c file eg/prog/egrep.awk
19686 @c file eg/prog/egrep.awk
19719 command line is used. The @command{awk} command-line arguments up to @code{ARGV[Optind]}
19720 are cleared, so that @command{awk} won't try to process them as files. If no
19726 @c file eg/prog/egrep.awk
19746 of @command{awk}.
19759 @c file eg/prog/egrep.awk
19767 The @code{beginfile} function is called by the rule in @file{ftrans.awk}
19775 @c file eg/prog/egrep.awk
19793 @c file eg/prog/egrep.awk
19828 @c file eg/prog/egrep.awk
19861 @c file eg/prog/egrep.awk
19875 @c file eg/prog/egrep.awk
19926 Here is a simple version of @command{id} written in @command{awk}.
19940 @cindex @code{id.awk} program
19942 @c file eg/prog/id.awk
19943 # id.awk --- implement id in awk
19948 @c file eg/prog/id.awk
19956 @c file eg/prog/id.awk
20067 Here is a version of @code{split} in @command{awk}. It uses the @code{ord} and
20078 @cindex @code{split.awk} program
20080 @c file eg/prog/split.awk
20081 # split.awk --- do split in awk
20086 @c file eg/prog/split.awk
20093 @c file eg/prog/split.awk
20132 @c file eg/prog/split.awk
20157 @c Exercise: do this with just awk builtin functions, index("abc..."), substr, etc.
20163 @c file eg/prog/split.awk
20183 This program is a bit sloppy; it relies on @command{awk} to automatically close the last file
20211 @code{tee} cannot use @code{ARGV} directly, since @command{awk} attempts to
20219 Finally, @command{awk} is forced to read the standard input by setting
20223 @cindex @code{tee.awk} program
20225 @c file eg/prog/tee.awk
20226 # tee.awk --- tee in awk
20229 @c file eg/prog/tee.awk
20237 @c file eg/prog/tee.awk
20264 @c file eg/prog/tee.awk
20300 @c file eg/prog/tee.awk
20342 is similar to @command{awk}'s default: nonwhitespace characters separated
20385 @cindex @code{uniq.awk} program
20387 @c file eg/prog/uniq.awk
20389 # uniq.awk --- do uniq in awk
20395 @c file eg/prog/uniq.awk
20402 @c file eg/prog/uniq.awk
20464 simply returns one or zero depending upon the result of a simple string
20479 @c file eg/prog/uniq.awk
20525 @c file eg/prog/uniq.awk
20600 by spaces and/or tabs. Luckily, this is the normal way @command{awk} separates
20607 Implementing @command{wc} in @command{awk} is particularly elegant,
20608 since @command{awk} does a lot of the work for us; it splits lines into
20627 @cindex @code{wc.awk} program
20629 @c file eg/prog/wc.awk
20630 # wc.awk --- count lines, words, characters
20633 @c file eg/prog/wc.awk
20639 @c file eg/prog/wc.awk
20679 @c file eg/prog/wc.awk
20702 @c file eg/prog/wc.awk
20730 @c file eg/prog/wc.awk
20743 @c file eg/prog/wc.awk
20765 @section A Grab Bag of @command{awk} Programs
20768 We hope you find them both interesting and enjoyable.
20781 * Igawk Program:: A wrapper for @command{awk} that includes
20800 This program, @file{dupword.awk}, scans through a file one line at a time
20820 @cindex @code{dupword.awk} program
20822 @c file eg/prog/dupword.awk
20823 # dupword.awk --- find duplicate words in text
20826 @c file eg/prog/dupword.awk
20834 @c file eg/prog/dupword.awk
20884 @cindex @code{alarm.awk} program
20886 @c file eg/prog/alarm.awk
20887 # alarm.awk --- set an alarm
20892 @c file eg/prog/alarm.awk
20899 @c file eg/prog/alarm.awk
20948 @c file eg/prog/alarm.awk
20990 @c file eg/prog/alarm.awk
21051 of standard @command{awk}: dealing with individual characters is very
21056 split each character in a string into separate array elements.}
21069 The string on which to do the translation.
21081 @command{awk} reads from the standard input.
21085 @cindex @code{translate.awk} program
21087 @c file eg/prog/translate.awk
21088 # translate.awk --- do tr-like stuff
21091 @c file eg/prog/translate.awk
21098 @c file eg/prog/translate.awk
21150 @command{awk} had added the @code{toupper} and @code{tolower} functions
21184 The @code{BEGIN} rule simply sets @code{RS} to the empty string, so that
21185 @command{awk} splits records at blank lines
21218 @cindex @code{labels.awk} program
21220 @c file eg/prog/labels.awk
21221 # labels.awk --- print mailing labels
21224 @c file eg/prog/labels.awk
21230 @c file eg/prog/labels.awk
21293 The following @command{awk} program prints
21295 associative nature of @command{awk} arrays by using strings as subscripts. It
21297 Finally, it shows how @command{awk} is used in conjunction with other
21318 It uses @command{awk}'s field-accessing mechanism
21333 Words are detected using the @command{awk} convention that fields are
21335 newlines) don't have any special meaning to @command{awk}. This means that
21339 The @command{awk} language considers upper- and lowercase characters to be
21352 The way to solve these problems is to use some of @command{awk}'s more advanced
21356 output of the @command{awk} script. Here is the new version of
21359 @cindex @code{wordfreq.awk} program
21361 @c file eg/prog/wordfreq.awk
21362 # wordfreq.awk --- print list of word frequencies
21379 Assuming we have saved this program in a file named @file{wordfreq.awk},
21383 awk -f wordfreq.awk file1 | sort -k 2nr
21388 decreasing frequency. The @command{awk} program suitably massages the
21391 The @command{awk} script's output is then sorted by the @command{sort}
21402 @c file eg/prog/wordfreq.awk
21447 @cindex @code{histsort.awk} program
21449 @c file eg/prog/histsort.awk
21450 # histsort.awk --- compact a shell history file
21454 @c file eg/prog/histsort.awk
21461 @c file eg/prog/histsort.awk
21499 present a large number of @command{awk} programs.
21505 are the top level nodes for a large number of @command{awk} programs.
21534 or @command{awk}. Literal @samp{@@} symbols are represented in Texinfo source
21549 The following program, @file{extract.awk}, reads through a Texinfo source
21557 The rules in @file{extract.awk} match either @samp{@@c} or
21560 @file{extract.awk} uses the @code{join} library function
21566 @file{extract.awk} to extract the sample programs and install many
21567 of them in a standard directory where @command{gawk} can find them.
21576 @@c file examples/messages.awk
21584 @@c file examples/messages.awk
21591 @file{extract.awk} begins by setting @code{IGNORECASE} to one, so that
21598 @cindex @code{extract.awk} program
21600 @c file eg/prog/extract.awk
21601 # extract.awk --- extract files and run programs
21605 @c file eg/prog/extract.awk
21613 @c file eg/prog/extract.awk
21678 @c file eg/prog/extract.awk
21736 @c file eg/prog/extract.awk
21773 @command{awk}'s @code{gsub} function
21776 The following program, @file{awksed.awk}, accepts at least two command-line
21782 @cindex @command{awksed.awk} program
21786 @c file eg/prog/awksed.awk
21787 # awksed.awk --- do s/foo/bar/g using just print
21791 @c file eg/prog/awksed.awk
21798 @c file eg/prog/awksed.awk
21843 is set to the null string. In this case, we can print @code{$0} using
21850 @code{ARGV[1]} and @code{ARGV[2]} to the null string, so that they are
21883 @cindex libraries of @command{awk} functions, example program for using
21886 Using library functions in @command{awk} can be very beneficial. It
21889 However, using library functions is only easy when writing @command{awk}
21892 environment variable and the ability to put @command{awk} functions into a
21898 @@include getopt.awk
21899 @@include join.awk
21931 @command{awk} source code for later, when the expanded program is run.
21934 For any arguments that do represent @command{awk} text, put the arguments into
21950 Run an @command{awk} program (naturally) over the shell variable's contents to expand
21960 the text of the @command{awk} program that will expand the user's program, for the
21974 to the user's @command{awk} program without being evaluated.
22006 should be the @command{awk} program. If there are no command-line
22010 @code{program} contains the complete text of the original @command{awk}
22051 # diagnostic if $x is the null string
22104 The @command{awk} program to process @samp{@@include} directives
22106 the shell script readable. The @command{awk} program
22124 The only way to test if a file can be read in @command{awk} is to go
22126 does.@footnote{On some very old versions of @command{awk}, the test
22211 printf("igawk:%s:%d: cannot find %s\n",
22242 into the command line. It is saved as a single string, even if the results
22295 the initial collected @command{awk} program much simpler; all the
22320 @command{sh} and @command{awk} programming together. You can usually
22322 in C or C++, and it is frequently easier to do certain kinds of string
22323 and argument manipulation using the shell than it is in @command{awk}.
22340 @item default.awk
22344 @item site.awk
22347 Having a separate file allows @file{default.awk} to change with
22357 directives, @file{default.awk} could simply contain @samp{@@include}
22405 @appendix The Evolution of the @command{awk} Language
22407 This @value{DOCUMENT} describes the GNU implementation of @command{awk}, which follows
22409 Many long-time @command{awk} users learned @command{awk} programming
22410 with the original @command{awk} implementation in Version 7 Unix.
22411 (This implementation was the basis for @command{awk} in Berkeley Unix,
22414 for their @command{awk}.)
22416 evolution of the @command{awk} language, with cross-references to other parts
22417 of the @value{DOCUMENT} where you can find more information.
22426 version of @command{awk}.
22428 @command{awk}.
22435 @cindex @command{awk}, versions of
22437 @cindex @command{awk}, versions of, changes between V7 and SVR3.1
22439 The @command{awk} language evolved considerably between the release of
22486 C-compatible operator precedence, which breaks some old @command{awk}
22502 (Some vendors have updated their old versions of @command{awk} to
22523 @cindex @command{awk}, versions of, changes between SVR3.1 and SVR4
22524 The System V Release 4 (1989) version of Unix @command{awk} added these features
22530 @c gawk and MKS awk
22535 @c MKS awk
22555 The @code{toupper} and @code{tolower} built-in string functions
22580 @appendixsec Changes Between SVR4 and POSIX @command{awk}
22581 @cindex @command{awk}, versions of, changes between SVR4 and POSIX @command{awk}
22582 @cindex POSIX @command{awk}, changes in @command{awk} versions
22584 The POSIX Command Language and Utilities standard for @command{awk} (1992)
22597 The concept of a numeric string and tighter comparison rules to go
22645 @appendixsec Extensions in the Bell Laboratories @command{awk}
22647 @cindex @command{awk}, versions of, See Also Bell Laboratories @command{awk}
22648 @cindex extensions, Bell Laboratories @command{awk}
22649 @cindex Bell Laboratories @command{awk} extensions
22651 Brian Kernighan, one of the original designers of Unix @command{awk},
22654 This @value{SECTION} describes extensions in his version of @command{awk} that are
22655 not in POSIX @command{awk}:
22663 As a side note, his @command{awk} no longer needs these options;
22682 The @code{SYMTAB} array, that allows access to @command{awk}'s internal symbol
22689 The Bell Laboratories @command{awk} also incorporates the following extensions,
22717 @appendixsec Extensions in @command{gawk} Not in POSIX @command{awk}
22724 differences in standard awk functions
22735 @cindex extensions, in @command{gawk}, not in POSIX @command{awk}
22831 @code{IGNORECASE} changed, now applying to string comparison as well
22869 the original Version 7 Unix version of @command{awk}
22874 Bell Laboratories research version of @command{awk}
22920 The ability to use octal and hexadecimal constants in @command{awk}
22988 The @option{--gen-po} command-line option and the use of a leading
23000 profiles of @command{awk} programs
23064 designed and implemented Unix @command{awk},
23102 making it compatible with ``new'' @command{awk}, and
23242 * Other Versions:: Other freely available @command{awk}
23301 will be less busy, and you can usually find one closer to your site.
23383 A description of one area in which the POSIX standard for @command{awk} is
23400 The @command{troff} source for a five-color @command{awk} reference card.
23415 @item doc/awk.info
23463 @itemx po/*
23465 @command{gawk}'s internationalization features, while the @file{po} library
23468 @item awklib/extract.awk
23472 The @file{awklib} directory contains a copy of @file{extract.awk}
23614 in @command{awk} programs
23637 has no effect on the running @command{awk} program.
23833 libraries in @file{gnu/lib/awk}, and manual pages under @file{gnu/man}.
23848 libraries under @file{/usr/share/awk}, manual pages under @file{/usr/man},
23855 install-info --info-dir=x:/usr/info x:/usr/info/awk.info
23974 try GNU Make 3.79.1 or later versions. You should find the latest
23993 @command{awk} scripts, you'll need to either change the call to
24007 but you're essentially on your own. Post to @code{comp.lang.awk} or
24012 in @file{awk.h} of any variables you add to @file{gawkw32.def}.
24014 Note that extension libraries have the name of the @command{awk}
24017 rename @command{gawk.exe} to @command{awk.exe} or if you try to use
24051 @code{@w{".;c:/lib/awk;c:/gnu/lib/awk"}}.
24057 and @file{c:/usr/share/awk}.
24063 @code{@w{".;c:/usr/share/awk;e:/usr/share/awk"}}.
24066 or @command{cmd.exe} under OS/2) may be useful for @command{awk} programming.
24076 @cindex differences in @command{awk} and @command{gawk}, @code{BINMODE} variable
24100 @code{BINMODE=@var{non-null-string}} is
24103 message if the string is not one of @code{"rw"} or @code{"wr"}.
24109 command line is read, but before processing any of the @command{awk} program).
24121 files @file{binmode[1-3].awk} (under @file{gnu/lib/awk} in some of the
24140 gawk -v BINMODE=w -f binmode2.awk @dots{}
24157 gawk -f binmode1.awk @dots{}
24290 @command{awk} programming language.
24293 for @command{awk} program files. For the @option{-f} option, if the specified
24298 @command{gawk} appends the suffix @samp{.awk} to the filename and retries
24307 changes. They @emph{are} minor though, and all @command{awk} programs
24324 single parameter (as in the quoted string program above), the command
24334 The default search path, when looking for @command{awk} program files specified
24393 a large amount of memory with most @command{awk} programs, and should run on all
24404 redirection is necessary to make it easy to import @command{awk} programs
24454 anywhere in your @env{PATH} where your shell can find it.
24470 @code{@w{".,c:\lib\awk,c:\gnu\lib\awk"}}. The search path can be
24479 Although @command{awk} allows great flexibility in doing I/O redirections
24486 @command{awk} program using @code{print} statements explicitly redirected
24499 use only backslashes. Also remember that in @command{awk}, backslashes in
24523 For example, @file{array.c} becomes @file{ARRAYC}, and @file{awk.h}
24570 to the smallest possible @command{awk} program and input @value{DF} that
24591 @cindex @code{comp.lang.awk} newsgroup
24593 posting to the Usenet/Internet newsgroup @code{comp.lang.awk}.
24604 If you find bugs in one of the non-Unix ports of @command{gawk}, please send
24676 @appendixsec Other Freely Available @command{awk} Implementations
24678 @cindex @command{awk}, implementations
24681 Subject: C++ comments in awk programs
24688 @i{It's kind of fun to put comments like this in your awk code.}@*
24693 There are three other freely available @command{awk} implementations.
24698 @cindex source code, Bell Laboratories @command{awk}
24699 @item Unix @command{awk}
24701 @command{awk} freely available.
24708 @uref{http://cm.bell-labs.com/who/bwk/awk.shar}
24711 @uref{http://cm.bell-labs.com/who/bwk/awk.tar.gz}
24714 @uref{http://cm.bell-labs.com/who/bwk/awk.zip}
24723 for a list of extensions in this @command{awk} that are not in POSIX @command{awk}.
24729 Michael Brennan has written an independent implementation of @command{awk},
24744 @command{mawk} has the following extensions that are not in POSIX @command{awk}:
24792 @cindex @command{awka} compiler for @command{awk}
24796 @command{awka} translates @command{awk} programs into C, compiles them,
24798 @command{awk} functionality.
24801 The @command{awk} translator is released under the GPL, and the library
24808 @cindex @command{pawk} profiling Bell Labs @command{awk}
24811 the Bell Labs @command{awk} to provide timing and profiling information.
24815 profiling. You may find it at either
24854 for a summary of the GNU extensions to the @command{awk} language and program.
24874 If you find that you want to enhance @command{gawk} in a significant
25037 (I find context diffs to be more readable but unified diffs are
25212 are the files @file{awk.h}, @file{builtin.c}, and @file{eval.c}.
25213 Reading @file{awk.y} in order to see how the parse tree is built
25216 @cindex @code{awk.h} file (internal)
25218 members, functions, and macros are declared in @file{awk.h} and are of
25227 An @code{AWKNUM} is the internal type of @command{awk}
25246 This macro guarantees that a @code{NODE}'s string value is current.
25248 It also guarantees that the string is zero-terminated.
25260 The data and length of a @code{NODE}'s string value, respectively.
25261 The string is @emph{not} guaranteed to be zero-terminated.
25262 If you need to pass the string value to a C library function, save
25296 Take a C string and turn it into a pointer to a @code{NODE} that
25309 Take a C string and turn it into a pointer to a @code{NODE} that
25337 function @code{name}. @code{name} is a regular C string. @code{count}
25364 what the @command{awk} program sees as the return value from the
25365 new @command{awk} function.
25439 Two useful functions that are not in @command{awk} are @code{chdir}
25440 (so that an @command{awk} program can change its directory) and
25441 @code{stat} (so that an @command{awk} program can gather information about
25455 This @value{SECTION} shows how to use the new functions at the @command{awk}
25476 is set to a string indicating the error.
25481 The right way to model this in @command{awk} is to fill in an associative
25535 The file's ``printable mode.'' This is a string representation of
25540 A printable string representation of the file's type. The value
25573 system and the type of the file. You can test for them in your @command{awk}
25605 #include "awk.h"
25622 The file includes the @code{"awk.h"} header file for definitions
25627 By convention, for an @command{awk} function @code{foo}, the function that
25636 the argument to be a string and passes the string value to the
25652 Finally, the function returns the return value to the @command{awk} level,
25821 # file testff.awk
25828 print "Info for testff.awk"
25829 ret = stat("testff.awk", data)
25833 print "testff.awk modified:",
25841 $ gawk -f testff.awk
25842 @print{} Info for testff.awk
25854 @print{} data["name"] = testff.awk
25859 @print{} testff.awk modified: 07 19 99 08:25:36
25922 @command{awk} language level:
25927 It is not clear that the @command{awk}-level interface to the
25950 It may be possible to map a GDBM/NDBM/SDBM file into an @command{awk} array.
25994 @item Compilation of @command{awk} programs
26032 As this @value{DOCUMENT} is specifically about @command{awk},
26126 program such as @command{awk} reads your program, and then uses the
26240 This step corresponds to @command{awk}'s @code{BEGIN} rule
26253 @command{awk}'s pattern-action paradigm
26264 This step corresponds to @command{awk}'s @code{END} rule
26296 @command{awk} manages the reading of data for you, as well as the
26298 tell @command{awk} what to with the data. You do this by describing
26301 @command{awk} programs usually makes them both easier to write
26312 @command{awk} has several predefined variables, and it has
26319 @cindex values, string
26321 Data, particularly in @command{awk}, consists of either numeric
26322 values, such as 42 or 3.1415927, or string values.
26326 Individual variables, as well as numeric and string variables, are
26357 @command{awk} uses @dfn{double-precision} floating-point numbers, which
26372 It is called the @dfn{null string}.
26373 The null string is character data that has no value.
26374 In other words, it is empty. It is written in @command{awk} programs
26394 the @command{awk} language.
26401 and Brian Kernighan was one of the creators of @command{awk}.)
26406 Where it makes sense, POSIX @command{awk} is compatible with 1990 ISO C.
26416 ``real'' numbers, i.e., those that have a fractional part. @command{awk}
26429 Internally, @command{awk} keeps both the numeric value
26430 (double-precision floating-point) and the string value for a variable.
26431 Separately, @command{awk} keeps
26436 It is important to note that the string value for a number may not
26439 The following program (@file{values.awk}) illustrates this:
26458 using @code{printf}, and then prints the string values obtained
26465 $ echo 2 3.654321 1.2345678 | awk -f values.awk
26472 what the default string representations show.
26492 $ awk '@{ printf("%010d\n", $1 * 100) @}'
26507 in @command{awk}, but simply an artifact of how computers
26542 A series of @command{awk} statements attached to a rule. If the rule's
26543 pattern matches an input record, @command{awk} executes the
26549 @cindex amazing @command{awk} assembler (@command{aaa})
26550 @item Amazing @command{awk} Assembler
26552 completely as @command{sed} and @command{awk} scripts. It is thousands
26563 commands, using @command{awk} and @command{sh}.
26569 to the beginning or end of the string, respectively.
26582 @command{awk} provides associative arrays.
26589 An @command{awk} expression that changes the value of some @command{awk}
26598 @item @command{awk} Language
26599 The language in which @command{awk} programs are written.
26601 @item @command{awk} Program
26602 An @command{awk} program consists of a series of @dfn{patterns} and
26605 @command{awk} programs may also contain function definitions.
26607 @item @command{awk} Script
26608 Another name for an @command{awk} program.
26630 @command{awk} lets you work with floating-point numbers and strings.
26650 The @command{awk} language provides built-in functions that perform various
26651 numerical, I/O-related, and string computations. Examples are
26653 substring of a string).
26655 and runtime string translation.
26676 are the variables that have special meaning to @command{awk}.
26689 Changing some of them affects @command{awk}'s running environment.
26702 @command{awk} programming language has C-like syntax, and this @value{DOCUMENT}
26703 points out similarities between @command{awk} and C when appropriate.
26726 It was written in @command{awk}
26741 A series of @command{awk} statements, enclosed in curly braces. Compound
26747 producing a new string. For example, the string @samp{foo} concatenated with
26748 the string @samp{bar} gives the string @samp{foobar}.
26768 @command{awk} for delimiting actions, compound statements, and function
26785 A description of @command{awk} programs, where you specify the data you
26801 @command{awk} stores numeric values. It is the C type @code{double}.
26805 ordinary expression. It could be a string constant, such as
26837 When @command{awk} reads an input record, it splits the record into pieces
26859 are controlled by the format string contained in the built-in variable
26868 or program-specific tasks. @command{awk} has a number of built-in
26885 The GNU implementation of @command{awk}.
26933 A single chunk of data that is read in by @command{awk}. Usually, an @command{awk} input
26949 @command{awk} is typically (but not always) implemented as an interpreter.
26955 in @command{awk} programs.
26967 In the @command{awk} language, a keyword is a word that has special
27009 @samp{&&}, @samp{||}, and @samp{!} in @command{awk}. Often called Boolean
27016 elements. In @command{awk}, a field designator can also be used as an
27020 The act of testing a string against a regular expression. If the
27021 regexp describes the contents of the string, it is said to @dfn{match} it.
27029 A string with no characters in it. It is represented explicitly in
27030 @command{awk} programs by placing two double quote characters next to
27035 A numeric-valued data object. Modern @command{awk} implementations use
27037 Very old @command{awk} implementations use single-precision floating-point.
27049 Patterns tell @command{awk} which input records are interesting to which
27062 @command{awk} users is
27073 functions and not for the main @command{awk} program. Special care must be
27079 can specify ranges of input lines for @command{awk} to process or it can
27100 @samp{R.*xp} matches any string starting with the letter @samp{R}
27101 and ending with the letters @samp{xp}. In @command{awk}, regexps are
27111 when you write the @command{awk} program and cannot be changed during
27115 A segment of an @command{awk} program that specifies how to process single
27117 @command{awk} reads an input record; then, for each rule, if the input record
27118 satisfies the rule's pattern, @command{awk} executes the rule's action.
27123 In @command{awk}, essentially every expression has a value. These values
27127 A single value, be it a number or a string.
27131 In @command{gawk}, a list of directories to search for @command{awk} program source files.
27146 The nature of the @command{awk} logical operators @samp{&&} and @samp{||}.
27162 This is the type used by some very old versions of @command{awk} to store
27182 string}. Constant strings are written with double quotes in the
27183 @command{awk} language and may contain escape sequences.
27224 record or a string.
28092 Robert J. Chassell points out that awk programs should have some indication
28153 Don't show the awk command with a program in quotes when it's
28161 awk '{