Lines Matching +refs:po +refs:next +refs:entry +refs:with +refs:regexp

87    with anything, it's your responsibility not to break the layout.
123 any later version published by the Free Software Foundation; with the
125 texts being (a) (see below), and with the Back-Cover Texts being (b)
168 included for their instructional value. They have been tested with care
171 liabilities with respect to the programs or applications.
277 * Sample Programs:: Many @command{awk} programs with complete
351 * Plain Getline:: Using @code{getline} with no arguments.
386 * Constants:: String, numeric and regexp constants.
390 * Using Constant Regexps:: When and how to use a regexp constant.
409 with @samp{<}, etc.
428 * Using Shell Variables:: How to use shell variables with
474 * Numeric Functions:: Functions that work with numbers, including
480 and @samp{&} with @code{sub}, @code{gsub},
483 * Time Functions:: Functions for dealing with timestamps.
504 * Two-way I/O:: Two-way communications with another
508 * Portal Files:: Using @command{gawk} with BSD portals.
686 it's no longer difficult to find a new @command{awk}. @command{gawk} ships with
722 data driven control-flow, pattern matching with regular expressions,
737 I recently experimented with an algorithm that for
775 when working with text files.
781 Such jobs are often easier with @command{awk}.
786 compatible with the System V Release 4 version of
787 @command{awk}. @command{gawk} is also compatible with the POSIX
789 properly written @command{awk} programs should work with @command{gawk}.
815 Experiment with algorithms that you can adapt later to other computer
838 how you can use it effectively. You should already be familiar with basic
842 be familiar with the ideas of I/O redirection and pipes.} as well as basic shell
898 This new version became widely available with Unix System V
911 Jay Fenlason completed it, with advice from Richard Stallman. John Woods
912 contributed parts of the code as well. In 1988 and 1989, David Trueman, with
914 with the newer @command{awk}.
920 from @command{awk}, and with a little help from me, set about adding
923 @cite{TCP/IP Internetworking with @command{gawk}}
926 with @command{gawk} @value{PVERSION} 3.1.
990 entry ``differences in @command{awk} and @command{gawk}.''}
1036 describes how @command{awk} programs can produce output with
1065 are the abilities to have two-way communications with another process,
1100 are completely unfamiliar with computer programming.
1108 If you find terms that you aren't familiar with, try looking them up here.
1159 by first pressing and holding the @kbd{CONTROL} key, next
1176 (often called ``dark corners'') are noted in this @value{DOCUMENT} with
1244 bundled on CD-ROMs with books about Linux.
1253 source code for the @value{DOCUMENT} comes with @command{gawk}; anyone
1271 version which I started working with in the fall of 1988.
1285 of @cite{The GAWK Manual}, with much additional material.
1296 but with significant additional material, reflecting the host of new features
1316 I started working with that version in the fall of 1988.
1319 In 1996, Edition 1.0 was released with @command{gawk} 3.0.0.
1324 but with significant additional material, reflecting the host of new features
1338 comes with the @command{gawk} distribution from the FSF.
1353 share with the rest of the world, please contact me (@email{arnold@@gnu.org}).
1408 Karl Berry helped significantly with the @TeX{} part of Texinfo.
1428 Although he is no longer involved with @command{gawk},
1429 working with him on this project was a significant pleasure.
1464 has been and continues to be a pleasure working with this team of fine
1484 and for sharing me with the computer.
1485 I would like to thank my parents for their love, and for the grace with
1488 He has sent my way, as well as for the gifts He has given me with which to
1504 It starts with the basics, and continues through all of the features of @command{awk}
1549 @chapter Getting Started with @command{awk}
1568 the data you want to work with and then what to do when you find it.
1570 detail, every step the program is to take. When working with procedural
1627 and run it with a command like this:
1633 This @value{SECTION} discusses both mechanisms, along with several
1652 Once you are familiar with @command{awk}, you will often type in simple
1750 This next simple @command{awk} program
1811 specify with @option{-f}, because most @value{FN}s don't contain any of the shell's
1831 @cindex @code{#} (number sign), @code{#!} (executable scripts), portability issues with
1832 @cindex number sign (@code{#}), @code{#!} (executable scripts), portability issues with
1849 After making this file executable (with the @command{chmod} utility),
1852 line beginning with @samp{#!} lists the full @value{FN} of an interpreter
1854 interpreter. The operating system then runs the interpreter with the given
1877 @subheading Advanced Notes: Portability Issues with @samp{#!}
1881 Often, this can be dealt with by using a symbolic link.
1913 In the @command{awk} language, a comment starts with the sharp sign
1951 It therefore prompts with the secondary prompt, waiting for more input.
1965 The next @value{SUBSECTION} describes the shell's quoting rules.
1983 Once you are working with the shell, it is helpful to have a basic
1990 Quoted items can be concatenated with nonquoted items as well as with other
1995 Preceding any single character with a backslash (@samp{\}) quotes
2036 @cindex single quote (@code{'}), with double quotes
2037 @cindex @code{'} (single quote), with double quotes
2113 computer bulletin board systems together with information about those systems.
2183 file into a file for use with @command{awk}
2261 you can come up with different ways to do the same things shown here:
2356 @section An Example with Two Rules
2366 @command{awk} reads the next line. (However,
2409 Note how the line beginning with @samp{sabafoo}
2457 @cindex line continuations, with C shell
2524 first line with a backslash character (@samp{\}). The backslash must be
2531 on the next line/ @{ print $1 @}'
2556 with the C shell.} It works for @command{awk} programs in files and
2561 in your awk program must be escaped with a backslash. To illustrate:
2575 Compare the previous example to how it is done with a POSIX-compliant shell:
2608 next line. However, the backslash-newline combination is never even
2617 with a semicolon (@samp{;}).
2628 separated with a semicolon was not in the original @command{awk}
2629 language; it was added for consistency with the treatment of statements
2643 @command{gawk} provides built-in functions for working with timestamps,
2664 Programs written with @command{awk} are usually much smaller than they would
2691 @cindex regexp, See regular expressions
2692 @c STARTOFRANGE regexp
2695 A @dfn{regular expression}, or @dfn{regexp}, is a way of describing a
2706 both. Such a regexp matches any string that contains that sequence.
2707 Thus, the regexp @samp{foo} matches any string containing @samp{foo}.
2774 @var{exp} ~ /@var{regexp}/
2779 matches @var{regexp}. The following example matches, or selects,
2780 all input records with the uppercase letter @samp{J} somewhere in the
2797 This next example is true if the expression @var{exp}
2799 does @emph{not} match @var{regexp}:
2802 @var{exp} !~ /@var{regexp}/
2818 @cindex regexp constants
2819 @cindex regular expressions, constants, See regexp constants
2820 When a regexp is enclosed in slashes, such as @code{/foo/}, we call it
2821 a @dfn{regexp constant}, much like @code{5.27} is a numeric constant and
2831 (@code{"foo"}) or regexp constants (@code{/foo/}).
2832 Instead, they should be represented with @dfn{escape sequences},
2833 which are character sequences beginning with a backslash (@samp{\}).
2846 string or regexp. Thus, the string whose contents are the two characters
2851 unprintable characters directly in a string constant or regexp constant,
2857 sequences apply to both string constants and regexp constants:
2924 A literal slash (necessary for regexp constants only).
2925 This expression is used when you want to write a regexp
2926 constant that contains a slash. Because the regexp is delimited by
2928 in order to tell @command{awk} to keep processing the rest of the regexp.
2941 with a backslash have special meaning in regexps.
2944 In a regexp, a backslash before any character that is not in the previous list
2947 means that the next character should be taken literally, even if it would
2948 normally be a regexp operator. For example, @code{/a\+b/} matches the three
2962 for both string constants and regexp constants. This happens very early,
2966 @command{gawk} processes both regexp constants and dynamic regexps
3014 escape to represent a regexp metacharacter.
3016 Does @command{awk} treat the character as a literal character or as a regexp
3026 escape sequences literally when used in regexp constants. Thus,
3034 You can combine regular expressions with special characters,
3043 are valid inside a regexp. They are introduced by a @samp{\} and
3082 matches a record that ends with a @samp{p}. The @samp{$} is an anchor
3097 matches any three-character sequence that begins with @samp{U} and ends
3098 with @samp{A}.
3104 character, which is a character with all bits equal to zero.
3141 means it matches any string that starts with @samp{P} or contains a digit.
3156 @cindex @code{*} (asterisk), @code{*} operator, as regexp operator
3157 @cindex asterisk (@code{*}), @code{*} operator, as regexp operator
3172 with backslashes.
3200 If there is one number in the braces, the preceding regexp is repeated
3202 If there are two numbers separated by a comma, the preceding regexp is
3204 If there is one number followed by a comma, then the preceding regexp
3221 and @command{egrep} consistent with each other.
3224 However, because old programs may use @samp{@{} and @samp{@}} in regexp
3230 For new programs that use @samp{@{} and @samp{@}} in regexp constants,
3231 it is good practice to always escape them with a backslash. Then the
3232 regexp constants are valid and work the way you want them to, using
3234 using a string constant with a regexp operator or function.}
3237 @cindex precedence, regexp operators
3248 stand for themselves when there is nothing in the regexp that precedes them.
3298 is compatible with other @command{awk}
3314 A character class is only valid in a regexp @emph{inside} the
3396 (called @dfn{collating elements}) that are represented with more than one
3408 then @code{[[.ch.]]} is a regexp that matches this collating element, whereas
3409 @code{[ch]} is a regexp that matches either @samp{c} or @samp{h}.
3417 ``e,'' ``@`e,'' and ``@'e.'' In this case, @code{[[=e=]]} is a regexp
3443 @cindex word, regexp definition of
3444 GNU software that deals with regular expressions provides a number of
3445 additional regexp operators. These operators are described in this
3448 Most of the additional operators deal with word matching.
3509 @command{gawk}'s regexp library routines consider the entire
3535 for @command{awk}. They are provided for compatibility with other
3542 that conflicts with the @command{awk} language's definition of @samp{\b}
3549 @c NOTE!!! Keep this in sync with the same table in the summary appendix!
3551 @c Should really do this with file inclusion.
3564 GNU regexp operators.
3567 GNU regexp operators described
3582 treated literally, even if they represent regexp metacharacters.
3636 When @code{IGNORECASE} is not zero, @emph{all} regexp and string
3661 thing you can do with @code{IGNORECASE} only is dynamically turn
3671 affected regexp operations only. It did not affect string comparison
3672 with @samp{==}, @samp{!=}, and so on.
3673 Beginning with @value{PVERSION} 3.0, both regexp and string comparison
3678 Beginning with @command{gawk} 3.0,
3683 for use with European languages.
3704 to make a change to the input record. Here, the regexp @code{/a+/}
3712 replaced with @samp{<A>} in this example:
3720 text matching and substitutions with the @code{match}, @code{sub}, @code{gsub},
3726 Understanding this principle is also important for regexp-based record
3744 regexp constant (i.e., a string of characters between slashes). It may
3747 regexp. A regexp that is computed in this way is called a @dfn{dynamic
3748 regexp}:
3756 This sets @code{digits_regexp} to a regexp that describes one or more digits,
3757 and tests whether the input record matches this regexp.
3762 operators, there is a difference between a regexp constant
3767 match the string on the lefthand side of the operator with the pattern
3771 @cindex regexp constants, slashes vs. quotes
3772 @cindex @code{\} (backslash), regexp constants
3773 @cindex backslash (@code{\}), regexp constants
3774 @cindex @code{"} (double quote), regexp constants
3775 @cindex double quote (@code{"}), regexp constants
3777 scanned twice? The answer has to do with escape sequences, and particularly
3778 with backslashes. To get a backslash into a regular expression inside a
3781 For example, @code{/\*/} is a regexp constant for a literal @samp{*}.
3782 Only one backslash is needed. To do the same thing with a string,
3787 @cindex troubleshooting, regexp constants vs. string constants
3788 @cindex regexp constants, vs. string constants
3789 @cindex string constants, vs. regexp constants
3790 Given that you can use both regexp and string constants to describe
3791 regular expressions, which should you use? The answer is ``regexp
3797 more difficult to read. Using regexp constants makes your programs
3802 It is more efficient to use regexp constants. @command{awk} can note
3803 that you have supplied a regexp and store it internally in a form that
3809 Using regexp constants is better form; it shows clearly that you
3810 intend a regexp match.
3815 @cindex regular expressions, dynamic, with embedded newlines
3819 character to be used inside a character list for a dynamic regexp:
3830 @cindex newlines, in regexp constants
3831 But a newline in a regexp constant works with no problem:
3844 @c ENDOFRANGE regexp
3851 locale setting can affect the way regexp matching works, often
3909 in order, processing all the data from one before going on to the next.
3926 used with it do not have to be named on the @command{awk} command line
3973 with the assignment operator, @samp{=}
3978 so that the very first record is read with the proper separator.
3993 rule in the @command{awk} program (the action with no pattern) prints each
3996 with each slash changed to a newline. Here are the results of running
4031 Note that the entry for the @samp{camelot} BBS is not split.
4106 ends at the next string that matches the regular expression; the next
4109 newline: a record ends at the beginning of the next matching string (the
4110 next newline in the input), and the following record starts just after
4122 with optional leading and/or trailing whitespace:
4139 of @code{RS} as a regexp and @code{RT}.
4170 consists of a character with all bits equal to zero, is a good
4388 prints a copy of the input file, with 10 subtracted from the second
4421 after adding a field, the record printed includes the new field, with
4480 The intervening field, @code{$5}, is created with an empty value
4482 and @code{NF} is updated with the value six.
4499 @c the comma before decrementing does NOT represent a tertiary entry
4511 print $0 # or whatever else with $0
4567 The value of @code{FS} can be changed in the @command{awk} program with the
4571 is read with the proper separator. To do this, use the special
4612 can massage it first with a separate @command{awk} program.)
4709 with leading whitespace intact. The assignment to @code{$2} rebuilds
4774 the @option{-F} and @option{-f} options have nothing to do with each other.
4803 figures that you really want your fields to be separated with tabs and
4805 if you really do want to separate your fields with @samp{t}s.
4852 On many Unix systems, each user has a separate entry in the system password
4855 the user's (encrypted or shadow) password. A password file entry might look
4875 processing. For example, with Unix @command{awk} and @command{gawk},
4877 to @code{FS} (the backslash is stripped). This creates a regexp meaning
4894 The character can even be a regexp metacharacter; it does not need
4897 @item FS == @var{regexp}
4898 Fields are separated by occurrences of characters that match @var{regexp}.
4899 Leading and trailing matches of @var{regexp} delimit empty fields.
4957 affects field splitting @emph{only} when the value of @code{FS} is a regexp.
4970 alphabetic character while ignoring case, use a regexp that will
4995 @command{gawk} @value{PVERSION} 2.13 introduced a facility for dealing with
4996 fixed-width fields with no distinctive field separator. For example,
5085 a system with card readers is another story!)
5125 information in one entry. In such cases, you can use multiline
5128 @cindex record separators, with multiline records
5142 encountered. The next record doesn't start until the first nonblank
5151 string @code{"\n\n+"} to @code{RS}. This regexp matches the newline
5156 So the next record doesn't start until
5177 or a regexp, this special feature of @code{RS} does not apply.
5190 regexp for that single character. For example, if the field
5199 list, where each entry is separated by blank lines. Consider a mailing
5247 program that deals with address lists.
5275 @item RS == @var{regexp}
5276 Records are separated by occurrences of characters that match @var{regexp}.
5277 Leading and trailing matches of @var{regexp} delimit empty records.
5290 @section Explicit Input with @code{getline}
5293 @cindex @code{getline} command, explicit input with
5322 * Plain Getline:: Using @code{getline} with no arguments.
5338 @subsection Using @code{getline} with No Arguments
5341 from the current input file. All it does in this case is read the next
5344 processing on the next record @emph{right now}. For example:
5371 */}) from the input. By replacing the @samp{print $0} with other
5389 By contrast, the @code{next} statement reads a new record
5390 but immediately begins processing it normally, starting with the first
5398 You can use @samp{getline @var{var}} to read the next record from
5401 For example, suppose the next line is a comment or a special string,
5451 Use @samp{getline < @var{file}} to read the next record from @var{file}.
5457 encounters a first field with a value equal to 10 in the current input
5544 lines that begin with @samp{@@execute}, which are replaced by the output
5657 The command that is started with @samp{@var{command} | getline} only
5662 @command{gawk} allows you start a @dfn{coprocess}, with which two-way
5663 communications are possible. This is done with the @samp{|&}
5731 @c The comma before "setting with" does NOT represent a tertiary
5732 @cindex @code{FILENAME} variable, @code{getline}, setting with
5748 Using @code{FILENAME} with @code{getline}
5799 computing @emph{which} values to print. However, with two exceptions,
5804 For printing with specifications, you need the @code{printf} statement
5832 The @code{print} statement is used to produce output with simple, standardized
5844 relational operator; otherwise it could be confused with a redirection
5854 The simple statement @samp{print} with no items is equivalent to
5868 newline, the newline is output along with the rest of the string. A
5884 The next example, which is run on the @file{inventory-shipped} file,
5885 prints the first two fields of each input record, with a space between
5901 together in the output, with no space. The reason for this is that
5915 To someone unfamiliar with the @file{inventory-shipped} file, neither
5999 with assignments on the command line, before the names of the input
6003 record, separated by a semicolon, with a blank line added after each
6032 @section Controlling Numeric Output with @code{print}
6053 that @code{print} uses with @code{sprintf} when it wants to convert a
6112 relational operator; otherwise, it can be confused with a redirection
6124 Each format specifier says to output the next item in the argument list
6151 A format specifier starts with the character @samp{%} and ends with
6166 (The @samp{%i} specification is for compatibility with ISO C.)
6177 prints @samp{1.950e+03}, with a total of four significant figures, three of
6180 discussed in the next @value{SUBSECTION}.)
6192 prints @samp{1950.000}, with a total of four significant figures, three of
6195 discussed in the next @value{SUBSECTION}.)
6211 are floating-point; it is provided primarily for compatibility with C.)
6260 would be the next argument in the list. Positional specifiers begin
6261 counting with one. Thus:
6293 For numeric conversions, prefix positive values with a space and
6294 negative values with a minus sign.
6314 padded with zeros instead of spaces.
6324 pad with spaces on the left. For example:
6344 Preceding the @var{width} with a minus sign causes the output to be
6345 padded with spaces on the right, instead of on the left.
6439 prints the phone numbers (@code{$2}) next on the line. This
6490 Printing each column heading with the same format specification
6608 The unsorted list is written with an ordinary redirection, while
6611 The next example uses redirection to mail a message to the mailing
6647 can be read with @code{getline}.
6648 Thus @var{command} is a @dfn{coprocess}, which works together with,
6716 The @code{tolower} function returns its argument string with all
6755 they are often redirected with the shell, via the @samp{<}, @samp{<<},
6814 The file associated with file descriptor @var{N}. Such a file must
6830 @cindex troubleshooting, quotes with @value{FN}s
6845 first be closed with the @code{close} function
6856 in decimal form, terminated with a newline.
6860 in decimal form, terminated with a newline.
6864 in decimal form, terminated with a newline.
6867 Reading this file returns a single record terminated with a newline.
6868 The fields are separated with spaces. The fields represent the
6896 They may not be used as source files with the @option{-f} option.
6903 in the next release of @command{gawk}.
6914 Starting with @value{PVERSION} 3.1 of @command{gawk}, @command{awk} programs
6926 These @value{FN}s are used with the @samp{|&} operator for communicating
6927 with a coprocess
6958 in the next release of @command{gawk}.
6967 Starting with @value{PVERSION} 3.1, @command{gawk} @emph{always}
6999 If the same @value{FN} or the same shell command is used with @code{getline}
7004 The next time the same file or command is used with @code{getline},
7008 command associated with it is remembered by @command{awk}, and subsequent
7032 included). For example, if you open a pipe with this:
7039 then you must close it with this:
7045 Once this function call is executed, the next @code{getline} from that
7046 file or command, or the next @code{print} or @code{printf} to that
7068 begin reading it with @code{getline}.
7085 To run the same program a second time, with the same arguments.
7102 use @code{close} on your files when you are done with them.
7133 does not represent a file, pipe or coprocess that was opened with
7144 When using the @samp{|&} operator to communicate with a coprocess,
7151 The second argument should be a string, with either of the values
7181 that was never opened with a redirection, or if there is
7223 print command, "died with signal", exit_val - 128
7225 print command, "exited with code", exit_val
7255 combinations of these with various operators.
7258 * Constants:: String, numeric and regexp constants.
7259 * Using Constant Regexps:: When and how to use a regexp constant.
7270 affects comparison of numbers and strings with
7334 implementations may have difficulty with some character codes.
7361 Octal numbers start with a leading @samp{0},
7362 and hexadecimal numbers start with a leading @samp{0x} or @samp{0X}:
7383 useful when working with data that cannot be represented conveniently as
7400 when working with the built-in bit manipulation functions;
7441 @cindex regexp constants
7446 A regexp constant is a regular expression description enclosed in
7450 (which are just ordinary strings or variables that contain a regexp).
7456 @cindex dark corner, regexp constants
7458 operators, a regexp constant merely stands for the regexp that is to be
7460 However, regexp constants (such as @code{/foo/}) may be used like simple expressions.
7462 regexp constant appears by itself, it has the same meaning as if it appeared
7494 @cindex @command{gawk}, regexp constants and
7495 @cindex regexp constants, in @command{gawk}
7497 This code is ``obviously'' testing @code{$1} for a match against the regexp
7500 against the regexp @code{/foo/}. The result is either zero or one,
7518 @cindex differences in @command{awk} and @command{gawk}, regexp constants
7519 @cindex dark corner, regexp constants, as arguments to user-defined functions
7528 the third argument of @code{split} to be a regexp constant, but some
7531 This can lead to confusion when attempting to use regexp constants
7556 In this example, the programmer wants to pass a regexp constant to the
7561 @command{gawk} issues a warning when it sees a regexp constant used as
7589 with a digit. Case is significant in variable names; @code{a} and @code{A}
7593 variable's current value. Variables are given new values with
7635 When the assignment is preceded with the @option{-v} option,
7711 string, concatenate the empty string, @code{""}, with that number.
7723 with @code{CONVFMT} as the format
7727 @code{CONVFMT}'s default value is @code{"%.6g"}, which prints a value with
7761 specifies the output format to use when printing numbers with @code{print}.
7788 As of @value{PVERSION} 3.1.3, @command{gawk} fully complies with this aspect
7936 writing expressions next to one another, with no operator. For example:
7997 The precedence of concatenation, when mixed with other operators, is often
8003 > Subject: gawk 3.0.4 bug with {print -12 " " -24}
8128 String values that do not begin with a digit have a numeric value of
8143 is assigned. Thus, @samp{z = 1} is an expression with the value one.
8167 do arithmetic with the old value of the variable. For example, the
8293 @cindex advanced features, regexp constants
8294 @cindex dark corner, regexp constants, @code{/=} operator and
8295 @cindex @code{/} (forward slash), @code{/=} operator, vs. @code{/=@dots{}/} regexp constant
8296 @cindex forward slash (@code{/}), @code{/=} operator, vs. @code{/=@dots{}/} regexp constant
8297 @cindex regexp constants, @code{/=@dots{}/}, @code{/=} operator and
8303 @cindex ambiguity, syntactic: @code{/=} operator vs. @code{/=@dots{}/} regexp constant
8304 @cindex syntactic ambiguity: @code{/=} operator vs. @code{/=@dots{}/} regexp constant
8305 @cindex @code{/=} operator vs. @code{/=@dots{}/} regexp constant
8307 operator and regexp constants whose first character is an @samp{=}.
8363 but with the side effect of incrementing it.
8380 it subtracts one instead of adding it. As with @samp{++}, it can be used before
8570 @c thanks to Karl Berry, kb@cs.umb.edu, for major help with TeX tables
8579 % template (and each row). # is replaced by the text of that entry on
8590 % The doubled && before the next entry means `repeat the following
8593 % The template itself, \quad#\hfil, left-justifies with a little space before.
8603 % do with the columns of the table, we use \noalign to get it in there.
8684 True if the string @var{x} matches the regexp denoted by @var{y}.
8687 True if the string @var{x} does not match the regexp denoted by @var{y}.
8690 True if the array @var{array} has an element with the subscript @var{subscript}.
8748 In the next example:
8755 @cindex comparison expressions, string vs. regexp
8756 @c @cindex string comparison vs. regexp comparison
8757 @c @cindex regexp comparison vs. string comparison
8789 either a regexp constant (@code{/@dots{}/}) or an ordinary
8791 dynamic regexp (@pxref{Regexp Usage}; also
8794 @cindex @command{awk}, regexp constants and
8795 @cindex regexp constants
8797 expression in slashes by itself is also an expression. The regexp
8798 @code{/@var{regexp}/} is an abbreviation for the following comparison expression:
8801 $0 ~ /@var{regexp}/
8830 (@samp{||}), ``and'' (@samp{&&}), and ``not'' (@samp{!}), along with
8924 $1 == "START" @{ interested = ! interested; next @}
8926 $1 == "END" @{ interested = ! interested; next @}
8930 The variable @code{interested}, as with all @command{awk} variables, starts
8933 to true, using @samp{!}. The next rule prints lines as long as
8943 @cindex @code{next} statement
8944 @strong{Note:} The @code{next} statement is discussed in
8946 @code{next} tells @command{awk} to skip the rest of the rules, get the
8947 next record, and start processing the rules over again at the top.
8972 @var{if-true-exp} is computed next and its value becomes the value of
8973 the whole expression. Otherwise, @var{if-false-exp} is computed next
9030 The way to use a function is with a @dfn{function call} expression,
9036 The following examples show function calls with and without arguments:
9049 a variable with an expression inside parentheses.
9053 with user-defined functions. Each function expects a particular number
9054 of arguments. For example, the @code{sqrt} function must be called with
9283 a pattern with an associated action. This @value{CHAPTER} describes how
9296 * Using Shell Variables:: How to use shell variables with @command{awk}.
9354 This kind of pattern is simply a regexp constant in the pattern part of
9356 The pattern matches when the input record matches the regexp.
9385 slashes (@code{/@var{regexp}/}), or any expression whose string value
9402 (There is no output, because there is no BBS site with the exact name @samp{foo}.)
9403 Contrast this with the following regular expression match, which
9404 accepts any record with a first field that contains @samp{foo}:
9414 @cindex regexp constants, as patterns
9415 @cindex patterns, regexp constants as
9416 A regexp constant as a pattern is also a special case of an expression
9472 @subsection Specifying Record Ranges with Patterns
9514 combine a range pattern that describes the delimited text with the
9515 @code{next} statement
9518 record and start over again with the next input record. Such a program
9522 /^%$/,/^%$/ @{ next @}
9535 /^%$/ @{ skip = ! skip; next @}
9536 skip == 1 @{ next @} # skip lines with `skip' set
9541 program attempts to combine a range pattern with another, simpler test:
9550 with other patterns:
9609 or with Boolean operators (indeed, they cannot be used with any operators).
9613 @code{BEGIN} and @code{END} rules may be intermixed with other rules.
9649 The first has to do with the value of @code{$0} in a @code{BEGIN}
9679 relying on @code{$0} being null. Although one might generally get away with
9684 @cindex @code{next} statement, @code{BEGIN}/@code{END} patterns and
9686 @cindex @code{BEGIN} pattern, @code{next}/@code{nextfile} statements and
9687 @cindex @code{END} pattern, @code{next}/@code{nextfile} statements and
9688 Finally, the @code{next} and @code{nextfile} statements are not allowed
9845 Also supplied in @command{awk} are the @code{next}
9882 All the control statements start with special keywords, such as @code{if}
9888 single @dfn{compound statement} with curly braces, separating them with
9989 never executed and @command{awk} continues with the statement following
10033 makes @var{condition} true). Contrast this with the corresponding
10043 is false to begin with.
10093 This prints the first three fields of each input record, with one field per
10157 @var{do something with} array[i]
10195 @code{continue}, @code{next}, @code{nextfile} or @code{exit} is encountered,
10218 next @code{case} until execution halts. In the above example, for
10219 any case value starting with @samp{2} followed by one or more digits,
10257 or @code{while} statement could be replaced with a @code{break} inside
10286 statement outside of a loop as if it were a @code{next} statement
10302 As with @code{break}, the @code{continue} statement is used only inside
10304 over the rest of the loop body, causing the next cycle around the loop
10305 to begin immediately. Contrast this with @code{break}, which jumps out
10309 skip the rest of the body of the loop and resume execution with the
10328 @code{for} loop from the previous example with the following @code{while} loop:
10355 statement outside a loop: as if it were a @code{next}
10366 @subsection The @code{next} Statement
10367 @cindex @code{next} statement
10369 The @code{next} statement forces @command{awk} to immediately stop processing
10370 the current record and go on to the next record. This means that no
10374 Contrast this with the effect of the @code{getline} function
10376 @command{awk} to read the next record immediately, but it does not alter the
10378 with a new input record).
10384 rules, then the @code{next} statement is analogous to a @code{continue}
10389 with four fields, and it shouldn't fail when given bad input. To avoid
10397 next
10402 Because of the @code{next} statement,
10410 @c @cindex @code{next}, inside a user-defined function
10411 @cindex @code{BEGIN} pattern, @code{next}/@code{nextfile} statements and
10412 @cindex @code{END} pattern, @code{next}/@code{nextfile} statements and
10413 @cindex POSIX @command{awk}, @code{next}/@code{nextfile} statements and
10414 @cindex @code{next} statement, user-defined functions and
10415 @cindex functions, user-defined, @code{next}/@code{nextfile} statements and
10417 the @code{next} statement is used in a @code{BEGIN} or @code{END} rule.
10420 some other @command{awk} implementations don't allow the @code{next}
10423 Just as with any other @code{next} statement, a @code{next} statement inside a
10424 function body reads the next record and starts processing it with the
10426 If the @code{next} statement causes the end of the input to be reached,
10433 @cindex differences in @command{awk} and @command{gawk}, @code{next}/@code{nextfile} statements
10436 which is similar to the @code{next} statement.
10448 updated to the name of the next @value{DF} listed on the command line,
10450 starts over with the first rule in the program.
10458 Normally, in order to move on to the next @value{DF}, a program
10465 opened with redirections. It is not related to the main processing that
10466 @command{awk} does with the files listed in @code{ARGV}.
10474 @cindex functions, user-defined, @code{next}/@code{nextfile} statements and
10482 function body reads the next record and starts processing it with the
10485 @cindex @code{next file} statement, in @command{gawk}
10486 @cindex @command{gawk}, @code{next file} statement in
10490 words (@samp{next file}) for the @code{nextfile} statement.
10493 inconsistent. When it appeared after @code{next}, @samp{file} was a keyword;
10495 accepted; @samp{next file} generates a syntax error.
10537 called a second time from an @code{END} rule with no argument,
10544 exiting with a nonzero status. An @command{awk} program can do this
10545 using an @code{exit} statement with a nonzero argument, as shown
10600 specific to @command{gawk} are marked with a pound sign@w{ (@samp{#}).}
10648 how to split input with fixed columnar boundaries.
10691 is to simply say @samp{FS = FS}, perhaps with an explanatory comment.
10700 and all regular expression matching are case independent. Thus, regexp
10701 matching with @samp{~} and @samp{!~}, as well as the @code{gensub},
10703 functions, record termination with @code{RS}, and field splitting with
10704 @code{FS}, all ignore case when doing their particular regexp operations.
10714 and regexp operations are always case-sensitive.
10745 printing with the @code{print} statement. It works by being passed
10777 If it is a regexp, records are separated by
10778 matches of the regexp in the input text.
10830 @command{gawk} are marked with a pound sign@w{ (@samp{#}).}
10890 next file is opened.
10997 @code{"FS"} if field splitting with @code{FS} is in effect, or it is
10998 @code{"FIELDWIDTHS"} if field splitting with @code{FIELDWIDTHS} is in effect.
11118 other special command-line options, with their arguments, are also not
11119 entered. This includes variable assignments done with the @option{-v}
11141 Each time @command{awk} reaches the end of an input file, it uses the next
11142 element of @code{ARGV} as the name of the next input file. By storing a
11156 replaced with the null string.
11189 end the @command{awk} options with @option{--} and then supply
11199 into @code{ARGV} for the @command{awk} program to deal with. As soon
11201 options that it might otherwise recognize. The previous example with
11227 The @value{CHAPTER} finishes with a discussion of @command{gawk}'s facility
11238 Thus, you cannot have a variable and an array with the same name in the
11334 position with zero elements before it.
11402 When @command{awk} creates an array (e.g., with the @code{split}
11436 automatically creates that array element, with the null string as its value.
11495 The following program takes a list of lines, each beginning with a line
11502 begin with a number:
11524 When this program is run with the following input:
11547 If a line number is repeated, the last line with a given number overrides
11549 Gaps in the line numbers can be handled with an easy improvement to the
11582 program has previously used, with the variable @var{var} set to that index.
11588 least once) in the input, by storing a one into the array @code{used} with
11685 All the elements of an array may be deleted with a single statement
11752 @code{data[xyz]} subscripts @code{data} with the string value @code{"12.153"}
11770 @i{do something with} array[i]
11786 As with many things in @command{awk}, the majority of the time
11801 A reasonable attempt to do so (with some test
11860 two-dimensional array named @code{grid} is with
11868 concatenates them together, with a separator between them. This creates
11877 concatenated with an @samp{@@} between them, yielding @code{"5@@12"}; thus,
11881 it was stored with a single index or a sequence of indices. The two
11962 (@pxref{Scanning an Array}) with the
11982 an element with index @code{"1\034foo"} exists in @code{array}. (Recall
11983 that the default value of @code{SUBSEP} is the character with code 034.)
11985 iteration with the variable @code{combined} set to @code{"1\034foo"}.
11998 @section Sorting Array Values and Indices with @command{gawk}
12005 The order in which an array is scanned with a @samp{for (i in array)}
12020 @var{do something with} data[i]
12041 @var{do something with} dest[i]
12050 To do that, starting with @command{gawk} 3.1.2, use the
12061 @var{do something with} dest[i]
12080 @var{do something with} data[ind[i]]
12116 to work with values that represent time, do
12141 * Numeric Functions:: Functions that work with numbers, including
12146 * Time Functions:: Functions for dealing with timestamps.
12193 is called with a value of four for its actual parameter.
12205 6, and then 12, and @code{atan2} is called with the two arguments 6
12207 first becomes 10, then 11, and @code{atan2} is called with the
12214 the built-in functions that work with numbers.
12245 This returns the sine of @var{x}, with @var{x} in radians.
12249 This returns the cosine of @var{x}, with @var{x} in radians.
12344 specific to @command{gawk} are marked with a pound sign@w{ (@samp{#}):}
12348 @samp{&} with @code{sub}, @code{gsub}, and
12361 of the sorted values of @var{source} are replaced with sequential
12362 integers starting with one. If the optional array @var{dest} is specified,
12448 @item match(@var{string}, @var{regexp} @r{[}, @var{array}@r{]})
12452 @var{regexp}. It returns the character position, or @dfn{index},
12456 The @var{regexp} argument may be either a regexp constant
12458 In the latter case, the string is treated as a regexp to be matched.
12464 functions that work with regular expressions, such as
12467 @samp{@var{string} ~ @var{regexp}}.
12523 matched by @var{regexp}. If @var{regexp} contains parentheses,
12537 beginning with @command{gawk} 3.1.2,
12555 should be tested for with the @code{in} operator
12570 a regexp describing where to split @var{string} (much as @code{FS} can
12571 be a regexp describing where to split input records). If
12597 As with input field-splitting, when the value of @var{fieldsep} is
12600 Also as with input field-splitting, if @var{fieldsep} is the null string, each
12610 the third argument to be a regexp constant (@code{/abc/}) as well as a
12615 discussion of the difference between using a string constant or a regexp constant,
12622 way to delete an entire array with one statement.
12632 have printed out with the same arguments
12647 begins with a leading @samp{0}, @code{strtonum} assumes that @var{str}
12648 is an octal number. If @var{str} begins with a leading @samp{0x} or
12668 @item sub(@var{regexp}, @var{replacement} @r{[}, @var{target}@r{]})
12672 leftmost, longest substring matched by the regular expression @var{regexp}.
12674 changed by replacing the matched text with @var{replacement}.
12677 The @var{regexp} argument may be either a regexp constant
12679 In the latter case, the string is treated as a regexp to be matched.
12702 leftmost longest occurrence of @samp{at} with @samp{ith}.
12708 stands for the precise substring that was matched by @var{regexp}. (If
12709 the regexp can match more than one string, then this precise substring
12732 illustrates the ``leftmost, longest'' rule in regexp matching
12739 For example, the following shows how to replace the first @samp{|} on each line with
12768 Finally, if the @var{regexp} is not a regexp constant, it is converted into a
12769 string, and then the value of that string is treated as the regexp to match.
12771 @item gsub(@var{regexp}, @var{replacement} @r{[}, @var{target}@r{]})
12783 replaces all occurrences of the string @samp{Britain} with @samp{United
12792 @item gensub(@var{regexp}, @var{replacement}, @var{how} @r{[}, @var{target}@r{]}) #
12796 the regular expression @var{regexp}. Unlike @code{sub} and @code{gsub},
12799 beginning with @samp{g} or @samp{G}, then it replaces all matches of
12800 @var{regexp} with @var{replacement}. Otherwise, @var{how} is treated
12801 as a number that indicates which match of @var{regexp} to replace. If
12806 regexp in the replacement text. This is done by using parentheses in
12807 the regexp to mark the components and then specifying @samp{\@var{N}}
12822 As with @code{sub}, you must type two backslashes in order
12828 which match of the regexp should be changed:
12842 If the @var{how} argument is a string that does not begin with @samp{g} or
12847 If @var{regexp} does not match @var{target}, @code{gensub}'s return value
12901 with string concatenation, in the following manner:
12913 This returns a copy of @var{string}, with each uppercase character
12914 in the string replaced with its corresponding lowercase character.
12920 This returns a copy of @var{string}, with each lowercase character
12921 in the string replaced with its corresponding uppercase character.
12927 @subsubsection More About @samp{\} and @samp{&} with @code{sub}, @code{gsub}, and @code{gensub}
12954 @samp{\} and put the next character into the string. Thus, for
12961 the generated text with a single @samp{&}. Any other @samp{\} within
12963 through unchanged. To illustrate with a table:
12965 @c Thank to Karl Berry for help with the TeX stuff.
13006 The problem with the historical approach is that there is no way to get
13058 in the @var{replacement} string must be preceded with a
13163 @c last comma in next two is part of tertiary
13206 Flush any buffered output associated with @var{filename}, which is either a
13246 a file or pipe that was opened for reading (such as with @code{getline}),
13252 @cindex interacting with other programs
13298 with a user sitting at a keyboard.@footnote{A program is interactive
13320 with this example:
13336 @subheading Advanced Notes: Controlling Output Buffering with @code{system}
13345 buffers is to call @code{system} with a null string as its argument:
13354 interpreter) with the empty command. Therefore, with @command{gawk}, this
13356 with other @command{awk} implementations, it does not necessarily avoid
13358 flush the buffer associated with the standard output and not necessarily
13424 working with timestamps. They are @command{gawk} extensions; they are
13451 minutes in a year with a leap second, which is why the
13457 The origin-zero Gregorian calendar is assumed, with year 0 preceding
13484 log file with the current time of day. In particular, it is easy to
13493 comparisons of dates and times, particularly when dealing with date and
13528 The century. This is the year divided by 100 and truncated to the next
13538 The day of the month, padded with a space if it is only one digit.
13577 with a 12-hour clock.
13609 and the next week is week one.)
13651 (These facilitate compliance with the POSIX @command{date} utility.)
13675 A public-domain C version of @code{strftime} is supplied with @command{gawk}
13685 Single-digit numbers are padded with a space.
13689 Single-digit numbers are padded with a space.
13715 provide an argument to it that begins with a @samp{+}, @command{date}
13832 The next operation is the @dfn{complement}; the complement of 1 is 0 and
13842 right by three bits, you end up with @samp{00010111}.@footnote{This example
13845 fill with 1's. Caveat emptor.}
13848 again with @samp{10111001} and shift it left by three bits, you end up
13849 with @samp{11001000}.
13997 ANDing the mask with the value indicates whether the
14005 Otherwise, at the end, it pads the value with zeros to represent multiples
14117 underscores that doesn't start with a digit.
14126 A function cannot have two parameters with the same name, nor may it
14127 have a parameter with the same name as the function itself.
14190 of the variable @samp{func} with the return value of the function @samp{foo}.
14251 When working with arrays, it is often necessary to delete all the elements
14252 in an array and start over with a new list of elements
14324 example, here is a call to @code{foo} with three arguments (the first
14334 to concatenate a variable with an expression in parentheses. However, it
14433 @cindex portability, @code{next} statement in user-defined functions
14435 error if you use the @code{next} statement
14458 A @code{return} statement with no value expression is assumed at the end of
14485 You call @code{maxelt} with one argument, which is an array name. The local
14561 @chapter Internationalization with @command{gawk}
14619 by a program, either directly or via formatting with @code{printf} or
14643 A table with strings of option names is not (e.g., @command{gawk}'s
14655 collected into a portable object file (@file{guide.po}),
14661 @cindex @code{.po} files
14662 @cindex files, @code{.po}
14666 For each language with a translator, @file{guide.po}
14667 is copied and translations are created and shipped with the application.
14674 Each language's @file{.po} file is converted into a binary
14809 For compatibility with GNU @code{gettext}, the default
14815 String constants marked with a leading underscore
14903 Mark all translatable strings with a leading underscore (@samp{_})
14932 with the @code{bindtextdomain} built-in function:
14956 @cindex @code{.po} files
14957 @cindex files, @code{.po}
14961 be extracted to create the initial @file{.po} file.
14965 @command{gawk}'s @option{--gen-po} command-line option extracts
14966 the messages and is discussed next.
14982 @cindex @code{--gen-po} option
14988 @cindex @code{--gen-po} option
14992 First, use the @option{--gen-po} command-line option to create
14993 the initial @file{.po} file:
14996 $ gawk --gen-po -f guide.awk > guide.po
15000 When run with @option{--gen-po}, @command{gawk} does not execute your
15005 second argument to @code{dcngettext}.@footnote{Starting with @code{gettext}
15006 version 0.11.5, the @command{xgettext} utility that comes with GNU
15067 Positional specifiers can be used with the dynamic field width and
15080 @strong{Note:} When using @samp{*} with a positional specifier, the @samp{*}
15084 @cindex @code{printf} statement, positional specifiers, mixing with regular formats
15086 @cindex positional specifiers, @code{printf} statement, mixing with regular formats
15087 @cindex format specifiers, mixing regular with positional specifiers
15089 and those with positional specifiers in the same string:
15119 if (Test_Guide) # set with -v
15138 as the concatenation of a variable named @code{_} with the string
15208 Run @samp{gawk --gen-po} to create the @file{.po} file:
15211 $ gawk --gen-po -f guide.awk > guide.po
15218 @c file eg/data/guide.po
15234 @strong{Note:} Strings not marked with a leading underscore do not
15235 appear in the @file{guide.po} file.
15244 $ cp guide.po guide-mellow.po
15245 @var{Add translations to} guide-mellow.po @dots{}
15253 @c file eg/data/guide-mellow.po
15267 The next step is to make the directory to hold the binary message object
15277 @cindex @code{.po} files, converting to @code{.mo}
15278 @cindex files, @code{.po}, converting to @code{.mo}
15279 @cindex @code{.mo} files, converting from @code{.po}
15280 @cindex files, @code{.mo}, converting from @code{.po}
15287 @file{.po} file to machine-readable @file{.mo} file.
15293 $ msgfmt guide-mellow.po
15341 @cindex @code{--with-included-gettext} configuration option
15342 @cindex configuration option, @code{--with-included-gettext}
15344 configure @command{gawk} with the @option{--with-included-gettext} option
15377 @value{DOCUMENT}, is described in full detail, along with the basics
15389 * Two-way I/O:: Two-way communications with another process.
15391 * Portal Files:: Using @command{gawk} with BSD portals.
15403 If you run @command{gawk} with the @option{--non-decimal-data} option,
15437 Because it is common to have decimal data with leading zeros, and because
15453 @section Two-Way Communications with Another Process
15485 @cindex advanced features, @command{gawk}, processes, communicating with
15486 @cindex processes, two-way communications with
15490 done with temporary files:
15495 while (@var{not done with data})
15510 to be using a temporary file with the same name.
15516 @cindex @command{csh} utility, @code{|&} operator, comparison with
15517 Starting with @value{PVERSION} 3.1 of @command{gawk}, it is possible to
15519 termed a @dfn{coprocess}, since it runs in parallel with @command{gawk}.
15534 that runs the other program. Output created with @code{print}
15538 As is the case with processes started by @samp{|}, the subprogram
15611 Beginning with @command{gawk} 3.1.2, you may use Pseudo-ttys (ptys) for
15656 by recognizing special @value{FN}s that begin with @samp{/inet/}.
15713 See @cite{TCP/IP Internetworking with @command{gawk}},
15720 @section Using @command{gawk} with BSD Portals
15731 is configured with the @option{--enable-portals} option
15734 files whose pathnames begin with @code{/p} as 4.4 BSD-style portals.
15738 When used with the @samp{|&} operator, @command{gawk} opens the file
15740 then manages creating the process associated with the portal and
15741 the corresponding communications with the portal's process.
15754 Beginning with @value{PVERSION} 3.1 of @command{gawk}, you may produce execution
15756 This is done with a specially compiled version of @command{gawk},
15781 Regular @command{gawk} also accepts this option. When called with just
15915 For user-defined functions, the count next to the @code{function}
15917 The counts next to the statements in the body show how many times
15923 The layout uses ``K&R'' style with tabs.
16010 Along with the regular profile, as shown earlier, the profile
16055 @command{gawk} do with non-option arguments.
16083 There are two ways to run @command{awk}---with an explicit program or with
16100 It is possible to invoke @command{awk} with an empty program:
16126 Options begin with a dash and consist of a single character.
16133 If a particular option with a value is given more than once, it is the
16175 @cindex built-in variables, @code{-v} option, setting with
16177 @cindex variables, built-in, @code{-v} option, setting with
16203 The full list of @command{gawk}-specific options is provided next.
16209 are not treated as options even if they begin with @samp{-}. This
16213 @cindex @code{-} (hyphen), filenames beginning with
16214 @cindex hyphen (@code{-}), filenames beginning with
16215 This is useful if you have @value{FN}s that start with @samp{-},
16217 by the user that could start with @samp{-}.
16268 You would also use this option if you have a large program with a lot of
16271 (This is a particularly easy mistake to make with simple variable
16274 @item -W gen-po
16275 @itemx --gen-po
16276 @cindex @code{--gen-po} option
16332 Use with care.
16344 @c IMPORTANT! Keep this list in sync with the one in node POSIX
16411 When run with @command{gawk}, the profile is just a ``pretty printed'' version
16412 of the program. When run with @command{pgawk}, the profile contains execution
16431 Allows you to mix source code in files with source
16445 with respect to whatever the Free Software Foundation is currently
16452 any other options are flagged as invalid with a warning message but
16553 arguments is made when @command{awk} is about to open the next input file.
16604 on the command-line with the @option{-f} option.
16612 file with the specified name.
16630 the command line with a short @value{FN}. Otherwise, the full @value{FN}
16642 @file{.} explicitly in the path or by writing a null entry in the
16643 path. (A null entry is indicated by starting or ending the path with a
16644 colon or by placing two colons next to each other (@samp{::}).) If the
16650 Starting with @value{PVERSION} 3.0, if @env{AWKPATH} is not defined in the
16672 they will @emph{not} be in the next release).
16676 @cindex @code{next file} statement, deprecated
16677 @cindex @code{nextfile} statement, @code{next file} statement and
16682 The use of @samp{next file} (two words) for @code{nextfile} was deprecated
16683 in @command{gawk} 3.0 but still worked. Starting with @value{PVERSION} 3.1, the
16693 They will be removed from the next release of @command{gawk}.
16902 Also, verify that all regexp and string constants used in
16941 private variables that will not conflict with any variables used by
16948 private variables with an underscore (@samp{_}). Users generally don't use
16951 with the user's program.
16969 variable's name with a capital letter---for
17055 function nextfile() @{ _abandon_ = FILENAME; next @}
17056 _abandon_ == FILENAME @{ next @}
17064 then the action part of the rule executes a @code{next} statement to
17065 go on to the next record. (The use of @samp{_} in the variable name is
17069 The use of the @code{next} statement effectively creates a loop that reads
17074 fails, and execution continues with the first rule of the ``real'' program.
17077 and then executes a @code{next} statement to start the
17084 execute @code{next} from within a function body. Some other workaround
17092 or even with just a variable assignment between them,
17114 function nextfile() @{ _abandon_ = FILENAME; next @}
17120 next
17126 equal to the current @value{FN} and then executes a @code{next} statement.
17127 The @code{next} statement reads the next record and increments @code{FNR}
17130 then @command{awk} closes the current @value{DF} and moves on to the next
17132 and @code{FNR} is reset to one. If this next file is the same as
17138 fail (until the next time that @code{nextfile} is called).
17141 and the program executes a @code{next} statement to skip through it.
17144 functionality of @code{nextfile} can be provided with a library file,
17155 next one, which saves a lot of time. This is particularly important in
17176 that a condition or set of conditions is true. Before proceeding with a
17275 There is a small problem with this version of @code{assert}.
17287 with an @code{exit} statement.
17482 used ASCII, but with mark parity, meaning that the leftmost bit in the byte
17539 should also have a reasonable default behavior. It is called with an array
17542 assumption since the array was likely created with @code{split}
17578 then @code{join} joins the strings with no separator between them.
17594 provide the minimum functionality necessary for dealing with the time of day
17600 with preformatted time information. It returns a string with the current
17618 # Populates the array argument time with individual values:
17726 @emph{portably}; this works with any implementation of @command{awk}:
17823 and then start over with it from the top.
17849 # make current file next to get done
17885 it stops with a fatal error. There are times when you
17921 Removing the element from @code{ARGV} with @code{delete}
17933 end of file indication, closes the file, and proceeds on to the next
18040 to disable command-line assignments. However, some simple programming with
18113 string that does not begin with @samp{-} ends the options.
18119 string with a colon. @code{getopt} is also passed the
18123 next option letter that it finds, or @samp{?} if it finds an invalid option.
18142 Notice that when the argument is grouped with its option, the rest of
18251 The function starts out with
18257 The @code{getopt} function first checks that it was indeed called with a string of options
18282 The next thing to check for is the end of the options. A @option{--}
18284 does not begin with a @samp{-}. @code{Optind} is used to step through
18317 grouped together with one @samp{-} (e.g., @option{-abx}), it is necessary
18321 the string of the next character to look at (we skip the @samp{-}, which
18323 obtained with @code{substr}. It is saved in @code{Optopt} for the main
18332 next option character. If @code{_opti} is greater than or equal to the
18334 to the next argument, so @code{Optind} is incremented and @code{_opti} is reset
18359 string is assigned to @code{Optarg}. Otherwise, the next command-line
18379 next element in @code{argv}. If neither condition is true, then only
18380 @code{_opti} is incremented, so that the next option letter can be processed
18381 on the next call to @code{getopt}.
18461 user information associated with the user and group ID numbers. This
18476 The primary function is @code{getpwent}, for ``get password entry.''
18478 @file{/etc/passwd}, which stores user information, along with the
18489 is called, it returns the next entry in the database. When there are
18571 The user's full name, and perhaps other information associated with the
18593 @item Full name @tab The user's full name, and perhaps other information associated with the
18626 @c Answer: return foo[key] returns "" if key not there, no need to check with `in'.
18695 with @code{FIELDWIDTHS} is in effect or not.
18702 is @code{"FIELDWIDTHS"} if field splitting is being done with
18757 The @code{getpwent} function simply steps through the database, one entry at
18792 functions. If this library file is loaded along with a user's program, but
18840 complete information. Therefore, as with the user database, it is necessary
18912 separated with colons and represent the following information:
18978 # group.awk --- functions for dealing with the group file
19074 subtle problem with the code just presented. Suppose that
19075 the first time there were no names. This code adds the names with
19103 looks up the information associated with that group ID:
19136 The @code{getgrent} function steps through the database one entry at a time.
19167 As with the user database routines, each function calls @code{_gr_init} to
19203 These are programs that you are hopefully already familiar with,
19236 program that start with a @samp{-}, and @var{files} are the actual @value{DF}s.
19310 may be separated by commas, and ranges of characters can be separated with
19330 The program begins with a comment describing the options, the library
19431 incorrect---@command{awk} would separate fields with runs of spaces,
19432 tabs, and/or newlines, and we want them to be separated with individual
19438 After dealing with the command-line options, the program verifies that the
19575 If the next field also has data, then the separator character is
19582 next
19607 @c Exercise: Rewrite using split with "".
19658 Use @var{pattern} as the regexp to match. The purpose of the @option{-e}
19659 option is to allow patterns that start with a @samp{-}.
19667 The program begins with a descriptive comment and then a @code{BEGIN} rule
19668 that processes the command-line arguments with @code{getopt}. The @option{-i}
19669 (ignore case) option is particularly easy with @command{gawk}; we just use the
19718 pattern is supplied with @option{-e}, the first nonoption on the
19748 The next set of lines should be uncommented if you are not using
19754 commented out since it is not necessary with @command{gawk}:
19772 is called with a parameter, but that we're not interested in its value):
19810 using the @samp{!} operator. @code{fcount} is incremented with the value of
19813 @code{next} statement just moves on to the next record.
19820 line in this file matched, and we can skip on to the next file with
19822 print the @value{FN}, and then skip to the next file with @code{nextfile}.
19823 Finally, each line is printed, with a leading @value{FN} and colon
19837 next
19893 rule uses backslash continuation, with the open brace on a line by
19935 The code is repetitive. The entry in the user database for the real user ID
20029 This loop works by starting at one, concatenating the value with
20060 1000 lines in it, with the likely exception of the last file. To change the
20062 preceded with a minus; e.g., @samp{-500} for files with 500 lines in them
20122 The next rule does most of the work. @code{tcount} (temporary count) tracks
20127 moves to the next letter in the alphabet and @code{s2} starts over again at
20157 @c Exercise: do this with just awk builtin functions, index("abc..."), substr, etc.
20346 Skip @var{n} characters before comparing lines. Any fields specified with
20367 The program begins with a @code{usage} function and then a brief outline of
20369 The @code{BEGIN} rule deals with the command-line arguments and options. It
20371 treating such an option as the option letter @samp{2} with an argument of
20374 concatenated with the option digit and then the result is added to zero to make
20377 @code{getopt} processes it next time. This code is admittedly a bit
20528 next
20542 next
20802 word on a line (in the variable @code{prev}) for comparison with the first
20803 word on the next line.
20808 The next statement replaces nonalphanumeric and nonwhitespace characters
20809 with spaces, so that punctuation does not affect the comparison either.
20810 The characters are replaced with spaces so that formatting controls
20840 next
20941 The next @value{SECTION} of code turns the alarm time into hours and minutes,
20985 @command{sleep} exited with an OK status (zero), then the program prints the
21033 first list is replaced with the first character in the second list,
21034 the second character in the first list is replaced with the second
21046 prove that character transliteration could be done with a user-level
21051 of standard @command{awk}: dealing with individual characters is very
21177 than 5 lines of data. Each address is separated from the next by a blank
21192 have to print horizontally; @code{line[1]} next to @code{line[6]},
21193 @code{line[2]} next to @code{line[7]}, and so on. Two loops are used to
21198 the row, and @samp{i+j+5} is the entry next to it. The output ends up
21297 Finally, it shows how @command{awk} is used in conjunction with other
21298 utility programs to do a useful task of some complexity with a minimum of
21507 If you want to experiment with these programs, it is tedious to have to type
21523 The Texinfo language is described fully, starting with
21538 Comments start with either @samp{@@c} or @samp{@@comment}.
21596 exits with a zero exit status, signifying OK:
21622 next
21659 ignores it and goes on to the next line.
21673 When the processing of the array is finished, @code{join} is called with the
21684 next
21722 Output done with @samp{>} only opens the file once; it stays open and
21770 Here, @samp{s/old/new/g} tells @command{sed} to look for the regexp
21771 @samp{old} on each input line and globally replace it with the text
21777 arguments: the pattern to look for and the text to replace it with. Any
21829 The program relies on @command{gawk}'s ability to have @code{RS} be a regexp,
21840 doesn't end with text that matches @code{RS}. Using a @code{print}
21860 Exercise, compare the performance of this version with the more
21913 with @samp{@@include} can contain further @samp{@@include} statements.
21939 Literal text, provided with @option{--source} or @option{--source=}. This
21943 Source @value{FN}s, provided with @option{-f}. We use a neat trick and append
21955 Run the expanded program with @command{gawk} and any other original command-line
21968 The next part loops through all the command-line arguments.
21977 This indicates that the next option is specific to @command{gawk}. To make
21980 programming trick. Don't worry about it if you are not familiar with
21987 The @value{FN} is appended to the shell variable @code{program} with an
22122 the @value{FN} is concatenated with the name of each directory in
22125 ahead and try to read it with @code{getline}; this is what @code{pathto}
22165 splitting the path on @samp{:}, null elements are replaced with @code{"."},
22180 The stack is initialized with @code{ARGV[1]}, which will be @file{/dev/stdin}.
22181 The main loop comes next. Input lines are read in succession. Lines that
22182 do not start with @samp{@@include} are printed verbatim.
22183 If the line does start with @samp{@@include}, the @value{FN} is in @code{$2}.
22187 The next thing to check is if the file is included already. The
22250 Run @command{gawk} with the @samp{@@include}-processing program (the
22261 The last step is to call @command{gawk} with the expanded program,
22262 along with the original
22294 Using @samp{@@include} even for the files named with @option{-f} makes building
22299 Not trying to save the line read with @code{getline}
22301 file's accessibility for use with the main program simplifies things
22316 aren't familiar with @command{sh}.
22347 Having a separate file allows @file{default.awk} to change with
22410 with the original @command{awk} implementation in Version 7 Unix.
22416 evolution of the @command{awk} language, with cross-references to other parts
22441 System V Release 3.1 (1987). This @value{SECTION} summarizes the changes, with
22570 The use of regexp constants, such as @code{/foo/}, as expressions, where
22598 with it (@pxref{Typing and Comparison}).
22608 @c IMPORTANT! Keep this list in sync with the one in node Options
22712 The ability to delete all of an array at once with @samp{delete @var{array}}
22740 They can all be disabled with either the @option{--traditional} or
22792 The @code{next file} statement for skipping to the next @value{DF}
22814 The ability to delete all of an array at once with @samp{delete @var{array}}
22818 The ability to use GNU-style long-named options that start with @option{--}
22832 as regexp operations
22850 allowing it to be called with no arguments
22859 The ability for @code{RS} to be a regexp
22863 The @code{next file} statement became @code{nextfile}
22939 for capturing text-matching subexpressions within a regexp
22979 @cindex @code{next file} statement
22980 The support for @samp{next file} as two words was removed completely
22988 The @option{--gen-po} command-line option and the use of a leading
23005 pathnames that begin with @file{/p} as BSD portals
23029 The source code now uses new-style function definitions, with
23030 @command{ansi2knr} to convert the code on systems with old compilers.
23102 making it compatible with ``new'' @command{awk}, and
23118 did the initial ports to MS-DOS with various versions of MSC.
23134 He continues to provide portability checking with DEC Alpha
23305 @command{gawk} is distributed as a @code{tar} file compressed with the
23329 but when retrieving distributions, you should get the version with the highest
23374 releases, with some indication of the time frame for the feature, based
23412 It should be processed with @TeX{} to produce a printed document, and
23413 with @command{makeinfo} to produce an Info or HTML file.
23424 @cite{TCP/IP Internetworking with @command{gawk}}.
23426 It should be processed with @TeX{} to produce a printed document and
23427 with @command{makeinfo} to produce an Info or HTML file.
23431 @cite{TCP/IP Internetworking with @command{gawk}}.
23463 @itemx po/*
23465 @command{gawk}'s internationalization features, while the @file{po} library
23545 (The @command{autoconf} software is described fully starting with
23606 with @file{/p} as BSD portal files when doing two-way I/O with
23619 @cindex @code{--with-included-gettext} configuration option
23620 @cindex @code{--with-included-gettext} configuration option, configuring @command{gawk} with
23621 @cindex configuration option, @code{--with-included-gettext}
23622 @item --with-included-gettext
23623 Use the version of the @code{gettext} library that comes with @command{gawk}.
23639 When used with GCC's automatic dead-code-elimination, this option
23642 with other compilers are likely to vary.
23654 You should also use this option if @option{--with-included-gettext}
23688 do so by not exiting with an error when a library function is not
23765 included with BeOS. The process is basically identical to the Unix process
23810 that various ``DOS extenders'' are often used with programs such as
23832 under the @file{gnu} directory, with executables in @file{gnu/bin},
23884 for @file{ChangeLog}) to the directory with the rest of the @command{gawk}
23885 sources. The @file{Makefile} contains a configuration section with comments and
23886 may need to be edited in order to work with your @command{make} utility.
23897 of the tests work properly with Stewartson's shell along with the
23973 the Makefiles of this package. If you encounter any problems with @command{make}
23984 To compile @command{gawk} with dynamic extension support,
23999 If you build @command{gawk.exe} with one compiler but want to build
24000 an extension library with the other, you need to copy the import
24016 with @command{gawk.exe}. In particular, they won't work if you
24126 appropriate for files with the DOS-style end-of-line.
24211 also a @file{Makefile} for use with the @code{MMS} utility. From the source
24258 a @code{DCL} symbol whose value begins with a dollar sign. For example:
24273 Optionally, the help entry can be loaded into a VMS help library:
24322 to the original shell-style interface (see the help entry for details).
24364 This has been tested with VAX/VMS V6.2, VMS POSIX V2.0, and DEC C V5.2.
24393 a large amount of memory with most @command{awk} programs, and should run on all
24417 port was done with @command{gcc}. You may actually prefer executables
24434 its results with the sample versions and possibly make adjustments.
24437 @samp{atarist}. This basically assumes the TOS environment with @command{gcc}.
24480 from within a program, this facility should be used with care on the ST
24492 When @command{gawk} is compiled with the ST version of @command{gcc} and its
24533 @samp{@}} characters to be escaped with @samp{~} on the command line
24542 records with no ``end-of-line'' character. That is, @samp{-mr 74} tells
24560 If you have problems with @command{gawk} or think that you have found a bug,
24584 You can get this information with the command @samp{gawk --version}.
24608 authoritative if it conflicts with this @value{DOCUMENT}.
24769 Use @code{"-"} instead of @code{"/dev/stdin"} with @command{mawk}.
24777 The ability to delete all of an array at once with @samp{delete @var{array}}
24781 The ability for @code{RS} to be a regexp
24789 The next version of @command{mawk} will support @code{nextfile}.
24797 and links them with a library of functions that provides the core
24855 All of these features can be turned off by invoking @command{gawk} with the
24856 @option{--traditional} option or with the @option{--posix} option.
24858 If @command{gawk} is compiled for debugging with @samp{-DDEBUG}, then there
24908 If that's not possible, continue with the rest of the steps in this list.
24932 An HTML version, suitable for reading with a WWW browser, is
24941 @cite{GNU Coding Standards}, with minor exceptions. The code is formatted
24955 line above the line with the name and arguments of the function.
25023 Along with your new code, please supply new sections and/or chapters
25036 the original @command{gawk} source tree with your version.
25051 Include an entry for the @file{ChangeLog} file with your submission.
25058 isn't possible for me to do that with a minimum of extra work, then I
25086 with the rest of @command{gawk} and the other ports. Avoid gratuitous
25094 with the GPL
25098 A number of the files that come with @command{gawk} are maintained by other
25133 separate subdirectory, with a name that is the same as, or reminiscent
25159 into @command{gawk} and have them coexist happily with other
25182 Beginning with @command{gawk} 3.1, it is possible to add new built-in
25188 Experience with programming in
25192 are very much subject to change in the next @command{gawk} release.
25194 upon the next release.
25219 use when writing extensions. The next @value{SECTION}
25288 This is usually a value created with @code{tmp_string} (see below).
25330 This macro releases the memory associated with a @code{NODE}
25331 allocated with @code{tmp_string} or @code{tmp_number}.
25376 An argument that is supposed to be an array needs to be handled with
25482 array with the appropriate information:
25531 with @code{strftime}
25632 with @code{get_argument}. Note that the first argument is
25639 The result of @code{force_string} has to be freed with @code{free_temp}:
25687 "stat: called with incorrect number of arguments (%d), should be 2",
25707 "stat: called with %d arguments, should be 2",
25883 : thoroughly updated manual. One of the sections deals with planned future
25934 Along with @code{FIELDWIDTHS}, this would speed up the processing of
25961 source code easier to work with:
26239 to work with, and so on.
26298 tell @command{awk} what to with the data. You do this by describing
26344 Signed values may be negative or positive, with the range of values just
26378 numbers go from 0 to 9, and then ``roll over'' into the next
26404 for C. This work culminated in 1989, with the production of the ANSI
26406 Where it makes sense, POSIX @command{awk} is compatible with 1990 ISO C.
26410 with this standard.
26474 @code{CONVFMT}'s default value is @code{"%.6g"}, which yields a value with
26568 The regexp metacharacters @samp{^} and @samp{$}, which force the match
26630 @command{awk} lets you work with floating-point numbers and strings.
26631 @command{gawk} lets you manipulate bit values with the built-in
26647 generally upwardly compatible with the Bourne shell.
26731 A subordinate program with which two-way communications is possible.
26747 producing a new string. For example, the string @samp{foo} concatenated with
26775 Such areas are marked in this @value{DOCUMENT} with
26822 with library functions available for converting these values into
26838 separated by whitespace (or by a separator regexp that you can
26923 @code{A}--@code{F}, with @samp{A}
26954 some part of the regexp. Interval expressions were not traditionally available
26982 @code{next},
27021 regexp describes the contents of the string, it is said to @dfn{match} it.
27024 Characters used within a regexp that do not stand for themselves.
27029 A string with no characters in it. It is represented explicitly in
27030 @command{awk} programs by placing two double quote characters next to
27032 occurrences of the field separator appear next to each other.
27084 If this isn't clear, refer to the entry for ``recursion.''
27098 Short for @dfn{regular expression}. A regexp is a pattern that denotes a
27099 set of strings, possibly an infinite set. For example, the regexp
27100 @samp{R.*xp} matches any string starting with the letter @samp{R}
27101 and ending with the letters @samp{xp}. In @command{awk}, regexps are
27106 See ``regexp.''
27175 or more at a time. This is in contrast with batch programs, which may
27177 anything, as well as with interactive programs which require input from the
27182 string}. Constant strings are written with double quotes in the
27270 We protect your rights with two steps: (1) copyright the software, and
27306 either verbatim or with modifications and/or translated into another
27324 along with the Program.
27375 with the Program (or with a work based on the Program) on a volume of
27386 Accompany it with the complete corresponding machine-readable
27391 Accompany it with a written offer, valid for at least three
27399 Accompany it with the information you received as to the offer
27402 received the program in object code or executable form with such
27403 an offer, in accord with Subsection b above.)
27413 form) with the major components (compiler, kernel, and so on) of the
27421 compelled to copy the source along with the object code.
27583 along with this program; if not, write to the Free Software
27594 Gnomovision comes with ABSOLUTELY NO WARRANTY; for details
27624 consider it more useful to permit linking proprietary applications with the
27651 with or without modifying it, either commercially or noncommercially.
27663 program should come with manuals providing the same freedoms that the
27683 Document or a portion of it, either copied verbatim, or with
27687 of the Document that deals exclusively with the relationship of the
27693 connection with the subject or with related matters, or of legal,
27713 straightforwardly with generic text editors or (for images composed of
27751 The Document may include Warranty Disclaimers next to the notice which
27784 the full title with all words of the title equally prominent and
27786 Copying with changes limited to the covers, as long as they preserve
27797 copy along with each Opaque copy, or state in or with each Opaque copy
27810 them a chance to provide you with an updated version of the Document.
27817 the Modified Version under precisely this License, with the Modified
27833 Version, together with at least five of the principal authors of the
27895 to conflict in title with any Invariant Section.
27931 You may combine the Document with other documents released under this
27939 multiple identical Invariant Sections may be replaced with a single
27940 copy. If there are multiple Invariant Sections with the same name but
27958 License in the various documents with a single copy that is included in
27970 A compilation of the Document or its derivatives with other separate
27992 Replacing Invariant Sections with translations requires special
28051 with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
28058 replace the ``with...Texts.'' line with this:
28062 with the Invariant Sections being @var{list their titles}, with
28063 the Front-Cover Texts being @var{list}, and with the Back-Cover Texts
28113 Use "non-" only with language names or acronyms, or the words bug and option
28137 "on", "that", "the", "to", "with", and "without",
28148 When using @strong, use "Note:" or "Caution:" with colons and
28150 with @quotation ... @end quotation.
28152 But exercise taste with this rule.
28153 Don't show the awk command with a program in quotes when it's
28193 Enhance FIELDWIDTHS with some way to indicate "the rest of the record".
28235 % - Sorting Array Values and Indices with gawk