aboutsummaryrefslogtreecommitdiffstats
path: root/gawk.1
diff options
context:
space:
mode:
authorArnold D. Robbins <arnold@skeeve.com>2010-07-02 15:46:31 +0300
committerArnold D. Robbins <arnold@skeeve.com>2010-07-02 15:46:31 +0300
commit3711eedc1b995eb1926c9ffb902d5d796cacf8d0 (patch)
tree5642fdee11499774e0b7401f195931cd3a143d18 /gawk.1
parentec6415f1ba061b2fb78808b7dba3246745a15398 (diff)
downloadegawk-3711eedc1b995eb1926c9ffb902d5d796cacf8d0.tar.gz
egawk-3711eedc1b995eb1926c9ffb902d5d796cacf8d0.tar.bz2
egawk-3711eedc1b995eb1926c9ffb902d5d796cacf8d0.zip
Now at 2.02.
Diffstat (limited to 'gawk.1')
-rw-r--r--gawk.11181
1 files changed, 1181 insertions, 0 deletions
diff --git a/gawk.1 b/gawk.1
new file mode 100644
index 00000000..67a76c3b
--- /dev/null
+++ b/gawk.1
@@ -0,0 +1,1181 @@
+.TH GAWK 1 "Free Software Foundation"
+.SH NAME
+gawk \- pattern scanning and processing language
+.SH SYNOPSIS
+.B gawk
+.ig
+[
+.B \-d
+] [
+.B \-D
+] [
+.B \-i
+] [
+.B \-v
+]
+..
+[
+.BI \-F\^ fs
+]
+.B \-f
+.I program-file
+[
+.B \-f
+.I program-file
+\&.\^.\^. ] [
+.B \-\^\-
+] file .\^.\^.
+.br
+.B gawk
+.ig
+[
+.B \-d
+] [
+.B \-D
+] [
+.B \-i
+] [
+.B \-v
+]
+..
+[
+.BI \-F\^ fs
+] [
+.B \-\^\-
+]
+.I program-text
+file .\^.\^.
+.SH DESCRIPTION
+.I Gawk
+is the GNU Project's implementation of the AWK programming language.
+It conforms to the definition and description of the language in
+.IR "The AWK Programming Language" ,
+by Aho, Kernighan, and Weinberger,
+with the additional features defined in the System V Release 4 version
+of \s-1UNIX\s+1
+.IR awk .
+.PP
+The command line consists of options to
+.I gawk
+itself, the AWK program text (if not supplied via the
+.B \-f
+option), and values to be made
+available in the
+.B ARGC
+and
+.B ARGV
+pre-defined AWK variables.
+.PP
+The options that
+.I gawk
+accepts are:
+.TP
+.BI \-F fs
+Use
+.I fs
+for the input field separator (the value of the
+.B FS
+predefined
+variable). For compatibility with \s-1UNIX\s+1
+.IR awk ,
+if
+.I fs
+is ``t'', then
+.B FS
+will be set to the tab character.
+.TP
+.BI \-f " program-file"
+Read the AWK program source from the file
+.IR program-file ,
+instead of from the first command line argument.
+.TP
+.B \-\^\-
+Signal the end of options. This is useful to allow further arguments to the
+AWK program itself to start with a ``\-''.
+This is mainly for consistency with the argument parsing convention used
+by most other System V programs.
+.PP
+Any other options are flagged as illegal, but are otherwise ignored.
+(However, see the
+.B "GNU EXTENSIONS"
+section, below.)
+.PP
+An AWK program consists of a sequence of pattern-action statements
+and optional function definitions.
+.RS
+.PP
+\fIpattern\fB { \fIaction statements\fB }\fR
+.br
+\fBfunction \fIname\fB(\fIparameter list\fB) { \fIstatements\fB }\fR
+.RE
+.PP
+.I Gawk
+first reads the program source from the
+.IR program-file (s)
+if specified, or from the first non-option argument on the command line.
+The
+.B \-f
+option may be used multiple times on the command line.
+.I Gawk
+will read the program text as if all the
+.IR program-file s
+had been concatenated together. This is useful for building libraries
+of AWK functions, without having to include them in each new AWK
+program that uses them. To use a library function in a file from a
+program typed in on the command line, specify
+.B /dev/tty
+as one of the
+.IR program-file s,
+type your program, and end it with a
+.B ^D
+(control-d).
+.PP
+.I Gawk
+compiles the program into an internal form,
+and then proceeds to read
+each file named in the
+.B ARGV
+array.
+If there are no files named on the command line,
+.I gawk
+reads the standard input.
+.PP
+If a ``file'' named on the command line has the form
+.IB var = val
+it is treated as a variable assignment. The variable
+.I var
+will be assigned the value
+.IR val .
+This is most useful for dynamically assigning values to the variables
+AWK uses to control how input is broken into fields and records. It
+is also useful for controlling state if multiple passes are needed over
+a single data file.
+.PP
+For each line in the input,
+.I gawk
+tests to see if it matches any
+.I pattern
+in the AWK program.
+For each pattern that the line matches, the associated
+.I action
+is executed.
+.SH VARIABLES AND FIELDS
+AWK variables are dynamic; they come into existence when they are
+first used. Their values are either floating-point numbers or strings,
+depending upon how they are used. AWK also has single dimension
+arrays; multiply dimensioned arrays may be simulated.
+There are several pre-defined variables that AWK sets as a program
+runs; these will be described as needed and summarized below.
+.PP
+As each input line is read,
+.I gawk
+splits the line into
+.IR fields ,
+using the value of the
+.B FS
+variable as the field separator.
+If
+.B FS
+is a single character, fields are separated by that character.
+Otherwise,
+.B FS
+is expected to be a full regular expression.
+In the special case that
+.B FS
+is a single blank, fields are separated
+by runs of blanks and/or tabs.
+.PP
+Each field in the input line may be referenced by its position,
+.BR $1 ,
+.BR $2 ,
+and so on.
+.B $0
+is the whole line. The value of a field may be assigned to as well.
+Fields need not be referenced by constants:
+.RS
+.PP
+.ft B
+n = 5
+.br
+print $n
+.ft R
+.RE
+.PP
+prints the fifth field in the input line.
+The variable
+.B NF
+is set to the total number of fields in the input line.
+.PP
+References to non-existent fields (i.e. fields after
+.BR $NF ),
+produce the null-string. However, assigning to a non-existent field
+(e.g.,
+.BR "$(NF+2) = 5" )
+will increase the value of
+.BR NF ,
+create any intervening fields with the null string as their value, and
+cause the value of
+.B $0
+to be recomputed, with the fields being separated by the value of
+.BR OFS .
+.SS Built-in Variables
+.PP
+AWK's built-in variables are:
+.PP
+.RS
+.TP \l'\fBFILENAME\fR'
+.B ARGC
+the number of command line arguments (does not include options to
+.IR gawk ,
+or the program source).
+.TP \l'\fBFILENAME\fR'
+.B ARGV
+array of command line arguments. The array is indexed from
+0 to
+.B ARGC
+\- 1.
+Dynamically changing the contents of
+.B ARGV
+can control the files used for data.
+.TP \l'\fBFILENAME\fR'
+.B ENVIRON
+An array containing the values of the current environment.
+The array is indexed by the environment variables, each element being
+the value of that variable (e.g., \fBENVIRON["HOME"]\fP might be
+.BR /u/arnold ).
+Changing this array does not affect the environment seen by programs which
+.I gawk
+spawns via redirection or the
+.B system
+function.
+.TP \l'\fBFILENAME\fR'
+.B FILENAME
+the name of the current input file.
+If no files are specified on the command line, the value of
+.B FILENAME
+is ``\-''.
+.TP \l'\fBFILENAME\fR'
+.B FNR
+the input record number in the current input file.
+.TP \l'\fBFILENAME\fR'
+.B FS
+the input field separator, a blank by default.
+.TP \l'\fBFILENAME\fR'
+.B NF
+the number of fields in the current input record.
+.TP \l'\fBFILENAME\fR'
+.B NR
+the total number of input records seen so far.
+.TP \l'\fBFILENAME\fR'
+.B OFMT
+the output format for numbers,
+.B %.6g
+by default.
+.TP \l'\fBFILENAME\fR'
+.B OFS
+the output field separator, a blank by default.
+.TP \l'\fBFILENAME\fR'
+.B ORS
+the output record separator, by default a newline.
+.TP \l'\fBFILENAME\fR'
+.B RS
+the input record separator, by default a newline.
+.B RS
+is exceptional in that only the first character of its string
+value is used for separating records. If
+.B RS
+is set to the null string, then records are separated by
+blank lines.
+When
+.B RS
+is set to the null string, then the newline character always acts as
+a field separator, in addition to whatever value
+.B FS
+may have.
+.TP \l'\fBFILENAME\fR'
+.B RSTART
+the index of the first character matched by
+.BR match() ;
+0 if no match.
+.TP \l'\fBFILENAME\fR'
+.B RLENGTH
+the length of the string matched by
+.BR match() ;
+\-1 if no match.
+.TP \l'\fBFILENAME\fR'
+.B SUBSEP
+the character used to separate multiple subscripts in array
+elements, by default \fB"\e034"\fR.
+.RE
+.SS Arrays
+.PP
+Arrays are subscripted with an expression between square brackets
+.RB ( [ " and " ] ).
+If the expression is an expression list
+.RI ( expr ", " expr " ...)"
+then the array subscript is a string consisting of the
+concatenation of the (string) value of each expression,
+separated by the value of the
+.B SUBSEP
+variable.
+This facility is used to simulate multiply dimensioned
+arrays. For example:
+.PP
+.RS
+.ft B
+i = "A" ;\^ j = "B" ;\^ k = "C"
+.br
+x[i,j,k] = "hello, world\en"
+.ft R
+.RE
+.PP
+assigns the string \fB"hello, world\en"\fR to the element of the array
+.B x
+which is indexed by the string \fB"A\e034B\e034C"\fR. All arrays in AWK
+are associative, i.e. indexed by string values.
+.PP
+The special operator
+.B in
+may be used in an
+.B if
+or
+.B while
+statement to see if an array has an index consisting of a particular
+value.
+.PP
+.RS
+.ft B
+.nf
+if (val in array)
+ print array[val]
+.fi
+.ft
+.RE
+.PP
+If the array has multiple subscripts, use
+.BR "(i, j) in array" .
+.PP
+The
+.B in
+construct may also be used in a
+.B for
+loop to iterate over all the elements of an array.
+.PP
+An element may be deleted from an array using the
+.B delete
+statement.
+.SS Variable Typing
+.PP
+Variables and fields
+may be (floating point) numbers, or strings, or both. How the
+value of a variable is interpreted depends upon its context. If used in
+a numeric expression, it will be treated as a number, if used as a string
+it will be treated as a string.
+.PP
+To force a variable to be treated as a number, add 0 to it; to force it
+to be treated as a string, concatenate it with the null string.
+.PP
+The AWK language defines comparisons as being done numerically if
+possible, otherwise one or both operands are converted to strings and
+a string comparison is performed.
+.PP
+Uninitialized variables have the numeric value 0 and the string value ""
+(the null, or empty, string).
+.SH PATTERNS AND ACTIONS
+AWK is a line oriented language. The pattern comes first, and then the
+action. Action statements are enclosed in
+.B {
+and
+.BR } .
+Either the pattern may be missing, or the action may be missing, but,
+of course, not both. If the pattern is missing, the action will be
+executed for every single line of input.
+A missing action is equivalent to
+.RS
+.PP
+.B "{ print }"
+.RE
+.PP
+which prints the entire line.
+.PP
+Comments begin with the ``#'' character, and continue until the
+end of the line.
+Blank lines may be used to separate statements.
+Normally, a statement ends with a newline, however, this is not the
+case for lines ending in
+a ``,'', ``{'', ``?'', ``:'', ``&&'', or ``||''.
+Lines ending in
+.B do
+or
+.B else
+also have their statements automatically continued on the following line.
+In other cases, a line can be continued by ending it with a ``\e'',
+in which case the newline will be ignored.
+.PP
+Multiple statements may
+be put on one line by separating them with a ``;''.
+This applies to both the statements within the action part of a
+pattern-action pair (the usual case),
+and to the pattern-action statements themselves.
+.SS Patterns
+AWK patterns may be one of the following:
+.PP
+.RS
+.nf
+.B BEGIN
+.B END
+.BI / "regular expression" /
+.I "relational expression"
+.IB pattern " && " pattern
+.IB pattern " || " pattern
+.IB pattern " ? " pattern " : " pattern
+.BI ( pattern )
+.BI ! " pattern"
+.IB pattern1 ", " pattern2"
+.fi
+.RE
+.PP
+.B BEGIN
+and
+.B END
+are two special kinds of patterns which are not tested against
+the input.
+The action parts of all
+.B BEGIN
+patterns are merged as if all the statements had
+been written in a single
+.B BEGIN
+block. They are executed before any
+of the input is read. Similarly, all the
+.B END
+blocks are merged,
+and executed when all the input is exhausted (or when an
+.B exit
+statement is executed).
+.B BEGIN
+and
+.B END
+patterns cannot be combined with other patterns in pattern expressions.
+.B BEGIN
+and
+.B END
+patterns cannot have missing action parts.
+.PP
+For
+.BI / "regular expression" /
+patterns, the associated statement is executed for each input line that matches
+the regular expression.
+Regular expressions are the same as those in
+.IR egrep (1),
+and are summarized below.
+.PP
+A
+.I "relational expression"
+may use any of the operators defined below in the section on actions.
+These generally test whether certain fields match certain regular expressions.
+.PP
+The
+.BR && ,
+.BR || ,
+and
+.B !
+operators are logical AND, logical OR, and logical NOT, respectively, as in C.
+They do short-circuit evaluation, also as in C, and are used for combining
+more primitive pattern expressions. As in most languages, parentheses
+may be used to change the order of evaluation.
+.PP
+The
+.B ?\^:
+operator is like the same operator in C. If the first pattern is true
+then the pattern used for testing is the second pattern, otherwise it is
+the third. Only one of the second and third patterns is evaluated.
+.PP
+The
+.IB pattern1 ", " pattern2"
+form of an expression is called a range pattern.
+It matches all input lines starting with a line that matches
+.IR pattern1 ,
+and continuing until a line that matches
+.IR pattern2 ,
+inclusive. It does not combine with any other sort of pattern expression.
+.SS Regular Expressions
+Regular expressions are the extended kind found in
+.IR egrep .
+They are composed of characters as follows:
+.RS
+.TP \l'[^abc...]'
+.I c
+matches the non-metacharacter
+.IR c .
+.TP \l'[^abc...]'
+.I \ec
+matches the literal character
+.IR c .
+.TP \l'[^abc...]'
+.B .
+matches any character except newline.
+.TP \l'[^abc...]'
+.B ^
+matches the beginning of a line or a string.
+.TP \l'[^abc...]'
+.B $
+matches the end of a line or a string.
+.TP \l'[^abc...]'
+.BI [ abc... ]
+character class, matches any of the characters
+.IR abc... .
+.TP \l'[^abc...]'
+.BI [^ abc... ]
+negated character class, matches any character except
+.I abc...
+and newline.
+.TP \l'[^abc...]'
+.IB r1 | r2
+alternation: matches either
+.I r1
+or
+.IR r2 .
+.TP \l'[^abc...]'
+.I r1r2
+concatenation: matches
+.IR r1 ,
+and then
+.IR r2 .
+.TP \l'[^abc...]'
+.IB r +
+matches one or more
+.IR r 's.
+.TP \l'[^abc...]'
+.IB r *
+matches zero or more
+.IR r 's.
+.TP \l'[^abc...]'
+.IB r ?
+matches zero or one
+.IR r 's.
+.TP \l'[^abc...]'
+.BI ( r )
+grouping: matches
+.IR r .
+.RE
+.SS Actions
+Action statements are enclosed in braces,
+.B {
+and
+.BR } .
+Action statements consist of the usual assignment, conditional, and looping
+statements found in most languages. The operators, control statements,
+and input/output statements
+available are patterned after those in C.
+.PP
+The operators in AWK, in order of increasing precedence, are
+.PP
+.RS
+.TP \l'\fB= += \-= *= /= %= ^=\fR'
+.B "= += \-= *= /= %= ^="
+Assignment. Both absolute assignment
+.BI ( var " = " value )
+and operator-assignment (the other forms) are supported.
+.TP \l'\fB= += \-= *= /= %= ^=\fR'
+.B ?:
+The C conditional expression. This has the form
+.IB expr1 " ? " expr2 " : " expr3\c
+\&. If
+.I expr1
+is true, the value of the expression is
+.IR expr2 ,
+otherwise it is
+.IR expr3 .
+Only one of
+.I expr2
+and
+.I expr3
+is evaluated.
+.TP \l'\fB= += \-= *= /= %= ^=\fR'
+.B ||
+logical OR.
+.TP \l'\fB= += \-= *= /= %= ^=\fR'
+.B &&
+logical AND.
+.TP \l'\fB= += \-= *= /= %= ^=\fR'
+.B "~ !~"
+regular expression match, negated match.
+.TP \l'\fB= += \-= *= /= %= ^=\fR'
+.B "< <= > >= != =="
+the regular relational operators.
+.TP \l'\fB= += \-= *= /= %= ^=\fR'
+.I blank
+string concatenation.
+.TP \l'\fB= += \-= *= /= %= ^=\fR'
+.B "+ \-"
+addition and subtraction.
+.TP \l'\fB= += \-= *= /= %= ^=\fR'
+.B "* / %"
+multiplication, division, and modulus.
+.TP \l'\fB= += \-= *= /= %= ^=\fR'
+.B "+ \- !"
+unary plus, unary minus, and logical negation.
+.TP \l'\fB= += \-= *= /= %= ^=\fR'
+.B ^
+exponentiation (\fB**\fR may also be used, and \fB**=\fR for
+the assignment operator).
+.TP \l'\fB= += \-= *= /= %= ^=\fR'
+.B "++ \-\^\-"
+increment and decrement, both prefix and postfix.
+.TP \l'\fB= += \-= *= /= %= ^=\fR'
+.B $
+field reference.
+.RE
+.PP
+The control statements are
+as follows:
+.PP
+.RS
+.nf
+\fBif (\fIcondition\fB) \fIstatement\fR [ \fBelse\fI statement \fR]
+\fBwhile (\fIcondition\fB) \fIstatement \fR
+\fBdo \fIstatement \fBwhile (\fIcondition\fB)\fR
+\fBfor (\fIexpr1\fB; \fIexpr2\fB; \fIexpr3\fB) \fIstatement\fR
+\fBfor (\fIvar \fBin\fI array\fB) \fIstatement\fR
+\fBbreak\fR
+\fBcontinue\fR
+\fBdelete \fIarray\^\fB[\^\fIindex\^\fB]\fR
+\fBexit\fR [ \fIexpression\fR ]
+\fB{ \fIstatements \fB}
+.fi
+.RE
+.PP
+The input/output statements are as follows:
+.PP
+.RS
+.TP \l'\fBprintf \fIfmt, expr-list\fR'
+.BI close( filename )
+close file (or pipe, see below).
+.TP \l'\fBprintf \fIfmt, expr-list\fR'
+.B getline
+set
+.B $0
+from next input record; set
+.BR NF ,
+.BR NR ,
+.BR FNR .
+.TP \l'\fBprintf \fIfmt, expr-list\fR'
+.BI "getline <" file
+set
+.B $0
+from next record of
+.IR file ;
+set
+.BR NF .
+.TP \l'\fBprintf \fIfmt, expr-list\fR'
+.BI getline " var"
+set
+.I var
+from next input record; set
+.BR NF ,
+.BR FNR .
+.TP \l'\fBprintf \fIfmt, expr-list\fR'
+.BI getline " var" " <" file
+set
+.I var
+from next record of
+.IR file .
+.TP \l'\fBprintf \fIfmt, expr-list\fR'
+.B next
+Stop processing the current input record. The next input record
+is read and processing starts over with the first pattern in the
+AWK program. If the end of the input data is reached, the
+.B END
+block(s), if any, are executed.
+.TP \l'\fBprintf \fIfmt, expr-list\fR'
+.B print
+prints the current record.
+.TP \l'\fBprintf \fIfmt, expr-list\fR'
+.BI print " expr-list"
+prints expressions.
+.TP \l'\fBprintf \fIfmt, expr-list\fR'
+.BI print " expr-list" " >" file
+prints expressions on
+.IR file .
+.TP \l'\fBprintf \fIfmt, expr-list\fR'
+.BI printf " fmt, expr-list"
+format and print.
+.TP \l'\fBprintf \fIfmt, expr-list\fR'
+.BI printf " fmt, expr-list" " >" file
+format and print on
+.IR file .
+.TP \l'\fBprintf \fIfmt, expr-list\fR'
+.BI system( cmd-line )
+execute the command
+.IR cmd-line ,
+and return the exit status.
+(This may not be available on
+systems besides \s-1UNIX\s+1 and \s-1GNU\s+1.)
+.RE
+.PP
+Other input/output redirections are also allowed. For
+.B print
+and
+.BR printf ,
+.BI >> file
+appends output to the
+.IR file ,
+while
+.BI | " command"
+writes on a pipe.
+In a similar fashion,
+.IB command " | getline"
+pipes into
+.BR getline .
+.BR Getline
+will return 0 on end of file, and \-1 on an error.
+.PP
+The AWK versions of the
+.B printf
+and
+.B sprintf
+(see below)
+functions accept the following conversion specification formats:
+.RS
+.TP
+.B %c
+An ASCII character.
+.TP
+.B %d
+A decimal number (the integer part).
+.TP
+.B %e
+A floating point number of the form
+.BR [\-]d.ddddddE[+\^\-]dd .
+.TP
+.B %f
+A floating point number of the form
+.BR [\-]ddd.dddddd .
+.TP
+.B %g
+Use
+.B e
+or
+.B f
+conversion, whichever is shorter, with nonsignificant zeros suppressed.
+.TP
+.B %o
+An unsigned octal number (again, an integer).
+.TP
+.B %s
+A character string.
+.TP
+.B %x
+An unsigned hexadecimal number (an integer).
+.TP
+.B %%
+A single
+.B %
+character; no argument is converted.
+.RE
+.PP
+There are optional, additional parameters that may lie between the
+.B %
+and the control letter:
+.RS
+.TP
+.B \-
+The expression should be left-justified within its field.
+.TP
+.I width
+The field should be padded to this width. If the number has a leading
+zero, then the field will be padded with zeros.
+Otherwise it is padded with blanks.
+.TP
+.BI . prec
+A number indicating the maximum width of strings or digits to the right
+of the decimal point.
+.RE
+.PP
+The dynamic
+.I width
+and
+.I prec
+capabilities of the C library
+.B printf
+routines are not supported.
+However, they may be simulated by using
+the AWK concatenation operation to build up
+a format specification dynamically.
+.PP
+AWK has the following pre-defined arithmetic functions:
+.PP
+.RS
+.TP \l'\fBsrand(\fIexpr\fB)\fR'
+.BI atan2( y , " x" )
+returns the arctangent of
+.I y/x
+in radians.
+.TP \l'\fBsrand(\fIexpr\fB)\fR'
+.BI cos( expr )
+returns the cosine in radians.
+.TP \l'\fBsrand(\fIexpr\fB)\fR'
+.BI exp( expr )
+the exponential function.
+.TP \l'\fBsrand(\fIexpr\fB)\fR'
+.BI int( expr )
+truncates to integer.
+.TP \l'\fBsrand(\fIexpr\fB)\fR'
+.BI log( expr )
+the natural logarithm function.
+.TP \l'\fBsrand(\fIexpr\fB)\fR'
+.B rand()
+returns a random number between 0 and 1.
+.TP \l'\fBsrand(\fIexpr\fB)\fR'
+.BI sin( expr )
+returns the sine in radians.
+.TP \l'\fBsrand(\fIexpr\fB)\fR'
+.BI sqrt( expr )
+the square root function.
+.TP \l'\fBsrand(\fIexpr\fB)\fR'
+.BI srand( expr )
+use
+.I expr
+as a new seed for the random number generator. If no
+.I expr
+is provided, the time of day will be used.
+The return value is the previous seed for the random
+number generator.
+.RE
+.PP
+AWK has the following pre-defined string functions:
+.PP
+.RS
+.TP \l'\fBsprintf(\fIfmt\fB, \fIexpr-list\fB)\fR'
+\fBgsub(\fIr\fB, \fIs\fB, \fIt\fB)\fR
+for each substring matching the regular expression
+.I r
+in the string
+.IR t ,
+substitute the string
+.IR s ,
+and return the number of substitutions.
+If
+.I t
+is not supplied, use
+.BR $0 .
+.TP \l'\fBsprintf(\fIfmt\fB, \fIexpr-list\fB)\fR'
+.BI index( s , " t" )
+returns the index of the string
+.I t
+in the string
+.IR s ,
+or 0 if
+.I t
+is not present.
+.TP \l'\fBsprintf(\fIfmt\fB, \fIexpr-list\fB)\fR'
+.BI length( s )
+returns the length of the string
+.IR s .
+.TP \l'\fBsprintf(\fIfmt\fB, \fIexpr-list\fB)\fR'
+.BI match( s , " r" )
+returns the position in
+.I s
+where the regular expression
+.I r
+occurs, or 0 if
+.I r
+is not present, and sets the values of
+.B RSTART
+and
+.BR RLENGTH .
+.TP \l'\fBsprintf(\fIfmt\fB, \fIexpr-list\fB)\fR'
+\fBsplit(\fIs\fB, \fIa\fB, \fIr\fB)\fR
+splits the string
+.I s
+into the array
+.I a
+on the regular expression
+.IR r ,
+and returns the number of fields. If
+.I r
+is omitted,
+.B FS
+is used instead.
+.TP \l'\fBsprintf(\fIfmt\fB, \fIexpr-list\fB)\fR'
+.BI sprintf( fmt , " expr-list" )
+prints
+.I expr-list
+according to
+.IR fmt ,
+and returns the resulting string.
+.TP \l'\fBsprintf(\fIfmt\fB, \fIexpr-list\fB)\fR'
+\fBsub(\fIr\fB, \fIs\fB, \fIt\fB)\fR
+this is just like
+.BR gsub ,
+but only the first matching substring is replaced.
+.TP \l'\fBsprintf(\fIfmt\fB, \fIexpr-list\fB)\fR'
+\fBsubstr(\fIs\fB, \fIi\fB, \fIn\fB)\fR
+returns the
+.IR n -character
+substring of
+.I s
+starting at
+.IR i .
+If
+.I n
+is omitted, the rest of
+.I s
+is used.
+.RE
+.PP
+String constants in AWK are sequences of characters enclosed
+between double quotes (\fB"\fR). Within strings, certain
+.I "escape sequences"
+are recognized, as in C. These are:
+.PP
+.RS
+.TP \l'\fB\e\fIddd\fR'
+.B \eb
+backspace.
+.TP \l'\fB\e\fIddd\fR'
+.B \ef
+form-feed.
+.TP \l'\fB\e\fIddd\fR'
+.B \en
+new line.
+.TP \l'\fB\e\fIddd\fR'
+.B \er
+carriage return.
+.TP \l'\fB\e\fIddd\fR'
+.B \et
+horizontal tab.
+.TP \l'\fB\e\fIddd\fR'
+.B \ev
+vertical tab.
+.TP \l'\fB\e\fIddd\fR'
+.BI \e ddd
+The character represented by the 1-, 2-, or 3-digit sequence of octal
+digits. E.g. "\e033" is the ASCII ESC (escape) character.
+.RE
+.SH FUNCTIONS
+Functions in AWK are defined as follows:
+.PP
+.RS
+\fBfunction \fIname\fB(\fIparameter list\fB) { \fIstatements \fB}\fR
+.RE
+.PP
+Functions are executed when called from within the action parts of regular
+pattern-action statements. Actual parameters supplied in the function
+call are used to instantiate the formal parameters declared in the function.
+Arrays are passed by reference, other variables are passed by value.
+.PP
+Since functions were not originally part of the AWK language, the provision
+for local variables is rather clumsy: they are declared as extra parameters
+in the parameter list. The convention is to separate local variables from
+real parameters by extra spaces in the parameter list. For example:
+.PP
+.RS
+.ft B
+.nf
+function f(p, q, a, b) { # a & b are local
+ ..... }
+
+/abc/ { ... ; f(1, 2) ; ... }
+.fi
+.ft R
+.RE
+.PP
+The left parenthesis in a function call is required
+to immediately follow the function name,
+without any intervening white space.
+This is to avoid a syntactic ambiguity with the concatenation operator.
+This restriction does not apply to the built-in functions listed above.
+.PP
+Functions may call each other and may be recursive.
+Function parameters used as local variables are initialized
+to the null string and the number zero upon function invocation.
+.PP
+The word
+.B func
+may be used in place of
+.BR function .
+.SH EXAMPLES
+.nf
+Print and sort the login names of all users:
+
+.ft B
+ BEGIN { FS = ":" }
+ { print $1 | "sort" }
+
+.ft R
+Count lines in a file:
+
+.ft B
+ { nlines++ }
+ END { print nlines }
+
+.ft R
+Precede each line by its number in the file:
+
+.ft B
+ { print FNR, $0 }
+
+.ft R
+Concatenate and line number (a variation on a theme):
+
+.ft B
+ { print NR, $0 }
+.ft R
+.SH SEE ALSO
+.IR "The AWK Programming Language" ,
+Alfred V. Aho, Brian W. Kernighan, Peter J. Weinberger,
+Addison-Wesley, 1988. ISBN 0-201-07981-X.
+.SH SYSTEM V RELEASE 4 COMPATIBILITY
+A primary goal for
+.I gawk
+is compatibility with the latest version of \s-1UNIX\s+1
+.IR awk .
+To this end,
+.I gawk
+incorporates the following user visible
+features which are not described in the AWK book,
+but are part of
+.I awk
+in System V Release 4.
+.PP
+When processing arguments,
+.I gawk
+uses the special option ``\fB\-\^\-\fP'' to signal the end of
+arguments, and warns about, but otherwise ignores, undefined options.
+.PP
+The AWK book does not define the return value of
+.BR srand() .
+The System V Release 4 version of \s-1UNIX\s+1
+.I awk
+has it return the seed it was using, to allow keeping track
+of random number sequences. Therefore
+.B srand()
+in
+.I gawk
+also returns its current seed.
+.PP
+The use of multiple
+.B \-f
+options is a new feature, as is the
+.B ENVIRON
+array.
+.SH GNU EXTENSIONS
+.I Gawk
+has some extensions to System V
+.IR awk .
+They are described in this section.
+All features described in this section may change at some time in
+the future, or may go away entirely. They can be disabled either by
+compiling
+.I gawk
+with
+.BR \-DSTRICT ,
+or by invoking
+.I gawk
+with the name
+.IR awk .
+You should not write programs that depend upon them.
+.PP
+The environment variable
+.B AWKPATH
+specifies a search path to use when finding source files named with
+the
+.B \-f
+option. If this variable does not exist, the default path is
+\fB".:/usr/lib/awk:/usr/local/lib/awk"\fR.
+If a file name given to the
+.B \-f
+option contains a ``/'' character, no path search is performed.
+.PP
+Two new relational operators are defined,
+.BR ~~ ,
+and
+.BR !~~ .
+These perform case independent regular expression match and no-match
+operations, respectively.
+.PP
+The AWK book does not define the return value of the
+.B close
+function.
+.IR Gawk\^ 's
+.B close
+returns the value from
+.IR fclose (3),
+or
+.IR pclose (3),
+when closing a file or pipe, respectively.
+.PP
+.I Gawk
+accepts the following additional arguments:
+.ig
+.TP
+.B \-D
+Turn on general debugging and turn on
+.IR yacc (1)
+or
+.IR bison (1)
+debugging output during program parsing.
+This option should only be of interest to the
+.I gawk
+maintainers, and may not even be compiled into
+.IR gawk .
+.TP
+.B \-d
+Turn on general debugging and print the
+.I gawk
+internal tree as the program is executed.
+This option should only be of interest to the
+.I gawk
+maintainers, and may not even be compiled into
+.IR gawk .
+..
+.TP
+.B \-i
+Ignore case when doing regular expression operations.
+This causes
+.B ~
+and
+.B !~
+to behave like the new operators
+.B ~~
+and
+.BR !~~ ,
+described above.
+.TP
+.B \-v
+Print version information for this particular copy of
+.I gawk
+on the error output.
+This is useful mainly for knowing if the current copy of
+.I gawk
+on your system
+is up to date with respect to whatever the Free Software Foundation
+is distributing.
+.SH BUGS
+The
+.B \-F
+option is not necessary given the command line variable assignment feature;
+it remains only for backwards compatibility.
+.SH AUTHORS
+The original version of \s-1UNIX\s+1
+.I awk
+was designed and implemented by Alfred Aho,
+Peter Weinberger, and Brian Kernighan of AT&T Bell Labs. Brian Kernighan
+continues to maintain and enhance it.
+.PP
+Paul Rubin and Jay Fenlason, with John Woods,
+all of the Free Software Foundation, wrote
+.IR gawk ,
+to be compatible with the original version of
+.I awk
+distributed in Seventh Edition \s-1UNIX\s+1.
+David Trueman of Dalhousie University, with contributions
+from Arnold Robbins at Emory University, made
+.I gawk
+compatible with the new version of \s-1UNIX\s+1
+.IR awk .
+.SH ACKNOWLEDGEMENTS
+Brian Kernighan of Bell Labs
+provided valuable assistance during testing and debugging.
+We thank him.