diff options
author | Arnold D. Robbins <arnold@skeeve.com> | 2010-07-02 15:46:31 +0300 |
---|---|---|
committer | Arnold D. Robbins <arnold@skeeve.com> | 2010-07-02 15:46:31 +0300 |
commit | 3711eedc1b995eb1926c9ffb902d5d796cacf8d0 (patch) | |
tree | 5642fdee11499774e0b7401f195931cd3a143d18 /gawk.1 | |
parent | ec6415f1ba061b2fb78808b7dba3246745a15398 (diff) | |
download | egawk-3711eedc1b995eb1926c9ffb902d5d796cacf8d0.tar.gz egawk-3711eedc1b995eb1926c9ffb902d5d796cacf8d0.tar.bz2 egawk-3711eedc1b995eb1926c9ffb902d5d796cacf8d0.zip |
Now at 2.02.
Diffstat (limited to 'gawk.1')
-rw-r--r-- | gawk.1 | 1181 |
1 files changed, 1181 insertions, 0 deletions
@@ -0,0 +1,1181 @@ +.TH GAWK 1 "Free Software Foundation" +.SH NAME +gawk \- pattern scanning and processing language +.SH SYNOPSIS +.B gawk +.ig +[ +.B \-d +] [ +.B \-D +] [ +.B \-i +] [ +.B \-v +] +.. +[ +.BI \-F\^ fs +] +.B \-f +.I program-file +[ +.B \-f +.I program-file +\&.\^.\^. ] [ +.B \-\^\- +] file .\^.\^. +.br +.B gawk +.ig +[ +.B \-d +] [ +.B \-D +] [ +.B \-i +] [ +.B \-v +] +.. +[ +.BI \-F\^ fs +] [ +.B \-\^\- +] +.I program-text +file .\^.\^. +.SH DESCRIPTION +.I Gawk +is the GNU Project's implementation of the AWK programming language. +It conforms to the definition and description of the language in +.IR "The AWK Programming Language" , +by Aho, Kernighan, and Weinberger, +with the additional features defined in the System V Release 4 version +of \s-1UNIX\s+1 +.IR awk . +.PP +The command line consists of options to +.I gawk +itself, the AWK program text (if not supplied via the +.B \-f +option), and values to be made +available in the +.B ARGC +and +.B ARGV +pre-defined AWK variables. +.PP +The options that +.I gawk +accepts are: +.TP +.BI \-F fs +Use +.I fs +for the input field separator (the value of the +.B FS +predefined +variable). For compatibility with \s-1UNIX\s+1 +.IR awk , +if +.I fs +is ``t'', then +.B FS +will be set to the tab character. +.TP +.BI \-f " program-file" +Read the AWK program source from the file +.IR program-file , +instead of from the first command line argument. +.TP +.B \-\^\- +Signal the end of options. This is useful to allow further arguments to the +AWK program itself to start with a ``\-''. +This is mainly for consistency with the argument parsing convention used +by most other System V programs. +.PP +Any other options are flagged as illegal, but are otherwise ignored. +(However, see the +.B "GNU EXTENSIONS" +section, below.) +.PP +An AWK program consists of a sequence of pattern-action statements +and optional function definitions. +.RS +.PP +\fIpattern\fB { \fIaction statements\fB }\fR +.br +\fBfunction \fIname\fB(\fIparameter list\fB) { \fIstatements\fB }\fR +.RE +.PP +.I Gawk +first reads the program source from the +.IR program-file (s) +if specified, or from the first non-option argument on the command line. +The +.B \-f +option may be used multiple times on the command line. +.I Gawk +will read the program text as if all the +.IR program-file s +had been concatenated together. This is useful for building libraries +of AWK functions, without having to include them in each new AWK +program that uses them. To use a library function in a file from a +program typed in on the command line, specify +.B /dev/tty +as one of the +.IR program-file s, +type your program, and end it with a +.B ^D +(control-d). +.PP +.I Gawk +compiles the program into an internal form, +and then proceeds to read +each file named in the +.B ARGV +array. +If there are no files named on the command line, +.I gawk +reads the standard input. +.PP +If a ``file'' named on the command line has the form +.IB var = val +it is treated as a variable assignment. The variable +.I var +will be assigned the value +.IR val . +This is most useful for dynamically assigning values to the variables +AWK uses to control how input is broken into fields and records. It +is also useful for controlling state if multiple passes are needed over +a single data file. +.PP +For each line in the input, +.I gawk +tests to see if it matches any +.I pattern +in the AWK program. +For each pattern that the line matches, the associated +.I action +is executed. +.SH VARIABLES AND FIELDS +AWK variables are dynamic; they come into existence when they are +first used. Their values are either floating-point numbers or strings, +depending upon how they are used. AWK also has single dimension +arrays; multiply dimensioned arrays may be simulated. +There are several pre-defined variables that AWK sets as a program +runs; these will be described as needed and summarized below. +.PP +As each input line is read, +.I gawk +splits the line into +.IR fields , +using the value of the +.B FS +variable as the field separator. +If +.B FS +is a single character, fields are separated by that character. +Otherwise, +.B FS +is expected to be a full regular expression. +In the special case that +.B FS +is a single blank, fields are separated +by runs of blanks and/or tabs. +.PP +Each field in the input line may be referenced by its position, +.BR $1 , +.BR $2 , +and so on. +.B $0 +is the whole line. The value of a field may be assigned to as well. +Fields need not be referenced by constants: +.RS +.PP +.ft B +n = 5 +.br +print $n +.ft R +.RE +.PP +prints the fifth field in the input line. +The variable +.B NF +is set to the total number of fields in the input line. +.PP +References to non-existent fields (i.e. fields after +.BR $NF ), +produce the null-string. However, assigning to a non-existent field +(e.g., +.BR "$(NF+2) = 5" ) +will increase the value of +.BR NF , +create any intervening fields with the null string as their value, and +cause the value of +.B $0 +to be recomputed, with the fields being separated by the value of +.BR OFS . +.SS Built-in Variables +.PP +AWK's built-in variables are: +.PP +.RS +.TP \l'\fBFILENAME\fR' +.B ARGC +the number of command line arguments (does not include options to +.IR gawk , +or the program source). +.TP \l'\fBFILENAME\fR' +.B ARGV +array of command line arguments. The array is indexed from +0 to +.B ARGC +\- 1. +Dynamically changing the contents of +.B ARGV +can control the files used for data. +.TP \l'\fBFILENAME\fR' +.B ENVIRON +An array containing the values of the current environment. +The array is indexed by the environment variables, each element being +the value of that variable (e.g., \fBENVIRON["HOME"]\fP might be +.BR /u/arnold ). +Changing this array does not affect the environment seen by programs which +.I gawk +spawns via redirection or the +.B system +function. +.TP \l'\fBFILENAME\fR' +.B FILENAME +the name of the current input file. +If no files are specified on the command line, the value of +.B FILENAME +is ``\-''. +.TP \l'\fBFILENAME\fR' +.B FNR +the input record number in the current input file. +.TP \l'\fBFILENAME\fR' +.B FS +the input field separator, a blank by default. +.TP \l'\fBFILENAME\fR' +.B NF +the number of fields in the current input record. +.TP \l'\fBFILENAME\fR' +.B NR +the total number of input records seen so far. +.TP \l'\fBFILENAME\fR' +.B OFMT +the output format for numbers, +.B %.6g +by default. +.TP \l'\fBFILENAME\fR' +.B OFS +the output field separator, a blank by default. +.TP \l'\fBFILENAME\fR' +.B ORS +the output record separator, by default a newline. +.TP \l'\fBFILENAME\fR' +.B RS +the input record separator, by default a newline. +.B RS +is exceptional in that only the first character of its string +value is used for separating records. If +.B RS +is set to the null string, then records are separated by +blank lines. +When +.B RS +is set to the null string, then the newline character always acts as +a field separator, in addition to whatever value +.B FS +may have. +.TP \l'\fBFILENAME\fR' +.B RSTART +the index of the first character matched by +.BR match() ; +0 if no match. +.TP \l'\fBFILENAME\fR' +.B RLENGTH +the length of the string matched by +.BR match() ; +\-1 if no match. +.TP \l'\fBFILENAME\fR' +.B SUBSEP +the character used to separate multiple subscripts in array +elements, by default \fB"\e034"\fR. +.RE +.SS Arrays +.PP +Arrays are subscripted with an expression between square brackets +.RB ( [ " and " ] ). +If the expression is an expression list +.RI ( expr ", " expr " ...)" +then the array subscript is a string consisting of the +concatenation of the (string) value of each expression, +separated by the value of the +.B SUBSEP +variable. +This facility is used to simulate multiply dimensioned +arrays. For example: +.PP +.RS +.ft B +i = "A" ;\^ j = "B" ;\^ k = "C" +.br +x[i,j,k] = "hello, world\en" +.ft R +.RE +.PP +assigns the string \fB"hello, world\en"\fR to the element of the array +.B x +which is indexed by the string \fB"A\e034B\e034C"\fR. All arrays in AWK +are associative, i.e. indexed by string values. +.PP +The special operator +.B in +may be used in an +.B if +or +.B while +statement to see if an array has an index consisting of a particular +value. +.PP +.RS +.ft B +.nf +if (val in array) + print array[val] +.fi +.ft +.RE +.PP +If the array has multiple subscripts, use +.BR "(i, j) in array" . +.PP +The +.B in +construct may also be used in a +.B for +loop to iterate over all the elements of an array. +.PP +An element may be deleted from an array using the +.B delete +statement. +.SS Variable Typing +.PP +Variables and fields +may be (floating point) numbers, or strings, or both. How the +value of a variable is interpreted depends upon its context. If used in +a numeric expression, it will be treated as a number, if used as a string +it will be treated as a string. +.PP +To force a variable to be treated as a number, add 0 to it; to force it +to be treated as a string, concatenate it with the null string. +.PP +The AWK language defines comparisons as being done numerically if +possible, otherwise one or both operands are converted to strings and +a string comparison is performed. +.PP +Uninitialized variables have the numeric value 0 and the string value "" +(the null, or empty, string). +.SH PATTERNS AND ACTIONS +AWK is a line oriented language. The pattern comes first, and then the +action. Action statements are enclosed in +.B { +and +.BR } . +Either the pattern may be missing, or the action may be missing, but, +of course, not both. If the pattern is missing, the action will be +executed for every single line of input. +A missing action is equivalent to +.RS +.PP +.B "{ print }" +.RE +.PP +which prints the entire line. +.PP +Comments begin with the ``#'' character, and continue until the +end of the line. +Blank lines may be used to separate statements. +Normally, a statement ends with a newline, however, this is not the +case for lines ending in +a ``,'', ``{'', ``?'', ``:'', ``&&'', or ``||''. +Lines ending in +.B do +or +.B else +also have their statements automatically continued on the following line. +In other cases, a line can be continued by ending it with a ``\e'', +in which case the newline will be ignored. +.PP +Multiple statements may +be put on one line by separating them with a ``;''. +This applies to both the statements within the action part of a +pattern-action pair (the usual case), +and to the pattern-action statements themselves. +.SS Patterns +AWK patterns may be one of the following: +.PP +.RS +.nf +.B BEGIN +.B END +.BI / "regular expression" / +.I "relational expression" +.IB pattern " && " pattern +.IB pattern " || " pattern +.IB pattern " ? " pattern " : " pattern +.BI ( pattern ) +.BI ! " pattern" +.IB pattern1 ", " pattern2" +.fi +.RE +.PP +.B BEGIN +and +.B END +are two special kinds of patterns which are not tested against +the input. +The action parts of all +.B BEGIN +patterns are merged as if all the statements had +been written in a single +.B BEGIN +block. They are executed before any +of the input is read. Similarly, all the +.B END +blocks are merged, +and executed when all the input is exhausted (or when an +.B exit +statement is executed). +.B BEGIN +and +.B END +patterns cannot be combined with other patterns in pattern expressions. +.B BEGIN +and +.B END +patterns cannot have missing action parts. +.PP +For +.BI / "regular expression" / +patterns, the associated statement is executed for each input line that matches +the regular expression. +Regular expressions are the same as those in +.IR egrep (1), +and are summarized below. +.PP +A +.I "relational expression" +may use any of the operators defined below in the section on actions. +These generally test whether certain fields match certain regular expressions. +.PP +The +.BR && , +.BR || , +and +.B ! +operators are logical AND, logical OR, and logical NOT, respectively, as in C. +They do short-circuit evaluation, also as in C, and are used for combining +more primitive pattern expressions. As in most languages, parentheses +may be used to change the order of evaluation. +.PP +The +.B ?\^: +operator is like the same operator in C. If the first pattern is true +then the pattern used for testing is the second pattern, otherwise it is +the third. Only one of the second and third patterns is evaluated. +.PP +The +.IB pattern1 ", " pattern2" +form of an expression is called a range pattern. +It matches all input lines starting with a line that matches +.IR pattern1 , +and continuing until a line that matches +.IR pattern2 , +inclusive. It does not combine with any other sort of pattern expression. +.SS Regular Expressions +Regular expressions are the extended kind found in +.IR egrep . +They are composed of characters as follows: +.RS +.TP \l'[^abc...]' +.I c +matches the non-metacharacter +.IR c . +.TP \l'[^abc...]' +.I \ec +matches the literal character +.IR c . +.TP \l'[^abc...]' +.B . +matches any character except newline. +.TP \l'[^abc...]' +.B ^ +matches the beginning of a line or a string. +.TP \l'[^abc...]' +.B $ +matches the end of a line or a string. +.TP \l'[^abc...]' +.BI [ abc... ] +character class, matches any of the characters +.IR abc... . +.TP \l'[^abc...]' +.BI [^ abc... ] +negated character class, matches any character except +.I abc... +and newline. +.TP \l'[^abc...]' +.IB r1 | r2 +alternation: matches either +.I r1 +or +.IR r2 . +.TP \l'[^abc...]' +.I r1r2 +concatenation: matches +.IR r1 , +and then +.IR r2 . +.TP \l'[^abc...]' +.IB r + +matches one or more +.IR r 's. +.TP \l'[^abc...]' +.IB r * +matches zero or more +.IR r 's. +.TP \l'[^abc...]' +.IB r ? +matches zero or one +.IR r 's. +.TP \l'[^abc...]' +.BI ( r ) +grouping: matches +.IR r . +.RE +.SS Actions +Action statements are enclosed in braces, +.B { +and +.BR } . +Action statements consist of the usual assignment, conditional, and looping +statements found in most languages. The operators, control statements, +and input/output statements +available are patterned after those in C. +.PP +The operators in AWK, in order of increasing precedence, are +.PP +.RS +.TP \l'\fB= += \-= *= /= %= ^=\fR' +.B "= += \-= *= /= %= ^=" +Assignment. Both absolute assignment +.BI ( var " = " value ) +and operator-assignment (the other forms) are supported. +.TP \l'\fB= += \-= *= /= %= ^=\fR' +.B ?: +The C conditional expression. This has the form +.IB expr1 " ? " expr2 " : " expr3\c +\&. If +.I expr1 +is true, the value of the expression is +.IR expr2 , +otherwise it is +.IR expr3 . +Only one of +.I expr2 +and +.I expr3 +is evaluated. +.TP \l'\fB= += \-= *= /= %= ^=\fR' +.B || +logical OR. +.TP \l'\fB= += \-= *= /= %= ^=\fR' +.B && +logical AND. +.TP \l'\fB= += \-= *= /= %= ^=\fR' +.B "~ !~" +regular expression match, negated match. +.TP \l'\fB= += \-= *= /= %= ^=\fR' +.B "< <= > >= != ==" +the regular relational operators. +.TP \l'\fB= += \-= *= /= %= ^=\fR' +.I blank +string concatenation. +.TP \l'\fB= += \-= *= /= %= ^=\fR' +.B "+ \-" +addition and subtraction. +.TP \l'\fB= += \-= *= /= %= ^=\fR' +.B "* / %" +multiplication, division, and modulus. +.TP \l'\fB= += \-= *= /= %= ^=\fR' +.B "+ \- !" +unary plus, unary minus, and logical negation. +.TP \l'\fB= += \-= *= /= %= ^=\fR' +.B ^ +exponentiation (\fB**\fR may also be used, and \fB**=\fR for +the assignment operator). +.TP \l'\fB= += \-= *= /= %= ^=\fR' +.B "++ \-\^\-" +increment and decrement, both prefix and postfix. +.TP \l'\fB= += \-= *= /= %= ^=\fR' +.B $ +field reference. +.RE +.PP +The control statements are +as follows: +.PP +.RS +.nf +\fBif (\fIcondition\fB) \fIstatement\fR [ \fBelse\fI statement \fR] +\fBwhile (\fIcondition\fB) \fIstatement \fR +\fBdo \fIstatement \fBwhile (\fIcondition\fB)\fR +\fBfor (\fIexpr1\fB; \fIexpr2\fB; \fIexpr3\fB) \fIstatement\fR +\fBfor (\fIvar \fBin\fI array\fB) \fIstatement\fR +\fBbreak\fR +\fBcontinue\fR +\fBdelete \fIarray\^\fB[\^\fIindex\^\fB]\fR +\fBexit\fR [ \fIexpression\fR ] +\fB{ \fIstatements \fB} +.fi +.RE +.PP +The input/output statements are as follows: +.PP +.RS +.TP \l'\fBprintf \fIfmt, expr-list\fR' +.BI close( filename ) +close file (or pipe, see below). +.TP \l'\fBprintf \fIfmt, expr-list\fR' +.B getline +set +.B $0 +from next input record; set +.BR NF , +.BR NR , +.BR FNR . +.TP \l'\fBprintf \fIfmt, expr-list\fR' +.BI "getline <" file +set +.B $0 +from next record of +.IR file ; +set +.BR NF . +.TP \l'\fBprintf \fIfmt, expr-list\fR' +.BI getline " var" +set +.I var +from next input record; set +.BR NF , +.BR FNR . +.TP \l'\fBprintf \fIfmt, expr-list\fR' +.BI getline " var" " <" file +set +.I var +from next record of +.IR file . +.TP \l'\fBprintf \fIfmt, expr-list\fR' +.B next +Stop processing the current input record. The next input record +is read and processing starts over with the first pattern in the +AWK program. If the end of the input data is reached, the +.B END +block(s), if any, are executed. +.TP \l'\fBprintf \fIfmt, expr-list\fR' +.B print +prints the current record. +.TP \l'\fBprintf \fIfmt, expr-list\fR' +.BI print " expr-list" +prints expressions. +.TP \l'\fBprintf \fIfmt, expr-list\fR' +.BI print " expr-list" " >" file +prints expressions on +.IR file . +.TP \l'\fBprintf \fIfmt, expr-list\fR' +.BI printf " fmt, expr-list" +format and print. +.TP \l'\fBprintf \fIfmt, expr-list\fR' +.BI printf " fmt, expr-list" " >" file +format and print on +.IR file . +.TP \l'\fBprintf \fIfmt, expr-list\fR' +.BI system( cmd-line ) +execute the command +.IR cmd-line , +and return the exit status. +(This may not be available on +systems besides \s-1UNIX\s+1 and \s-1GNU\s+1.) +.RE +.PP +Other input/output redirections are also allowed. For +.B print +and +.BR printf , +.BI >> file +appends output to the +.IR file , +while +.BI | " command" +writes on a pipe. +In a similar fashion, +.IB command " | getline" +pipes into +.BR getline . +.BR Getline +will return 0 on end of file, and \-1 on an error. +.PP +The AWK versions of the +.B printf +and +.B sprintf +(see below) +functions accept the following conversion specification formats: +.RS +.TP +.B %c +An ASCII character. +.TP +.B %d +A decimal number (the integer part). +.TP +.B %e +A floating point number of the form +.BR [\-]d.ddddddE[+\^\-]dd . +.TP +.B %f +A floating point number of the form +.BR [\-]ddd.dddddd . +.TP +.B %g +Use +.B e +or +.B f +conversion, whichever is shorter, with nonsignificant zeros suppressed. +.TP +.B %o +An unsigned octal number (again, an integer). +.TP +.B %s +A character string. +.TP +.B %x +An unsigned hexadecimal number (an integer). +.TP +.B %% +A single +.B % +character; no argument is converted. +.RE +.PP +There are optional, additional parameters that may lie between the +.B % +and the control letter: +.RS +.TP +.B \- +The expression should be left-justified within its field. +.TP +.I width +The field should be padded to this width. If the number has a leading +zero, then the field will be padded with zeros. +Otherwise it is padded with blanks. +.TP +.BI . prec +A number indicating the maximum width of strings or digits to the right +of the decimal point. +.RE +.PP +The dynamic +.I width +and +.I prec +capabilities of the C library +.B printf +routines are not supported. +However, they may be simulated by using +the AWK concatenation operation to build up +a format specification dynamically. +.PP +AWK has the following pre-defined arithmetic functions: +.PP +.RS +.TP \l'\fBsrand(\fIexpr\fB)\fR' +.BI atan2( y , " x" ) +returns the arctangent of +.I y/x +in radians. +.TP \l'\fBsrand(\fIexpr\fB)\fR' +.BI cos( expr ) +returns the cosine in radians. +.TP \l'\fBsrand(\fIexpr\fB)\fR' +.BI exp( expr ) +the exponential function. +.TP \l'\fBsrand(\fIexpr\fB)\fR' +.BI int( expr ) +truncates to integer. +.TP \l'\fBsrand(\fIexpr\fB)\fR' +.BI log( expr ) +the natural logarithm function. +.TP \l'\fBsrand(\fIexpr\fB)\fR' +.B rand() +returns a random number between 0 and 1. +.TP \l'\fBsrand(\fIexpr\fB)\fR' +.BI sin( expr ) +returns the sine in radians. +.TP \l'\fBsrand(\fIexpr\fB)\fR' +.BI sqrt( expr ) +the square root function. +.TP \l'\fBsrand(\fIexpr\fB)\fR' +.BI srand( expr ) +use +.I expr +as a new seed for the random number generator. If no +.I expr +is provided, the time of day will be used. +The return value is the previous seed for the random +number generator. +.RE +.PP +AWK has the following pre-defined string functions: +.PP +.RS +.TP \l'\fBsprintf(\fIfmt\fB, \fIexpr-list\fB)\fR' +\fBgsub(\fIr\fB, \fIs\fB, \fIt\fB)\fR +for each substring matching the regular expression +.I r +in the string +.IR t , +substitute the string +.IR s , +and return the number of substitutions. +If +.I t +is not supplied, use +.BR $0 . +.TP \l'\fBsprintf(\fIfmt\fB, \fIexpr-list\fB)\fR' +.BI index( s , " t" ) +returns the index of the string +.I t +in the string +.IR s , +or 0 if +.I t +is not present. +.TP \l'\fBsprintf(\fIfmt\fB, \fIexpr-list\fB)\fR' +.BI length( s ) +returns the length of the string +.IR s . +.TP \l'\fBsprintf(\fIfmt\fB, \fIexpr-list\fB)\fR' +.BI match( s , " r" ) +returns the position in +.I s +where the regular expression +.I r +occurs, or 0 if +.I r +is not present, and sets the values of +.B RSTART +and +.BR RLENGTH . +.TP \l'\fBsprintf(\fIfmt\fB, \fIexpr-list\fB)\fR' +\fBsplit(\fIs\fB, \fIa\fB, \fIr\fB)\fR +splits the string +.I s +into the array +.I a +on the regular expression +.IR r , +and returns the number of fields. If +.I r +is omitted, +.B FS +is used instead. +.TP \l'\fBsprintf(\fIfmt\fB, \fIexpr-list\fB)\fR' +.BI sprintf( fmt , " expr-list" ) +prints +.I expr-list +according to +.IR fmt , +and returns the resulting string. +.TP \l'\fBsprintf(\fIfmt\fB, \fIexpr-list\fB)\fR' +\fBsub(\fIr\fB, \fIs\fB, \fIt\fB)\fR +this is just like +.BR gsub , +but only the first matching substring is replaced. +.TP \l'\fBsprintf(\fIfmt\fB, \fIexpr-list\fB)\fR' +\fBsubstr(\fIs\fB, \fIi\fB, \fIn\fB)\fR +returns the +.IR n -character +substring of +.I s +starting at +.IR i . +If +.I n +is omitted, the rest of +.I s +is used. +.RE +.PP +String constants in AWK are sequences of characters enclosed +between double quotes (\fB"\fR). Within strings, certain +.I "escape sequences" +are recognized, as in C. These are: +.PP +.RS +.TP \l'\fB\e\fIddd\fR' +.B \eb +backspace. +.TP \l'\fB\e\fIddd\fR' +.B \ef +form-feed. +.TP \l'\fB\e\fIddd\fR' +.B \en +new line. +.TP \l'\fB\e\fIddd\fR' +.B \er +carriage return. +.TP \l'\fB\e\fIddd\fR' +.B \et +horizontal tab. +.TP \l'\fB\e\fIddd\fR' +.B \ev +vertical tab. +.TP \l'\fB\e\fIddd\fR' +.BI \e ddd +The character represented by the 1-, 2-, or 3-digit sequence of octal +digits. E.g. "\e033" is the ASCII ESC (escape) character. +.RE +.SH FUNCTIONS +Functions in AWK are defined as follows: +.PP +.RS +\fBfunction \fIname\fB(\fIparameter list\fB) { \fIstatements \fB}\fR +.RE +.PP +Functions are executed when called from within the action parts of regular +pattern-action statements. Actual parameters supplied in the function +call are used to instantiate the formal parameters declared in the function. +Arrays are passed by reference, other variables are passed by value. +.PP +Since functions were not originally part of the AWK language, the provision +for local variables is rather clumsy: they are declared as extra parameters +in the parameter list. The convention is to separate local variables from +real parameters by extra spaces in the parameter list. For example: +.PP +.RS +.ft B +.nf +function f(p, q, a, b) { # a & b are local + ..... } + +/abc/ { ... ; f(1, 2) ; ... } +.fi +.ft R +.RE +.PP +The left parenthesis in a function call is required +to immediately follow the function name, +without any intervening white space. +This is to avoid a syntactic ambiguity with the concatenation operator. +This restriction does not apply to the built-in functions listed above. +.PP +Functions may call each other and may be recursive. +Function parameters used as local variables are initialized +to the null string and the number zero upon function invocation. +.PP +The word +.B func +may be used in place of +.BR function . +.SH EXAMPLES +.nf +Print and sort the login names of all users: + +.ft B + BEGIN { FS = ":" } + { print $1 | "sort" } + +.ft R +Count lines in a file: + +.ft B + { nlines++ } + END { print nlines } + +.ft R +Precede each line by its number in the file: + +.ft B + { print FNR, $0 } + +.ft R +Concatenate and line number (a variation on a theme): + +.ft B + { print NR, $0 } +.ft R +.SH SEE ALSO +.IR "The AWK Programming Language" , +Alfred V. Aho, Brian W. Kernighan, Peter J. Weinberger, +Addison-Wesley, 1988. ISBN 0-201-07981-X. +.SH SYSTEM V RELEASE 4 COMPATIBILITY +A primary goal for +.I gawk +is compatibility with the latest version of \s-1UNIX\s+1 +.IR awk . +To this end, +.I gawk +incorporates the following user visible +features which are not described in the AWK book, +but are part of +.I awk +in System V Release 4. +.PP +When processing arguments, +.I gawk +uses the special option ``\fB\-\^\-\fP'' to signal the end of +arguments, and warns about, but otherwise ignores, undefined options. +.PP +The AWK book does not define the return value of +.BR srand() . +The System V Release 4 version of \s-1UNIX\s+1 +.I awk +has it return the seed it was using, to allow keeping track +of random number sequences. Therefore +.B srand() +in +.I gawk +also returns its current seed. +.PP +The use of multiple +.B \-f +options is a new feature, as is the +.B ENVIRON +array. +.SH GNU EXTENSIONS +.I Gawk +has some extensions to System V +.IR awk . +They are described in this section. +All features described in this section may change at some time in +the future, or may go away entirely. They can be disabled either by +compiling +.I gawk +with +.BR \-DSTRICT , +or by invoking +.I gawk +with the name +.IR awk . +You should not write programs that depend upon them. +.PP +The environment variable +.B AWKPATH +specifies a search path to use when finding source files named with +the +.B \-f +option. If this variable does not exist, the default path is +\fB".:/usr/lib/awk:/usr/local/lib/awk"\fR. +If a file name given to the +.B \-f +option contains a ``/'' character, no path search is performed. +.PP +Two new relational operators are defined, +.BR ~~ , +and +.BR !~~ . +These perform case independent regular expression match and no-match +operations, respectively. +.PP +The AWK book does not define the return value of the +.B close +function. +.IR Gawk\^ 's +.B close +returns the value from +.IR fclose (3), +or +.IR pclose (3), +when closing a file or pipe, respectively. +.PP +.I Gawk +accepts the following additional arguments: +.ig +.TP +.B \-D +Turn on general debugging and turn on +.IR yacc (1) +or +.IR bison (1) +debugging output during program parsing. +This option should only be of interest to the +.I gawk +maintainers, and may not even be compiled into +.IR gawk . +.TP +.B \-d +Turn on general debugging and print the +.I gawk +internal tree as the program is executed. +This option should only be of interest to the +.I gawk +maintainers, and may not even be compiled into +.IR gawk . +.. +.TP +.B \-i +Ignore case when doing regular expression operations. +This causes +.B ~ +and +.B !~ +to behave like the new operators +.B ~~ +and +.BR !~~ , +described above. +.TP +.B \-v +Print version information for this particular copy of +.I gawk +on the error output. +This is useful mainly for knowing if the current copy of +.I gawk +on your system +is up to date with respect to whatever the Free Software Foundation +is distributing. +.SH BUGS +The +.B \-F +option is not necessary given the command line variable assignment feature; +it remains only for backwards compatibility. +.SH AUTHORS +The original version of \s-1UNIX\s+1 +.I awk +was designed and implemented by Alfred Aho, +Peter Weinberger, and Brian Kernighan of AT&T Bell Labs. Brian Kernighan +continues to maintain and enhance it. +.PP +Paul Rubin and Jay Fenlason, with John Woods, +all of the Free Software Foundation, wrote +.IR gawk , +to be compatible with the original version of +.I awk +distributed in Seventh Edition \s-1UNIX\s+1. +David Trueman of Dalhousie University, with contributions +from Arnold Robbins at Emory University, made +.I gawk +compatible with the new version of \s-1UNIX\s+1 +.IR awk . +.SH ACKNOWLEDGEMENTS +Brian Kernighan of Bell Labs +provided valuable assistance during testing and debugging. +We thank him. |