aboutsummaryrefslogtreecommitdiffstats
path: root/doc/gawk.1
diff options
context:
space:
mode:
Diffstat (limited to 'doc/gawk.1')
-rw-r--r--doc/gawk.1502
1 files changed, 282 insertions, 220 deletions
diff --git a/doc/gawk.1 b/doc/gawk.1
index 90288db5..34fa7923 100644
--- a/doc/gawk.1
+++ b/doc/gawk.1
@@ -1,6 +1,5 @@
.ds PX \s-1POSIX\s+1
.ds UX \s-1UNIX\s+1
-.ds AN \s-1ANSI\s+1
.ds GN \s-1GNU\s+1
.ds AK \s-1AWK\s+1
.ds EP \fIGAWK: Effective AWK Programming\fP
@@ -14,7 +13,7 @@
. if \w'\(rq' .ds rq "\(rq
. \}
.\}
-.TH GAWK 1 "Dec 07 2012" "Free Software Foundation" "Utility Commands"
+.TH GAWK 1 "Apr 24 2013" "Free Software Foundation" "Utility Commands"
.SH NAME
gawk \- pattern scanning and processing language
.SH SYNOPSIS
@@ -43,7 +42,7 @@ This version in turn is based on the description in
by Aho, Kernighan, and Weinberger.
.I Gawk
provides the additional features found in the current version
-of \*(UX
+of Brian Kernighan's
.I awk
and a number of \*(GN-specific extensions.
.PP
@@ -60,7 +59,7 @@ and
.B ARGV
pre-defined \*(AK variables.
.PP
-When
+When
.I gawk
is invoked with the
.B \-\^\-profile
@@ -107,7 +106,7 @@ next command line argument.
Long options may be abbreviated, as long as the abbreviation
remains unique.
.PP
-Additionally, each long option has a corresponding short
+Additionally, every long option has a corresponding short
option, so that the option's functionality may be used from
within
.B #!
@@ -158,7 +157,7 @@ to the variable
before execution of the program begins.
Such variable values are available to the
.B BEGIN
-block of an \*(AK program.
+rule of an \*(AK program.
.TP
.PD 0
.B \-b
@@ -171,6 +170,7 @@ process strings as multibyte characters.
The
.B "\-\^\-posix"
option overrides this one.
+.bp
.TP
.PD 0
.B \-c
@@ -181,7 +181,7 @@ Run in
.I compatibility
mode. In compatibility mode,
.I gawk
-behaves identically to \*(UX
+behaves identically to Brian Kernighan's
.IR awk ;
none of the \*(GN-specific extensions are recognized.
.\" The use of
@@ -234,7 +234,7 @@ Enable debugging of \*(AK programs.
By default, the debugger reads commands interactively from the terminal.
The optional
.IR file
-argument can be used to specify a file with a list
+argument specifies a file with a list
of commands for the debugger to execute non-interactively.
.TP
.PD 0
@@ -304,8 +304,10 @@ Load an awk source library.
This searches for the library using the
.B AWKPATH
environment variable. If the initial search fails, another attempt will
-be made after appending the ".awk" suffix. The file will be loaded only
-once (i.e. duplicates are eliminated), and the code does not constitute
+be made after appending the
+.B \&.awk
+suffix. The file will be loaded only
+once (i.e., duplicates are eliminated), and the code does not constitute
the main program source.
.TP
.PD 0
@@ -347,7 +349,7 @@ actually invalid are issued. (This is not fully implemented yet.)
Force arbitrary precision arithmetic on numbers. This option has
no effect if
.I gawk
-is not compiled to use the GNU MPFR and MP libraries.
+is not compiled to use the GNU MPFR and MP libraries.
.TP
.PD 0
.B \-n
@@ -415,12 +417,12 @@ elimination for recursive functions. The
maintainer hopes to add additional optimizations over time.
.TP
.PD 0
-\fB\-p\fR[\fIprof_file\fR]
+\fB\-p\fR[\fIprof-file\fR]
.TP
.PD
-\fB\-\^\-profile\fR[\fB=\fIprof_file\fR]
+\fB\-\^\-profile\fR[\fB=\fIprof-file\fR]
Start a profiling session, and send the profiling data to
-.IR prof_file .
+.IR prof-file .
The default is
.BR awkprof.out .
The profile contains execution counts of each statement in the program
@@ -487,7 +489,7 @@ and
.I egrep
consistent with each other.
They are enabled by default, but this option remains for use with
-.BR \-\^-traditional .
+.BR \-\^\-traditional .
.TP
.PD 0
.BI \-S
@@ -500,7 +502,7 @@ in sandbox mode, disabling the
.B system()
function, input redirection with
.BR getline ,
-output redirection with
+output redirection with
.BR print " and " printf ,
and loading dynamic extensions.
Command execution (through pipelines) is also disabled.
@@ -513,7 +515,7 @@ This effectively blocks a script from accessing local resources
.PD
.B \-\^\-lint\-old
Provide warnings about constructs that are
-not portable to the original version of Unix
+not portable to the original version of \*(UX
.IR awk .
.TP
.PD 0
@@ -547,6 +549,10 @@ options are passed on to the \*(AK program in the
.B ARGV
array for processing. This is particularly useful for running \*(AK
programs via the \*(lq#!\*(rq executable interpreter mechanism.
+.PP
+For \*(PX compatibility, the
+.B \-W
+option may be used, followed by the name of a long option.
.SH AWK PROGRAM EXECUTION
.PP
An \*(AK program consists of a sequence of pattern-action statements
@@ -586,13 +592,16 @@ functions with command line programs.
In addition, lines beginning with
.B @include
may be used to include other source files into your program,
-making library use even easier.
+making library use even easier. This is equivalent
+to using the
+.B \-i
+option.
.PP
Lines beginning with
.B @load
may be used to load shared libraries into your program. This is equivalent
to using the
-.B \-l
+.B \-l
option.
.PP
The environment variable
@@ -611,6 +620,17 @@ If a file name given to the
.B \-f
option contains a \*(lq/\*(rq character, no path search is performed.
.PP
+The environment variable
+.B AWKLIBPATH
+specifies a search path to use when finding source files named with
+the
+.B \-l
+option. If this variable does not exist, the default path is
+\fB".:/usr/local/lib/gawk"\fR.
+(The actual directory may vary, depending upon how
+.I gawk
+was built and installed.)
+.PP
.I Gawk
executes \*(AK programs in the following order.
First,
@@ -624,7 +644,7 @@ Then,
.I gawk
executes the code in the
.B BEGIN
-block(s) (if any),
+rule(s) (if any),
and then proceeds to read
each file named in the
.B ARGV
@@ -642,7 +662,7 @@ will be assigned the value
.IR val .
(This happens after any
.B BEGIN
-block(s) have been run.)
+rule(s) have been run.)
Command line variable assignment
is most useful for dynamically assigning values to the variables
\*(AK uses to control how input is broken into fields and records.
@@ -673,16 +693,17 @@ For each record in the input,
tests to see if it matches any
.I pattern
in the \*(AK program.
-For each pattern that the record matches, the associated
-.I action
-is executed.
+For each pattern that the record matches,
+.I gawk
+executes the associated
+.IR action .
The patterns are tested in the order they occur in the program.
.PP
Finally, after all the input is exhausted,
.I gawk
executes the code in the
.B END
-block(s) (if any).
+rule(s) (if any).
.SS Command Line Directories
.PP
According to POSIX, files named on the
@@ -710,6 +731,10 @@ first used. Their values are either floating-point numbers or strings,
or both,
depending upon how they are used. \*(AK also has one dimensional
arrays; arrays with multiple dimensions may be simulated.
+.I Gawk
+provides true arrays of arrays; see
+.BR Arrays ,
+below.
Several pre-defined variables are set as a program
runs; these are described as needed and summarized below.
.SS Records
@@ -799,7 +824,7 @@ or
overrides the use of
.BR FPAT .
.PP
-Each field in the input record may be referenced by its position,
+Each field in the input record may be referenced by its position:
.BR $1 ,
.BR $2 ,
and so on.
@@ -821,14 +846,14 @@ The variable
.B NF
is set to the total number of fields in the input record.
.PP
-References to non-existent fields (i.e. fields after
+References to non-existent fields (i.e., fields after
.BR $NF )
produce the null-string. However, assigning to a non-existent field
(e.g.,
.BR "$(NF+2) = 5" )
increases the value of
.BR NF ,
-creates any intervening fields with the null string as their value, and
+creates any intervening fields with the null string as their values, and
causes the value of
.B $0
to be recomputed, with the fields being separated by the value of
@@ -891,7 +916,7 @@ The conversion format for numbers, \fB"%.6g"\fR, by default.
An array containing the values of the current environment.
The array is indexed by the environment variables, each element being
the value of that variable (e.g., \fBENVIRON["HOME"]\fP might be
-.BR /home/arnold ).
+\fB"/home/arnold"\fR).
Changing this array does not affect the environment seen by programs which
.I gawk
spawns via redirection or the
@@ -931,7 +956,7 @@ However,
.B FILENAME
is undefined inside the
.B BEGIN
-block
+rule
(unless set by
.BR getline ).
.TP
@@ -958,13 +983,13 @@ The input field separator, a space by default. See
above.
.TP
.B FUNCTAB
-An array whose indices are the names of all the user-defined
+An array whose indices and corresponding values
+are the names of all the user-defined
or extension functions in the program.
.BR NOTE :
-The array values cannot currently be used.
-Also, you may not use the
+You may not use the
.B delete
-statment with the
+statement with the
.B FUNCTAB
array.
.TP
@@ -1063,7 +1088,7 @@ The following elements are guaranteed to be available:
.RS
.TP \w'\fBPROCINFO["version"]\fR'u+1n
\fBPROCINFO["egid"]\fP
-the value of the
+The value of the
.IR getegid (2)
system call.
.TP
@@ -1072,7 +1097,7 @@ The default time format string for
.BR strftime() .
.TP
\fBPROCINFO["euid"]\fP
-the value of the
+The value of the
.IR geteuid (2)
system call.
.TP
@@ -1089,7 +1114,13 @@ is in effect.
.TP
\fBPROCINFO["identifiers"]\fP
A subarray, indexed by the names of all identifiers used in the
-text of the AWK program. For each identifier, the value of the element is one of the following:
+text of the AWK program.
+The values indicate what
+.I gawk
+knows about the identifiers after it has finished parsing the program; they are
+.I not
+updated while the program runs.
+For each identifier, the value of the element is one of the following:
.RS
.TP
\fB"array"\fR
@@ -1110,28 +1141,23 @@ doesn't know yet).
\fB"user"\fR
The identifier is a user-defined function.
.RE
-The values indicate what
-.I gawk
-knows about the identifiers after it has finished parsing the program; they are
-.I not
-updated while the program runs.
.TP
\fBPROCINFO["gid"]\fP
-the value of the
+The value of the
.IR getgid (2)
system call.
.TP
\fBPROCINFO["pgrpid"]\fP
-the process group ID of the current process.
+The process group ID of the current process.
.TP
\fBPROCINFO["pid"]\fP
-the process ID of the current process.
+The process ID of the current process.
.TP
\fBPROCINFO["ppid"]\fP
-the parent process ID of the current process.
+The parent process ID of the current process.
.TP
\fBPROCINFO["uid"]\fP
-the value of the
+The value of the
.IR getuid (2)
system call.
.TP
@@ -1157,11 +1183,11 @@ and
\fB"@unsorted"\fR.
The value can also be the name of any comparison function defined
as follows:
-.PP
-.RS
+.sp
+.in +5m
\fBfunction cmp_func(i1, v1, i2, v2)\fR
-.RE
-.PP
+.in -5m
+.sp
where
.I i1
and
@@ -1176,7 +1202,7 @@ It should return a number less than, equal to, or greater than 0,
depending on how the elements of the array are to be ordered.
.TP
\fBPROCINFO["input", "READ_TIMEOUT"]\fP
-specifies the timeout in milliseconds for reading data from
+The timeout in milliseconds for reading data from
.IR input ,
where
.I input
@@ -1184,22 +1210,38 @@ is a redirection string or a filename. A value of zero or
less than zero means no timeout.
.TP
\fBPROCINFO["mpfr_version"]\fP
-the version of the GNU MPFR library used for arbitrary precision
+The version of the GNU MPFR library used for arbitrary precision
number support in
.IR gawk .
+This entry is not present if MPFR support is not compiled into
+.IR gawk .
.TP
\fBPROCINFO["gmp_version"]\fP
-the version of the GNU MP library used for arbitrary precision
+The version of the GNU MP library used for arbitrary precision
number support in
.IR gawk .
+This entry is not present if MPFR support is not compiled into
+.IR gawk .
.TP
\fBPROCINFO["prec_max"]\fP
-the maximum precision supported by the GNU MPFR library for
+The maximum precision supported by the GNU MPFR library for
arbitrary precision floating-point numbers.
+This entry is not present if MPFR support is not compiled into
+.IR gawk .
.TP
\fBPROCINFO["prec_min"]\fP
-the minimum precision allowed by the GNU MPFR library for
+The minimum precision allowed by the GNU MPFR library for
arbitrary precision floating-point numbers.
+This entry is not present if MPFR support is not compiled into
+.IR gawk .
+.TP
+\fBPROCINFO["api_major"]\fP
+The major version of the extension API.
+This entry is not present if loading dynamic extensions is not available.
+.TP
+\fBPROCINFO["api_minor"]\fP
+The minor version of the extension API.
+This entry is not present if loading dynamic extensions is not available.
.TP
\fBPROCINFO["version"]\fP
the version of
@@ -1248,15 +1290,17 @@ elements, by default \fB"\e034"\fR.
An array whose indices are the names of all currently defined
global variables and arrays in the program. The array may be used
for indirect access to read or write the value of a variable:
-.PP
-.RS
+.sp
.ft B
+.nf
+.in +5m
foo = 5
SYMTAB["foo"] = 4
print foo # prints 4
+.fi
.ft R
-.RE
-.PP
+.in -5m
+.sp
The
.B isarray()
function may be used to test if an element in
@@ -1264,7 +1308,7 @@ function may be used to test if an element in
is an array.
You may not use the
.B delete
-statment with the
+statement with the
.B SYMTAB
array.
.TP
@@ -1296,7 +1340,7 @@ x[i, j, k] = "hello, world\en"
assigns the string \fB"hello, world\en"\fR to the element of the array
.B x
which is indexed by the string \fB"A\e034B\e034C"\fR. All arrays in \*(AK
-are associative, i.e. indexed by string values.
+are associative, i.e., indexed by string values.
.PP
The special operator
.B in
@@ -1333,6 +1377,7 @@ just by specifying the array name without a subscript.
supports true multidimensional arrays. It does not require that
such arrays be ``rectangular'' as in C or C++.
For example:
+.sp
.RS
.ft B
.nf
@@ -1342,6 +1387,18 @@ a[2][2] = 7
.fi
.ft
.RE
+.PP
+.BR NOTE :
+You may need to tell
+.I gawk
+that an array element is really a subarray in order to use it where
+.I gawk
+expects an array (such as in the second argument to
+.BR split() ).
+You can do this by creating an element in the subarray and then
+deleting it with the
+.B delete
+statement.
.SS Variable Typing And Conversion
.PP
Variables and fields
@@ -1353,6 +1410,9 @@ it will be treated as a string.
To force a variable to be treated as a number, add 0 to it; to force it
to be treated as a string, concatenate it with the null string.
.PP
+Uninitialized variables have the numeric value 0 and the string value ""
+(the null, or empty, string).
+.PP
When a string must be converted to a number, the conversion is accomplished
using
.IR strtod (3).
@@ -1383,7 +1443,7 @@ has a string value of \fB"12"\fR and not \fB"12.00"\fR.
.BR NOTE :
When operating in POSIX mode (such as with the
.B \-\^\-posix
-command line option),
+option),
beware that locale settings may interfere with the way
decimal numbers are treated: the decimal separator of the numbers you
are feeding to
@@ -1420,9 +1480,6 @@ The basic idea is that
.IR "user input" ,
and only user input, that looks numeric,
should be treated that way.
-.PP
-Uninitialized variables have the numeric value 0 and the string value ""
-(the null, or empty, string).
.SS Octal and Hexadecimal Constants
You may use C-style octal and hexadecimal constants in your AWK
program source code.
@@ -1448,28 +1505,28 @@ A literal backslash.
The \*(lqalert\*(rq character; usually the \s-1ASCII\s+1 \s-1BEL\s+1 character.
.TP
.B \eb
-backspace.
+Backspace.
.TP
.B \ef
-form-feed.
+Form-feed.
.TP
.B \en
-newline.
+Newline.
.TP
.B \er
-carriage return.
+Carriage return.
.TP
.B \et
-horizontal tab.
+Horizontal tab.
.TP
.B \ev
-vertical tab.
+Vertical tab.
.TP
.BI \ex "\^hex digits"
The character represented by the string of hexadecimal digits following
the
.BR \ex .
-As in \*(AN C, all following hexadecimal digits are considered part of
+As in ISO C, all following hexadecimal digits are considered part of
the escape sequence.
(This feature should tell us something about language design by committee.)
E.g., \fB"\ex1B"\fR is the \s-1ASCII\s+1 \s-1ESC\s+1 (escape) character.
@@ -1568,10 +1625,10 @@ The action parts of all
patterns are merged as if all the statements had
been written in a single
.B BEGIN
-block. They are executed before any
+rule. They are executed before any
of the input is read. Similarly, all the
.B END
-blocks are merged,
+rules are merged,
and executed when all the input is exhausted (or when an
.B exit
statement is executed).
@@ -1594,7 +1651,7 @@ Inside the
.B BEGINFILE
rule, the value of
.B ERRNO
-will be the empty string if the file could be opened successfully.
+will be the empty string if the file was opened successfully.
Otherwise, there is some problem with the file and the code should
use
.B nextfile
@@ -1646,58 +1703,59 @@ Regular expressions are the extended kind found in
They are composed of characters as follows:
.TP "\w'\fB[^\fIabc.\|.\|.\fB]\fR'u+2n"
.I c
-matches the non-metacharacter
+Matches the non-metacharacter
.IR c .
.TP
.I \ec
-matches the literal character
+Matches the literal character
.IR c .
.TP
.B .
-matches any character
+Matches any character
.I including
newline.
.TP
.B ^
-matches the beginning of a string.
+Matches the beginning of a string.
.TP
.B $
-matches the end of a string.
+Matches the end of a string.
.TP
.BI [ abc.\|.\|. ]
-character list, matches any of the characters
+A character list: matches any of the characters
.IR abc.\|.\|. .
+You may include a range of characters by separating them with a dash.
.TP
\fB[^\fIabc.\|.\|.\fB]\fR
-negated character list, matches any character except
+A negated character list: matches any character except
.IR abc.\|.\|. .
.TP
.IB r1 | r2
-alternation: matches either
+Alternation: matches either
.I r1
or
.IR r2 .
.TP
.I r1r2
-concatenation: matches
+Concatenation: matches
.IR r1 ,
and then
.IR r2 .
.TP
.IB r\^ +
-matches one or more
+Matches one or more
.IR r\^ "'s."
.TP
.IB r *
-matches zero or more
+Matches zero or more
.IR r\^ "'s."
.TP
.IB r\^ ?
-matches zero or one
+Matches zero or one
.IR r\^ "'s."
.TP
.BI ( r )
-grouping: matches
+Grouping: matches
.IR r .
.TP
.PD 0
@@ -1728,37 +1786,38 @@ is repeated at least
times.
.TP
.B \ey
-matches the empty string at either the beginning or the
+Matches the empty string at either the beginning or the
end of a word.
.TP
.B \eB
-matches the empty string within a word.
+Matches the empty string within a word.
.TP
.B \e<
-matches the empty string at the beginning of a word.
+Matches the empty string at the beginning of a word.
.TP
.B \e>
-matches the empty string at the end of a word.
+Matches the empty string at the end of a word.
.TP
.B \es
-matches any whitespace character.
+Matches any whitespace character.
.TP
.B \eS
-matches any nonwhitespace character.
+Matches any nonwhitespace character.
.TP
.B \ew
-matches any word-constituent character (letter, digit, or underscore).
+Matches any word-constituent character (letter, digit, or underscore).
.TP
.B \eW
-matches any character that is not word-constituent.
+Matches any character that is not word-constituent.
.TP
.B \e`
-matches the empty string at the beginning of a buffer (string).
+Matches the empty string at the beginning of a buffer (string).
.TP
.B \e'
-matches the empty string at the end of a buffer.
+Matches the empty string at the end of a buffer.
.PP
-The escape sequences that are valid in string constants (see below)
+The escape sequences that are valid in string constants (see
+.BR "String Constants" )
are also valid in regular expressions.
.PP
.I "Character classes"
@@ -1907,7 +1966,7 @@ interprets characters in regular expressions.
No options
In the default case,
.I gawk
-provide all the facilities of
+provides all the facilities of
\*(PX regular expressions and the \*(GN regular expression operators described above.
.TP
.B \-\^\-posix
@@ -1918,7 +1977,7 @@ matches a literal
.BR w ).
.TP
.B \-\^\-traditional
-Traditional Unix
+Traditional \*(UX
.I awk
regular expressions are matched. The \*(GN operators
are not special, and interval expressions are not available.
@@ -1940,7 +1999,7 @@ and input/output statements
available are patterned after those in C.
.SS Operators
.PP
-The operators in \*(AK, in order of decreasing precedence, are
+The operators in \*(AK, in order of decreasing precedence, are:
.PP
.TP "\w'\fB*= /= %= ^=\fR'u+1n"
.BR ( \&.\|.\|. )
@@ -1992,7 +2051,7 @@ Only use one on the right-hand side. The expression
has the same meaning as \fB(($0 ~ /foo/) ~ \fIexp\fB)\fR.
This is usually
.I not
-what was intended.
+what you want.
.TP
.B in
Array membership.
@@ -2068,7 +2127,8 @@ Set
from next input record; set
.BR NF ,
.BR NR ,
-.BR FNR .
+.BR FNR ,
+.BR RT .
.TP
.BI "getline <" file
Set
@@ -2076,20 +2136,23 @@ Set
from next record of
.IR file ;
set
-.BR NF .
+.BR NF ,
+.BR RT .
.TP
.BI getline " var"
Set
.I var
from next input record; set
.BR NR ,
-.BR FNR .
+.BR FNR ,
+.BR RT .
.TP
.BI getline " var" " <" file
Set
.I var
from next record of
-.IR file .
+.IR file ,
+.BR RT .
.TP
\fIcommand\fB | getline \fR[\fIvar\fR]
Run
@@ -2098,7 +2161,8 @@ piping the output either into
.B $0
or
.IR var ,
-as above.
+as above, and
+.BR RT .
.TP
\fIcommand\fB |& getline \fR[\fIvar\fR]
Run
@@ -2108,7 +2172,8 @@ piping the output either into
.B $0
or
.IR var ,
-as above.
+as above, and
+.BR RT .
Co-processes are a
.I gawk
extension.
@@ -2120,9 +2185,12 @@ below.)
.B next
Stop processing the current input record. The next input record
is read and processing starts over with the first pattern in the
-\*(AK program. If the end of the input data is reached, the
+\*(AK program.
+Upon reaching the end of the input data,
+.I gawk
+executes any
.B END
-block(s), if any, are executed.
+rule(s).
.TP
.B "nextfile"
Stop processing the current input file. The next input record read
@@ -2133,33 +2201,32 @@ and
are updated,
.B FNR
is reset to 1, and processing starts over with the first pattern in the
-\*(AK program. If the end of the input data is reached, the
+\*(AK program.
+Upon reaching the end of the input data,
+.I gawk
+executes any
.B END
-block(s), if any, are executed.
+rule(s).
.TP
.B print
Print the current record.
-The output record is terminated with the value of the
-.B ORS
-variable.
+The output record is terminated with the value of
+.BR ORS .
.TP
.BI print " expr-list"
Print expressions.
-Each expression is separated by the value of the
-.B OFS
-variable.
-The output record is terminated with the value of the
-.B ORS
-variable.
+Each expression is separated by the value of
+.BR OFS .
+The output record is terminated with the value of
+.BR ORS .
.TP
.BI print " expr-list" " >" file
Print expressions on
.IR file .
-Each expression is separated by the value of the
-.B OFS
-variable. The output record is terminated with the value of the
-.B ORS
-variable.
+Each expression is separated by the value of
+.BR OFS .
+The output record is terminated with the value of
+.BR ORS .
.TP
.BI printf " fmt, expr-list"
Format and print.
@@ -2207,10 +2274,10 @@ The
command returns 1 on success, 0 on end of file, and \-1 on an error.
Upon an error,
.B ERRNO
-contains a string describing the problem.
+is set to a string describing the problem.
.PP
.BR NOTE :
-Failure in opening a two-way socket will result in a non-fatal error being
+Failure in opening a two-way socket results in a non-fatal error being
returned to the calling function. If using a pipe, co-process, or socket to
.BR getline ,
or from
@@ -2247,7 +2314,7 @@ A decimal number (the integer part).
.TP
.BR %e , " %E"
A floating point number of the form
-.BR [\-]d.dddddde[+\^\-]dd .
+[\fB\-\fP]\fId\fB.\fIdddddd\^\fBe\fR[\fB+\-\fR]\fIdd\fR.
The
.B %E
format uses
@@ -2257,7 +2324,7 @@ instead of
.TP
.BR %f , " %F"
A floating point number of the form
-.BR [\-]ddd.dddddd .
+[\fB\-\fP]\fIddd\fB.\fIdddddd\fR.
If the system library supports it,
.B %F
is available as well. This is like
@@ -2378,9 +2445,9 @@ value to be printed.
.TP
.I width
The field should be padded to this width. The field is normally padded
-with spaces. If the
+with spaces. With the
.B 0
-flag has been used, it is padded with zeroes.
+flag, it is padded with zeroes.
.TP
.BI \&. prec
A number that specifies the precision to use when printing.
@@ -2415,15 +2482,15 @@ The dynamic
.I width
and
.I prec
-capabilities of the \*(AN C
+capabilities of the ISO C
.B printf()
routines are supported.
A
.B *
in place of either the
-.B width
+.I width
or
-.B prec
+.I prec
specifications causes their values to be taken from
the argument list to
.B printf
@@ -2454,6 +2521,9 @@ parent process (usually the shell).
These file names may also be used on the command line to name data files.
The filenames are:
.TP "\w'\fB/dev/stdout\fR'u+1n"
+.B \-
+The standard input.
+.TP
.B /dev/stdin
The standard input.
.TP
@@ -2560,7 +2630,8 @@ Return the sine of
which is in radians.
.TP
.BI sqrt( expr )
-The square root function.
+Return the square root of
+.IR expr .
.TP
\&\fBsrand(\fR[\fIexpr\^\fR]\fB)\fR
Use
@@ -2568,7 +2639,7 @@ Use
as the new seed for the random number generator. If no
.I expr
is provided, use the time of day.
-The return value is the previous seed for the random
+Return the previous seed for the random
number generator.
.SS String Functions
.PP
@@ -2593,7 +2664,7 @@ with sequential
integers starting with 1. If the optional
destination array
.I d
-is specified, then
+is specified,
first duplicate
.I s
into
@@ -2789,11 +2860,11 @@ Element values are the portions of
that matched
.IR r .
The value of
-.I seps[i]
+.BI seps[ i ]
is the separator that appeared in
front of
-.IR a[i+1] .
-If
+.BI a[ i +1]\fR.
+\&\fRIf
.I r
is omitted,
.B FPAT
@@ -2826,33 +2897,33 @@ The arrays
and
.I seps
are cleared first.
-.I seps[i]
+.BI seps[ i ]
is the field separator matched by
.I r
between
-.I a[i]
+.BI a[ i ]
and
-.IR a[i+1] .
-If
+.BI a[ i +1]\fR.
+\&\fRIf
.I r
is a single space, then leading whitespace in
.I s
goes into the extra array element
-.I seps[0]
+.B seps[0]
and trailing whitespace goes into the extra array element
-.IR seps[n] ,
+.BI seps[ n ]\fR,
where
.I n
-is the return value of
-.IR "split(s, a, r, seps)" .
+is the return value of
+.BI split( s ", " a ", " r ", " seps )\fR.
Splitting behaves identically to field splitting, described above.
.TP
.BI sprintf( fmt , " expr-list" )
-Prints
+Print
.I expr-list
according to
.IR fmt ,
-and returns the resulting string.
+and return the resulting string.
.TP
.BI strtonum( str )
Examine
@@ -2863,10 +2934,8 @@ If
begins
with a leading
.BR 0 ,
-.B strtonum()
-assumes that
-.I str
-is an octal number.
+treat it
+as an octal number.
If
.I str
begins
@@ -2874,11 +2943,9 @@ with a leading
.B 0x
or
.BR 0X ,
-.B strtonum()
-assumes that
-.I str
-is a hexadecimal number.
-Otherwise, decimal is assumed.
+treat it
+as a hexadecimal number.
+Otherwise, assume it is a decimal number.
.TP
\fBsub(\fIr\fB, \fIs \fR[\fB, \fIt\fR]\fB)\fR
Just like
@@ -2991,7 +3058,7 @@ The default format is available in
.BR PROCINFO["strftime"] .
See the specification for the
.B strftime()
-function in \*(AN C for the format conversions that are
+function in ISO C for the format conversions that are
guaranteed to be available.
.TP
.B systime()
@@ -3053,7 +3120,7 @@ For full details, see \*(EP.
Specify the directory where
.I gawk
looks for the
-.B \&.mo
+.B \&.gmo
files, in case they
will not or cannot be placed in the ``standard'' locations
(e.g., during testing).
@@ -3097,7 +3164,7 @@ You must also supply a text domain. Use
.B TEXTDOMAIN
if you want to use the current domain.
.TP
-\fBdcngettext(\fIstring1 \fR, \fIstring2 \fR, \fInumber \fR[\fB, \fIdomain \fR[\fB, \fIcategory\fR]]\fB)\fR
+\fBdcngettext(\fIstring1\fB, \fIstring2\fB, \fInumber \fR[\fB, \fIdomain \fR[\fB, \fIcategory\fR]]\fB)\fR
Return the plural form used for
.I number
of the translation of
@@ -3207,7 +3274,8 @@ Calling an undefined function at run time is a fatal error.
The word
.B func
may be used in place of
-.BR function .
+.BR function ,
+although this is deprecated.
.SH DYNAMICALLY LOADING NEW FUNCTIONS
You can dynamically add new built-in functions to the running
.I gawk
@@ -3269,16 +3337,16 @@ action to assign a value to the
.B TEXTDOMAIN
variable to set the text domain to a name associated with your program:
.sp
-.RS
+.in +5m
.ft B
BEGIN { TEXTDOMAIN = "myprog" }
.ft R
-.RE
+.in -5m
.sp
This allows
.I gawk
to find the
-.B \&.mo
+.B \&.gmo
file associated with your program.
Without this step,
.I gawk
@@ -3301,12 +3369,12 @@ functions in your program, as appropriate.
Run
.B "gawk \-\^\-gen\-pot \-f myprog.awk > myprog.pot"
to generate a
-.B \&.po
+.B \&.pot
file for your program.
.TP
5.
Provide appropriate translations, and build and install the corresponding
-.B \&.mo
+.B \&.gmo
files.
.PP
The internationalization features are described in full detail in \*(EP.
@@ -3314,13 +3382,13 @@ The internationalization features are described in full detail in \*(EP.
A primary goal for
.I gawk
is compatibility with the \*(PX standard, as well as with the
-latest version of \*(UX
+latest version of Brian Kernighan's
.IR awk .
To this end,
.I gawk
incorporates the following user visible
features which are not described in the \*(AK book,
-but are part of the Bell Laboratories version of
+but are part of the Brian Kernighan's version of
.IR awk ,
and are in the \*(PX standard.
.PP
@@ -3328,19 +3396,20 @@ The book indicates that command line variable assignment happens when
.I awk
would otherwise open the argument as a file, which is after the
.B BEGIN
-block is executed. However, in earlier implementations, when such an
+rule is executed. However, in earlier implementations, when such an
assignment appeared before any file names, the assignment would happen
.I before
the
.B BEGIN
-block was run. Applications came to depend on this \*(lqfeature.\*(rq
+rule was run. Applications came to depend on this \*(lqfeature.\*(rq
When
.I awk
was changed to match its documentation, the
.B \-v
option for assigning variables before program execution was added to
accommodate applications that depended upon the old behavior.
-(This feature was agreed upon by both the Bell Laboratories and the \*(GN developers.)
+(This feature was agreed upon by both the Bell Laboratories
+and the \*(GN developers.)
.PP
When processing arguments,
.I gawk
@@ -3378,7 +3447,7 @@ and fed back into the Bell Laboratories version); the
.B tolower()
and
.B toupper()
-built-in functions (from the Bell Laboratories version); and the \*(AN C conversion specifications in
+built-in functions (from the Bell Laboratories version); and the ISO C conversion specifications in
.B printf
(done first in the Bell Laboratories version).
.SH HISTORICAL FEATURES
@@ -3413,7 +3482,7 @@ issues a warning about its use if
is specified on the command line.
.SH GNU EXTENSIONS
.I Gawk
-has a number of extensions to \*(PX
+has a too-large number of extensions to \*(PX
.IR awk .
They are described in this section. All the extensions described here
can be disabled by
@@ -3441,12 +3510,19 @@ environment variable is not special.
.\" POSIX and language recognition issues
.TP
\(bu
-There is no facility for doing file inclusion
+There is no facility for doing file inclusion
.RI ( gawk 's
.B @include
mechanism).
.TP
\(bu
+There is no facility for dynamically adding new functions
+written in C
+.RI ( gawk 's
+.B @load
+mechanism).
+.TP
+\(bu
The
.B \ex
escape sequence.
@@ -3550,16 +3626,17 @@ and
The ability to pass an array to
.BR length() .
.\" New keywords or changes to keywords
-.TP
-\(bu
-The use of
-.BI delete " array"
-to delete the entire contents of an array.
-.TP
-\(bu
-The use of
-.B "nextfile"
-to abandon processing of the current input file.
+.\" (As of 2012, these are in POSIX)
+.\" .TP
+.\" \(bu
+.\" The use of
+.\" .BI delete " array"
+.\" to delete the entire contents of an array.
+.\" .TP
+.\" \(bu
+.\" The use of
+.\" .B "nextfile"
+.\" to abandon processing of the current input file.
.\" New functions
.TP
\(bu
@@ -3587,12 +3664,6 @@ functions.
.TP
\(bu
Localizable strings.
-.\" Extending gawk
-.TP
-\(bu
-Adding new built-in functions dynamically with the
-.B extension()
-function.
.PP
The \*(AK book does not define the return value of the
.B close()
@@ -3661,15 +3732,15 @@ The
environment variable can be used to provide a list of directories that
.I gawk
searches when looking for files named via the
-.B \-f
-,
-.B \-\^\-file
-,
+.BR \-f ,
+.RB \-\^\-file ,
.B \-i
and
.B \-\^\-include
options. If the initial search fails, the path is searched again after
-appending ".awk" to the filename.
+appending
+.B \&.awk
+to the filename.
.PP
The
.B AWKLIBPATH
@@ -3687,10 +3758,11 @@ environment variable can be used to specify a timeout
in milliseconds for reading input from a terminal, pipe
or two-way communication including sockets.
.PP
-For socket communication, two special environment variables can be used to control the number of retries
-.RB ( GAWK_SOCK_RETRIES ),
-and the interval between retries
-.RB ( GAWK_MSEC_SLEEP ).
+For connection to a remote host via socket,
+.B GAWK_SOCK_RETRIES
+controls the number of retries, and
+.B GAWK_MSEC_SLEEP
+and the interval between retries.
The interval is in milliseconds. On systems that do not support
.IR usleep (3),
the value is rounded up to an integral number of seconds.
@@ -3759,22 +3831,12 @@ compatible with the new version of \*(UX
.IR awk .
Arnold Robbins is the current maintainer.
.PP
-The initial DOS port was done by Conrad Kwok and Scott Garfinkle.
-Scott Deifik maintains the port to MS-DOS using DJGPP.
-Eli Zaretskii maintains the port to MS-Windows using MinGW.
-Pat Rankin did the
-port to VMS, and Michal Jaegermann did the port to the Atari ST.
-The port to OS/2 was done by Kai Uwe Rommel, with contributions and
-help from Darrel Hankerson.
-Andreas Buening now maintains the OS/2 port.
-The late Fred Fish supplied support for the Amiga,
-and Martin Brown provided the BeOS port.
-Stephen Davies provided the original Tandem port, and
-Matthew Woehlke provided changes for Tandem's POSIX-compliant systems.
-Dave Pitts provided the port to z/OS.
+See \*(EP for a full list of the contributors to
+.I gawk
+and its documentation.
.PP
See the
-.I README
+.B README
file in the
.I gawk
distribution for up-to-date information about maintainers
@@ -3892,13 +3954,13 @@ Run an external command for particular lines of data:
.ft R
.fi
.SH ACKNOWLEDGEMENTS
-Brian Kernighan of Bell Laboratories
+Brian Kernighan
provided valuable assistance during testing and debugging.
We thank him.
.SH COPYING PERMISSIONS
Copyright \(co 1989, 1991, 1992, 1993, 1994, 1995, 1996,
1997, 1998, 1999, 2001, 2002, 2003, 2004, 2005, 2007, 2009,
-2010, 2011, 2012
+2010, 2011, 2012, 2013
Free Software Foundation, Inc.
.PP
Permission is granted to make and distribute verbatim copies of