aboutsummaryrefslogtreecommitdiffstats
path: root/doc/gawk.texi
diff options
context:
space:
mode:
Diffstat (limited to 'doc/gawk.texi')
-rw-r--r--doc/gawk.texi340
1 files changed, 167 insertions, 173 deletions
diff --git a/doc/gawk.texi b/doc/gawk.texi
index f7b04d7d..fd411745 100644
--- a/doc/gawk.texi
+++ b/doc/gawk.texi
@@ -708,7 +708,7 @@ particular records in a file and perform operations upon them.
record.
* Nextfile Statement:: Stop processing the current file.
* Exit Statement:: Stop execution of @command{awk}.
-* Built-in Variables:: Summarizes the built-in variables.
+* Built-in Variables:: Summarizes the predefined variables.
* User-modified:: Built-in variables that you change to
control @command{awk}.
* Auto-set:: Built-in variables where @command{awk}
@@ -906,7 +906,6 @@ particular records in a file and perform operations upon them.
* Extension API Description:: A full description of the API.
* Extension API Functions Introduction:: Introduction to the API functions.
* General Data Types:: The data types.
-* Requesting Values:: How to get a value.
* Memory Allocation Functions:: Functions for allocating memory.
* Constructor Functions:: Functions for creating values.
* Registration Functions:: Functions to register things with
@@ -919,6 +918,7 @@ particular records in a file and perform operations upon them.
* Two-way processors:: Registering a two-way processor.
* Printing Messages:: Functions for printing messages.
* Updating @code{ERRNO}:: Functions for updating @code{ERRNO}.
+* Requesting Values:: How to get a value.
* Accessing Parameters:: Functions for accessing parameters.
* Symbol Table Access:: Functions for accessing global
variables.
@@ -957,9 +957,9 @@ particular records in a file and perform operations upon them.
processor.
* Extension Sample Read write array:: Serializing an array to a file.
* Extension Sample Readfile:: Reading an entire file into a string.
-* Extension Sample API Tests:: Tests for the API.
* Extension Sample Time:: An interface to @code{gettimeofday()}
and @code{sleep()}.
+* Extension Sample API Tests:: Tests for the API.
* gawkextlib:: The @code{gawkextlib} project.
* Extension summary:: Extension summary.
* Extension Exercises:: Exercises.
@@ -1570,7 +1570,7 @@ for getting most things done in a program.
@ref{Patterns and Actions},
describes how to write patterns for matching records, actions for
-doing something when a record is matched, and the built-in variables
+doing something when a record is matched, and the predefined variables
@command{awk} and @command{gawk} use.
@ref{Arrays},
@@ -3656,8 +3656,8 @@ The @option{-v} option can only set one variable, but it can be used
more than once, setting another variable each time, like this:
@samp{awk @w{-v foo=1} @w{-v bar=2} @dots{}}.
-@cindex built-in variables, @code{-v} option@comma{} setting with
-@cindex variables, built-in, @code{-v} option@comma{} setting with
+@cindex predefined variables, @code{-v} option@comma{} setting with
+@cindex variables, predefined @code{-v} option@comma{} setting with
@quotation CAUTION
Using @option{-v} to set the values of the built-in
variables may lead to surprising results. @command{awk} will reset the
@@ -6142,7 +6142,7 @@ standard input (by default, this is the keyboard, but often it is a pipe from an
command) or from files whose names you specify on the @command{awk}
command line. If you specify input files, @command{awk} reads them
in order, processing all the data from one before going on to the next.
-The name of the current input file can be found in the built-in variable
+The name of the current input file can be found in the predefined variable
@code{FILENAME}
(@pxref{Built-in Variables}).
@@ -6190,9 +6190,9 @@ used with it do not have to be named on the @command{awk} command line
@cindex @code{FNR} variable
@command{awk} divides the input for your program into records and fields.
It keeps track of the number of records that have been read so far from
-the current input file. This value is stored in a built-in variable
+the current input file. This value is stored in a predefined variable
called @code{FNR} which is reset to zero every time a new file is started.
-Another built-in variable, @code{NR}, records the total number of input
+Another predefined variable, @code{NR}, records the total number of input
records read so far from all @value{DF}s. It starts at zero, but is
never automatically reset to zero.
@@ -6210,7 +6210,7 @@ Records are separated by a character called the @dfn{record separator}.
By default, the record separator is the newline character.
This is why records are, by default, single lines.
A different character can be used for the record separator by
-assigning the character to the built-in variable @code{RS}.
+assigning the character to the predefined variable @code{RS}.
@cindex newlines, as record separators
@cindex @code{RS} variable
@@ -6596,7 +6596,7 @@ field.
@cindex @code{NF} variable
@cindex fields, number of
-@code{NF} is a built-in variable whose value is the number of fields
+@code{NF} is a predefined variable whose value is the number of fields
in the current record. @command{awk} automatically updates the value
of @code{NF} each time it reads a record. No matter how many fields
there are, the last field in a record can be represented by @code{$NF}.
@@ -6954,7 +6954,7 @@ is split into three fields: @samp{m}, @samp{@bullet{}g}, and
Note the leading spaces in the values of the second and third fields.
@cindex troubleshooting, @command{awk} uses @code{FS} not @code{IFS}
-The field separator is represented by the built-in variable @code{FS}.
+The field separator is represented by the predefined variable @code{FS}.
Shell programmers take note: @command{awk} does @emph{not} use the
name @code{IFS} that is used by the POSIX-compliant shells (such as
the Unix Bourne shell, @command{sh}, or Bash).
@@ -7199,7 +7199,7 @@ an uppercase @samp{F} instead of a lowercase @samp{f}. The latter
option (@option{-f}) specifies a file containing an @command{awk} program.
The value used for the argument to @option{-F} is processed in exactly the
-same way as assignments to the built-in variable @code{FS}.
+same way as assignments to the predefined variable @code{FS}.
Any special characters in the field separator must be escaped
appropriately. For example, to use a @samp{\} as the field separator
on the command line, you would have to type:
@@ -8185,7 +8185,7 @@ from the file
@var{file}, and put it in the variable @var{var}. As above, @var{file}
is a string-valued expression that specifies the file from which to read.
-In this version of @code{getline}, none of the built-in variables are
+In this version of @code{getline}, none of the predefined variables are
changed and the record is not split into fields. The only variable
changed is @var{var}.@footnote{This is not quite true. @code{RT} could
be changed if @code{RS} is a regular expression.}
@@ -8347,7 +8347,7 @@ BEGIN @{
@}
@end example
-In this version of @code{getline}, none of the built-in variables are
+In this version of @code{getline}, none of the predefined variables are
changed and the record is not split into fields. However, @code{RT} is set.
@ifinfo
@@ -8409,7 +8409,7 @@ When you use @samp{@var{command} |& getline @var{var}}, the output from
the coprocess @var{command} is sent through a two-way pipe to @code{getline}
and into the variable @var{var}.
-In this version of @code{getline}, none of the built-in variables are
+In this version of @code{getline}, none of the predefined variables are
changed and the record is not split into fields. The only variable
changed is @var{var}.
However, @code{RT} is set.
@@ -8512,9 +8512,9 @@ know that there is a string value to be assigned.
@ref{table-getline-variants}
summarizes the eight variants of @code{getline},
-listing which built-in variables are set by each one,
+listing which predefined variables are set by each one,
and whether the variant is standard or a @command{gawk} extension.
-Note: for each variant, @command{gawk} sets the @code{RT} built-in variable.
+Note: for each variant, @command{gawk} sets the @code{RT} predefined variable.
@float Table,table-getline-variants
@caption{@code{getline} Variants and What They Set}
@@ -8974,7 +8974,7 @@ of items separated by commas. In the output, the items are normally
separated by single spaces. However, this doesn't need to be the case;
a single space is simply the default. Any string of
characters may be used as the @dfn{output field separator} by setting the
-built-in variable @code{OFS}. The initial value of this variable
+predefined variable @code{OFS}. The initial value of this variable
is the string @w{@code{" "}}---that is, a single space.
The output from an entire @code{print} statement is called an
@@ -9050,7 +9050,7 @@ more fully in
@cindexawkfunc{sprintf}
@cindex @code{OFMT} variable
@cindex output, format specifier@comma{} @code{OFMT}
-The built-in variable @code{OFMT} contains the format specification
+The predefined variable @code{OFMT} contains the format specification
that @code{print} uses with @code{sprintf()} when it wants to convert a
number to a string for printing.
The default value of @code{OFMT} is @code{"%.6g"}.
@@ -10227,7 +10227,7 @@ retval = close(command) # syntax error in many Unix awks
The return value is @minus{}1 if the argument names something
that was never opened with a redirection, or if there is
a system problem closing the file or process.
-In these cases, @command{gawk} sets the built-in variable
+In these cases, @command{gawk} sets the predefined variable
@code{ERRNO} to a string describing the problem.
In @command{gawk},
@@ -10283,7 +10283,7 @@ retval = close(command) # syntax error in many Unix awks
The return value is @minus{}1 if the argument names something
that was never opened with a redirection, or if there is
a system problem closing the file or process.
-In these cases, @command{gawk} sets the built-in variable
+In these cases, @command{gawk} sets the predefined variable
@code{ERRNO} to a string describing the problem.
In @command{gawk},
@@ -10776,10 +10776,10 @@ array parameters. @xref{String Functions}.
@cindex variables, initializing
A few variables have special built-in meanings, such as @code{FS} (the
field separator), and @code{NF} (the number of fields in the current input
-record). @xref{Built-in Variables}, for a list of the built-in variables.
-These built-in variables can be used and assigned just like all other
+record). @xref{Built-in Variables}, for a list of the predefined variables.
+These predefined variables can be used and assigned just like all other
variables, but their values are also used or changed automatically by
-@command{awk}. All built-in variables' names are entirely uppercase.
+@command{awk}. All predefined variables' names are entirely uppercase.
Variables in @command{awk} can be assigned either numeric or string values.
The kind of value a variable holds can change over the life of a program.
@@ -10905,7 +10905,7 @@ Strings that can't be interpreted as valid numbers convert to zero.
@cindex @code{CONVFMT} variable
The exact manner in which numbers are converted into strings is controlled
-by the @command{awk} built-in variable @code{CONVFMT} (@pxref{Built-in Variables}).
+by the @command{awk} predefined variable @code{CONVFMT} (@pxref{Built-in Variables}).
Numbers are converted using the @code{sprintf()} function
with @code{CONVFMT} as the format
specifier
@@ -12936,7 +12936,7 @@ program, and occasionally the format for data read as input.
As you have already seen, each @command{awk} statement consists of
a pattern with an associated action. This @value{CHAPTER} describes how
you build patterns and actions, what kinds of things you can do within
-actions, and @command{awk}'s built-in variables.
+actions, and @command{awk}'s predefined variables.
The pattern-action rules and the statements available for use
within actions form the core of @command{awk} programming.
@@ -12951,7 +12951,7 @@ building something useful.
* Action Overview:: What goes into an action.
* Statements:: Describes the various control statements in
detail.
-* Built-in Variables:: Summarizes the built-in variables.
+* Built-in Variables:: Summarizes the predefined variables.
* Pattern Action Summary:: Patterns and Actions summary.
@end menu
@@ -14360,11 +14360,11 @@ results across different operating systems.
@c ENDOFRANGE accs
@node Built-in Variables
-@section Built-in Variables
+@section Predefined Variables
@c STARTOFRANGE bvar
-@cindex built-in variables
+@cindex predefined variables
@c STARTOFRANGE varb
-@cindex variables, built-in
+@cindex variables, predefined
Most @command{awk} variables are available to use for your own
purposes; they never change unless your program assigns values to
@@ -14375,8 +14375,8 @@ to tell @command{awk} how to do certain things. Others are set
automatically by @command{awk}, so that they carry information from the
internal workings of @command{awk} to your program.
-@cindex @command{gawk}, built-in variables and
-This @value{SECTION} documents all of @command{gawk}'s built-in variables,
+@cindex @command{gawk}, predefined variables and
+This @value{SECTION} documents all of @command{gawk}'s predefined variables,
most of which are also documented in the @value{CHAPTER}s describing
their areas of activity.
@@ -14391,7 +14391,7 @@ their areas of activity.
@node User-modified
@subsection Built-in Variables That Control @command{awk}
@c STARTOFRANGE bvaru
-@cindex built-in variables, user-modifiable
+@cindex predefined variables, user-modifiable
@c STARTOFRANGE nmbv
@cindex user-modifiable variables
@@ -14628,9 +14628,9 @@ The default value of @code{TEXTDOMAIN} is @code{"messages"}.
@subsection Built-in Variables That Convey Information
@c STARTOFRANGE bvconi
-@cindex built-in variables, conveying information
+@cindex predefined variables, conveying information
@c STARTOFRANGE vbconi
-@cindex variables, built-in, conveying information
+@cindex variables, predefined conveying information
The following is an alphabetical list of variables that @command{awk}
sets automatically on certain occasions in order to provide
information to your program.
@@ -15305,7 +15305,7 @@ immediately. You may pass an optional numeric value to be used
as @command{awk}'s exit status.
@item
-Some built-in variables provide control over @command{awk}, mainly for I/O.
+Some predefined variables provide control over @command{awk}, mainly for I/O.
Other variables convey information from @command{awk} to your program.
@item
@@ -16099,7 +16099,7 @@ An important aspect to remember about arrays is that @emph{array subscripts
are always strings}. When a numeric value is used as a subscript,
it is converted to a string value before being used for subscripting
(@pxref{Conversion}).
-This means that the value of the built-in variable @code{CONVFMT} can
+This means that the value of the predefined variable @code{CONVFMT} can
affect how your program accesses elements of an array. For example:
@example
@@ -17283,8 +17283,8 @@ for @code{match()}, the order is the same as for the @samp{~} operator:
@cindex @code{RSTART} variable, @code{match()} function and
@cindex @code{RLENGTH} variable, @code{match()} function and
@cindex @code{match()} function, @code{RSTART}/@code{RLENGTH} variables
-The @code{match()} function sets the built-in variable @code{RSTART} to
-the index. It also sets the built-in variable @code{RLENGTH} to the
+The @code{match()} function sets the predefined variable @code{RSTART} to
+the index. It also sets the predefined variable @code{RLENGTH} to the
length in characters of the matched substring. If no match is found,
@code{RSTART} is set to zero, and @code{RLENGTH} to @minus{}1.
@@ -19273,7 +19273,7 @@ the call.
A function cannot have two parameters with the same name, nor may it
have a parameter with the same name as the function itself.
In addition, according to the POSIX standard, function parameters
-cannot have the same name as one of the special built-in variables
+cannot have the same name as one of the special predefined variables
(@pxref{Built-in Variables}). Not all versions of @command{awk} enforce
this restriction.
@@ -20521,7 +20521,7 @@ example, @code{getopt()}'s @code{Opterr} and @code{Optind} variables
(@pxref{Getopt Function}).
The leading capital letter indicates that it is global, while the fact that
the variable name is not all capital letters indicates that the variable is
-not one of @command{awk}'s built-in variables, such as @code{FS}.
+not one of @command{awk}'s predefined variables, such as @code{FS}.
@cindex @option{--dump-variables} option, using for library functions
It is also important that @emph{all} variables in library
@@ -23435,7 +23435,7 @@ and the file transition library program
The program begins with a descriptive comment and then a @code{BEGIN} rule
that processes the command-line arguments with @code{getopt()}. The @option{-i}
(ignore case) option is particularly easy with @command{gawk}; we just use the
-@code{IGNORECASE} built-in variable
+@code{IGNORECASE} predefined variable
(@pxref{Built-in Variables}):
@cindex @code{egrep.awk} program
@@ -30287,7 +30287,7 @@ results. With the @option{-M} command-line option,
all floating-point arithmetic operators and numeric functions
can yield results to any desired precision level supported by MPFR.
-Two built-in variables, @code{PREC} and @code{ROUNDMODE},
+Two predefined variables, @code{PREC} and @code{ROUNDMODE},
provide control over the working precision and the rounding mode.
The precision and the rounding mode are set globally for every operation
to follow.
@@ -30563,7 +30563,7 @@ $ @kbd{gawk -f pi2.awk}
the precision or accuracy of individual numbers. Performing an arithmetic
operation or calling a built-in function rounds the result to the current
working precision. The default working precision is 53 bits, which you can
-modify using the built-in variable @code{PREC}. You can also set the
+modify using the predefined variable @code{PREC}. You can also set the
value to one of the predefined case-insensitive strings
shown in @ref{table-predefined-precision-strings},
to emulate an IEEE 754 binary format.
@@ -31264,13 +31264,13 @@ This (rather large) @value{SECTION} describes the API in detail.
@menu
* Extension API Functions Introduction:: Introduction to the API functions.
* General Data Types:: The data types.
-* Requesting Values:: How to get a value.
* Memory Allocation Functions:: Functions for allocating memory.
* Constructor Functions:: Functions for creating values.
* Registration Functions:: Functions to register things with
@command{gawk}.
* Printing Messages:: Functions for printing messages.
* Updating @code{ERRNO}:: Functions for updating @code{ERRNO}.
+* Requesting Values:: How to get a value.
* Accessing Parameters:: Functions for accessing parameters.
* Symbol Table Access:: Functions for accessing global
variables.
@@ -31289,6 +31289,9 @@ API function pointers are provided for the following kinds of operations:
@itemize @value{BULLET}
@item
+Allocating, reallocating, and releasing memory.
+
+@item
Registration functions. You may register:
@itemize @value{MINUS}
@item
@@ -31321,9 +31324,6 @@ Symbol table access: retrieving a global variable, creating one,
or changing one.
@item
-Allocating, reallocating, and releasing memory.
-
-@item
Creating and releasing cached values; this provides an
efficient way to use values for multiple variables and
can be a big performance win.
@@ -32534,7 +32534,7 @@ Return false if the value cannot be retrieved.
@item awk_bool_t sym_update_scalar(awk_scalar_t cookie, awk_value_t *value);
Update the value associated with a scalar cookie. Return false if
the new value is not of type @code{AWK_STRING} or @code{AWK_NUMBER}.
-Here too, the built-in variables may not be updated.
+Here too, the predefined variables may not be updated.
@end table
It is not obvious at first glance how to work with scalar cookies or
@@ -33399,7 +33399,7 @@ This variable is true if @command{gawk} was invoked with @option{--traditional}
@end table
The value of @code{do_lint} can change if @command{awk} code
-modifies the @code{LINT} built-in variable (@pxref{Built-in Variables}).
+modifies the @code{LINT} predefined variable (@pxref{Built-in Variables}).
The others should not change during execution.
@node Extension API Boilerplate
@@ -33974,7 +33974,16 @@ for success:
@}
@end example
-Finally, here is the @code{do_stat()} function. It starts with
+The third argument to @code{stat()} was not discussed previously. This
+argument is optional. If present, it causes @code{do_stat()} to use
+the @code{stat()} system call instead of the @code{lstat()} system
+call. This is done by using a function pointer: @code{statfunc}.
+@code{statfunc} is initialized to point to @code{lstat()} (instead
+of @code{stat()}) to get the file information, in case the file is a
+symbolic link. However, if there were three arguments, @code{statfunc}
+is set point to @code{stat()}, instead.
+
+Here is the @code{do_stat()} function. It starts with
variable declarations and argument checking:
@ignore
@@ -34005,16 +34014,10 @@ do_stat(int nargs, awk_value_t *result)
@}
@end example
-The third argument to @code{stat()} was not discussed previously. This argument
-is optional. If present, it causes @code{stat()} to use the @code{stat()}
-system call instead of the @code{lstat()} system call.
-
Then comes the actual work. First, the function gets the arguments.
-Next, it gets the information for the file.
-The code use @code{lstat()} (instead of @code{stat()})
-to get the file information,
-in case the file is a symbolic link.
-If there's an error, it sets @code{ERRNO} and returns:
+Next, it gets the information for the file. If the called function
+(@code{lstat()} or @code{stat()}) returns an error, the code sets
+@code{ERRNO} and returns:
@example
/* file is first arg, array to hold results is second */
@@ -34043,7 +34046,7 @@ If there's an error, it sets @code{ERRNO} and returns:
@end example
The tedious work is done by @code{fill_stat_array()}, shown
-earlier. When done, return the result from @code{fill_stat_array()}:
+earlier. When done, the function returns the result from @code{fill_stat_array()}:
@example
ret = fill_stat_array(name, array, & sbuf);
@@ -34106,7 +34109,7 @@ of the @file{gawkapi.h} header file,
the following steps@footnote{In practice, you would probably want to
use the GNU Autotools---Automake, Autoconf, Libtool, and @command{gettext}---to
configure and build your libraries. Instructions for doing so are beyond
-the scope of this @value{DOCUMENT}. @xref{gawkextlib}, for WWW links to
+the scope of this @value{DOCUMENT}. @xref{gawkextlib}, for Internet links to
the tools.} create a GNU/Linux shared library:
@example
@@ -34134,14 +34137,14 @@ BEGIN @{
for (i in data)
printf "data[\"%s\"] = %s\n", i, data[i]
print "testff.awk modified:",
- strftime("%m %d %y %H:%M:%S", data["mtime"])
+ strftime("%m %d %Y %H:%M:%S", data["mtime"])
print "\nInfo for JUNK"
ret = stat("JUNK", data)
print "ret =", ret
for (i in data)
printf "data[\"%s\"] = %s\n", i, data[i]
- print "JUNK modified:", strftime("%m %d %y %H:%M:%S", data["mtime"])
+ print "JUNK modified:", strftime("%m %d %Y %H:%M:%S", data["mtime"])
@}
@end example
@@ -34155,25 +34158,26 @@ $ @kbd{AWKLIBPATH=$PWD gawk -f testff.awk}
@print{} Info for testff.awk
@print{} ret = 0
@print{} data["blksize"] = 4096
-@print{} data["mtime"] = 1350838628
+@print{} data["devbsize"] = 512
+@print{} data["mtime"] = 1412004710
@print{} data["mode"] = 33204
@print{} data["type"] = file
@print{} data["dev"] = 2053
@print{} data["gid"] = 1000
-@print{} data["ino"] = 1719496
-@print{} data["ctime"] = 1350838628
+@print{} data["ino"] = 10358899
+@print{} data["ctime"] = 1412004710
@print{} data["blocks"] = 8
@print{} data["nlink"] = 1
@print{} data["name"] = testff.awk
-@print{} data["atime"] = 1350838632
+@print{} data["atime"] = 1412004716
@print{} data["pmode"] = -rw-rw-r--
-@print{} data["size"] = 662
+@print{} data["size"] = 666
@print{} data["uid"] = 1000
-@print{} testff.awk modified: 10 21 12 18:57:08
-@print{}
+@print{} testff.awk modified: 09 29 2014 18:31:50
+@print{}
@print{} Info for JUNK
@print{} ret = -1
-@print{} JUNK modified: 01 01 70 02:00:00
+@print{} JUNK modified: 01 01 1970 02:00:00
@end example
@node Extension Samples
@@ -34198,9 +34202,9 @@ Others mainly provide example code that shows how to use the extension API.
* Extension Sample Rev2way:: Reversing data sample two-way processor.
* Extension Sample Read write array:: Serializing an array to a file.
* Extension Sample Readfile:: Reading an entire file into a string.
-* Extension Sample API Tests:: Tests for the API.
* Extension Sample Time:: An interface to @code{gettimeofday()}
and @code{sleep()}.
+* Extension Sample API Tests:: Tests for the API.
@end menu
@node Extension Sample File Functions
@@ -34210,7 +34214,7 @@ The @code{filefuncs} extension provides three different functions, as follows:
The usage is:
@table @asis
-@item @@load "filefuncs"
+@item @code{@@load "filefuncs"}
This is how you load the extension.
@cindex @code{chdir()} extension function
@@ -34273,7 +34277,7 @@ Not all systems support all file types. @tab All
@itemx @code{result = fts(pathlist, flags, filedata)}
Walk the file trees provided in @code{pathlist} and fill in the
@code{filedata} array as described below. @code{flags} is the bitwise
-OR of several predefined constant values, also described below.
+OR of several predefined values, also described below.
Return zero if there were no errors, otherwise return @minus{}1.
@end table
@@ -34318,10 +34322,10 @@ Immediately follow a symbolic link named in @code{pathlist},
whether or not @code{FTS_LOGICAL} is set.
@item FTS_SEEDOT
-By default, the @code{fts()} routines do not return entries for @file{.} (dot)
-and @file{..} (dot-dot). This option causes entries for dot-dot to also
-be included. (The extension always includes an entry for dot,
-see below.)
+By default, the C library @code{fts()} routines do not return entries for
+@file{.} (dot) and @file{..} (dot-dot). This option causes entries for
+dot-dot to also be included. (The extension always includes an entry
+for dot, see below.)
@item FTS_XDEV
During a traversal, do not cross onto a different mounted filesystem.
@@ -34375,8 +34379,8 @@ Otherwise it returns @minus{}1.
@quotation NOTE
The @code{fts()} extension does not exactly mimic the
interface of the C library @code{fts()} routines, choosing instead to
-provide an interface that is based on associative arrays, which should
-be more comfortable to use from an @command{awk} program. This includes the
+provide an interface that is based on associative arrays, which is
+more comfortable to use from an @command{awk} program. This includes the
lack of a comparison function, since @command{gawk} already provides
powerful array sorting facilities. While an @code{fts_read()}-like
interface could have been provided, this felt less natural than simply
@@ -34384,7 +34388,8 @@ creating a multidimensional array to represent the file hierarchy and
its information.
@end quotation
-See @file{test/fts.awk} in the @command{gawk} distribution for an example.
+See @file{test/fts.awk} in the @command{gawk} distribution for an example
+use of the @code{fts()} extension function.
@node Extension Sample Fnmatch
@subsection Interface To @code{fnmatch()}
@@ -34592,7 +34597,7 @@ indicating the type of the file. The letters are file types are shown
in @ref{table-readdir-file-types}.
@float Table,table-readdir-file-types
-@caption{File Types Returned By @code{readdir()}}
+@caption{File Types Returned By The @code{readdir} Extension}
@multitable @columnfractions .1 .9
@headitem Letter @tab File Type
@item @code{b} @tab Block device
@@ -34684,6 +34689,9 @@ The @code{rwarray} extension adds two functions,
named @code{writea()} and @code{reada()}, as follows:
@table @code
+@item @@load "rwarray"
+This is how you load the extension.
+
@cindex @code{writea()} extension function
@item ret = writea(file, array)
This function takes a string argument, which is the name of the file
@@ -34759,17 +34767,6 @@ if (contents == "" && ERRNO != "") @{
@}
@end example
-@node Extension Sample API Tests
-@subsection API Tests
-@cindex @code{testext} extension
-
-The @code{testext} extension exercises parts of the extension API that
-are not tested by the other samples. The @file{extension/testext.c}
-file contains both the C code for the extension and @command{awk}
-test code inside C comments that run the tests. The testing framework
-extracts the @command{awk} code and runs the tests. See the source file
-for more information.
-
@node Extension Sample Time
@subsection Extension Time Functions
@@ -34800,6 +34797,17 @@ Implementation details: depending on platform availability, this function
tries to use @code{nanosleep()} or @code{select()} to implement the delay.
@end table
+@node Extension Sample API Tests
+@subsection API Tests
+@cindex @code{testext} extension
+
+The @code{testext} extension exercises parts of the extension API that
+are not tested by the other samples. The @file{extension/testext.c}
+file contains both the C code for the extension and @command{awk}
+test code inside C comments that run the tests. The testing framework
+extracts the @command{awk} code and runs the tests. See the source file
+for more information.
+
@node gawkextlib
@section The @code{gawkextlib} Project
@cindex @code{gawkextlib}
@@ -34815,8 +34823,7 @@ As of this writing, there are five extensions:
@itemize @value{BULLET}
@item
-XML parser extension, using the @uref{http://expat.sourceforge.net, Expat}
-XML parsing library.
+GD graphics library extension.
@item
PDF extension.
@@ -34825,17 +34832,14 @@ PDF extension.
PostgreSQL extension.
@item
-GD graphics library extension.
-
-@item
MPFR library extension.
This provides access to a number of MPFR functions which @command{gawk}'s
native MPFR support does not.
-@end itemize
-The @code{time} extension described earlier (@pxref{Extension Sample
-Time}) was originally from this project but has been moved in to the
-main @command{gawk} distribution.
+@item
+XML parser extension, using the @uref{http://expat.sourceforge.net, Expat}
+XML parsing library.
+@end itemize
@cindex @command{git} utility
You can check out the code for the @code{gawkextlib} project
@@ -34926,6 +34930,9 @@ API function pointers are provided for the following kinds of operations:
@itemize @value{BULLET}
@item
+Allocating, reallocating, and releasing memory.
+
+@item
Registration functions. You may register
extension functions,
exit callbacks,
@@ -34949,9 +34956,6 @@ Symbol table access: retrieving a global variable, creating one,
or changing one.
@item
-Allocating, reallocating, and releasing memory.
-
-@item
Creating and releasing cached values; this provides an
efficient way to use values for multiple variables and
can be a big performance win.
@@ -34983,7 +34987,7 @@ treated as read-only by the extension.
@item
@emph{All} memory passed from an extension to @command{gawk} must come from
the API's memory allocation functions. @command{gawk} takes responsibility for
-the memory and will release it when appropriate.
+the memory and releases it when appropriate.
@item
The API provides information about the running version of @command{gawk} so
@@ -35000,7 +35004,7 @@ The @command{gawk} distribution includes a number of small but useful
sample extensions. The @code{gawkextlib} project includes several more,
larger, extensions. If you wish to write an extension and contribute it
to the community of @command{gawk} users, the @code{gawkextlib} project
-should be the place to do so.
+is the place to do so.
@end itemize
@@ -35082,7 +35086,7 @@ which follows the POSIX specification. Many long-time @command{awk}
users learned @command{awk} programming with the original @command{awk}
implementation in Version 7 Unix. (This implementation was the basis for
@command{awk} in Berkeley Unix, through 4.3-Reno. Subsequent versions
-of Berkeley Unix, and some systems derived from 4.4BSD-Lite, used various
+of Berkeley Unix, and, for a while, some systems derived from 4.4BSD-Lite, used various
versions of @command{gawk} for their @command{awk}.) This @value{CHAPTER}
briefly describes the evolution of the @command{awk} language, with
cross-references to other parts of the @value{DOCUMENT} where you can
@@ -35155,7 +35159,7 @@ The built-in functions @code{close()} and @code{system()}
@item
The @code{ARGC}, @code{ARGV}, @code{FNR}, @code{RLENGTH}, @code{RSTART},
-and @code{SUBSEP} built-in variables (@pxref{Built-in Variables}).
+and @code{SUBSEP} predefined variables (@pxref{Built-in Variables}).
@item
Assignable @code{$0} (@pxref{Changing Fields}).
@@ -35186,14 +35190,11 @@ of @code{FS}.
@item
Dynamic regexps as operands of the @samp{~} and @samp{!~} operators
-(@pxref{Regexp Usage}).
+(@pxref{Computed Regexps}).
@item
The escape sequences @samp{\b}, @samp{\f}, and @samp{\r}
(@pxref{Escape Sequences}).
-(Some vendors have updated their old versions of @command{awk} to
-recognize @samp{\b}, @samp{\f}, and @samp{\r}, but this is not
-something you can rely on.)
@item
Redirection of input for the @code{getline} function
@@ -35232,7 +35233,7 @@ The @option{-v} option for assigning variables before program execution begins
@c GNU, Bell Laboratories & MKS together
@item
-The @option{--} option for terminating command-line options.
+The @option{--} signal for terminating command-line options.
@item
The @samp{\a}, @samp{\v}, and @samp{\x} escape sequences
@@ -35255,7 +35256,7 @@ A cleaner specification for the @code{%c} format-control letter in the
@item
The ability to dynamically pass the field width and precision (@code{"%*.*d"})
-in the argument list of the @code{printf} function
+in the argument list of @code{printf} and @code{sprintf()}
(@pxref{Control Letters}).
@item
@@ -35290,8 +35291,8 @@ The concept of a numeric string and tighter comparison rules to go
with it (@pxref{Typing and Comparison}).
@item
-The use of built-in variables as function parameter names is forbidden
-(@pxref{Definition Syntax}.
+The use of predefined variables as function parameter names is forbidden
+(@pxref{Definition Syntax}).
@item
More complete documentation of many of the previously undocumented
@@ -35386,7 +35387,7 @@ in the current version of @command{gawk}.
@itemize @value{BULLET}
@item
-Additional built-in variables:
+Additional predefined variables:
@itemize @value{MINUS}
@item
@@ -35470,14 +35471,6 @@ The @code{BEGINFILE} and @code{ENDFILE} special patterns.
(@pxref{BEGINFILE/ENDFILE}).
@item
-The ability to delete all of an array at once with @samp{delete @var{array}}
-(@pxref{Delete}).
-
-@item
-The @code{nextfile} statement
-(@pxref{Nextfile Statement}).
-
-@item
The @code{switch} statement
(@pxref{Switch Statement}).
@end itemize
@@ -35492,7 +35485,7 @@ of a two-way pipe to a coprocess
(@pxref{Two-way I/O}).
@item
-POSIX compliance for @code{gsub()} and @code{sub()}.
+POSIX compliance for @code{gsub()} and @code{sub()} with @option{--posix}.
@item
The @code{length()} function accepts an array argument
@@ -35520,6 +35513,20 @@ Additional functions only in @command{gawk}:
@itemize @value{MINUS}
@item
+The @code{gensub()}, @code{patsplit()}, and @code{strtonum()} functions
+for more powerful text manipulation
+(@pxref{String Functions}).
+
+@item
+The @code{asort()} and @code{asorti()} functions for sorting arrays
+(@pxref{Array Sorting}).
+
+@item
+The @code{mktime()}, @code{systime()}, and @code{strftime()}
+functions for working with timestamps
+(@pxref{Time Functions}).
+
+@item
The
@code{and()},
@code{compl()},
@@ -35533,30 +35540,15 @@ functions for bit manipulation
@c In 4.1, and(), or() and xor() grew the ability to take > 2 arguments
@item
-The @code{asort()} and @code{asorti()} functions for sorting arrays
-(@pxref{Array Sorting}).
+The @code{isarray()} function to check if a variable is an array or not
+(@pxref{Type Functions}).
@item
The @code{bindtextdomain()}, @code{dcgettext()} and @code{dcngettext()}
functions for internationalization
(@pxref{Programmer i18n}).
-
-@item
-The @code{fflush()} function from BWK @command{awk}
-(@pxref{I/O Functions}).
-
-@item
-The @code{gensub()}, @code{patsplit()}, and @code{strtonum()} functions
-for more powerful text manipulation
-(@pxref{String Functions}).
-
-@item
-The @code{mktime()}, @code{systime()}, and @code{strftime()}
-functions for working with timestamps
-(@pxref{Time Functions}).
@end itemize
-
@item
Changes and/or additions in the command-line options:
@@ -35679,7 +35671,7 @@ GCC for VAX and Alpha has not been tested for a while.
@item
Support for the following obsolete systems was removed from the code
-and the documentation for @command{gawk} @value{PVERSION} 4.1:
+for @command{gawk} @value{PVERSION} 4.1:
@c nested table
@itemize @value{MINUS}
@@ -36316,33 +36308,29 @@ The dynamic extension interface was completely redone
@cindex extensions, Brian Kernighan's @command{awk}
@cindex extensions, @command{mawk}
-This @value{SECTION} summarizes the common extensions supported
+The following table summarizes the common extensions supported
by @command{gawk}, Brian Kernighan's @command{awk}, and @command{mawk},
the three most widely-used freely available versions of @command{awk}
(@pxref{Other Versions}).
-@multitable {@file{/dev/stderr} special file} {BWK Awk} {Mawk} {GNU Awk}
-@headitem Feature @tab BWK Awk @tab Mawk @tab GNU Awk
-@item @samp{\x} Escape sequence @tab X @tab X @tab X
-@item @code{FS} as null string @tab X @tab X @tab X
-@item @file{/dev/stdin} special file @tab X @tab X @tab X
-@item @file{/dev/stdout} special file @tab X @tab X @tab X
-@item @file{/dev/stderr} special file @tab X @tab X @tab X
-@item @code{delete} without subscript @tab X @tab X @tab X
-@item @code{fflush()} function @tab X @tab X @tab X
-@item @code{length()} of an array @tab X @tab X @tab X
-@item @code{nextfile} statement @tab X @tab X @tab X
-@item @code{**} and @code{**=} operators @tab X @tab @tab X
-@item @code{func} keyword @tab X @tab @tab X
-@item @code{BINMODE} variable @tab @tab X @tab X
-@item @code{RS} as regexp @tab @tab X @tab X
-@item Time related functions @tab @tab X @tab X
+@multitable {@file{/dev/stderr} special file} {BWK Awk} {Mawk} {GNU Awk} {Now standard}
+@headitem Feature @tab BWK Awk @tab Mawk @tab GNU Awk @tab Now standard
+@item @samp{\x} Escape sequence @tab X @tab X @tab X @tab
+@item @code{FS} as null string @tab X @tab X @tab X @tab
+@item @file{/dev/stdin} special file @tab X @tab X @tab X @tab
+@item @file{/dev/stdout} special file @tab X @tab X @tab X @tab
+@item @file{/dev/stderr} special file @tab X @tab X @tab X @tab
+@item @code{delete} without subscript @tab X @tab X @tab X @tab X
+@item @code{fflush()} function @tab X @tab X @tab X @tab X
+@item @code{length()} of an array @tab X @tab X @tab X @tab
+@item @code{nextfile} statement @tab X @tab X @tab X @tab X
+@item @code{**} and @code{**=} operators @tab X @tab @tab X @tab
+@item @code{func} keyword @tab X @tab @tab X @tab
+@item @code{BINMODE} variable @tab @tab X @tab X @tab
+@item @code{RS} as regexp @tab @tab X @tab X @tab
+@item Time related functions @tab @tab X @tab X @tab
@end multitable
-(Technically speaking, as of late 2012, @code{fflush()}, @samp{delete @var{array}},
-and @code{nextfile} are no longer extensions, since they have been added
-to POSIX.)
-
@node Ranges and Locales
@appendixsec Regexp Ranges and Locales: A Long Sad Story
@@ -36379,6 +36367,7 @@ In the @code{"C"} and @code{"POSIX"} locales, a range expression like
But outside those locales, the ordering was defined to be based on
@dfn{collation order}.
+What does that mean?
In many locales, @samp{A} and @samp{a} are both less than @samp{B}.
In other words, these locales sort characters in dictionary order,
and @samp{[a-dx-z]} is typically not equivalent to @samp{[abcdxyz]};
@@ -36386,7 +36375,7 @@ instead it might be equivalent to @samp{[ABCXYabcdxyz]}, for example.
This point needs to be emphasized: Much literature teaches that you should
use @samp{[a-z]} to match a lowercase character. But on systems with
-non-ASCII locales, this also matched all of the uppercase characters
+non-ASCII locales, this also matches all of the uppercase characters
except @samp{A} or @samp{Z}! This was a continuous cause of confusion, even well
into the twenty-first century.
@@ -36692,6 +36681,11 @@ The development of the extension API first released with
Arnold Robbins and Andrew Schorr, with notable contributions from
the rest of the development team.
+@cindex Malmberg, John E.
+@item
+John Malmberg contributed significant improvements to the
+OpenVMS port and the related documentation.
+
@item
@cindex Colombo, Antonio
Antonio Giovanni Colombo rewrote a number of examples in the early
@@ -39215,7 +39209,7 @@ Pat Rankin suggested the solution that was adopted.
@appendixsubsec Other Design Decisions
As an arbitrary design decision, extensions can read the values of
-built-in variables and arrays (such as @code{ARGV} and @code{FS}), but cannot
+predefined variables and arrays (such as @code{ARGV} and @code{FS}), but cannot
change them, with the exception of @code{PROCINFO}.
The reason for this is to prevent an extension function from affecting
@@ -39956,11 +39950,11 @@ See ``Free Documentation License.''
@item Field
When @command{awk} reads an input record, it splits the record into pieces
separated by whitespace (or by a separator regexp that you can
-change by setting the built-in variable @code{FS}). Such pieces are
+change by setting the predefined variable @code{FS}). Such pieces are
called fields. If the pieces are of fixed length, you can use the built-in
variable @code{FIELDWIDTHS} to describe their lengths.
If you wish to specify the contents of fields instead of the field
-separator, you can use the built-in variable @code{FPAT} to do so.
+separator, you can use the predefined variable @code{FPAT} to do so.
(@xref{Field Separators},
@ref{Constant Size},
and
@@ -39979,7 +39973,7 @@ See also ``Double Precision'' and ``Single Precision.''
Format strings control the appearance of output in the
@code{strftime()} and @code{sprintf()} functions, and in the
@code{printf} statement as well. Also, data conversions from numbers to strings
-are controlled by the format strings contained in the built-in variables
+are controlled by the format strings contained in the predefined variables
@code{CONVFMT} and @code{OFMT}. (@xref{Control Letters}.)
@item Free Documentation License