aboutsummaryrefslogtreecommitdiffstats
path: root/doc/gawktexi.in
diff options
context:
space:
mode:
Diffstat (limited to 'doc/gawktexi.in')
-rw-r--r--doc/gawktexi.in941
1 files changed, 537 insertions, 404 deletions
diff --git a/doc/gawktexi.in b/doc/gawktexi.in
index d0356991..aac8c2af 100644
--- a/doc/gawktexi.in
+++ b/doc/gawktexi.in
@@ -115,13 +115,6 @@
@end macro
@end ifnothtml
-@set FN file name
-@set FFN File Name
-@set DF data file
-@set DDF Data File
-@set PVERSION version
-@set CTL Ctrl
-
@ignore
Some comments on the layout for TeX.
1. Use at least texinfo.tex 2000-09-06.09
@@ -196,6 +189,7 @@ supports it in developing GNU and promoting software freedom.''
@c during editing and review.
@setchapternewpage odd
+@shorttitlepage GNU Awk
@titlepage
@title @value{TITLE}
@subtitle @value{SUBTITLE}
@@ -405,7 +399,7 @@ particular records in a file and perform operations upon them.
* Field Splitting Summary:: Some final points and a summary table.
* Constant Size:: Reading constant width data.
* Splitting By Content:: Defining Fields By Content
-* Multiple Line:: Reading multi-line records.
+* Multiple Line:: Reading multiline records.
* Getline:: Reading files under explicit program
control using the @code{getline}
function.
@@ -556,9 +550,9 @@ particular records in a file and perform operations upon them.
@command{awk}.
* Uninitialized Subscripts:: Using Uninitialized variables as
subscripts.
-* Multi-dimensional:: Emulating multidimensional arrays in
+* Multidimensional:: Emulating multidimensional arrays in
@command{awk}.
-* Multi-scanning:: Scanning multidimensional arrays.
+* Multiscanning:: Scanning multidimensional arrays.
* Arrays of Arrays:: True multidimensional arrays.
* Built-in:: Summarizes the built-in functions.
* Calling Built-in:: How to call built-in functions.
@@ -610,6 +604,8 @@ particular records in a file and perform operations upon them.
* Join Function:: A function to join an array into a
string.
* Getlocaltime Function:: A function to get formatted times.
+* Readfile Function:: A function to read an entire file at
+ once.
* Data File Management:: Functions for managing command-line
data files.
* Filetrans Function:: A function for handling data file
@@ -1155,17 +1151,17 @@ wrote the bulk of
@cite{TCP/IP Internetworking with @command{gawk}}
(a separate document, available as part of the @command{gawk} distribution).
His code finally became part of the main @command{gawk} distribution
-with @command{gawk} @value{PVERSION} 3.1.
+with @command{gawk} version 3.1.
John Haque rewrote the @command{gawk} internals, in the process providing
an @command{awk}-level debugger. This version became available as
-@command{gawk} @value{PVERSION} 4.0, in 2011.
+@command{gawk} version 4.0, in 2011.
@xref{Contributors},
for a complete list of those who made important contributions to @command{gawk}.
@node Names
-@section A Rose by Any Other Name
+@unnumberedsec A Rose by Any Other Name
@cindex @command{awk}, new vs.@: old
The @command{awk} language has evolved over the years. Full details are
@@ -1201,7 +1197,7 @@ we simply use the term @command{awk}. When referring to a feature that is
specific to the GNU implementation, we use the term @command{gawk}.
@node This Manual
-@section Using This Book
+@unnumberedsec Using This Book
@cindex @command{awk}, terms describing
The term @command{awk} refers to a particular program as well as to the language you
@@ -1374,7 +1370,7 @@ present the licenses that cover the @command{gawk} source code
and this @value{DOCUMENT}, respectively.
@node Conventions
-@section Typographical Conventions
+@unnumberedsec Typographical Conventions
@cindex Texinfo
This @value{DOCUMENT} is written in @uref{http://www.gnu.org/software/texinfo/, Texinfo},
@@ -1413,23 +1409,23 @@ emphasized @emph{like this}, and if a point needs to be made
strongly, it is done @strong{like this}. The first occurrence of
a new term is usually its @dfn{definition} and appears in the same
font as the previous occurrence of ``definition'' in this sentence.
-Finally, @value{FN}s are indicated like this: @file{/path/to/ourfile}.
+Finally, file names are indicated like this: @file{/path/to/ourfile}.
@end ifnotinfo
Characters that you type at the keyboard look @kbd{like this}. In particular,
there are special characters called ``control characters.'' These are
characters that you type by holding down both the @kbd{CONTROL} key and
-another key, at the same time. For example, a @kbd{@value{CTL}-d} is typed
+another key, at the same time. For example, a @kbd{Ctrl-d} is typed
by first pressing and holding the @kbd{CONTROL} key, next
pressing the @kbd{d} key and finally releasing both keys.
@c fakenode --- for prepinfo
-@subsubheading Dark Corners
+@unnumberedsubsec Dark Corners
@cindex Kernighan, Brian
@quotation
@i{Dark corners are basically fractal --- no matter how much
-you illuminate, there's always a smaller but darker one.}@*
-Brian Kernighan
+you illuminate, there's always a smaller but darker one.}
+@author Brian Kernighan
@end quotation
@cindex d.c., See dark corner
@@ -1564,7 +1560,7 @@ of @cite{GAWK: The GNU Awk User's Guide}.
Edition @value{EDITION} maintains the basic structure of Edition 1.0,
but with significant additional material, reflecting the host of new features
-in @command{gawk} @value{PVERSION} @value{VERSION}.
+in @command{gawk} version @value{VERSION}.
Of particular note is
@ref{Array Sorting},
@ref{Bitwise Functions},
@@ -2000,9 +1996,9 @@ awk '@var{program}'
@noindent
@command{awk} applies the @var{program} to the @dfn{standard input},
which usually means whatever you type on the terminal. This continues
-until you indicate end-of-file by typing @kbd{@value{CTL}-d}.
+until you indicate end-of-file by typing @kbd{Ctrl-d}.
(On other operating systems, the end-of-file character may be different.
-For example, on OS/2, it is @kbd{@value{CTL}-z}.)
+For example, on OS/2, it is @kbd{Ctrl-z}.)
@cindex files, input, See input files
@cindex input files, running @command{awk} without
@@ -2048,7 +2044,7 @@ $ @kbd{awk '@{ print @}'}
@print{} Four score and seven years ago, ...
@kbd{What, me worry?}
@print{} What, me worry?
-@kbd{@value{CTL}-d}
+@kbd{Ctrl-d}
@end example
@node Long
@@ -2069,7 +2065,7 @@ awk -f @var{source-file} @var{input-file1} @var{input-file2} @dots{}
@cindex command line, options
@cindex options, command-line
The @option{-f} instructs the @command{awk} utility to get the @command{awk} program
-from the file @var{source-file}. Any @value{FN} can be used for
+from the file @var{source-file}. Any file name can be used for
@var{source-file}. For example, you could put the program:
@example
@@ -2094,8 +2090,8 @@ awk "BEGIN @{ print \"Don't Panic!\" @}"
@noindent
This was explained earlier
(@pxref{Read Terminal}).
-Note that you don't usually need single quotes around the @value{FN} that you
-specify with @option{-f}, because most @value{FN}s don't contain any of the shell's
+Note that you don't usually need single quotes around the file name that you
+specify with @option{-f}, because most file names don't contain any of the shell's
special characters. Notice that in @file{advice}, the @command{awk}
program did not have single quotes around it. The quotes are only needed
for programs that are provided on the @command{awk} command line.
@@ -2105,7 +2101,7 @@ for programs that are provided on the @command{awk} command line.
@c STARTOFRANGE qs2x
@cindex @code{'} (single quote)
If you want to clearly identify your @command{awk} program files as such,
-you can add the extension @file{.awk} to the @value{FN}. This doesn't
+you can add the extension @file{.awk} to the file name. This doesn't
affect the execution of the @command{awk} program but it does make
``housekeeping'' easier.
@@ -2132,13 +2128,13 @@ BEGIN @{ print "Don't Panic!" @}
After making this file executable (with the @command{chmod} utility),
simply type @samp{advice}
at the shell and the system arranges to run @command{awk}@footnote{The
-line beginning with @samp{#!} lists the full @value{FN} of an interpreter
+line beginning with @samp{#!} lists the full file name of an interpreter
to run and an optional initial command-line argument to pass to that
interpreter. The operating system then runs the interpreter with the given
argument and the full argument list of the executed program. The first argument
-in the list is the full @value{FN} of the @command{awk} program.
+in the list is the full file name of the @command{awk} program.
The rest of the
-argument list contains either options to @command{awk}, or @value{DF}s,
+argument list contains either options to @command{awk}, or data files,
or both. Note that on many systems @command{awk} may be found in
@file{/usr/bin} instead of in @file{/bin}. Caveat Emptor.} as if you had
typed @samp{awk -f advice}:
@@ -2349,7 +2345,7 @@ awk -F"" '@var{program}' @var{files} # wrong!
@noindent
In the second case, @command{awk} will attempt to use the text of the program
-as the value of @code{FS}, and the first @value{FN} as the text of the program!
+as the value of @code{FS}, and the first file name as the text of the program!
This results in syntax errors at best, and confusing behavior at worst.
@end itemize
@@ -2464,19 +2460,19 @@ gawk "@{ print \"\042\" $0 \"\042\" @}" @var{file}
@node Sample Data Files
-@section @value{DDF}s for the Examples
+@section Data Files for the Examples
@c For gawk >= 4.0, update these data files. No-one has such slow modems!
@cindex input files, examples
@cindex @code{BBS-list} file
Many of the examples in this @value{DOCUMENT} take their input from two sample
-@value{DF}s. The first, @file{BBS-list}, represents a list of
+data files. The first, @file{BBS-list}, represents a list of
computer bulletin board systems together with information about those systems.
-The second @value{DF}, called @file{inventory-shipped}, contains
+The second data file, called @file{inventory-shipped}, contains
information about monthly shipments. In both files,
each line is considered to be one @dfn{record}.
-In the @value{DF} @file{BBS-list}, each record contains the name of a computer
+In the data file @file{BBS-list}, each record contains the name of a computer
bulletin board, its phone number, the board's baud rate(s), and a code for
the number of hours it is operational. An @samp{A} in the last column
means the board operates 24 hours a day. A @samp{B} in the last
@@ -2506,7 +2502,7 @@ sabafoo 555-2127 1200/300 C
@end example
@cindex @code{inventory-shipped} file
-The @value{DF} @file{inventory-shipped} represents
+The data file @file{inventory-shipped} represents
information about shipments during the year.
Each record contains the month, the number
of green crates shipped, the number of red boxes shipped, the number of
@@ -2550,8 +2546,8 @@ learn in this @value{DOCUMENT}.
@cindex Texinfo
If you are using the stand-alone version of Info,
see @ref{Extract Program},
-for an @command{awk} program that extracts these @value{DF}s from
-@file{gawk.texi}, the Texinfo source file for this Info file.
+for an @command{awk} program that extracts these data files from
+@file{gawk.texi}, the (generated) Texinfo source file for this Info file.
@end ifinfo
@node Very Simple
@@ -2613,9 +2609,9 @@ collection of useful, short programs to get you started. Some of these
programs contain constructs that haven't been covered yet. (The description
of the program will give you a good idea of what is going on, but please
read the rest of the @value{DOCUMENT} to become an @command{awk} expert!)
-Most of the examples use a @value{DF} named @file{data}. This is just a
+Most of the examples use a data file named @file{data}. This is just a
placeholder; if you use these programs yourself, substitute
-your own @value{FN}s for @file{data}.
+your own file names for @file{data}.
For future reference, note that there is often more than
one way to do things in @command{awk}. At some point, you may want
to look back at these examples and see if
@@ -2705,7 +2701,7 @@ awk 'END @{ print NR @}' data
@end example
@item
-Print the even-numbered lines in the @value{DF}:
+Print the even-numbered lines in the data file:
@example
awk 'NR % 2 == 0' data
@@ -2747,7 +2743,7 @@ This program prints every line that contains the string
@samp{12} @emph{or} the string @samp{21}. If a line contains both
strings, it is printed twice, once by each rule.
-This is what happens if we run this program on our two sample @value{DF}s,
+This is what happens if we run this program on our two sample data files,
@file{BBS-list} and @file{inventory-shipped}:
@example
@@ -2813,7 +2809,7 @@ the file. The fourth field identifies the group of the file.
The fifth field contains the size of the file in bytes. The
sixth, seventh, and eighth fields contain the month, day, and time,
respectively, that the file was last modified. Finally, the ninth field
-contains the @value{FN}.@footnote{The @samp{LC_ALL=C} is
+contains the file name.@footnote{The @samp{LC_ALL=C} is
needed to produce this traditional-style output from @command{ls}.}
@c @cindex automatic initialization
@@ -3222,8 +3218,8 @@ conventions.
@cindex @code{-} (hyphen), filenames beginning with
@cindex hyphen (@code{-}), filenames beginning with
-This is useful if you have @value{FN}s that start with @samp{-},
-or in shell scripts, if you have @value{FN}s that will be specified
+This is useful if you have file names that start with @samp{-},
+or in shell scripts, if you have file names that will be specified
by the user that could start with @samp{-}.
It is also useful for passing options on to the @command{awk}
program; see @ref{Getopt Function}.
@@ -3441,7 +3437,7 @@ when parsing numeric input data (@pxref{Locales}).
Enable pretty-printing of @command{awk} programs.
By default, output program is created in a file named @file{awkprof.out}.
The optional @var{file} argument allows you to specify a different
-@value{FN} for the output.
+file name for the output.
No space is allowed between the @option{-o} and @var{file}, if
@var{file} is supplied.
@@ -3462,7 +3458,7 @@ Enable profiling of @command{awk} programs
(@pxref{Profiling}).
By default, profiles are created in a file named @file{awkprof.out}.
The optional @var{file} argument allows you to specify a different
-@value{FN} for the profile file.
+file name for the profile file.
No space is allowed between the @option{-p} and @var{file}, if
@var{file} is supplied.
@@ -3590,7 +3586,7 @@ function names must be unique.)
With standard @command{awk}, library functions can still be used, even
if the program is entered at the terminal,
by specifying @samp{-f /dev/tty}. After typing your program,
-type @kbd{@value{CTL}-d} (the end-of-file character) to terminate it.
+type @kbd{Ctrl-d} (the end-of-file character) to terminate it.
(You may also use @samp{-f -} to read program source from the standard
input but then you will not be able to also use the standard input as a
source of data.)
@@ -3672,9 +3668,9 @@ sets the variable @code{ARGIND} to the index in @code{ARGV} of the
current element.
@cindex input files, variable assignments and
-The distinction between @value{FN} arguments and variable-assignment
+The distinction between file name arguments and variable-assignment
arguments is made when @command{awk} is about to open the next input file.
-At that point in execution, it checks the @value{FN} to see whether
+At that point in execution, it checks the file name to see whether
it is really a variable assignment; if so, @command{awk} sets the variable
instead of reading a file.
@@ -3691,7 +3687,7 @@ sequences (@pxref{Escape Sequences}).
@value{DARKCORNER}
In some earlier implementations of @command{awk}, when a variable assignment
-occurred before any @value{FN}s, the assignment would happen @emph{before}
+occurred before any file names, the assignment would happen @emph{before}
the @code{BEGIN} rule was executed. @command{awk}'s behavior was thus
inconsistent; some command-line assignments were available inside the
@code{BEGIN} rule, while others were not. Unfortunately,
@@ -3702,8 +3698,8 @@ upon the old behavior.
The variable assignment feature is most useful for assigning to variables
such as @code{RS}, @code{OFS}, and @code{ORS}, which control input and
-output formats before scanning the @value{DF}s. It is also useful for
-controlling state if multiple passes are needed over a @value{DF}. For
+output formats before scanning the data files. It is also useful for
+controlling state if multiple passes are needed over a data file. For
example:
@cindex files, multiple passes over
@@ -3739,13 +3735,13 @@ You may also use @code{"-"} to name standard input when reading
files with @code{getline} (@pxref{Getline/File}).
In addition, @command{gawk} allows you to specify the special
-@value{FN} @file{/dev/stdin}, both on the command line and
+file name @file{/dev/stdin}, both on the command line and
with @code{getline}.
Some other versions of @command{awk} also support this, but it
is not standard.
(Some operating systems provide a @file{/dev/stdin} file
in the file system, however, @command{gawk} always processes
-this @value{FN} itself.)
+this file name itself.)
@node Environment Variables
@section The Environment Variables @command{gawk} Uses
@@ -3775,7 +3771,7 @@ on the command-line with the @option{-f} option.
In most @command{awk}
implementations, you must supply a precise path name for each program
file, unless the file is in the current directory.
-But in @command{gawk}, if the @value{FN} supplied to the @option{-f}
+But in @command{gawk}, if the file name supplied to the @option{-f}
or @option{-i} options
does not contain a @samp{/}, then @command{gawk} searches a list of
directories (called the @dfn{search path}), one by one, looking for a
@@ -3795,7 +3791,7 @@ though.}
The search path feature is particularly useful for building libraries
of useful @command{awk} functions. The library files can be placed in a
standard directory in the default path and then specified on
-the command line with a short @value{FN}. Otherwise, the full @value{FN}
+the command line with a short file name. Otherwise, the full file name
would have to be typed for each file.
By using the @option{-i} option, or the @option{--source} and @option{-f} options, your command-line
@@ -3889,10 +3885,6 @@ for use by the @command{gawk} developers for testing and tuning.
They are subject to change. The variables are:
@table @env
-@item AVG_CHAIN_MAX
-The average number of items @command{gawk} will maintain on a
-hash chain for managing arrays.
-
@item AWK_HASH
If this variable exists with a value of @samp{gst}, @command{gawk}
will switch to using the hash function from GNU Smalltalk for
@@ -3905,6 +3897,13 @@ files one line at a time, instead of reading in blocks. This exists
for debugging problems on filesystems on non-POSIX operating systems
where I/O is performed in records, not in blocks.
+@item GAWK_MSG_SRC
+If this variable exists, @command{gawk} includes the source file
+name and line number from which warning and/or fatal messages
+are generated. Its purpose is to help isolate the source of a
+message, since there can be multiple places which produce the
+same warning or error message.
+
@item GAWK_NO_DFA
If this variable exists, @command{gawk} does not use the DFA regexp matcher
for ``does it match'' kinds of tests. This can cause @command{gawk}
@@ -3917,6 +3916,14 @@ coordinate with each other.)
This specifies the amount by which @command{gawk} should grow its
internal evaluation stack, when needed.
+@item INT_CHAIN_MAX
+The average number of items @command{gawk} will maintain on a
+hash chain for managing arrays indexed by integers.
+
+@item STR_CHAIN_MAX
+The average number of items @command{gawk} will maintain on a
+hash chain for managing arrays indexed by strings.
+
@item TIDYMEM
If this variable exists, @command{gawk} uses the @code{mtrace()} library
calls from GNU LIBC to help track down possible memory leaks.
@@ -3995,7 +4002,7 @@ use @samp{@@include} followed by the name of the file to be included,
enclosed in double quotes.
@quotation NOTE
-Keep in mind that this is a language construct and the @value{FN} cannot
+Keep in mind that this is a language construct and the file name cannot
be a string variable, but rather just a literal string in double quotes.
@end quotation
@@ -4020,7 +4027,7 @@ $ @kbd{gawk -f test3}
@print{} This is file test3.
@end example
-The @value{FN} can, of course, be a pathname. For example:
+The file name can, of course, be a pathname. For example:
@example
@@include "../io_funcs"
@@ -4118,7 +4125,7 @@ they will @emph{not} be in the next release).
@cindex @code{PROCINFO} array
The process-related special files @file{/dev/pid}, @file{/dev/ppid},
@file{/dev/pgrpid}, and @file{/dev/user} were deprecated in @command{gawk}
-3.1, but still worked. As of @value{PVERSION} 4.0, they are no longer
+3.1, but still worked. As of version 4.0, they are no longer
interpreted specially by @command{gawk}. (Use @code{PROCINFO} instead;
see @ref{Auto-set}.)
@@ -4137,8 +4144,8 @@ in case some option becomes obsolete in a future version of @command{gawk}.
@cindex Jedi knights
@cindex Knights, jedi
@quotation
-@i{Use the Source, Luke!}@*
-Obi-Wan
+@i{Use the Source, Luke!}
+@author Obi-Wan
@end quotation
This @value{SECTION} intentionally left
@@ -4374,39 +4381,39 @@ A literal backslash, @samp{\}.
@cindex @code{\} (backslash), @code{\a} escape sequence
@cindex backslash (@code{\}), @code{\a} escape sequence
@item \a
-The ``alert'' character, @kbd{@value{CTL}-g}, ASCII code 7 (BEL).
+The ``alert'' character, @kbd{Ctrl-g}, ASCII code 7 (BEL).
(This usually makes some sort of audible noise.)
@cindex @code{\} (backslash), @code{\b} escape sequence
@cindex backslash (@code{\}), @code{\b} escape sequence
@item \b
-Backspace, @kbd{@value{CTL}-h}, ASCII code 8 (BS).
+Backspace, @kbd{Ctrl-h}, ASCII code 8 (BS).
@cindex @code{\} (backslash), @code{\f} escape sequence
@cindex backslash (@code{\}), @code{\f} escape sequence
@item \f
-Formfeed, @kbd{@value{CTL}-l}, ASCII code 12 (FF).
+Formfeed, @kbd{Ctrl-l}, ASCII code 12 (FF).
@cindex @code{\} (backslash), @code{\n} escape sequence
@cindex backslash (@code{\}), @code{\n} escape sequence
@item \n
-Newline, @kbd{@value{CTL}-j}, ASCII code 10 (LF).
+Newline, @kbd{Ctrl-j}, ASCII code 10 (LF).
@cindex @code{\} (backslash), @code{\r} escape sequence
@cindex backslash (@code{\}), @code{\r} escape sequence
@item \r
-Carriage return, @kbd{@value{CTL}-m}, ASCII code 13 (CR).
+Carriage return, @kbd{Ctrl-m}, ASCII code 13 (CR).
@cindex @code{\} (backslash), @code{\t} escape sequence
@cindex backslash (@code{\}), @code{\t} escape sequence
@item \t
-Horizontal TAB, @kbd{@value{CTL}-i}, ASCII code 9 (HT).
+Horizontal TAB, @kbd{Ctrl-i}, ASCII code 9 (HT).
@c @cindex @command{awk} language, V.4 version
@cindex @code{\} (backslash), @code{\v} escape sequence
@cindex backslash (@code{\}), @code{\v} escape sequence
@item \v
-Vertical tab, @kbd{@value{CTL}-k}, ASCII code 11 (VT).
+Vertical tab, @kbd{Ctrl-k}, ASCII code 11 (VT).
@cindex @code{\} (backslash), @code{\}@var{nnn} escape sequence
@cindex backslash (@code{\}), @code{\}@var{nnn} escape sequence
@@ -4738,7 +4745,7 @@ constants,
@command{gawk} did @emph{not} match interval expressions
in regexps.
-However, beginning with @value{PVERSION} 4.0,
+However, beginning with version 4.0,
@command{gawk} does match interval expressions by default.
This is because compatibility with POSIX has become more
important to most @command{gawk} users than compatibility with
@@ -5329,7 +5336,7 @@ But a newline in a regexp constant works with no problem:
$ @kbd{awk '$0 ~ /[ \t\n]/'}
@kbd{here is a sample line}
@print{} here is a sample line
-@kbd{@value{CTL}-d}
+@kbd{Ctrl-d}
@end example
@command{gawk} does not have this problem, and it isn't likely to
@@ -5379,7 +5386,7 @@ used with it do not have to be named on the @command{awk} command line
* Field Separators:: The field separator and how to change it.
* Constant Size:: Reading constant width data.
* Splitting By Content:: Defining Fields By Content
-* Multiple Line:: Reading multi-line records.
+* Multiple Line:: Reading multiline records.
* Getline:: Reading files under explicit program control
using the @code{getline} function.
* Read Timeout:: Reading input with a timeout.
@@ -5404,7 +5411,7 @@ so far
from the current input file. This value is stored in a
built-in variable called @code{FNR}. It is reset to zero when a new
file is started. Another built-in variable, @code{NR}, records the total
-number of input records read so far from all @value{DF}s. It starts at zero,
+number of input records read so far from all data files. It starts at zero,
but is never automatically reset to zero.
@cindex separators, for records
@@ -5478,7 +5485,7 @@ $ @kbd{awk 'BEGIN @{ RS = "/" @}}
@noindent
Note that the entry for the @samp{camelot} BBS is not split.
-In the original @value{DF}
+In the original data file
(@pxref{Sample Data Files}),
the line looks like this:
@@ -5491,7 +5498,7 @@ It has one baud rate only, so there are no slashes in the record,
unlike the others which have two or more baud rates.
In fact, this record is treated as part of the record
for the @samp{core} BBS; the newline separating them in the output
-is the original newline in the @value{DF}, not the one added by
+is the original newline in the data file, not the one added by
@command{awk} when it printed the record!
@cindex record separators, changing
@@ -5627,8 +5634,8 @@ In compatibility mode, only the first character of the value of
@code{RS} is used to determine the end of the record.
@sidebar @code{RS = "\0"} Is Not Portable
-@cindex portability, @value{DF}s as single record
-There are times when you might want to treat an entire @value{DF} as a
+@cindex portability, data files as single record
+There are times when you might want to treat an entire data file as a
single record. The only way to make this happen is to give @code{RS}
a value that you know doesn't occur in the input file. This is hard
to do in a general way, such that a program always works for arbitrary
@@ -6810,7 +6817,7 @@ appear in a row, they are considered one record separator.
@cindex dark corner, multiline records
There is an important difference between @samp{RS = ""} and
@samp{RS = "\n\n+"}. In the first case, leading newlines in the input
-@value{DF} are ignored, and if a file ends without extra blank lines
+data file are ignored, and if a file ends without extra blank lines
after the last record, the final newline is removed from the record.
In the second case, this special processing is not done.
@value{DARKCORNER}
@@ -6845,7 +6852,7 @@ Another way to separate fields is to
put each field on a separate line: to do this, just set the
variable @code{FS} to the string @code{"\n"}. (This single
character separator matches a single newline.)
-A practical example of a @value{DF} organized this way might be a mailing
+A practical example of a data file organized this way might be a mailing
list, where each entry is separated by blank lines. Consider a mailing
list in a file named @file{addresses}, which looks like this:
@@ -6910,7 +6917,7 @@ value of
@table @code
@item RS == "\n"
Records are separated by the newline character (@samp{\n}). In effect,
-every line in the @value{DF} is a separate record, including blank lines.
+every line in the data file is a separate record, including blank lines.
This is the default.
@item RS == @var{any single character}
@@ -7116,7 +7123,7 @@ the value of @code{NF} do not change.
@cindex operators, input/output
Use @samp{getline < @var{file}} to read the next record from @var{file}.
Here @var{file} is a string-valued expression that
-specifies the @value{FN}. @samp{< @var{file}} is called a @dfn{redirection}
+specifies the file name. @samp{< @var{file}} is called a @dfn{redirection}
because it directs input to come from a different place.
For example, the following
program reads its input record from the file @file{secondary.input} when it
@@ -7202,8 +7209,8 @@ that does handle nested @samp{@@include} statements.
@c From private email, dated October 2, 1988. Used by permission, March 2013.
@quotation
@i{Omniscience has much to recommend it.
-Failing that, attention to details would be useful.}@*
-Brian Kernighan
+Failing that, attention to details would be useful.}
+@author Brian Kernighan
@end quotation
@cindex @code{|} (vertical bar), @code{|} operator (I/O)
@@ -7423,10 +7430,10 @@ system permits.
@item
An interesting side effect occurs if you use @code{getline} without a
redirection inside a @code{BEGIN} rule. Because an unredirected @code{getline}
-reads from the command-line @value{DF}s, the first @code{getline} command
+reads from the command-line data files, the first @code{getline} command
causes @command{awk} to set the value of @code{FILENAME}. Normally,
@code{FILENAME} does not have a value inside @code{BEGIN} rules, because you
-have not yet started to process the command-line @value{DF}s.
+have not yet started to process the command-line data files.
@value{DARKCORNER}
(@xref{BEGIN/END},
also @pxref{Auto-set}.)
@@ -7648,7 +7655,7 @@ For printing with specifications, you need the @code{printf} statement
@cindex @code{printf} statement
Besides basic and formatted printing, this @value{CHAPTER}
also covers I/O redirections to files and pipes, introduces
-the special @value{FN}s that @command{gawk} processes internally,
+the special file names that @command{gawk} processes internally,
and discusses the @code{close()} built-in function.
@menu
@@ -8449,9 +8456,9 @@ but they work identically for @code{printf}:
@cindex operators, input/output
@item print @var{items} > @var{output-file}
This redirection prints the items into the output file named
-@var{output-file}. The @value{FN} @var{output-file} can be any
+@var{output-file}. The file name @var{output-file} can be any
expression. Its value is changed to a string and then used as a
-@value{FN} (@pxref{Expressions}).
+file name (@pxref{Expressions}).
When this type of redirection is used, the @var{output-file} is erased
before the first output is written to it. Subsequent writes to the same
@@ -8617,7 +8624,7 @@ open as many pipelines as the underlying operating system permits.
A particularly powerful way to use redirection is to build command lines
and pipe them into the shell, @command{sh}. For example, suppose you
-have a list of files brought over from a system where all the @value{FN}s
+have a list of files brought over from a system where all the file names
are stored in uppercase, and you wish to rename them to have names in
all lowercase. The following program is both simple and efficient:
@@ -8639,12 +8646,12 @@ It then sends the list to the shell for execution.
@c ENDOFRANGE reout
@node Special Files
-@section Special @value{FFN}s in @command{gawk}
+@section Special File Names in @command{gawk}
@c STARTOFRANGE gfn
-@cindex @command{gawk}, @value{FN}s in
+@cindex @command{gawk}, file names in
-@command{gawk} provides a number of special @value{FN}s that it interprets
-internally. These @value{FN}s provide access to standard file descriptors
+@command{gawk} provides a number of special file names that it interprets
+internally. These file names provide access to standard file descriptors
and TCP/IP networking.
@menu
@@ -8708,12 +8715,12 @@ that happens, writing to the screen is not correct. In fact, if
terminal at all.
Then opening @file{/dev/tty} fails.
-@command{gawk} provides special @value{FN}s for accessing the three standard
+@command{gawk} provides special file names for accessing the three standard
streams. @value{COMMONEXT}. It also provides syntax for accessing
-any other inherited open files. If the @value{FN} matches
+any other inherited open files. If the file name matches
one of these special names when @command{gawk} redirects input or output,
-then it directly uses the stream that the @value{FN} stands for.
-These special @value{FN}s work for all operating systems that @command{gawk}
+then it directly uses the stream that the file name stands for.
+These special file names work for all operating systems that @command{gawk}
has been ported to, not just those that are POSIX-compliant:
@cindex common extensions, @code{/dev/stdin} special file
@@ -8722,7 +8729,7 @@ has been ported to, not just those that are POSIX-compliant:
@cindex extensions, common@comma{} @code{/dev/stdin} special file
@cindex extensions, common@comma{} @code{/dev/stdout} special file
@cindex extensions, common@comma{} @code{/dev/stderr} special file
-@cindex @value{FN}s, standard streams in @command{gawk}
+@cindex file names, standard streams in @command{gawk}
@cindex @code{/dev/@dots{}} special files (@command{gawk})
@cindex files, @code{/dev/@dots{}} special files
@cindex @code{/dev/fd/@var{N}} special files
@@ -8743,7 +8750,7 @@ the shell). Unless special pains are taken in the shell from which
@command{gawk} is invoked, only descriptors 0, 1, and 2 are available.
@end table
-The @value{FN}s @file{/dev/stdin}, @file{/dev/stdout}, and @file{/dev/stderr}
+The file names @file{/dev/stdin}, @file{/dev/stdout}, and @file{/dev/stderr}
are aliases for @file{/dev/fd/0}, @file{/dev/fd/1}, and @file{/dev/fd/2},
respectively. However, they are more self-explanatory.
The proper way to write an error message in a @command{gawk} program
@@ -8753,14 +8760,14 @@ is to use @file{/dev/stderr}, like this:
print "Serious error detected!" > "/dev/stderr"
@end example
-@cindex troubleshooting, quotes with @value{FN}s
-Note the use of quotes around the @value{FN}.
+@cindex troubleshooting, quotes with file names
+Note the use of quotes around the file name.
Like any other redirection, the value must be a string.
It is a common error to omit the quotes, which leads
to confusing results.
@c Exercise: What does it do? :-)
-Finally, using the @code{close()} function on a @value{FN} of the
+Finally, using the @code{close()} function on a file name of the
form @code{"/dev/fd/@var{N}"}, for file descriptor numbers
above two, does actually close the given file descriptor.
@@ -8776,7 +8783,7 @@ versions of @command{awk}.
@command{gawk} programs
can open a two-way
TCP/IP connection, acting as either a client or a server.
-This is done using a special @value{FN} of the form:
+This is done using a special file name of the form:
@example
@file{/@var{net-type}/@var{protocol}/@var{local-port}/@var{remote-host}/@var{remote-port}}
@@ -8786,7 +8793,7 @@ The @var{net-type} is one of @samp{inet}, @samp{inet4} or @samp{inet6}.
The @var{protocol} is one of @samp{tcp} or @samp{udp},
and the other fields represent the other essential pieces of information
for making a networking connection.
-These @value{FN}s are used with the @samp{|&} operator for communicating
+These file names are used with the @samp{|&} operator for communicating
with a coprocess
(@pxref{Two-way I/O}).
This is an advanced feature, mentioned here only for completeness.
@@ -8794,21 +8801,21 @@ Full discussion is delayed until
@ref{TCP/IP Networking}.
@node Special Caveats
-@subsection Special @value{FFN} Caveats
+@subsection Special File Name Caveats
Here is a list of things to bear in mind when using the
-special @value{FN}s that @command{gawk} provides:
+special file names that @command{gawk} provides:
@itemize @bullet
-@cindex compatibility mode (@command{gawk}), @value{FN}s
-@cindex @value{FN}s, in compatibility mode
+@cindex compatibility mode (@command{gawk}), file names
+@cindex file names, in compatibility mode
@item
-Recognition of these special @value{FN}s is disabled if @command{gawk} is in
+Recognition of these special file names is disabled if @command{gawk} is in
compatibility mode (@pxref{Options}).
@item
@command{gawk} @emph{always}
-interprets these special @value{FN}s.
+interprets these special file names.
For example, using @samp{/dev/fd/4}
for output actually writes on file descriptor 4, and not on a new
file descriptor that is @code{dup()}'ed from file descriptor 4. Most of
@@ -8831,7 +8838,7 @@ Doing so results in unpredictable behavior.
@cindex coprocesses, closing
@cindex @code{getline} command, coprocesses@comma{} using from
-If the same @value{FN} or the same shell command is used with @code{getline}
+If the same file name or the same shell command is used with @code{getline}
more than once during the execution of an @command{awk} program
(@pxref{Getline}),
the file is opened (or the command is executed) the first time only.
@@ -8840,7 +8847,7 @@ The next time the same file or command is used with @code{getline},
another record is read from it, and so on.
Similarly, when a file or pipe is opened for output, @command{awk} remembers
-the @value{FN} or command associated with it, and subsequent
+the file name or command associated with it, and subsequent
writes to the same file or command are appended to the previous writes.
The file or pipe stays open until @command{awk} exits.
@@ -8882,7 +8889,7 @@ file or command, or the next @code{print} or @code{printf} to that
file or command, reopens the file or reruns the command.
Because the expression that you use to close a file or pipeline must
exactly match the expression used to open the file or run the command,
-it is good practice to use a variable to store the @value{FN} or command.
+it is good practice to use a variable to store the file name or command.
The previous example becomes the following:
@example
@@ -8931,7 +8938,7 @@ a separate message.
@cindex portability, @code{close()} function and
If you use more files than the system allows you to have open,
@command{gawk} attempts to multiplex the available open files among
-your @value{DF}s. @command{gawk}'s ability to do this depends upon the
+your data files. @command{gawk}'s ability to do this depends upon the
facilities of your operating system, so it may not always work. It is
therefore both good practice and good portability advice to always
use @code{close()} on your files when you are done with them.
@@ -9445,7 +9452,7 @@ as in the following:
@noindent
the variable is set at the very beginning, even before the
@code{BEGIN} rules execute. The @option{-v} option and its assignment
-must precede all the @value{FN} arguments, as well as the program text.
+must precede all the file name arguments, as well as the program text.
(@xref{Options}, for more information about
the @option{-v} option.)
Otherwise, the variable assignment is performed at a time determined by
@@ -9527,7 +9534,7 @@ with @code{CONVFMT} as the format
specifier
(@pxref{String Functions}).
-@code{CONVFMT}'s default value is @code{"%.6g"}, which prints a value with
+@code{CONVFMT}'s default value is @code{"%.6g"}, which creates a value with
at most six significant digits. For some applications, you might want to
change it to specify more precision.
On most modern machines,
@@ -9618,7 +9625,7 @@ point, so the default behavior was restored to use a period as the
decimal point character. You can use the @option{--use-lc-numeric}
option (@pxref{Options}) to force @command{gawk} to use the locale's
decimal point character. (@command{gawk} also uses the locale's decimal
-point character when in POSIX mode, either via @w{@option{--posix}}, or the
+point character when in POSIX mode, either via @option{--posix}, or the
@env{POSIXLY_CORRECT} environment variable, as shown previously.)
@ref{table-locale-affects} describes the cases in which the locale's decimal
@@ -9776,8 +9783,8 @@ For maximum portability, do not use the @samp{**} operator.
@subsection String Concatenation
@cindex Kernighan, Brian
@quotation
-@i{It seemed like a good idea at the time.}@*
-Brian Kernighan
+@i{It seemed like a good idea at the time.}
+@author Brian Kernighan
@end quotation
@cindex string operators
@@ -10248,8 +10255,8 @@ like @samp{@var{lvalue}++}, but instead of adding, it subtracts.)
@cindex Marx, Groucho
@quotation
@i{Doctor, doctor! It hurts when I do this!@*
-So don't do that!}@*
-Groucho Marx
+So don't do that!}
+@author Groucho Marx
@end quotation
@noindent
@@ -10346,8 +10353,8 @@ the string constant @code{"0"} is actually true, because it is non-null.
@node Typing and Comparison
@subsection Variable Typing and Comparison Expressions
@quotation
-@i{The Guide is definitive. Reality is frequently inaccurate.}@*
-The Hitchhiker's Guide to the Galaxy
+@i{The Guide is definitive. Reality is frequently inaccurate.}
+@author The Hitchhiker's Guide to the Galaxy
@end quotation
@c STARTOFRANGE comex
@@ -11005,7 +11012,7 @@ $ @kbd{awk '@{ print "The square root of", $1, "is", sqrt($1) @}'}
@print{} The square root of 3 is 1.73205
@kbd{5}
@print{} The square root of 5 is 2.23607
-@kbd{@value{CTL}-d}
+@kbd{Ctrl-d}
@end example
A function can also have side effects, such as assigning
@@ -12527,11 +12534,11 @@ The @code{nextfile} statement
is similar to the @code{next} statement.
However, instead of abandoning processing of the current record, the
@code{nextfile} statement instructs @command{awk} to stop processing the
-current @value{DF}.
+current data file.
Upon execution of the @code{nextfile} statement,
@code{FILENAME} is
-updated to the name of the next @value{DF} listed on the command line,
+updated to the name of the next data file listed on the command line,
@code{FNR} is reset to one,
and processing
starts over with the first rule in the program.
@@ -12540,10 +12547,10 @@ then the code in any @code{END} rules is executed. An exception to this is
when @code{nextfile} is invoked during execution of any statement in an
@code{END} rule; In this case, it causes the program to stop immediately. @xref{BEGIN/END}.
-The @code{nextfile} statement is useful when there are many @value{DF}s
+The @code{nextfile} statement is useful when there are many data files
to process but it isn't necessary to process every record in every file.
Without @code{nextfile},
-in order to move on to the next @value{DF}, a program
+in order to move on to the next data file, a program
would have to continue scanning the unwanted records. The @code{nextfile}
statement accomplishes this much more efficiently.
@@ -12781,7 +12788,7 @@ exclusively on the value of @code{FS}.
@item FS
This is the input field separator
(@pxref{Field Separators}).
-The value is a single-character string or a multi-character regular
+The value is a single-character string or a multicharacter regular
expression that matches the separations between fields in an input
record. If the value is the null string (@code{""}), then each
character in the record becomes a separate field.
@@ -12927,7 +12934,7 @@ This is the subscript separator. It has the default value of
@code{"\034"} and is used to separate the parts of the indices of a
multidimensional array. Thus, the expression @code{@w{foo["A", "B"]}}
really accesses @code{foo["A\034B"]}
-(@pxref{Multi-dimensional}).
+(@pxref{Multidimensional}).
@cindex @command{gawk}, @code{TEXTDOMAIN} variable in
@cindex @code{TEXTDOMAIN} variable
@@ -13010,17 +13017,17 @@ about how @command{awk} uses these variables.
@cindex differences in @command{awk} and @command{gawk}, @code{ARGIND} variable
@item ARGIND #
The index in @code{ARGV} of the current file being processed.
-Every time @command{gawk} opens a new @value{DF} for processing, it sets
-@code{ARGIND} to the index in @code{ARGV} of the @value{FN}.
+Every time @command{gawk} opens a new data file for processing, it sets
+@code{ARGIND} to the index in @code{ARGV} of the file name.
When @command{gawk} is processing the input files,
@samp{FILENAME == ARGV[ARGIND]} is always true.
@cindex files, processing@comma{} @code{ARGIND} variable and
This variable is useful in file processing; it allows you to tell how far
-along you are in the list of @value{DF}s as well as to distinguish between
-successive instances of the same @value{FN} on the command line.
+along you are in the list of data files as well as to distinguish between
+successive instances of the same file name on the command line.
-@cindex @value{FN}s, distinguishing
+@cindex file names, distinguishing
While you can change the value of @code{ARGIND} within your @command{awk}
program, @command{gawk} automatically sets it to a new value when the
next file is opened.
@@ -13037,10 +13044,18 @@ it is not special.
An associative array containing the values of the environment. The array
indices are the environment variable names; the elements are the values of
the particular environment variables. For example,
-@code{ENVIRON["HOME"]} might be @file{/home/arnold}. Changing this array
-does not affect the environment passed on to any programs that
-@command{awk} may spawn via redirection or the @code{system()} function.
-@c (In a future version of @command{gawk}, it may do so.)
+@code{ENVIRON["HOME"]} might be @file{/home/arnold}.
+
+For POSIX @command{awk}, changing this array does not affect the
+environment passed on to any programs that @command{awk} may spawn via
+redirection or the @code{system()} function.
+
+However, beginning with version 4.2, if not in POSIX
+compatibility mode, @command{gawk} does update its own environment when
+@code{ENVIRON} is changed, thus changing the environment seen by programs
+that it creates. You should therefore be especially careful if you
+modify @code{ENVIRON["PATH"]"}, which is the search path for finding
+executable programs.
Some operating systems may not have environment variables.
On such systems, the @code{ENVIRON} array is empty (except for
@@ -13082,14 +13097,14 @@ it is not special.
@cindex dark corner, @code{FILENAME} variable
@item FILENAME
The name of the file that @command{awk} is currently reading.
-When no @value{DF}s are listed on the command line, @command{awk} reads
+When no data files are listed on the command line, @command{awk} reads
from the standard input and @code{FILENAME} is set to @code{"-"}.
@code{FILENAME} is changed each time a new file is read
(@pxref{Reading Files}).
Inside a @code{BEGIN} rule, the value of @code{FILENAME} is
@code{""}, since there are no input files being processed
yet.@footnote{Some early implementations of Unix @command{awk} initialized
-@code{FILENAME} to @code{"-"}, even if there were @value{DF}s to be
+@code{FILENAME} to @code{"-"}, even if there were data files to be
processed. This behavior was incorrect and should not be relied
upon in your programs.}
@value{DARKCORNER}
@@ -13129,8 +13144,12 @@ current record. @xref{Changing Fields}.
@item FUNCTAB #
An array whose indices and corresponding values are the names of all
the user-defined or extension functions in the program.
-@strong{NOTE}: You may not use the @code{delete} statement with the
-@code{FUNCTAB} array.
+
+@quotation NOTE
+Attempting to use the @code{delete} statement with the @code{FUNCTAB}
+array will cause a fatal error. Any attempt to assign to an element of
+the @code{FUNCTAB} array will also cause a fatal error.
+@end quotation
@cindex @code{NR} variable
@item NR
@@ -13462,11 +13481,11 @@ additional files to be read.
If the value of @code{ARGC} is decreased, that eliminates input files
from the end of the list. By recording the old value of @code{ARGC}
elsewhere, a program can treat the eliminated arguments as
-something other than @value{FN}s.
+something other than file names.
To eliminate a file from the middle of the list, store the null string
(@code{""}) into @code{ARGV} in place of the file's name. As a
-special feature, @command{awk} ignores @value{FN}s that have been
+special feature, @command{awk} ignores file names that have been
replaced with the null string.
Another option is to
use the @code{delete} statement to remove elements from
@@ -13561,7 +13580,7 @@ same @command{awk} program.
* Numeric Array Subscripts:: How to use numbers as subscripts in
@command{awk}.
* Uninitialized Subscripts:: Using Uninitialized variables as subscripts.
-* Multi-dimensional:: Emulating multidimensional arrays in
+* Multidimensional:: Emulating multidimensional arrays in
@command{awk}.
* Arrays of Arrays:: True multidimensional arrays.
@end menu
@@ -13591,8 +13610,8 @@ an array.
@cindex Wall, Larry
@quotation
@i{Doing linear scans over an associative array is like trying to club someone
-to death with a loaded Uzi.}@*
-Larry Wall
+to death with a loaded Uzi.}
+@author Larry Wall
@end quotation
The @command{awk} language provides one-dimensional arrays
@@ -14003,29 +14022,29 @@ Array elements are processed in arbitrary order, which is the default
@command{awk} behavior.
@item "@@ind_str_asc"
-Order by indices compared as strings; this is the most basic sort.
+Order by indices in ascending order compared as strings; this is the most basic sort.
(Internally, array indices are always strings, so with @samp{a[2*5] = 1}
the index is @code{"10"} rather than numeric 10.)
@item "@@ind_num_asc"
-Order by indices but force them to be treated as numbers in the process.
+Order by indices in ascending order but force them to be treated as numbers in the process.
Any index with a non-numeric value will end up positioned as if it were zero.
@item "@@val_type_asc"
-Order by element values rather than indices.
+Order by element values in ascending order (rather than by indices).
Ordering is by the type assigned to the element
(@pxref{Typing and Comparison}).
All numeric values come before all string values,
which in turn come before all subarrays.
(Subarrays have not been described yet;
-@pxref{Arrays of Arrays}).
+@pxref{Arrays of Arrays}.)
@item "@@val_str_asc"
-Order by element values rather than by indices. Scalar values are
+Order by element values in ascending order (rather than by indices). Scalar values are
compared as strings. Subarrays, if present, come out last.
@item "@@val_num_asc"
-Order by element values rather than by indices. Scalar values are
+Order by element values in ascending order (rather than by indices). Scalar values are
compared as numbers. Subarrays, if present, come out last.
When numeric values are equal, the string values are used to provide
an ordering: this guarantees consistent results across different
@@ -14038,13 +14057,14 @@ across different environments.} which @command{gawk} uses internally
to perform the sorting.
@item "@@ind_str_desc"
-Reverse order from the most basic sort.
+String indices ordered from high to low.
@item "@@ind_num_desc"
Numeric indices ordered from high to low.
@item "@@val_type_desc"
-Element values, based on type, in descending order.
+Element values, based on type, ordered from high to low.
+Subarrays, if present, come out first.
@item "@@val_str_desc"
Element values, treated as strings, ordered from high to low.
@@ -14354,11 +14374,11 @@ Even though it is somewhat unusual, the null string
if @option{--lint} is provided
on the command line (@pxref{Options}).
-@node Multi-dimensional
+@node Multidimensional
@section Multidimensional Arrays
@menu
-* Multi-scanning:: Scanning multidimensional arrays.
+* Multiscanning:: Scanning multidimensional arrays.
@end menu
@cindex subscripts in arrays, multidimensional
@@ -14456,7 +14476,7 @@ the program produces the following output:
3 2 1 6
@end example
-@node Multi-scanning
+@node Multiscanning
@subsection Scanning Multidimensional Arrays
There is no special @code{for} statement for scanning a
@@ -14901,15 +14921,16 @@ sequences of random numbers.
@node String Functions
@subsection String-Manipulation Functions
-The functions in this @value{SECTION} look at or change the text of one or more
-strings.
-@code{gawk} understands locales (@pxref{Locales}), and does all string processing in terms of
-@emph{characters}, not @emph{bytes}. This distinction is particularly important
-to understand for locales where one character
-may be represented by multiple bytes. Thus, for example, @code{length()}
-returns the number of characters in a string, and not the number of bytes
-used to represent those characters, Similarly, @code{index()} works with
-character indices, and not byte indices.
+The functions in this @value{SECTION} look at or change the text of one
+or more strings.
+
+@code{gawk} understands locales (@pxref{Locales}), and does all
+string processing in terms of @emph{characters}, not @emph{bytes}.
+This distinction is particularly important to understand for locales
+where one character may be represented by multiple bytes. Thus, for
+example, @code{length()} returns the number of characters in a string,
+and not the number of bytes used to represent those characters. Similarly,
+@code{index()} works with character indices, and not byte indices.
In the following list, optional parameters are enclosed in square brackets@w{ ([ ]).}
Several functions perform string substitution; the full discussion is
@@ -14926,30 +14947,32 @@ pound sign@w{ (@samp{#}):}
@table @code
@item asort(@var{source} @r{[}, @var{dest} @r{[}, @var{how} @r{]} @r{]}) #
+@itemx asorti(@var{source} @r{[}, @var{dest} @r{[}, @var{how} @r{]} @r{]}) #
+@cindex @code{asorti()} function (@command{gawk})
@cindex arrays, elements, retrieving number of
@cindex @code{asort()} function (@command{gawk})
@cindex @command{gawk}, @code{IGNORECASE} variable in
@cindex @code{IGNORECASE} variable
-Return the number of elements in the array @var{source}.
-@command{gawk} sorts the contents of @var{source}
-and replaces the indices
-of the sorted values of @var{source} with sequential
-integers starting with one. If the optional array @var{dest} is specified,
-then @var{source} is duplicated into @var{dest}. @var{dest} is then
-sorted, leaving the indices of @var{source} unchanged. The optional third
-argument @var{how} is a string which controls the rule for comparing values,
-and the sort direction. A single space is required between the
-comparison mode, @samp{string} or @samp{number}, and the direction specification,
-@samp{ascending} or @samp{descending}. You can omit direction and/or mode
-in which case it will default to @samp{ascending} and @samp{string}, respectively.
-An empty string "" is the same as the default @code{"ascending string"}
-for the value of @var{how}. If the @samp{source} array contains subarrays as values,
-they will come out last(first) in the @samp{dest} array for @samp{ascending}(@samp{descending})
-order specification. The value of @code{IGNORECASE} affects the sorting.
-The third argument can also be a user-defined function name in which case
-the value returned by the function is used to order the array elements
-before constructing the result array.
-@xref{Array Sorting Functions}, for more information.
+These two functions are similar in behavior, so they are described
+together.
+
+@quotation NOTE
+The following description ignores the third argument, @var{how}, since it
+requires understanding features that we have not discussed yet. Thus,
+the discussion here is a deliberate simplification. (We do provide all
+the details later on: @xref{Array Sorting Functions}, for the full story.)
+@end quotation
+
+Both functions return the number of elements in the array @var{source}.
+For @command{asort()}, @command{gawk} sorts the values of @var{source}
+and replaces the indices of the sorted values of @var{source} with
+sequential integers starting with one. If the optional array @var{dest}
+is specified, then @var{source} is duplicated into @var{dest}. @var{dest}
+is then sorted, leaving the indices of @var{source} unchanged.
+
+When comparing strings, @code{IGNORECASE} affects the sorting. If the
+@var{source} array contains subarrays as values (@pxref{Arrays of
+Arrays}), they will come last, after all scalar values.
For example, if the contents of @code{a} are as follows:
@@ -14975,29 +14998,19 @@ a[2] = "de"
a[3] = "sac"
@end example
-In order to reverse the direction of the sorted results in the above example,
-@code{asort()} can be called with three arguments as follows:
+The @code{asorti()} function works similarly to @code{asort()}, however,
+the @emph{indices} are sorted, instead of the values. Thus, in the
+previous example, starting with the same initial set of indices and
+values in @code{a}, calling @samp{asorti(a)} would yield:
@example
-asort(a, a, "descending")
+a[1] = "first"
+a[2] = "last"
+a[3] = "middle"
@end example
-The @code{asort()} function is described in more detail in
-@ref{Array Sorting Functions}.
-@code{asort()} is a @command{gawk} extension; it is not available
-in compatibility mode (@pxref{Options}).
-
-@item asorti(@var{source} @r{[}, @var{dest} @r{[}, @var{how} @r{]} @r{]}) #
-@cindex @code{asorti()} function (@command{gawk})
-Return the number of elements in the array @var{source}.
-It works similarly to @code{asort()}, however, the @emph{indices}
-are sorted, instead of the values. (Here too,
-@code{IGNORECASE} affects the sorting.)
-
-The @code{asorti()} function is described in more detail in
-@ref{Array Sorting Functions}.
-@code{asorti()} is a @command{gawk} extension; it is not available
-in compatibility mode (@pxref{Options}).
+@code{asort()} and @code{asorti()} are @command{gawk} extensions; they
+are not available in compatibility mode (@pxref{Options}).
@item gensub(@var{regexp}, @var{replacement}, @var{how} @r{[}, @var{target}@r{]}) #
@cindex @code{gensub()} function (@command{gawk})
@@ -15896,17 +15909,17 @@ _bigskip}
The only case where the difference is noticeable is the last one: @samp{\\\\}
is seen as @samp{\\} and produces @samp{\} instead of @samp{\\}.
-Starting with @value{PVERSION} 3.1.4, @command{gawk} followed the POSIX rules
+Starting with version 3.1.4, @command{gawk} followed the POSIX rules
when @option{--posix} is specified (@pxref{Options}). Otherwise,
it continued to follow the 1996 proposed rules, since
that had been its behavior for many years.
-When @value{PVERSION} 4.0.0 was released, the @command{gawk} maintainer
+When version 4.0.0 was released, the @command{gawk} maintainer
made the POSIX rules the default, breaking well over a decade's worth
of backwards compatibility.@footnote{This was rather naive of him, despite
there being a note in this section indicating that the next major version
would move to the POSIX rules.} Needless to say, this was a bad idea,
-and as of @value{PVERSION} 4.0.1, @command{gawk} resumed its historical
+and as of version 4.0.1, @command{gawk} resumed its historical
behavior, and only follows the POSIX rules when @option{--posix} is given.
The rules for @code{gensub()} are considerably simpler. At the runtime
@@ -16140,7 +16153,7 @@ $ @kbd{awk '@{ print $1 + $2 @}'}
@print{} 2
@kbd{2 3}
@print{} 5
-@kbd{@value{CTL}-d}
+@kbd{Ctrl-d}
@end example
@noindent
@@ -16151,13 +16164,13 @@ with this example:
$ @kbd{awk '@{ print $1 + $2 @}' | cat}
@kbd{1 1}
@kbd{2 3}
-@kbd{@value{CTL}-d}
+@kbd{Ctrl-d}
@print{} 2
@print{} 5
@end example
@noindent
-Here, no output is printed until after the @kbd{@value{CTL}-d} is typed, because
+Here, no output is printed until after the @kbd{Ctrl-d} is typed, because
it is all buffered and sent down the pipe to @command{cat} in one shot.
@end sidebar
@@ -16614,8 +16627,8 @@ gawk 'BEGIN @{
@c STARTOFRANGE opbit
@cindex operations, bitwise
@quotation
-@i{I can explain it for you, but I can't understand it for you.}@*
-Anonymous
+@i{I can explain it for you, but I can't understand it for you.}
+@author Anonymous
@end quotation
Many languages provide the ability to perform @dfn{bitwise} operations
@@ -16917,6 +16930,19 @@ that traverses every element of a true multidimensional array
Return a true value if @var{x} is an array. Otherwise return false.
@end table
+@code{isarray()} is meant for use in two circumstances. The first is when
+traversing a multidimensional array: you can test if an element is itself
+an array or not. The second is inside the body of a user-defined function
+(not discussed yet; @pxref{User-defined}), to test if a paramater is an
+array or not.
+
+Note, however, that using @code{isarray()} at the global level to test
+variables makes no sense. Since you are the one writing the program, you
+are supposed to know if your variables are arrays or not. And in fact,
+due to the way @command{gawk} works, if you pass the name of a variable
+that has not been previously used to @code{isarray()}, @command{gawk}
+will end up turning it into a scalar.
+
@node I18N Functions
@subsection String-Translation Functions
@cindex @command{gawk}, string-translation functions
@@ -18041,9 +18067,9 @@ it allows you to encapsulate algorithms and program tasks in a single
place. It simplifies programming, making program development more
manageable, and making programs more readable.
-In their seminal 1976 book, @cite{Software Tools}@footnote{Sadly, over 35
+In their seminal 1976 book, @cite{Software Tools},@footnote{Sadly, over 35
years later, many of the lessons taught by this book have yet to be
-learned by a vast number of practicing programmers.}, Brian Kernighan
+learned by a vast number of practicing programmers.} Brian Kernighan
and P.J.@: Plauger wrote:
@quotation
@@ -18242,6 +18268,7 @@ programming use.
vice versa.
* Join Function:: A function to join an array into a string.
* Getlocaltime Function:: A function to get formatted times.
+* Readfile Function:: A function to read an entire file at once.
@end menu
@node Strtonum Function
@@ -18457,7 +18484,7 @@ An @code{END} rule is automatically added
to the program calling @code{assert()}. Normally, if a program consists
of just a @code{BEGIN} rule, the input files and/or standard input are
not read. However, now that the program has an @code{END} rule, @command{awk}
-attempts to read the input @value{DF}s or standard input
+attempts to read the input data files or standard input
(@pxref{Using BEGIN/END}),
most likely causing the program to hang as it waits for input.
@@ -18866,17 +18893,92 @@ A more general design for the @code{getlocaltime()} function would have
allowed the user to supply an optional timestamp value to use instead
of the current time.
+@node Readfile Function
+@subsection Reading A Whole File At Once
+
+Often, it is convenient to have the entire contents of a file available
+in memory as a single string. A straightforward but naive way to
+do that might be as follows:
+
+@example
+function readfile(file, tmp, contents)
+@{
+ if ((getline tmp < file) < 0)
+ return
+
+ contents = tmp
+ while (getline tmp < file) > 0)
+ contents = contents RT tmp
+
+ close(file)
+ return contents
+@}
+@end example
+
+This function reads from @code{file} one record at a time, building
+up the full contents of the file in the local variable @code{contents}.
+It works, but is not necessarily efficient.
+
+The following function, based on a suggestion by Denis Shirokov,
+reads the entire contents of the named file in one shot:
+
+@cindex @code{readfile()} user-defined function
+@example
+@c file eg/lib/readfile.awk
+# readfile.awk --- read an entire file at once
+@c endfile
+@ignore
+@c file eg/lib/readfile.awk
+#
+# Original idea by Denis Shirokov, cosmogen@@gmail.com, April 2013
+#
+@c endfile
+@end ignore
+@c file eg/lib/readfile.awk
+
+function readfile(file, tmp, save_rs)
+@{
+ save_rs = RS
+ RS = "^$"
+ getline tmp < file
+ close(file)
+ RS = save_rs
+
+ return tmp
+@}
+@c endfile
+@end example
+
+It works by setting @code{RS} to @samp{^$}, a regular expression that
+will never match if the file has contents. @command{gawk} reads data from
+the file into @code{tmp} attempting to match @code{RS}. The match fails
+after each read, but fails quickly, such that @command{gawk} fills
+@code{tmp} with the entire contents of the file.
+(@xref{Records}, for information on @code{RT} and @code{RS}.)
+
+In the case that @code{file} is empty, the return value is the null
+string. Thus calling code may use something like:
+
+@example
+contents = readfile("/some/path")
+if (length(contents) == 0)
+ # file was empty @dots{}
+@end example
+
+This tests the result to see if it is empty or not. An equivalent
+test would be @samp{contents == ""}.
+
@node Data File Management
-@section @value{DDF} Management
+@section Data File Management
@c STARTOFRANGE dataf
@cindex files, managing
@c STARTOFRANGE libfdataf
-@cindex libraries of @command{awk} functions, managing, @value{DF}s
+@cindex libraries of @command{awk} functions, managing, data files
@c STARTOFRANGE flibdataf
-@cindex functions, library, managing @value{DF}s
+@cindex functions, library, managing data files
This @value{SECTION} presents functions that are useful for managing
-command-line @value{DF}s.
+command-line data files.
@menu
* Filetrans Function:: A function for handling data file transitions.
@@ -18887,16 +18989,16 @@ command-line @value{DF}s.
@end menu
@node Filetrans Function
-@subsection Noting @value{DDF} Boundaries
+@subsection Noting Data File Boundaries
-@cindex files, managing, @value{DF} boundaries
+@cindex files, managing, data file boundaries
@cindex files, initialization and cleanup
The @code{BEGIN} and @code{END} rules are each executed exactly once at
the beginning and end of your @command{awk} program, respectively
(@pxref{BEGIN/END}).
We (the @command{gawk} authors) once had a user who mistakenly thought that the
-@code{BEGIN} rule is executed at the beginning of each @value{DF} and the
-@code{END} rule is executed at the end of each @value{DF}.
+@code{BEGIN} rule is executed at the beginning of each data file and the
+@code{END} rule is executed at the end of each data file.
When informed
that this was not the case, the user requested that we add new special
@@ -18907,7 +19009,7 @@ Adding these special patterns to @command{gawk} wasn't necessary;
the job can be done cleanly in @command{awk} itself, as illustrated
by the following library program.
It arranges to call two user-supplied functions, @code{beginfile()} and
-@code{endfile()}, at the beginning and end of each @value{DF}.
+@code{endfile()}, at the beginning and end of each data file.
Besides solving the problem in only nine(!) lines of code, it does so
@emph{portably}; this works with any implementation of @command{awk}:
@@ -18938,17 +19040,17 @@ This file must be loaded before the user's ``main'' program, so that the
rule it supplies is executed first.
This rule relies on @command{awk}'s @code{FILENAME} variable that
-automatically changes for each new @value{DF}. The current @value{FN} is
+automatically changes for each new data file. The current file name is
saved in a private variable, @code{_oldfilename}. If @code{FILENAME} does
-not equal @code{_oldfilename}, then a new @value{DF} is being processed and
+not equal @code{_oldfilename}, then a new data file is being processed and
it is necessary to call @code{endfile()} for the old file. Because
@code{endfile()} should only be called if a file has been processed, the
program first checks to make sure that @code{_oldfilename} is not the null
-string. The program then assigns the current @value{FN} to
+string. The program then assigns the current file name to
@code{_oldfilename} and calls @code{beginfile()} for the file.
Because, like all @command{awk} variables, @code{_oldfilename} is
initialized to the null string, this rule executes correctly even for the
-first @value{DF}.
+first data file.
The program also supplies an @code{END} rule to do the final processing for
the last file. Because this @code{END} rule comes before any @code{END} rules
@@ -18957,7 +19059,7 @@ again the value of multiple @code{BEGIN} and @code{END} rules should be clear.
@cindex @code{beginfile()} user-defined function
@cindex @code{endfile()} user-defined function
-If the same @value{DF} occurs twice in a row on the command line, then
+If the same data file occurs twice in a row on the command line, then
@code{endfile()} and @code{beginfile()} are not executed at the end of the
first pass and at the beginning of the second pass.
The following version solves the problem:
@@ -19072,12 +19174,12 @@ The @code{rewind()} function also relies on the @code{nextfile} keyword
(@pxref{Nextfile Statement}).
@node File Checking
-@subsection Checking for Readable @value{DDF}s
+@subsection Checking for Readable Data Files
-@cindex troubleshooting, readable @value{DF}s
-@cindex readable @value{DF}s@comma{} checking
+@cindex troubleshooting, readable data files
+@cindex readable data files@comma{} checking
@cindex files, skipping
-Normally, if you give @command{awk} a @value{DF} that isn't readable,
+Normally, if you give @command{awk} a data file that isn't readable,
it stops with a fatal error. There are times when you
might want to just ignore such files and keep going. You can
do this by prepending the following program to your @command{awk}
@@ -19126,15 +19228,15 @@ This is a by-product of @command{awk}'s implicit
read-a-record-and-match-against-the-rules loop: when @command{awk}
tries to read a record from an empty file, it immediately receives an
end of file indication, closes the file, and proceeds on to the next
-command-line @value{DF}, @emph{without} executing any user-level
+command-line data file, @emph{without} executing any user-level
@command{awk} program code.
Using @command{gawk}'s @code{ARGIND} variable
(@pxref{Built-in Variables}), it is possible to detect when an empty
-@value{DF} has been skipped. Similar to the library file presented
+data file has been skipped. Similar to the library file presented
in @ref{Filetrans Function}, the following library file calls a function named
@code{zerofile()} that the user must provide. The arguments passed are
-the @value{FN} and the position in @code{ARGV} where it was found:
+the file name and the position in @code{ARGV} where it was found:
@cindex @code{zerofile.awk} program
@example
@@ -19222,15 +19324,15 @@ END @{
@end ignore
@node Ignoring Assigns
-@subsection Treating Assignments as @value{FFN}s
+@subsection Treating Assignments as File Names
@cindex assignments as filenames
@cindex filenames, assignments as
Occasionally, you might not want @command{awk} to process command-line
variable assignments
(@pxref{Assignment Options}).
-In particular, if you have a @value{FN} that contain an @samp{=} character,
-@command{awk} treats the @value{FN} as an assignment, and does not process it.
+In particular, if you have a file name that contain an @samp{=} character,
+@command{awk} treats the file name as an assignment, and does not process it.
Some users have suggested an additional command-line option for @command{gawk}
to disable command-line assignments. However, some simple programming with
@@ -19274,7 +19376,7 @@ awk -v No_command_assign=1 -f noassign.awk -f yourprog.awk *
The function works by looping through the arguments.
It prepends @samp{./} to
any argument that matches the form
-of a variable assignment, turning that argument into a @value{FN}.
+of a variable assignment, turning that argument into a file name.
The use of @code{No_command_assign} allows you to disable command-line
assignments at invocation time, by giving the variable a true value.
@@ -19441,7 +19543,7 @@ The discussion that follows walks through the code a bit at a time:
# <c> a character representing the current option
# Private Data:
-# _opti -- index in multi-flag option, e.g., -abc
+# _opti -- index in multiflag option, e.g., -abc
@c endfile
@end example
@@ -19633,7 +19735,7 @@ After @code{getopt()} is through, it is the responsibility of the user level
code to
clear out all the elements of @code{ARGV} from 1 to @code{Optind},
so that @command{awk} does not try to process the command-line options
-as @value{FN}s.
+as file names.
@end quotation
Several of the sample programs presented in
@@ -20507,7 +20609,7 @@ awk -f @var{program} -- @var{options} @var{files}
@noindent
Here, @var{program} is the name of the @command{awk} program (such as
@file{cut.awk}), @var{options} are any command-line options for the
-program that start with a @samp{-}, and @var{files} are the actual @value{DF}s.
+program that start with a @samp{-}, and @var{files} are the actual data files.
If your system supports the @samp{#!} executable interpreter mechanism
(@pxref{Executable Scripts}),
@@ -20712,7 +20814,7 @@ spaces. Also remember that after @code{getopt()} is through
we have to
clear out all the elements of @code{ARGV} from 1 to @code{Optind},
so that @command{awk} does not try to process the command-line options
-as @value{FN}s.
+as file names.
After dealing with the command-line options, the program verifies that the
options make sense. Only one or the other of @option{-c} and @option{-f}
@@ -20908,8 +21010,8 @@ egrep @r{[} @var{options} @r{]} '@var{pattern}' @var{files} @dots{}
The @var{pattern} is a regular expression. In typical usage, the regular
expression is quoted to prevent the shell from expanding any of the
-special characters as @value{FN} wildcards. Normally, @command{egrep}
-prints the lines that matched. If multiple @value{FN}s are provided on
+special characters as file name wildcards. Normally, @command{egrep}
+prints the lines that matched. If multiple file names are provided on
the command line, each output line is preceded by the name of the file
and a colon.
@@ -21000,7 +21102,7 @@ pattern is supplied with @option{-e}, the first nonoption on the
command line is used. The @command{awk} command-line arguments up to @code{ARGV[Optind]}
are cleared, so that @command{awk} won't try to process them as files. If no
files are specified, the standard input is used, and if multiple files are
-specified, we make sure to note this so that the @value{FN}s can precede the
+specified, we make sure to note this so that the file names can precede the
matched lines in the output:
@example
@@ -21098,9 +21200,9 @@ A number of additional tests are made, but they are only done if we
are not counting lines. First, if the user only wants exit status
(@code{no_print} is true), then it is enough to know that @emph{one}
line in this file matched, and we can skip on to the next file with
-@code{nextfile}. Similarly, if we are only printing @value{FN}s, we can
-print the @value{FN}, and then skip to the next file with @code{nextfile}.
-Finally, each line is printed, with a leading @value{FN} and colon
+@code{nextfile}. Similarly, if we are only printing file names, we can
+print the file name, and then skip to the next file with @code{nextfile}.
+Finally, each line is printed, with a leading file name and colon
if necessary:
@cindex @code{!} (exclamation point), @code{!} operator
@@ -21348,7 +21450,7 @@ number of lines in each file, supply a number on the command line
preceded with a minus; e.g., @samp{-500} for files with 500 lines in them
instead of 1000. To change the name of the output files to something like
@file{myfileaa}, @file{myfileab}, and so on, supply an additional
-argument that specifies the @value{FN} prefix.
+argument that specifies the file name prefix.
Here is a version of @command{split} in @command{awk}. It uses the
@code{ord()} and @code{chr()} functions presented in
@@ -21358,8 +21460,8 @@ The program first sets its defaults, and then tests to make sure there are
not too many arguments. It then looks at each argument in turn. The
first argument could be a minus sign followed by a number. If it is, this happens
to look like a negative number, so it is made positive, and that is the
-count of lines. The data @value{FN} is skipped over and the final argument
-is used as the prefix for the output @value{FN}s:
+count of lines. The data file name is skipped over and the final argument
+is used as the prefix for the output file names:
@cindex @code{split.awk} program
@example
@@ -21408,7 +21510,7 @@ BEGIN @{
The next rule does most of the work. @code{tcount} (temporary count) tracks
how many lines have been printed to the output file so far. If it is greater
than @code{count}, it is time to close the current file and start a new one.
-@code{s1} and @code{s2} track the current suffixes for the @value{FN}. If
+@code{s1} and @code{s2} track the current suffixes for the file name. If
they are both @samp{z}, the file is just too big. Otherwise, @code{s1}
moves to the next letter in the alphabet and @code{s2} starts over again at
@samp{a}:
@@ -21496,13 +21598,13 @@ The @code{BEGIN} rule first makes a copy of all the command-line arguments
into an array named @code{copy}.
@code{ARGV[0]} is not copied, since it is not needed.
@code{tee} cannot use @code{ARGV} directly, since @command{awk} attempts to
-process each @value{FN} in @code{ARGV} as input data.
+process each file name in @code{ARGV} as input data.
@cindex flag variables
If the first argument is @option{-a}, then the flag variable
@code{append} is set to true, and both @code{ARGV[1]} and
@code{copy[1]} are deleted. If @code{ARGC} is less than two, then no
-@value{FN}s were supplied and @code{tee} prints a usage message and exits.
+file names were supplied and @code{tee} prints a usage message and exits.
Finally, @command{awk} is forced to read the standard input by setting
@code{ARGV[1]} to @code{"-"} and @code{ARGC} to two:
@@ -21964,7 +22066,7 @@ BEGIN @{
@end example
The @code{beginfile()} function is simple; it just resets the counts of lines,
-words, and characters to zero, and saves the current @value{FN} in
+words, and characters to zero, and saves the current file name in
@code{fname}:
@example
@@ -21986,7 +22088,7 @@ you will see that
@code{FNR} has already been reset by the time
@code{endfile()} is called.} It then prints out those numbers
for the file that was just read. It relies on @code{beginfile()} to reset the
-numbers for the following @value{DF}:
+numbers for the following data file:
@c FIXME: ONE DAY: make the above footnote an exercise,
@c instead of giving away the answer.
@@ -22154,8 +22256,8 @@ word, comparing it to the previous one:
@cindex insomnia, cure for
@cindex Robbins, Arnold
@quotation
-@i{Nothing cures insomnia like a ringing alarm clock.}@*
-Arnold Robbins
+@i{Nothing cures insomnia like a ringing alarm clock.}
+@author Arnold Robbins
@end quotation
@c STARTOFRANGE tialarm
@@ -22331,12 +22433,10 @@ often used to map uppercase letters into lowercase for further processing:
@command{tr} requires two lists of characters.@footnote{On some older
systems,
-@ifset ORA
including Solaris,
-@end ifset
@command{tr} may require that the lists be written as
range expressions enclosed in square brackets (@samp{[a-z]}) and quoted,
-to prevent the shell from attempting a @value{FN} expansion. This is
+to prevent the shell from attempting a file name expansion. This is
not a feature.} When processing the input, the first character in the
first list is replaced with the first character in the second list,
the second character in the first list is replaced with the second
@@ -22734,7 +22834,7 @@ The @command{uniq} program
(@pxref{Uniq Program}),
removes duplicate lines from @emph{sorted} data.
-Suppose, however, you need to remove duplicate lines from a @value{DF} but
+Suppose, however, you need to remove duplicate lines from a data file but
that you want to preserve the order the lines are in. A good example of
this might be a shell history file. The history file keeps a copy of all
the commands you have entered, and it is not unusual to repeat a command
@@ -22870,7 +22970,7 @@ Lines containing @samp{@@group} and @samp{@@end group} are simply removed.
(@pxref{Join Function}).
The example programs in the online Texinfo source for @cite{@value{TITLE}}
-(@file{gawk.texi}) have all been bracketed inside @samp{file} and
+(@file{gawktexi.in}) have all been bracketed inside @samp{file} and
@samp{endfile} lines. The @command{gawk} distribution uses a copy of
@file{extract.awk} to extract the sample programs and install many
of them in a standard directory where @command{gawk} can find them.
@@ -22953,7 +23053,7 @@ screen.
@end ifnottex
The second rule handles moving data into files. It verifies that a
-@value{FN} is given in the directive. If the file named is not the
+file name is given in the directive. If the file named is not the
current file, then the current file is closed. Keeping the current file
open until a new file is encountered allows the use of the @samp{>}
redirection for printing the contents, keeping open file management
@@ -23035,7 +23135,7 @@ subsequent output is appended to the file
(@pxref{Redirection}).
This makes it easy to mix program text and explanatory prose for the same
sample source file (as has been done here!) without any hassle. The file is
-only closed when a new data @value{FN} is encountered or at the end of the
+only closed when a new data file name is encountered or at the end of the
input file.
Finally, the function @code{@w{unexpected_eof()}} prints an appropriate
@@ -23087,7 +23187,7 @@ Here, @samp{s/old/new/g} tells @command{sed} to look for the regexp
The following program, @file{awksed.awk}, accepts at least two command-line
arguments: the pattern to look for and the text to replace it with. Any
-additional arguments are treated as data @value{FN}s to process. If none
+additional arguments are treated as data file names to process. If none
are provided, the standard input is used:
@cindex Brennan, Michael
@@ -23160,7 +23260,7 @@ The @code{BEGIN} rule handles the setup, checking for the right number
of arguments and calling @code{usage()} if there is a problem. Then it sets
@code{RS} and @code{ORS} from the command-line arguments and sets
@code{ARGV[1]} and @code{ARGV[2]} to the null string, so that they are
-not treated as @value{FN}s
+not treated as file names
(@pxref{ARGC and ARGV}).
The @code{usage()} function prints an error message and exits.
@@ -23258,7 +23358,7 @@ Literal text, provided with @option{--source} or @option{--source=}. This
text is just appended directly.
@item
-Source @value{FN}s, provided with @option{-f}. We use a neat trick and append
+Source file names, provided with @option{-f}. We use a neat trick and append
@samp{@@include @var{filename}} to the shell variable's contents. Since the file-inclusion
program works the way @command{gawk} does, this gets the text
of the file included into the program at the correct point.
@@ -23271,7 +23371,7 @@ shell variable.
@item
Run the expanded program with @command{gawk} and any other original command-line
-arguments that the user supplied (such as the data @value{FN}s).
+arguments that the user supplied (such as the data file names).
@end enumerate
This program uses shell variables extensively: for storing command-line arguments,
@@ -23302,7 +23402,7 @@ programming trick. Don't worry about it if you are not familiar with
These are saved and passed on to @command{gawk}.
@item -f@r{,} --file@r{,} --file=@r{,} -Wfile=
-The @value{FN} is appended to the shell variable @code{program} with an
+The file name is appended to the shell variable @code{program} with an
@samp{@@include} statement.
The @command{expr} utility is used to remove the leading option part of the
argument (e.g., @samp{--file=}).
@@ -23426,10 +23526,10 @@ is stored in the shell variable @code{expand_prog}. Doing this keeps
the shell script readable. The @command{awk} program
reads through the user's program, one line at a time, using @code{getline}
(@pxref{Getline}). The input
-@value{FN}s and @samp{@@include} statements are managed using a stack.
-As each @samp{@@include} is encountered, the current @value{FN} is
+file names and @samp{@@include} statements are managed using a stack.
+As each @samp{@@include} is encountered, the current file name is
``pushed'' onto the stack and the file named in the @samp{@@include}
-directive becomes the current @value{FN}. As each file is finished,
+directive becomes the current file name. As each file is finished,
the stack is ``popped,'' and the previous input file becomes the current
input file again. The process is started by making the original file
the first one on the stack.
@@ -23438,16 +23538,16 @@ The @code{pathto()} function does the work of finding the full path to
a file. It simulates @command{gawk}'s behavior when searching the
@env{AWKPATH} environment variable
(@pxref{AWKPATH Variable}).
-If a @value{FN} has a @samp{/} in it, no path search is done.
-Similarly, if the @value{FN} is @code{"-"}, then that string is
+If a file name has a @samp{/} in it, no path search is done.
+Similarly, if the file name is @code{"-"}, then that string is
used as-is. Otherwise,
-the @value{FN} is concatenated with the name of each directory in
-the path, and an attempt is made to open the generated @value{FN}.
+the file name is concatenated with the name of each directory in
+the path, and an attempt is made to open the generated file name.
The only way to test if a file can be read in @command{awk} is to go
ahead and try to read it with @code{getline}; this is what @code{pathto()}
does.@footnote{On some very old versions of @command{awk}, the test
@samp{getline junk < t} can loop forever if the file exists but is empty.
-Caveat emptor.} If the file can be read, it is closed and the @value{FN}
+Caveat emptor.} If the file can be read, it is closed and the file name
is returned:
@ignore
@@ -23505,14 +23605,14 @@ BEGIN @{
The stack is initialized with @code{ARGV[1]}, which will be @file{/dev/stdin}.
The main loop comes next. Input lines are read in succession. Lines that
do not start with @samp{@@include} are printed verbatim.
-If the line does start with @samp{@@include}, the @value{FN} is in @code{$2}.
+If the line does start with @samp{@@include}, the file name is in @code{$2}.
@code{pathto()} is called to generate the full path. If it cannot, then the program
prints an error message and continues.
The next thing to check is if the file is included already. The
-@code{processed} array is indexed by the full @value{FN} of each included
+@code{processed} array is indexed by the full file name of each included
file and it tracks this information for us. If the file is
-seen again, a warning message is printed. Otherwise, the new @value{FN} is
+seen again, a warning message is printed. Otherwise, the new file name is
pushed onto the stack and processing continues.
Finally, when @code{getline} encounters the end of the input file, the file
@@ -23590,10 +23690,10 @@ options and command-line arguments that the user supplied.
@c this causes more problems than it solves, so leave it out.
@ignore
-The special file @file{/dev/null} is passed as a @value{DF} to @command{gawk}
+The special file @file{/dev/null} is passed as a data file to @command{gawk}
to handle an interesting case. Suppose that the user's program only has
-a @code{BEGIN} rule and there are no @value{DF}s to read.
-The program should exit without reading any @value{DF}s.
+a @code{BEGIN} rule and there are no data files to read.
+The program should exit without reading any data files.
However, suppose that an included library file defines an @code{END}
rule of its own. In this case, @command{gawk} will hang, reading standard
input. In order to avoid this, @file{/dev/null} is explicitly added to the
@@ -23974,8 +24074,8 @@ who knows where you live."
@end ignore
@quotation
@i{Write documentation as if whoever reads it is
-a violent psychopath who knows where you live.}@*
-Steve English, as quoted by Peter Langston
+a violent psychopath who knows where you live.}
+@author Steve English, as quoted by Peter Langston
@end quotation
This @value{CHAPTER} discusses advanced features in @command{gawk}.
@@ -24294,7 +24394,7 @@ ordered data:
@example
function cmp_randomize(i1, v1, i2, v2)
@{
- # random order
+ # random order (caution: this may never terminate!)
return (2 - 4 * rand())
@}
@end example
@@ -24309,7 +24409,7 @@ with otherwise equal values is to include the indices in the comparison
rules. Note that doing this may make the loop traversal less efficient,
so consider it only if necessary. The following comparison functions
force a deterministic order, and are based on the fact that the
-indices of two elements are never equal:
+(string) indices of two elements are never equal:
@example
function cmp_numeric(i1, v1, i2, v2)
@@ -24368,15 +24468,14 @@ sorted array traversal is not the default.
@cindex arrays, sorting
@cindex @code{asort()} function (@command{gawk})
@cindex @code{asort()} function (@command{gawk}), arrays@comma{} sorting
+@cindex @code{asorti()} function (@command{gawk})
+@cindex @code{asorti()} function (@command{gawk}), arrays@comma{} sorting
@cindex sort function, arrays, sorting
-In most @command{awk} implementations, sorting an array requires
-writing a @code{sort()} function.
-While this can be educational for exploring different sorting algorithms,
-usually that's not the point of the program.
-@command{gawk} provides the built-in @code{asort()}
-and @code{asorti()} functions
-(@pxref{String Functions})
-for sorting arrays. For example:
+In most @command{awk} implementations, sorting an array requires writing
+a @code{sort()} function. While this can be educational for exploring
+different sorting algorithms, usually that's not the point of the program.
+@command{gawk} provides the built-in @code{asort()} and @code{asorti()}
+functions (@pxref{String Functions}) for sorting arrays. For example:
@example
@var{populate the array} data
@@ -24389,7 +24488,7 @@ After the call to @code{asort()}, the array @code{data} is indexed from 1
to some number @var{n}, the total number of elements in @code{data}.
(This count is @code{asort()}'s return value.)
@code{data[1]} @value{LEQ} @code{data[2]} @value{LEQ} @code{data[3]}, and so on.
-The comparison is based on the type of the elements
+The default comparison is based on the type of the elements
(@pxref{Typing and Comparison}).
All numeric values come before all string values,
which in turn come before all subarrays.
@@ -24411,24 +24510,11 @@ In this case, @command{gawk} copies the @code{source} array into the
@code{dest} array and then sorts @code{dest}, destroying its indices.
However, the @code{source} array is not affected.
-@code{asort()} accepts a third string argument to control comparison of
-array elements. As with @code{PROCINFO["sorted_in"]}, this argument
-may be one of the predefined names that @command{gawk} provides
-(@pxref{Controlling Scanning}), or the name of a user-defined function
-(@pxref{Controlling Array Traversal}).
-
-@quotation NOTE
-In all cases, the sorted element values consist of the original
-array's element values. The ability to control comparison merely
-affects the way in which they are sorted.
-@end quotation
-
Often, what's needed is to sort on the values of the @emph{indices}
-instead of the values of the elements.
-To do that, use the
-@code{asorti()} function. The interface is identical to that of
-@code{asort()}, except that the index values are used for sorting, and
-become the values of the result array:
+instead of the values of the elements. To do that, use the
+@code{asorti()} function. The interface and behavior are identical to
+that of @code{asort()}, except that the index values are used for sorting,
+and become the values of the result array:
@example
@{ source[$0] = some_func($0) @}
@@ -24445,23 +24531,35 @@ END @{
@}
@end example
-Similar to @code{asort()},
-in all cases, the sorted element values consist of the original
-array's indices. The ability to control comparison merely
-affects the way in which they are sorted.
+So far, so good. Now it starts to get interesting. Both @code{asort()}
+and @code{asorti()} accept a third string argument to control comparison
+of array elements. In @ref{String Functions}, we ignored this third
+argument; however, the time has now come to describe how this argument
+affects these two functions.
+
+Basically, the third argument specifies how the array is to be sorted.
+There are two possibilities. As with @code{PROCINFO["sorted_in"]},
+this argument may be one of the predefined names that @command{gawk}
+provides (@pxref{Controlling Scanning}), or it may be the name of a
+user-defined function (@pxref{Controlling Array Traversal}).
+
+In the latter case, @emph{the function can compare elements in any way
+it chooses}, taking into account just the indices, just the values,
+or both. This is extremely powerful.
-Sorting the array by replacing the indices provides maximal flexibility.
-To traverse the elements in decreasing order, use a loop that goes from
-@var{n} down to 1, either over the elements or over the indices.@footnote{You
-may also use one of the predefined sorting names that sorts in
-decreasing order.}
+Once the array is sorted, @code{asort()} takes the @emph{values} in
+their final order, and uses them to fill in the result array, whereas
+@code{asorti()} takes the @emph{indices} in their final order, and uses
+them to fill in the result array.
@cindex reference counting, sorting arrays
+@quotation NOTE
Copying array indices and elements isn't expensive in terms of memory.
Internally, @command{gawk} maintains @dfn{reference counts} to data.
For example, when @code{asort()} copies the first array to the second one,
there is only one copy of the original array elements' data, even though
both arrays use the values.
+@end quotation
@c Document It And Call It A Feature. Sigh.
@cindex @command{gawk}, @code{IGNORECASE} variable in
@@ -24687,10 +24785,10 @@ another process on another system across an IP network connection.
You can think of this as just a @emph{very long} two-way pipeline to
a coprocess.
The way @command{gawk} decides that you want to use TCP/IP networking is
-by recognizing special @value{FN}s that begin with one of @samp{/inet/},
+by recognizing special file names that begin with one of @samp{/inet/},
@samp{/inet4/} or @samp{/inet6}.
-The full syntax of the special @value{FN} is
+The full syntax of the special file name is
@file{/@var{net-type}/@var{protocol}/@var{local-port}/@var{remote-host}/@var{remote-port}}.
The components are:
@@ -25059,8 +25157,8 @@ the case of the @code{INT} signal, @command{gawk} exits. This is
because these systems don't support the @command{kill} command, so the
only signals you can deliver to a program are those generated by the
keyboard. The @code{INT} signal is generated by the
-@kbd{@value{CTL}-@key{C}} or @kbd{@value{CTL}-@key{BREAK}} key, while the
-@code{QUIT} signal is generated by the @kbd{@value{CTL}-@key{\}} key.
+@kbd{Ctrl-@key{C}} or @kbd{Ctrl-@key{BREAK}} key, while the
+@code{QUIT} signal is generated by the @kbd{Ctrl-@key{\}} key.
Finally, @command{gawk} also accepts another option, @option{--pretty-print}.
When called this way, @command{gawk} ``pretty prints'' the program into
@@ -25852,7 +25950,7 @@ complete detail in
@cite{GNU gettext tools}.)
@end ifnotinfo
As of this writing, the latest version of GNU @code{gettext} is
-@uref{ftp://ftp.gnu.org/gnu/gettext/gettext-0.18.2.1.tar.gz, @value{PVERSION} 0.18.2.1}.
+@uref{ftp://ftp.gnu.org/gnu/gettext/gettext-0.18.2.1.tar.gz, version 0.18.2.1}.
If a translation of @command{gawk}'s messages exists,
then @command{gawk} produces usage messages, warnings,
@@ -26732,7 +26830,7 @@ functions which called the one you are in. The commands for doing this are:
Print a backtrace of all function calls (stack frames), or innermost @var{count}
frames if @var{count} > 0. Print the outermost @var{count} frames if
@var{count} < 0. The backtrace displays the name and arguments to each
-function, the source @value{FN}, and the line number.
+function, the source file name, and the line number.
@cindex debugger commands, @code{down}
@cindex @code{down} debugger command
@@ -26865,7 +26963,7 @@ Turn instruction tracing on or off. The default is @code{off}.
@end table
@item @code{save} @var{filename}
-Save the commands from the current session to the given @value{FN},
+Save the commands from the current session to the given file name,
so that they can be replayed using the @command{source} command.
@item @code{source} @var{filename}
@@ -27033,8 +27131,8 @@ features. The following types of completion are available:
@item Command completion
Command names.
-@item Source @value{FN} completion
-Source @value{FN}s. Relevant commands are
+@item Source file name completion
+Source file names. Relevant commands are
@code{break},
@code{clear},
@code{list},
@@ -27122,11 +27220,11 @@ to believe. Novice computer users solve this problem by implicitly trusting
in the computer as an infallible authority; they tend to believe that all
digits of a printed answer are significant. Disillusioned computer users have
just the opposite approach; they are constantly afraid that their answers
-are almost meaningless.}@*
-Donald Knuth@footnote{Donald E.@: Knuth.
+are almost meaningless.}@footnote{Donald E.@: Knuth.
@cite{The Art of Computer Programming}. Volume 2,
@cite{Seminumerical Algorithms}, third edition,
1998, ISBN 0-201-89683-4, p.@: 229.}
+@author Donald Knuth
@end quotation
This @value{CHAPTER} discusses issues that you may encounter
@@ -27264,7 +27362,7 @@ This makes it clear that the full numeric value is different from
what the default string representations show.
@code{CONVFMT}'s default value is @code{"%.6g"}, which yields a value with
-at least six significant digits. For some applications, you might want to
+at most six significant digits. For some applications, you might want to
change it to specify more precision.
On most modern machines, most of the time,
17 digits is enough to capture a floating-point number's
@@ -27293,7 +27391,7 @@ $ @kbd{awk '@{ printf("%010d\n", $1 * 100) @}'}
@print{} 0000051580
515.82
@print{} 0000051582
-@kbd{@value{CTL}-d}
+@kbd{Ctrl-d}
@end example
@noindent
@@ -28133,11 +28231,10 @@ floating-point format to a precision lower than working precision.
Do we promote them to full membership of the high-precision club,
or do we treat them and all their associates as second-class citizens?
Sometimes the first course is proper, sometimes the second, and it takes
-careful analysis to tell which.}
-
-Dirk Laurie@footnote{Dirk Laurie.
+careful analysis to tell which.}@footnote{Dirk Laurie.
@cite{Variable-precision Arithmetic Considered Perilous --- A Detective Story}.
Electronic Transactions on Numerical Analysis. Volume 28, pp. 168-173, 2008.}
+@author Dirk Laurie
@end quotation
@command{gawk} does not implicitly modify the precision of any previously
@@ -28675,12 +28772,12 @@ the macros as if they were functions.
@subsection General Purpose Data Types
@quotation
-@i{I have a true love/hate relationship with unions.}@*
-Arnold Robbins
+@i{I have a true love/hate relationship with unions.}
+@author Arnold Robbins
@i{That's the thing about unions: the compiler will arrange things so they
-can accommodate both love and hate.}@*
-Chet Ramey
+can accommodate both love and hate.}
+@author Chet Ramey
@end quotation
The extension API defines a number of simple types and structures for general
@@ -30613,8 +30710,8 @@ path with a list of directories to search for compiled extensions.
@section Example: Some File Functions
@quotation
-@i{No matter where you go, there you are.} @*
-Buckaroo Bonzai
+@i{No matter where you go, there you are.}
+@author Buckaroo Bonzai
@end quotation
@c It's enough to show chdir and stat, no need for fts
@@ -31397,7 +31494,7 @@ Return zero if there were no errors, otherwise return @minus{}1.
The @code{fts()} function provides a hook to the C library @code{fts()}
routines for traversing file hierarchies. Instead of returning data
-about one file at a time in a stream, it fills in a multi-dimensional
+about one file at a time in a stream, it fills in a multidimensional
array with data about each file and directory encountered in the requested
hierarchies.
@@ -31498,7 +31595,7 @@ be more comfortable to use from an @command{awk} program. This includes the
lack of a comparison function, since @command{gawk} already provides
powerful array sorting facilities. While an @code{fts_read()}-like
interface could have been provided, this felt less natural than simply
-creating a multi-dimensional array to represent the file hierarchy and
+creating a multidimensional array to represent the file hierarchy and
its information.
@end quotation
@@ -32156,7 +32253,7 @@ Multiple @code{BEGIN} and @code{END} rules
@item
Multidimensional arrays
-(@pxref{Multi-dimensional}).
+(@pxref{Multidimensional}).
@end itemize
@c ENDOFRANGE gawkv1
@@ -32363,7 +32460,7 @@ Special files in I/O redirections:
@itemize @minus{}
@item
The @file{/dev/stdin}, @file{/dev/stdout}, @file{/dev/stderr} and
-@file{/dev/fd/@var{N}} special @value{FN}s
+@file{/dev/fd/@var{N}} special file names
(@pxref{Special Files}).
@item
@@ -32587,7 +32684,7 @@ long options
@item
Support for the following obsolete systems was removed from the code
-and the documentation for @command{gawk} @value{PVERSION} 4.0:
+and the documentation for @command{gawk} version 4.0:
@c nested table
@itemize @minus
@@ -32770,8 +32867,8 @@ cases: the default regexp matching; with @option{--traditional}, and with
@appendixsec Major Contributors to @command{gawk}
@cindex @command{gawk}, list of contributors to
@quotation
-@i{Always give credit where credit is due.}@*
-Anonymous
+@i{Always give credit where credit is due.}
+@author Anonymous
@end quotation
This @value{SECTION} names the major contributors to @command{gawk}
@@ -32968,6 +33065,10 @@ The modifications to convert @command{gawk}
into a byte-code interpreter, including the debugger.
@item
+The addition of true multidimensional arrays.
+@ref{Arrays of Arrays}.
+
+@item
The additional modifications for support of arbitrary precision arithmetic.
@item
@@ -32980,6 +33081,10 @@ into one, for the 4.1 release.
@item
Improved array internals for arrays indexed by integers.
+
+@item
+The improved array sorting features were driven by John together
+with Pat Rankin.
@end itemize
@item
@@ -33101,7 +33206,7 @@ Extracting the archive
creates a directory named @file{gawk-@value{VERSION}.@value{PATCHLEVEL}}
in the current directory.
-The distribution @value{FN} is of the form
+The distribution file name is of the form
@file{gawk-@var{V}.@var{R}.@var{P}.tar.gz}.
The @var{V} represents the major version of @command{gawk},
the @var{R} represents the current release of version @var{V}, and
@@ -33133,6 +33238,13 @@ The actual @command{gawk} source code.
@end table
@table @file
+@item ABOUT-NLS
+Information about GNU @command{gettext} and translations.
+
+@item AUTHORS
+A file with some information about the authorship of @command{gawk}.
+It exists only to satisfy the pedants at the Free Software Foundation.
+
@item README
@itemx README_d/README.*
Descriptive files: @file{README} for @command{gawk} under Unix and the
@@ -33156,16 +33268,6 @@ An older list of changes to @command{gawk}.
@item COPYING
The GNU General Public License.
-@item FUTURES
-A brief list of features and changes being contemplated for future
-releases, with some indication of the time frame for the feature, based
-on its difficulty.
-
-@item LIMITATIONS
-A list of those factors that limit @command{gawk}'s performance.
-Most of these depend on the hardware or operating system software and
-are not limits in @command{gawk} itself.
-
@item POSIX.STD
A description of behaviors in the POSIX standard for @command{awk} which
are left undefined, or where @command{gawk} may not comply fully, as well
@@ -33198,12 +33300,19 @@ The @command{troff} source for a manual page describing @command{gawk}.
This is distributed for the convenience of Unix users.
@cindex Texinfo
-@item doc/gawk.texi
+@item doc/gawktexi.in
+@itemx doc/sidebar.awk
The Texinfo source file for this @value{DOCUMENT}.
-It should be processed with @TeX{}
-(via @command{texi2dvi} or @command{texi2pdf})
+It should be processed by @file{doc/sidebar.awk}
+before processing with @command{texi2dvi} or @command{texi2pdf}
to produce a printed document, and
with @command{makeinfo} to produce an Info or HTML file.
+The @file{Makefile} takes care of this processing and produces
+printable output via @command{texi2dvi} or @command{texi2pdf}.
+
+@item doc/gawk.texi
+The file produced after processing @file{gawktexi.in}
+with @file{sidebar.awk}.
@item doc/gawk.info
The generated Info file for this @value{DOCUMENT}.
@@ -33242,15 +33351,21 @@ the @file{Makefile.in} files used by @command{autoconf} and
@item Makefile.in
@itemx aclocal.m4
+@itemx bisonfix.awk
+@itemx config.guess
@itemx configh.in
@itemx configure.ac
@itemx configure
@itemx custom.h
+@itemx depcomp
+@itemx install-sh
@itemx missing_d/*
+@itemx mkinstalldirs
@itemx m4/*
-These files and subdirectories are used when configuring @command{gawk}
-for various Unix systems. They are explained in
-@ref{Unix Installation}.
+These files and subdirectories are used when configuring and compiling
+@command{gawk} for various Unix systems. Most of them are explained
+in @ref{Unix Installation}. The rest are there to support the main
+infrastructure.
@item po/*
The @file{po} library contains message translations.
@@ -33394,6 +33509,14 @@ command line when compiling @command{gawk} from scratch, including:
@table @code
+@cindex @code{--disable-extensions} configuration option
+@cindex configuration option, @code{--disable-extensions}
+@item --disable-extensions
+Disable configuring and building the sample extensions in the
+@file{extension} directory. This is useful for cross-compiling.
+The default action is to dynamically check if the extensions
+can be configured and compiled.
+
@cindex @code{--disable-lint} configuration option
@cindex configuration option, @code{--disable-lint}
@item --disable-lint
@@ -33953,7 +34076,7 @@ provides information about both the @command{gawk} implementation and the
The logical name @samp{AWK_LIBRARY} can designate a default location
for @command{awk} program files. For the @option{-f} option, if the specified
-@value{FN} has no device or directory path information in it, @command{gawk}
+file name has no device or directory path information in it, @command{gawk}
looks in the current directory first, then in the directory specified
by the translation of @samp{AWK_LIBRARY} if the file is not found.
If, after searching in both directories, the file still is not found,
@@ -33986,7 +34109,7 @@ One side effect of dual command-line parsing is that if there is only a
single parameter (as in the quoted string program above), the command
becomes ambiguous. To work around this, the normally optional @option{--}
flag is required to force Unix-style parsing rather than @code{DCL} parsing. If any
-other dash-type options (or multiple parameters such as @value{DF}s to
+other dash-type options (or multiple parameters such as data files to
process) are present, there is no ambiguity and @option{--} can be omitted.
@c @cindex directory search
@@ -34047,7 +34170,7 @@ define a symbol, as follows:
$ @kbd{gawk :== $sys$common:[syshlp.examples.tcpip.snmp]gawk.exe}
@end example
-This is apparently @value{PVERSION} 2.15.6, which is extremely old. We
+This is apparently version 2.15.6, which is extremely old. We
recommend compiling and using the current version.
@c ENDOFRANGE opgawx
@@ -34057,8 +34180,8 @@ recommend compiling and using the current version.
@appendixsec Reporting Problems and Bugs
@cindex archeologists
@quotation
-@i{There is nothing more dangerous than a bored archeologist.}@*
-The Hitchhiker's Guide to the Galaxy
+@i{There is nothing more dangerous than a bored archeologist.}
+@author The Hitchhiker's Guide to the Galaxy
@end quotation
@c the radio show, not the book. :-)
@@ -34076,8 +34199,8 @@ what you're trying to do. If it's not clear whether you should be able
to do something or not, report that too; it's a bug in the documentation!
Before reporting a bug or trying to fix it yourself, try to isolate it
-to the smallest possible @command{awk} program and input @value{DF} that
-reproduces the problem. Then send us the program and @value{DF},
+to the smallest possible @command{awk} program and input data file that
+reproduces the problem. Then send us the program and data file,
some idea of what kind of Unix system you're using,
the compiler you used to compile @command{gawk}, and the exact results
@command{gawk} gave you. Also say what you expected to occur; this helps
@@ -34174,8 +34297,8 @@ Date: Wed, 4 Sep 1996 08:11:48 -0700 (PDT)
@cindex Brennan, Michael
@quotation
@i{It's kind of fun to put comments like this in your awk code.}@*
-@ @ @ @ @ @ @code{// Do C++ comments work? answer: yes! of course}@*
-Michael Brennan
+@ @ @ @ @ @ @code{// Do C++ comments work? answer: yes! of course}
+@author Michael Brennan
@end quotation
There are a number of other freely available @command{awk} implementations.
@@ -34217,10 +34340,8 @@ repository in a directory named @file{bwkawk}. If you leave that argument
off the @command{git} command line, the repository copy is created in a
directory named @file{awk}.
-This version requires an ISO C (1990 standard) compiler;
-the C compiler from
-GCC (the GNU Compiler Collection)
-works quite nicely.
+This version requires an ISO C (1990 standard) compiler; the C compiler
+from GCC (the GNU Compiler Collection) works quite nicely.
@xref{Common Extensions},
for a list of extensions in this @command{awk} that are not in POSIX @command{awk}.
@@ -34301,15 +34422,22 @@ information, see the @uref{http://busybox.net, project's home page}.
@cindex source code, Solaris @command{awk}
@item The OpenSolaris POSIX @command{awk}
The version of @command{awk} in @file{/usr/xpg4/bin} on Solaris is
-more-or-less
-POSIX-compliant. It is based on the @command{awk} from Mortice Kern
-Systems for PCs. The source code can be downloaded from
-the @uref{http://www.opensolaris.org, OpenSolaris web site}.
+more-or-less POSIX-compliant. It is based on the @command{awk} from
+Mortice Kern Systems for PCs.
This author was able to make it compile and work under GNU/Linux
with 1--2 hours of work. Making it more generally portable (using
GNU Autoconf and/or Automake) would take more work, and this
has not been done, at least to our knowledge.
+@cindex Illumos
+@cindex Illumos, POSIX-compliant @command{awk}
+@cindex source code, Illumos @command{awk}
+The source code used to be available from the OpenSolaris web site.
+However, that project was ended and the web site shut down. Fortunately, the
+@uref{http://wiki.illumos.org/display/illumos/illumos+Home, Illumos project}
+makes this implementation available. You can view the files one at a time from
+@uref{https://github.com/joyent/illumos-joyent/blob/master/usr/src/cmd/awk_xpg4}.
+
@cindex @command{jawk}
@cindex Java implementation of @command{awk}
@cindex source code, @command{jawk}
@@ -34350,6 +34478,10 @@ under the GPL. It has a large number of extensions over standard
See @uref{http://www.quiktrim.org/QTawk.html} for more information,
including the manual and a download link.
+@item Other Versions
+See also the @uref{http://en.wikipedia.org/wiki/Awk_language#Versions_and_implementations,
+Wikipedia article}, for information on additional versions.
+
@end table
@c ENDOFRANGE gligawk
@c ENDOFRANGE ingawk
@@ -34960,11 +35092,11 @@ Larry
@cindex Wall, Larry
@cindex Robbins, Arnold
@quotation
-@i{AWK is a language similar to PERL, only considerably more elegant.}@*
-Arnold Robbins
+@i{AWK is a language similar to PERL, only considerably more elegant.}
+@author Arnold Robbins
-@i{Hey!}@*
-Larry Wall
+@i{Hey!}
+@author Larry Wall
@end quotation
The @file{TODO} file in the @command{gawk} Git repository lists possible
@@ -35096,7 +35228,7 @@ in order to loop over all the element in an easy fashion for C code.
@item
The ability to create arrays (including @command{gawk}'s true
-multi-dimensional arrays).
+multidimensional arrays).
@end itemize
@end itemize
@@ -35229,11 +35361,11 @@ to any of the above.
@ref{Dynamic Extensions}, describes the supported API and mechanisms
for writing extensions for @command{gawk}. This API was introduced
-in @value{PVERSION} 4.1. However, for many years @command{gawk}
+in version 4.1. However, for many years @command{gawk}
provided an extension mechanism that required knowledge of @command{gawk}
internals and that was not as well designed.
-In order to provide a transition period, @command{gawk} @value{PVERSION}
+In order to provide a transition period, @command{gawk} version
4.1 continues to support the original extension mechanism.
This will be true for the life of exactly one major release. This support
will be withdrawn, and removed from the source code, at the next major
@@ -36201,7 +36333,7 @@ numeric values. It is the C type @code{float}.
The character generated by hitting the space bar on the keyboard.
@item Special File
-A @value{FN} interpreted internally by @command{gawk}, instead of being handed
+A file name interpreted internally by @command{gawk}, instead of being handed
directly to the underlying operating system---for example, @file{/dev/stderr}.
(@xref{Special Files}.)
@@ -37582,6 +37714,7 @@ Consistency issues:
Use MS-Windows not MS Windows
Use MS-DOS not MS-DOS
Use an empty set of parentheses after built-in and awk function names.
+ Use "multiFOO" without a hyphen.
Date: Wed, 13 Apr 94 15:20:52 -0400
From: rms@gnu.org (Richard Stallman)