aboutsummaryrefslogtreecommitdiffstats
path: root/doc
diff options
context:
space:
mode:
Diffstat (limited to 'doc')
-rw-r--r--doc/ChangeLog4
-rw-r--r--doc/gawk.info1291
-rw-r--r--doc/gawk.texi232
-rw-r--r--doc/gawktexi.in215
4 files changed, 890 insertions, 852 deletions
diff --git a/doc/ChangeLog b/doc/ChangeLog
index d6e825f2..b1c3d0d3 100644
--- a/doc/ChangeLog
+++ b/doc/ChangeLog
@@ -1,3 +1,7 @@
+2015-01-23 Arnold D. Robbins <arnold@skeeve.com>
+
+ * gawktexi.in: O'Reilly fixes.
+
2015-01-21 Arnold D. Robbins <arnold@skeeve.com>
* gawktexi.in: O'Reilly fixes.
diff --git a/doc/gawk.info b/doc/gawk.info
index cf827935..468b5f72 100644
--- a/doc/gawk.info
+++ b/doc/gawk.info
@@ -3803,8 +3803,9 @@ Collating symbols
Equivalence classes
Locale-specific names for a list of characters that are equal. The
name is enclosed between `[=' and `=]'. For example, the name `e'
- might be used to represent all of "e," "e`," and "e'." In this
- case, `[[=e=]]' is a regexp that matches any of `e', `e'', or `e`'.
+ might be used to represent all of "e," "e^," "e`," and "e'." In
+ this case, `[[=e=]]' is a regexp that matches any of `e', `e^',
+ `e'', or `e`'.
These features are very valuable in non-English-speaking locales.
@@ -3826,7 +3827,7 @@ Consider the following:
This example uses the `sub()' function to make a change to the input
record. (`sub()' replaces the first instance of any text matched by
the first argument with the string provided as the second argument;
-*note String Functions::). Here, the regexp `/a+/' indicates "one or
+*note String Functions::.) Here, the regexp `/a+/' indicates "one or
more `a' characters," and the replacement text is `<A>'.
The input contains four `a' characters. `awk' (and POSIX) regular
@@ -3863,15 +3864,16 @@ regexp":
This sets `digits_regexp' to a regexp that describes one or more digits,
and tests whether the input record matches this regexp.
- NOTE: When using the `~' and `!~' operators, there is a difference
- between a regexp constant enclosed in slashes and a string
- constant enclosed in double quotes. If you are going to use a
- string constant, you have to understand that the string is, in
- essence, scanned _twice_: the first time when `awk' reads your
+ NOTE: When using the `~' and `!~' operators, be aware that there
+ is a difference between a regexp constant enclosed in slashes and
+ a string constant enclosed in double quotes. If you are going to
+ use a string constant, you have to understand that the string is,
+ in essence, scanned _twice_: the first time when `awk' reads your
program, and the second time when it goes to match the string on
the lefthand side of the operator with the pattern on the right.
This is true of any string-valued expression (such as
- `digits_regexp', shown previously), not just string constants.
+ `digits_regexp', shown in the previous example), not just string
+ constants.
What difference does it make if the string is scanned twice? The
answer has to do with escape sequences, and particularly with
@@ -3968,7 +3970,7 @@ letters, digits, or underscores (`_'):
`\B'
Matches the empty string that occurs between two word-constituent
- characters. For example, `/\Brat\B/' matches `crate' but it does
+ characters. For example, `/\Brat\B/' matches `crate', but it does
not match `dirty rat'. `\B' is essentially the opposite of `\y'.
There are two other operators that work on buffers. In Emacs, a
@@ -3977,10 +3979,10 @@ letters, digits, or underscores (`_'):
operators are:
`\`'
- Matches the empty string at the beginning of a buffer (string).
+ Matches the empty string at the beginning of a buffer (string)
`\''
- Matches the empty string at the end of a buffer (string).
+ Matches the empty string at the end of a buffer (string)
Because `^' and `$' always work in terms of the beginning and end of
strings, these operators don't add any new capabilities for `awk'.
@@ -4151,7 +4153,7 @@ one line. Each record is automatically split into chunks called
parts of a record.
On rare occasions, you may need to use the `getline' command. The
-`getline' command is valuable, both because it can do explicit input
+`getline' command is valuable both because it can do explicit input
from any number of files, and because the files used with it do not
have to be named on the `awk' command line (*note Getline::).
@@ -4200,8 +4202,8 @@ File: gawk.info, Node: awk split records, Next: gawk split records, Up: Recor
Records are separated by a character called the "record separator". By
default, the record separator is the newline character. This is why
-records are, by default, single lines. A different character can be
-used for the record separator by assigning the character to the
+records are, by default, single lines. To use a different character
+for the record separator, simply assign that character to the
predefined variable `RS'.
Like any other variable, the value of `RS' can be changed in the
@@ -4216,14 +4218,14 @@ BEGIN/END::). For example:
awk 'BEGIN { RS = "u" }
{ print $0 }' mail-list
-changes the value of `RS' to `u', before reading any input. This is a
-string whose first character is the letter "u"; as a result, records
-are separated by the letter "u." Then the input file is read, and the
-second rule in the `awk' program (the action with no pattern) prints
-each record. Because each `print' statement adds a newline at the end
-of its output, this `awk' program copies the input with each `u'
-changed to a newline. Here are the results of running the program on
-`mail-list':
+changes the value of `RS' to `u', before reading any input. The new
+value is a string whose first character is the letter "u"; as a result,
+records are separated by the letter "u". Then the input file is read,
+and the second rule in the `awk' program (the action with no pattern)
+prints each record. Because each `print' statement adds a newline at
+the end of its output, this `awk' program copies the input with each
+`u' changed to a newline. Here are the results of running the program
+on `mail-list':
$ awk 'BEGIN { RS = "u" }
> { print $0 }' mail-list
@@ -4271,11 +4273,11 @@ data file (*note Sample Data Files::), the line looks like this:
Bill 555-1675 bill.drowning@hotmail.com A
-It contains no `u' so there is no reason to split the record, unlike
-the others which have one or more occurrences of the `u'. In fact,
-this record is treated as part of the previous record; the newline
-separating them in the output is the original newline in the data file,
-not the one added by `awk' when it printed the record!
+It contains no `u', so there is no reason to split the record, unlike
+the others, which each have one or more occurrences of the `u'. In
+fact, this record is treated as part of the previous record; the
+newline separating them in the output is the original newline in the
+data file, not the one added by `awk' when it printed the record!
Another way to change the record separator is on the command line,
using the variable-assignment feature (*note Other Arguments::):
@@ -4341,8 +4343,8 @@ part of either record.
character. However, when `RS' is a regular expression, `RT' contains
the actual input text that matched the regular expression.
- If the input file ended without any text that matches `RS', `gawk'
-sets `RT' to the null string.
+ If the input file ends without any text matching `RS', `gawk' sets
+`RT' to the null string.
The following example illustrates both of these features. It sets
`RS' equal to a regular expression that matches either a newline or a
@@ -4440,12 +4442,12 @@ to these pieces of the record. You don't have to use them--you can
operate on the whole record if you want--but fields are what make
simple `awk' programs so powerful.
- You use a dollar-sign (`$') to refer to a field in an `awk' program,
+ You use a dollar sign (`$') to refer to a field in an `awk' program,
followed by the number of the field you want. Thus, `$1' refers to the
-first field, `$2' to the second, and so on. (Unlike the Unix shells,
-the field numbers are not limited to single digits. `$127' is the
-127th field in the record.) For example, suppose the following is a
-line of input:
+first field, `$2' to the second, and so on. (Unlike in the Unix
+shells, the field numbers are not limited to single digits. `$127' is
+the 127th field in the record.) For example, suppose the following is
+a line of input:
This seems like a pretty nice example.
@@ -4462,10 +4464,9 @@ as `$7', which is `example.'. If you try to reference a field beyond
the last one (such as `$8' when the record has only seven fields), you
get the empty string. (If used in a numeric operation, you get zero.)
- The use of `$0', which looks like a reference to the "zero-th"
-field, is a special case: it represents the whole input record. Use it
-when you are not interested in specific fields. Here are some more
-examples:
+ The use of `$0', which looks like a reference to the "zeroth" field,
+is a special case: it represents the whole input record. Use it when
+you are not interested in specific fields. Here are some more examples:
$ awk '$1 ~ /li/ { print $0 }' mail-list
-| Amelia 555-5553 amelia.zodiacusque@gmail.com F
@@ -4513,8 +4514,8 @@ is another example of using expressions as field numbers:
awk '{ print $(2*2) }' mail-list
`awk' evaluates the expression `(2*2)' and uses its value as the
-number of the field to print. The `*' sign represents multiplication,
-so the expression `2*2' evaluates to four. The parentheses are used so
+number of the field to print. The `*' represents multiplication, so
+the expression `2*2' evaluates to four. The parentheses are used so
that the multiplication is done before the `$' operation; they are
necessary whenever there is a binary operator(1) in the field-number
expression. This example, then, prints the type of relationship (the
@@ -4538,7 +4539,7 @@ field number.
---------- Footnotes ----------
(1) A "binary operator", such as `*' for multiplication, is one that
-takes two operands. The distinction is required, because `awk' also has
+takes two operands. The distinction is required because `awk' also has
unary (one-operand) and ternary (three-operand) operators.

@@ -4660,7 +4661,7 @@ value of `NF' and recomputes `$0'. (d.c.) Here is an example:
decremented.
Finally, there are times when it is convenient to force `awk' to
-rebuild the entire record, using the current value of the fields and
+rebuild the entire record, using the current values of the fields and
`OFS'. To do this, use the seemingly innocuous assignment:
$1 = $1 # force record to be reconstituted
@@ -4680,7 +4681,7 @@ built-in function that updates `$0', such as `sub()' and `gsub()'
It is important to remember that `$0' is the _full_ record, exactly
as it was read from the input. This includes any leading or trailing
whitespace, and the exact whitespace (or other characters) that
-separate the fields.
+separates the fields.
It is a common error to try to change the field separators in a
record simply by setting `FS' and `OFS', and then expecting a plain
@@ -4748,7 +4749,7 @@ attached, such as:
John Q. Smith, LXIX, 29 Oak St., Walamazoo, MI 42139
-The same program would extract `*LXIX', instead of `*29*Oak*St.'. If
+The same program would extract `*LXIX' instead of `*29*Oak*St.'. If
you were expecting the program to print the address, you would be
surprised. The moral is to choose your data layout and separator
characters carefully to prevent such problems. (If the data is not in
@@ -4947,11 +4948,11 @@ your field and record separators.
Perhaps the most common use of a single character as the field
separator occurs when processing the Unix system password file. On
many Unix systems, each user has a separate entry in the system
-password file, one line per user. The information in these lines is
-separated by colons. The first field is the user's login name and the
-second is the user's encrypted or shadow password. (A shadow password
-is indicated by the presence of a single `x' in the second field.) A
-password file entry might look like this:
+password file, with one line per user. The information in these lines
+is separated by colons. The first field is the user's login name and
+the second is the user's encrypted or shadow password. (A shadow
+password is indicated by the presence of a single `x' in the second
+field.) A password file entry might look like this:
arnold:x:2076:10:Arnold Robbins:/home/arnold:/bin/bash
@@ -4979,15 +4980,14 @@ When you do this, `$1' is the same as `$0'.
According to the POSIX standard, `awk' is supposed to behave as if
each record is split into fields at the time it is read. In
particular, this means that if you change the value of `FS' after a
-record is read, the value of the fields (i.e., how they were split)
+record is read, the values of the fields (i.e., how they were split)
should reflect the old value of `FS', not the new one.
However, many older implementations of `awk' do not work this way.
Instead, they defer splitting the fields until a field is actually
referenced. The fields are split using the _current_ value of `FS'!
(d.c.) This behavior can be difficult to diagnose. The following
-example illustrates the difference between the two methods. (The
-`sed'(2) command prints just the first line of `/etc/passwd'.)
+example illustrates the difference between the two methods:
sed 1q /etc/passwd | awk '{ FS = ":" ; print $1 }'
@@ -5000,6 +5000,8 @@ first line of the file, something like:
root:x:0:0:Root:/:
+ (The `sed'(2) command prints just the first line of `/etc/passwd'.)
+
---------- Footnotes ----------
(1) Thanks to Andrew Schorr for this tip.
@@ -5153,7 +5155,7 @@ run on a system with card readers is another story!)
splitting again. Use `FS = FS' to make this happen, without having to
know the current value of `FS'. In order to tell which kind of field
splitting is in effect, use `PROCINFO["FS"]' (*note Auto-set::). The
-value is `"FS"' if regular field splitting is being used, or it is
+value is `"FS"' if regular field splitting is being used, or
`"FIELDWIDTHS"' if fixed-width field splitting is being used:
if (PROCINFO["FS"] == "FS")
@@ -5186,10 +5188,10 @@ what they are, and not by what they are not.
The most notorious such case is so-called "comma-separated values"
(CSV) data. Many spreadsheet programs, for example, can export their
data into text files, where each record is terminated with a newline,
-and fields are separated by commas. If only commas separated the data,
+and fields are separated by commas. If commas only separated the data,
there wouldn't be an issue. The problem comes when one of the fields
contains an _embedded_ comma. In such cases, most programs embed the
-field in double quotes.(1) So we might have data like this:
+field in double quotes.(1) So, we might have data like this:
Robbins,Arnold,"1234 A Pretty Street, NE",MyTown,MyState,12345-6789,USA
@@ -5256,9 +5258,9 @@ being used.
provides an elegant solution for the majority of cases, and the
`gawk' developers are satisfied with that.
- As written, the regexp used for `FPAT' requires that each field have
-a least one character. A straightforward modification (changing
-changed the first `+' to `*') allows fields to be empty:
+ As written, the regexp used for `FPAT' requires that each field
+contain at least one character. A straightforward modification
+(changing the first `+' to `*') allows fields to be empty:
FPAT = "([^,]*)|(\"[^\"]+\")"
@@ -5266,9 +5268,8 @@ changed the first `+' to `*') allows fields to be empty:
available for splitting regular strings (*note String Functions::).
To recap, `gawk' provides three independent methods to split input
-records into fields. `gawk' uses whichever mechanism was last chosen
-based on which of the three variables--`FS', `FIELDWIDTHS', and
-`FPAT'--was last assigned to.
+records into fields. The mechanism used is based on which of the three
+variables--`FS', `FIELDWIDTHS', or `FPAT'--was last assigned to.
---------- Footnotes ----------
@@ -5306,7 +5307,7 @@ empty; lines that contain only whitespace do not count.)
`"\n\n+"' to `RS'. This regexp matches the newline at the end of the
record and one or more blank lines after the record. In addition, a
regular expression always matches the longest possible sequence when
-there is a choice (*note Leftmost Longest::). So the next record
+there is a choice (*note Leftmost Longest::). So, the next record
doesn't start until the first nonblank line that follows--no matter how
many blank lines appear in a row, they are considered one record
separator.
@@ -5318,12 +5319,12 @@ last record, the final newline is removed from the record. In the
second case, this special processing is not done. (d.c.)
Now that the input is separated into records, the second step is to
-separate the fields in the record. One way to do this is to divide each
-of the lines into fields in the normal manner. This happens by default
-as the result of a special feature. When `RS' is set to the empty
-string, _and_ `FS' is set to a single character, the newline character
-_always_ acts as a field separator. This is in addition to whatever
-field separations result from `FS'.(1)
+separate the fields in the records. One way to do this is to divide
+each of the lines into fields in the normal manner. This happens by
+default as the result of a special feature. When `RS' is set to the
+empty string _and_ `FS' is set to a single character, the newline
+character _always_ acts as a field separator. This is in addition to
+whatever field separations result from `FS'.(1)
The original motivation for this special exception was probably to
provide useful behavior in the default case (i.e., `FS' is equal to
@@ -5331,17 +5332,17 @@ provide useful behavior in the default case (i.e., `FS' is equal to
newline character to separate fields, because there is no way to
prevent it. However, you can work around this by using the `split()'
function to break up the record manually (*note String Functions::).
-If you have a single character field separator, you can work around the
+If you have a single-character field separator, you can work around the
special feature in a different way, by making `FS' into a regexp for
that single character. For example, if the field separator is a
percent character, instead of `FS = "%"', use `FS = "[%]"'.
Another way to separate fields is to put each field on a separate
line: to do this, just set the variable `FS' to the string `"\n"'.
-(This single character separator matches a single newline.) A
+(This single-character separator matches a single newline.) A
practical example of a data file organized this way might be a mailing
-list, where each entry is separated by blank lines. Consider a mailing
-list in a file named `addresses', which looks like this:
+list, where blank lines separate the entries. Consider a mailing list
+in a file named `addresses', which looks like this:
Jane Doe
123 Main Street
@@ -5424,7 +5425,7 @@ File: gawk.info, Node: Getline, Next: Read Timeout, Prev: Multiple Line, Up:
So far we have been getting our input data from `awk''s main input
stream--either the standard input (usually your keyboard, sometimes the
-output from another program) or from the files specified on the command
+output from another program) or the files specified on the command
line. The `awk' language has a special built-in command called
`getline' that can be used to read input under your explicit control.
@@ -5562,7 +5563,7 @@ and produces these results:
free
The `getline' command used in this way sets only the variables `NR',
-`FNR', and `RT' (and of course, VAR). The record is not split into
+`FNR', and `RT' (and, of course, VAR). The record is not split into
fields, so the values of the fields (including `$0') and the value of
`NF' do not change.
@@ -5572,8 +5573,8 @@ File: gawk.info, Node: Getline/File, Next: Getline/Variable/File, Prev: Getli
4.9.3 Using `getline' from a File
---------------------------------
-Use `getline < FILE' to read the next record from FILE. Here FILE is a
-string-valued expression that specifies the file name. `< FILE' is
+Use `getline < FILE' to read the next record from FILE. Here, FILE is
+a string-valued expression that specifies the file name. `< FILE' is
called a "redirection" because it directs input to come from a
different place. For example, the following program reads its input
record from the file `secondary.input' when it encounters a first field
@@ -5709,8 +5710,8 @@ all `awk' implementations.
treatment of a construct like `"echo " "date" | getline'. Most
versions, including the current version, treat it at as `("echo "
"date") | getline'. (This is also how BWK `awk' behaves.) Some
- versions changed and treated it as `"echo " ("date" | getline)'.
- (This is how `mawk' behaves.) In short, _always_ use explicit
+ versions instead treat it as `"echo " ("date" | getline)'. (This
+ is how `mawk' behaves.) In short, _always_ use explicit
parentheses, and then you won't have to worry.

@@ -5746,15 +5747,16 @@ File: gawk.info, Node: Getline/Coprocess, Next: Getline/Variable/Coprocess, P
4.9.7 Using `getline' from a Coprocess
--------------------------------------
-Input into `getline' from a pipe is a one-way operation. The command
-that is started with `COMMAND | getline' only sends data _to_ your
-`awk' program.
+Reading input into `getline' from a pipe is a one-way operation. The
+command that is started with `COMMAND | getline' only sends data _to_
+your `awk' program.
On occasion, you might want to send data to another program for
processing and then read the results back. `gawk' allows you to start
a "coprocess", with which two-way communications are possible. This is
done with the `|&' operator. Typically, you write data to the
-coprocess first and then read results back, as shown in the following:
+coprocess first and then read the results back, as shown in the
+following:
print "SOME QUERY" |& "db_server"
"db_server" |& getline
@@ -5816,7 +5818,7 @@ in mind:
files. (d.c.) (See *note BEGIN/END::; also *note Auto-set::.)
* Using `FILENAME' with `getline' (`getline < FILENAME') is likely
- to be a source for confusion. `awk' opens a separate input stream
+ to be a source of confusion. `awk' opens a separate input stream
from the current input file. However, by not using a variable,
`$0' and `NF' are still updated. If you're doing this, it's
probably by accident, and you should reconsider what it is you're
@@ -5824,15 +5826,15 @@ in mind:
* *note Getline Summary::, presents a table summarizing the
`getline' variants and which variables they can affect. It is
- worth noting that those variants which do not use redirection can
+ worth noting that those variants that do not use redirection can
cause `FILENAME' to be updated if they cause `awk' to start
reading a new input file.
* If the variable being assigned is an expression with side effects,
different versions of `awk' behave differently upon encountering
end-of-file. Some versions don't evaluate the expression; many
- versions (including `gawk') do. Here is an example, due to Duncan
- Moore:
+ versions (including `gawk') do. Here is an example, courtesy of
+ Duncan Moore:
BEGIN {
system("echo 1 > f")
@@ -5840,8 +5842,8 @@ in mind:
print c
}
- Here, the side effect is the `++c'. Is `c' incremented if end of
- file is encountered, before the element in `a' is assigned?
+ Here, the side effect is the `++c'. Is `c' incremented if
+ end-of-file is encountered before the element in `a' is assigned?
`gawk' treats `getline' like a function call, and evaluates the
expression `a[++c]' before attempting to read from `f'. However,
@@ -5885,8 +5887,8 @@ This minor node describes a feature that is specific to `gawk'.
You may specify a timeout in milliseconds for reading input from the
keyboard, a pipe, or two-way communication, including TCP/IP sockets.
-This can be done on a per input, command, or connection basis, by
-setting a special element in the `PROCINFO' array (*note Auto-set::):
+This can be done on a per-input, per-command, or per-connection basis,
+by setting a special element in the `PROCINFO' array (*note Auto-set::):
PROCINFO["input_name", "READ_TIMEOUT"] = TIMEOUT IN MILLISECONDS
@@ -5910,7 +5912,7 @@ for more than five seconds:
print $0
`gawk' terminates the read operation if input does not arrive after
-waiting for the timeout period, returns failure and sets `ERRNO' to an
+waiting for the timeout period, returns failure, and sets `ERRNO' to an
appropriate string value. A negative or zero value for the timeout is
the same as specifying no timeout at all.
@@ -5950,7 +5952,7 @@ input to arrive:
environment variable exists, `gawk' uses its value to initialize the
timeout value. The exclusive use of the environment variable to
specify timeout has the disadvantage of not being able to control it on
-a per command or connection basis.
+a per-command or per-connection basis.
`gawk' considers a timeout event to be an error even though the
attempt to read from the underlying device may succeed in a later
@@ -6018,7 +6020,7 @@ File: gawk.info, Node: Input Summary, Next: Input Exercises, Prev: Command-li
* `gawk' sets `RT' to the text matched by `RS'.
* After splitting the input into records, `awk' further splits the
- record into individual fields, named `$1', `$2', and so on. `$0'
+ records into individual fields, named `$1', `$2', and so on. `$0'
is the whole record, and `NF' indicates how many fields there are.
The default way to split fields is between whitespace characters.
@@ -6032,19 +6034,21 @@ File: gawk.info, Node: Input Summary, Next: Input Exercises, Prev: Command-li
* Field splitting is more complicated than record splitting:
- Field separator value Fields are split ... `awk' /
- `gawk'
+ Field separator value Fields are split ... `awk' /
+ `gawk'
----------------------------------------------------------------------
- `FS == " "' On runs of whitespace `awk'
- `FS == ANY SINGLE On that character `awk'
- CHARACTER'
- `FS == REGEXP' On text matching the regexp `awk'
- `FS == ""' Each individual character is `gawk'
- a separate field
- `FIELDWIDTHS == LIST OF Based on character position `gawk'
- COLUMNS'
- `FPAT == REGEXP' On the text surrounding text `gawk'
- matching the regexp
+ `FS == " "' On runs of whitespace `awk'
+ `FS == ANY SINGLE On that character `awk'
+ CHARACTER'
+ `FS == REGEXP' On text matching the `awk'
+ regexp
+ `FS == ""' Such that each individual `gawk'
+ character is a separate
+ field
+ `FIELDWIDTHS == LIST OF Based on character `gawk'
+ COLUMNS' position
+ `FPAT == REGEXP' On the text surrounding `gawk'
+ text matching the regexp
* Using `FS = "\n"' causes the entire record to be a single field
(assuming that newlines separate records).
@@ -6054,12 +6058,11 @@ File: gawk.info, Node: Input Summary, Next: Input Exercises, Prev: Command-li
* Use `PROCINFO["FS"]' to see how fields are being split.
- * Use `getline' in its various forms to read additional records,
- from the default input stream, from a file, or from a pipe or
- coprocess.
+ * Use `getline' in its various forms to read additional records from
+ the default input stream, from a file, or from a pipe or coprocess.
- * Use `PROCINFO[FILE, "READ_TIMEOUT"]' to cause reads to timeout for
- FILE.
+ * Use `PROCINFO[FILE, "READ_TIMEOUT"]' to cause reads to time out
+ for FILE.
* Directories on the command line are fatal for standard `awk';
`gawk' ignores them if not in POSIX mode.
@@ -6153,7 +6156,7 @@ you will probably get an error. Keep in mind that a space is printed
between any two items.
Note that the `print' statement is a statement and not an
-expression--you can't use it in the pattern part of a PATTERN-ACTION
+expression--you can't use it in the pattern part of a pattern-action
statement, for example.

@@ -6301,7 +6304,7 @@ File: gawk.info, Node: OFMT, Next: Printf, Prev: Output Separators, Up: Prin
===========================================
When printing numeric values with the `print' statement, `awk'
-internally converts the number to a string of characters and prints
+internally converts each number to a string of characters and prints
that string. `awk' uses the `sprintf()' function to do this conversion
(*note String Functions::). For now, it suffices to say that the
`sprintf()' function accepts a "format specification" that tells it how
@@ -6356,7 +6359,7 @@ A simple `printf' statement looks like this:
As for `print', the entire list of arguments may optionally be enclosed
in parentheses. Here too, the parentheses are necessary if any of the
-item expressions use the `>' relational operator; otherwise, it can be
+item expressions uses the `>' relational operator; otherwise, it can be
confused with an output redirection (*note Redirection::).
The difference between `printf' and `print' is the FORMAT argument.
@@ -6383,7 +6386,7 @@ statements. For example:
> }'
-| Don't Panic!
-Here, neither the `+' nor the `OUCH!' appear in the output message.
+Here, neither the `+' nor the `OUCH!' appears in the output message.

File: gawk.info, Node: Control Letters, Next: Format Modifiers, Prev: Basic Printf, Up: Printf
@@ -6422,7 +6425,7 @@ width. Here is a list of the format-control letters:
(The `%i' specification is for compatibility with ISO C.)
`%e', `%E'
- Print a number in scientific (exponential) notation; for example:
+ Print a number in scientific (exponential) notation. For example:
printf "%4.3e\n", 1950
@@ -6447,7 +6450,7 @@ width. Here is a list of the format-control letters:
Math Definitions::).
`%F'
- Like `%f' but the infinity and "not a number" values are spelled
+ Like `%f', but the infinity and "not a number" values are spelled
using uppercase letters.
The `%F' format is a POSIX extension to ISO C; not all systems
@@ -6641,7 +6644,7 @@ string, like so:
s = "abcdefg"
printf "%" w "." p "s\n", s
-This is not particularly easy to read but it does work.
+This is not particularly easy to read, but it does work.
C programmers may be used to supplying additional modifiers (`h',
`j', `l', `L', `t', and `z') in `printf' format strings. These are not
@@ -6680,7 +6683,7 @@ an aligned two-column table of names and phone numbers, as shown here:
-| Jean-Paul 555-2127
In this case, the phone numbers had to be printed as strings because
-the numbers are separated by a dash. Printing the phone numbers as
+the numbers are separated by dashes. Printing the phone numbers as
numbers would have produced just the first three digits: `555'. This
would have been pretty confusing.
@@ -6728,7 +6731,7 @@ output, usually the screen. Both `print' and `printf' can also send
their output to other places. This is called "redirection".
NOTE: When `--sandbox' is specified (*note Options::), redirecting
- output to files, pipes and coprocesses is disabled.
+ output to files, pipes, and coprocesses is disabled.
A redirection appears after the `print' or `printf' statement.
Redirections in `awk' are written just like redirections in shell
@@ -6768,7 +6771,7 @@ work identically for `printf':
Each output file contains one name or number per line.
`print ITEMS >> OUTPUT-FILE'
- This redirection prints the items into the pre-existing output file
+ This redirection prints the items into the preexisting output file
named OUTPUT-FILE. The difference between this and the single-`>'
redirection is that the old contents (if any) of OUTPUT-FILE are
not erased. Instead, the `awk' output is appended to the file.
@@ -6816,8 +6819,8 @@ work identically for `printf':
`print ITEMS |& COMMAND'
This redirection prints the items to the input of COMMAND. The
difference between this and the single-`|' redirection is that the
- output from COMMAND can be read with `getline'. Thus COMMAND is a
- "coprocess", which works together with, but subsidiary to, the
+ output from COMMAND can be read with `getline'. Thus, COMMAND is
+ a "coprocess", which works together with but is subsidiary to the
`awk' program.
This feature is a `gawk' extension, and is not available in POSIX
@@ -6841,7 +6844,7 @@ a file, and then to use `>>' for subsequent output:
This is indeed how redirections must be used from the shell. But in
`awk', it isn't necessary. In this kind of case, a program should use
`>' for all the `print' statements, because the output file is only
-opened once. (It happens that if you mix `>' and `>>' that output is
+opened once. (It happens that if you mix `>' and `>>' output is
produced in the expected order. However, mixing the operators for the
same file is definitely poor style, and is confusing to readers of your
program.)
@@ -6874,14 +6877,14 @@ command lines to be fed to the shell.

File: gawk.info, Node: Special FD, Next: Special Files, Prev: Redirection, Up: Printing
-5.7 Special Files for Standard Pre-Opened Data Streams
-======================================================
+5.7 Special Files for Standard Preopened Data Streams
+=====================================================
Running programs conventionally have three input and output streams
already available to them for reading and writing. These are known as
the "standard input", "standard output", and "standard error output".
-These open streams (and any other open file or pipe) are often referred
-to by the technical term "file descriptors".
+These open streams (and any other open files or pipes) are often
+referred to by the technical term "file descriptors".
These streams are, by default, connected to your keyboard and
screen, but they are often redirected with the shell, via the `<', `<<',
@@ -6906,7 +6909,7 @@ error messages to the screen, like this:
(`/dev/tty' is a special file supplied by the operating system that is
connected to your keyboard and screen. It represents the "terminal,"(1)
which on modern systems is a keyboard and screen, not a serial console.)
-This generally has the same effect but not always: although the
+This generally has the same effect, but not always: although the
standard error stream is usually the screen, it can be redirected; when
that happens, writing to the screen is not correct. In fact, if `awk'
is run from a background job, it may not have a terminal at all. Then
@@ -6933,7 +6936,7 @@ becomes:
print "Serious error detected!" > "/dev/stderr"
- Note the use of quotes around the file name. Like any other
+ Note the use of quotes around the file name. Like with any other
redirection, the value must be a string. It is a common error to omit
the quotes, which leads to confusing results.
@@ -6966,7 +6969,7 @@ there are special file names reserved for TCP/IP networking.

File: gawk.info, Node: Other Inherited Files, Next: Special Network, Up: Special Files
-5.8.1 Accessing Other Open Files With `gawk'
+5.8.1 Accessing Other Open Files with `gawk'
--------------------------------------------
Besides the `/dev/stdin', `/dev/stdout', and `/dev/stderr' special file
@@ -7016,7 +7019,7 @@ File: gawk.info, Node: Special Caveats, Prev: Special Network, Up: Special Fi
Here are some things to bear in mind when using the special file names
that `gawk' provides:
- * Recognition of the file names for the three standard pre-opened
+ * Recognition of the file names for the three standard preopened
files is disabled only in POSIX mode.
* Recognition of the other special file names is disabled if `gawk'
@@ -7025,7 +7028,7 @@ that `gawk' provides:
* `gawk' _always_ interprets these special file names. For example,
using `/dev/fd/4' for output actually writes on file descriptor 4,
- and not on a new file descriptor that is `dup()''ed from file
+ and not on a new file descriptor that is `dup()'ed from file
descriptor 4. Most of the time this does not matter; however, it
is important to _not_ close any of the files related to file
descriptors 0, 1, and 2. Doing so results in unpredictable
@@ -7185,8 +7188,8 @@ closing input or output files, respectively. This value is zero if the
close succeeds, or -1 if it fails.
The POSIX standard is very vague; it says that `close()' returns
-zero on success and nonzero otherwise. In general, different
-implementations vary in what they report when closing pipes; thus the
+zero on success and a nonzero value otherwise. In general, different
+implementations vary in what they report when closing pipes; thus, the
return value cannot be used portably. (d.c.) In POSIX mode (*note
Options::), `gawk' just returns zero when closing a pipe.
@@ -7212,8 +7215,8 @@ File: gawk.info, Node: Output Summary, Next: Output Exercises, Prev: Close Fi
numeric values for the `print' statement.
* The `printf' statement provides finer-grained control over output,
- with format control letters for different data types and various
- flags that modify the behavior of the format control letters.
+ with format-control letters for different data types and various
+ flags that modify the behavior of the format-control letters.
* Output from both `print' and `printf' may be redirected to files,
pipes, and coprocesses.
@@ -28464,7 +28467,7 @@ Unix `awk'
To get `awka', go to `http://sourceforge.net/projects/awka'.
The project seems to be frozen; no new code changes have been made
- since approximately 2003.
+ since approximately 2001.
`pawk'
Nelson H.F. Beebe at the University of Utah has modified BWK `awk'
@@ -28704,7 +28707,7 @@ possible to include them:
document describes how GNU software should be written. If you
haven't read it, please do so, preferably _before_ starting to
modify `gawk'. (The `GNU Coding Standards' are available from the
- GNU Project's website (http://www.gnu.org/prep/standards_toc.html).
+ GNU Project's website (http://www.gnu.org/prep/standards/).
Texinfo, Info, and DVI versions are also available.)
5. Use the `gawk' coding style. The C code for `gawk' follows the
@@ -31409,7 +31412,7 @@ Index
* ! (exclamation point), !~ operator <5>: Case-sensitivity. (line 26)
* ! (exclamation point), !~ operator <6>: Computed Regexps. (line 6)
* ! (exclamation point), !~ operator: Regexp Usage. (line 19)
-* " (double quote), in regexp constants: Computed Regexps. (line 29)
+* " (double quote), in regexp constants: Computed Regexps. (line 30)
* " (double quote), in shell commands: Quoting. (line 54)
* # (number sign), #! (executable scripts): Executable Scripts.
(line 6)
@@ -31644,7 +31647,7 @@ Index
* \ (backslash), in escape sequences: Escape Sequences. (line 6)
* \ (backslash), in escape sequences, POSIX and: Escape Sequences.
(line 108)
-* \ (backslash), in regexp constants: Computed Regexps. (line 29)
+* \ (backslash), in regexp constants: Computed Regexps. (line 30)
* \ (backslash), in shell commands: Quoting. (line 48)
* \ (backslash), regexp operator: Regexp Operators. (line 18)
* ^ (caret), ^ operator: Precedence. (line 49)
@@ -31913,7 +31916,7 @@ Index
* backslash (\), in escape sequences: Escape Sequences. (line 6)
* backslash (\), in escape sequences, POSIX and: Escape Sequences.
(line 108)
-* backslash (\), in regexp constants: Computed Regexps. (line 29)
+* backslash (\), in regexp constants: Computed Regexps. (line 30)
* backslash (\), in shell commands: Quoting. (line 48)
* backslash (\), regexp operator: Regexp Operators. (line 18)
* backtrace debugger command: Execution Stack. (line 13)
@@ -32511,7 +32514,7 @@ Index
* dollar sign ($), incrementing fields and arrays: Increment Ops.
(line 30)
* dollar sign ($), regexp operator: Regexp Operators. (line 35)
-* double quote ("), in regexp constants: Computed Regexps. (line 29)
+* double quote ("), in regexp constants: Computed Regexps. (line 30)
* double quote ("), in shell commands: Quoting. (line 54)
* down debugger command: Execution Stack. (line 23)
* Drepper, Ulrich: Acknowledgments. (line 52)
@@ -32897,7 +32900,7 @@ Index
* gawk, awk and: Preface. (line 21)
* gawk, bitwise operations in: Bitwise Functions. (line 40)
* gawk, break statement in: Break Statement. (line 51)
-* gawk, character classes and: Bracket Expressions. (line 100)
+* gawk, character classes and: Bracket Expressions. (line 101)
* gawk, coding style in: Adding Code. (line 38)
* gawk, command-line options, and regular expressions: GNU Regexp Operators.
(line 70)
@@ -33180,7 +33183,7 @@ Index
(line 13)
* internationalization, localization: User-modified. (line 151)
* internationalization, localization, character classes: Bracket Expressions.
- (line 100)
+ (line 101)
* internationalization, localization, gawk and: Internationalization.
(line 13)
* internationalization, localization, locale categories: Explaining gettext.
@@ -33398,8 +33401,8 @@ Index
* newlines, as field separators: Default Field Splitting.
(line 6)
* newlines, as record separators: awk split records. (line 12)
-* newlines, in dynamic regexps: Computed Regexps. (line 59)
-* newlines, in regexp constants: Computed Regexps. (line 69)
+* newlines, in dynamic regexps: Computed Regexps. (line 60)
+* newlines, in regexp constants: Computed Regexps. (line 70)
* newlines, printing: Print Examples. (line 12)
* newlines, separating statements in actions <1>: Statements. (line 10)
* newlines, separating statements in actions: Action Overview.
@@ -33825,8 +33828,8 @@ Index
* regexp constants, as patterns: Expression Patterns. (line 34)
* regexp constants, in gawk: Using Constant Regexps.
(line 28)
-* regexp constants, slashes vs. quotes: Computed Regexps. (line 29)
-* regexp constants, vs. string constants: Computed Regexps. (line 39)
+* regexp constants, slashes vs. quotes: Computed Regexps. (line 30)
+* regexp constants, vs. string constants: Computed Regexps. (line 40)
* register extension: Registration Functions.
(line 6)
* regular expressions: Regexp. (line 6)
@@ -33845,7 +33848,7 @@ Index
(line 57)
* regular expressions, dynamic: Computed Regexps. (line 6)
* regular expressions, dynamic, with embedded newlines: Computed Regexps.
- (line 59)
+ (line 60)
* regular expressions, gawk, command-line options: GNU Regexp Operators.
(line 70)
* regular expressions, interval expressions and: Options. (line 279)
@@ -34042,7 +34045,7 @@ Index
* sidebar, Understanding #!: Executable Scripts. (line 31)
* sidebar, Understanding $0: Changing Fields. (line 134)
* sidebar, Using \n in Bracket Expressions of Dynamic Regexps: Computed Regexps.
- (line 57)
+ (line 58)
* sidebar, Using close()'s Return Value: Close Files And Pipes.
(line 131)
* SIGHUP signal, for dynamic profiling: Profiling. (line 210)
@@ -34136,7 +34139,7 @@ Index
* stream editors: Full Line Fields. (line 22)
* strftime: Time Functions. (line 48)
* string constants: Scalar Constants. (line 15)
-* string constants, vs. regexp constants: Computed Regexps. (line 39)
+* string constants, vs. regexp constants: Computed Regexps. (line 40)
* string extraction (internationalization): String Extraction.
(line 6)
* string length: String Functions. (line 171)
@@ -34271,7 +34274,7 @@ Index
* troubleshooting, quotes with file names: Special FD. (line 62)
* troubleshooting, readable data files: File Checking. (line 6)
* troubleshooting, regexp constants vs. string constants: Computed Regexps.
- (line 39)
+ (line 40)
* troubleshooting, string concatenation: Concatenation. (line 26)
* troubleshooting, substr() function: String Functions. (line 499)
* troubleshooting, system() function: I/O Functions. (line 128)
@@ -34517,496 +34520,496 @@ Ref: Regexp Operators-Footnote-1170640
Ref: Regexp Operators-Footnote-2170787
Node: Bracket Expressions170885
Ref: table-char-classes172900
-Node: Leftmost Longest175825
-Node: Computed Regexps177127
-Node: GNU Regexp Operators180524
-Node: Case-sensitivity184197
-Ref: Case-sensitivity-Footnote-1187082
-Ref: Case-sensitivity-Footnote-2187317
-Node: Regexp Summary187425
-Node: Reading Files188892
-Node: Records190986
-Node: awk split records191719
-Node: gawk split records196634
-Ref: gawk split records-Footnote-1201178
-Node: Fields201215
-Ref: Fields-Footnote-1203991
-Node: Nonconstant Fields204077
-Ref: Nonconstant Fields-Footnote-1206320
-Node: Changing Fields206524
-Node: Field Separators212453
-Node: Default Field Splitting215158
-Node: Regexp Field Splitting216275
-Node: Single Character Fields219625
-Node: Command Line Field Separator220684
-Node: Full Line Fields223896
-Ref: Full Line Fields-Footnote-1225413
-Ref: Full Line Fields-Footnote-2225459
-Node: Field Splitting Summary225560
-Node: Constant Size227634
-Node: Splitting By Content232223
-Ref: Splitting By Content-Footnote-1236217
-Node: Multiple Line236380
-Ref: Multiple Line-Footnote-1242266
-Node: Getline242445
-Node: Plain Getline244657
-Node: Getline/Variable247297
-Node: Getline/File248445
-Node: Getline/Variable/File249829
-Ref: Getline/Variable/File-Footnote-1251432
-Node: Getline/Pipe251519
-Node: Getline/Variable/Pipe254202
-Node: Getline/Coprocess255333
-Node: Getline/Variable/Coprocess256585
-Node: Getline Notes257324
-Node: Getline Summary260116
-Ref: table-getline-variants260528
-Node: Read Timeout261357
-Ref: Read Timeout-Footnote-1265181
-Node: Command-line directories265239
-Node: Input Summary266144
-Node: Input Exercises269445
-Node: Printing270173
-Node: Print271950
-Node: Print Examples273407
-Node: Output Separators276186
-Node: OFMT278204
-Node: Printf279558
-Node: Basic Printf280343
-Node: Control Letters281913
-Node: Format Modifiers285896
-Node: Printf Examples291905
-Node: Redirection294391
-Node: Special FD301232
-Ref: Special FD-Footnote-1304392
-Node: Special Files304466
-Node: Other Inherited Files305083
-Node: Special Network306083
-Node: Special Caveats306945
-Node: Close Files And Pipes307896
-Ref: Close Files And Pipes-Footnote-1315078
-Ref: Close Files And Pipes-Footnote-2315226
-Node: Output Summary315376
-Node: Output Exercises316374
-Node: Expressions317054
-Node: Values318239
-Node: Constants318917
-Node: Scalar Constants319608
-Ref: Scalar Constants-Footnote-1320467
-Node: Nondecimal-numbers320717
-Node: Regexp Constants323735
-Node: Using Constant Regexps324260
-Node: Variables327403
-Node: Using Variables328058
-Node: Assignment Options329969
-Node: Conversion331844
-Node: Strings And Numbers332368
-Ref: Strings And Numbers-Footnote-1335433
-Node: Locale influences conversions335542
-Ref: table-locale-affects338289
-Node: All Operators338877
-Node: Arithmetic Ops339507
-Node: Concatenation342012
-Ref: Concatenation-Footnote-1344831
-Node: Assignment Ops344937
-Ref: table-assign-ops349916
-Node: Increment Ops351188
-Node: Truth Values and Conditions354626
-Node: Truth Values355711
-Node: Typing and Comparison356760
-Node: Variable Typing357570
-Node: Comparison Operators361223
-Ref: table-relational-ops361633
-Node: POSIX String Comparison365128
-Ref: POSIX String Comparison-Footnote-1366200
-Node: Boolean Ops366338
-Ref: Boolean Ops-Footnote-1370817
-Node: Conditional Exp370908
-Node: Function Calls372635
-Node: Precedence376515
-Node: Locales380176
-Node: Expressions Summary381808
-Node: Patterns and Actions384368
-Node: Pattern Overview385488
-Node: Regexp Patterns387167
-Node: Expression Patterns387710
-Node: Ranges391420
-Node: BEGIN/END394526
-Node: Using BEGIN/END395287
-Ref: Using BEGIN/END-Footnote-1398021
-Node: I/O And BEGIN/END398127
-Node: BEGINFILE/ENDFILE400441
-Node: Empty403342
-Node: Using Shell Variables403659
-Node: Action Overview405932
-Node: Statements408258
-Node: If Statement410106
-Node: While Statement411601
-Node: Do Statement413630
-Node: For Statement414774
-Node: Switch Statement417931
-Node: Break Statement420313
-Node: Continue Statement422354
-Node: Next Statement424181
-Node: Nextfile Statement426562
-Node: Exit Statement429192
-Node: Built-in Variables431595
-Node: User-modified432728
-Ref: User-modified-Footnote-1440409
-Node: Auto-set440471
-Ref: Auto-set-Footnote-1454163
-Ref: Auto-set-Footnote-2454368
-Node: ARGC and ARGV454424
-Node: Pattern Action Summary458642
-Node: Arrays461069
-Node: Array Basics462398
-Node: Array Intro463242
-Ref: figure-array-elements465206
-Ref: Array Intro-Footnote-1467732
-Node: Reference to Elements467860
-Node: Assigning Elements470312
-Node: Array Example470803
-Node: Scanning an Array472561
-Node: Controlling Scanning475577
-Ref: Controlling Scanning-Footnote-1480773
-Node: Numeric Array Subscripts481089
-Node: Uninitialized Subscripts483274
-Node: Delete484891
-Ref: Delete-Footnote-1487634
-Node: Multidimensional487691
-Node: Multiscanning490788
-Node: Arrays of Arrays492377
-Node: Arrays Summary497136
-Node: Functions499228
-Node: Built-in500127
-Node: Calling Built-in501205
-Node: Numeric Functions503196
-Ref: Numeric Functions-Footnote-1508015
-Ref: Numeric Functions-Footnote-2508372
-Ref: Numeric Functions-Footnote-3508420
-Node: String Functions508692
-Ref: String Functions-Footnote-1532167
-Ref: String Functions-Footnote-2532296
-Ref: String Functions-Footnote-3532544
-Node: Gory Details532631
-Ref: table-sub-escapes534412
-Ref: table-sub-proposed535932
-Ref: table-posix-sub537296
-Ref: table-gensub-escapes538832
-Ref: Gory Details-Footnote-1539664
-Node: I/O Functions539815
-Ref: I/O Functions-Footnote-1547033
-Node: Time Functions547180
-Ref: Time Functions-Footnote-1557668
-Ref: Time Functions-Footnote-2557736
-Ref: Time Functions-Footnote-3557894
-Ref: Time Functions-Footnote-4558005
-Ref: Time Functions-Footnote-5558117
-Ref: Time Functions-Footnote-6558344
-Node: Bitwise Functions558610
-Ref: table-bitwise-ops559172
-Ref: Bitwise Functions-Footnote-1563481
-Node: Type Functions563650
-Node: I18N Functions564801
-Node: User-defined566446
-Node: Definition Syntax567251
-Ref: Definition Syntax-Footnote-1572658
-Node: Function Example572729
-Ref: Function Example-Footnote-1575648
-Node: Function Caveats575670
-Node: Calling A Function576188
-Node: Variable Scope577146
-Node: Pass By Value/Reference580134
-Node: Return Statement583629
-Node: Dynamic Typing586610
-Node: Indirect Calls587539
-Ref: Indirect Calls-Footnote-1598841
-Node: Functions Summary598969
-Node: Library Functions601671
-Ref: Library Functions-Footnote-1605280
-Ref: Library Functions-Footnote-2605423
-Node: Library Names605594
-Ref: Library Names-Footnote-1609048
-Ref: Library Names-Footnote-2609271
-Node: General Functions609357
-Node: Strtonum Function610460
-Node: Assert Function613482
-Node: Round Function616806
-Node: Cliff Random Function618347
-Node: Ordinal Functions619363
-Ref: Ordinal Functions-Footnote-1622426
-Ref: Ordinal Functions-Footnote-2622678
-Node: Join Function622889
-Ref: Join Function-Footnote-1624658
-Node: Getlocaltime Function624858
-Node: Readfile Function628602
-Node: Shell Quoting630572
-Node: Data File Management631973
-Node: Filetrans Function632605
-Node: Rewind Function636661
-Node: File Checking638048
-Ref: File Checking-Footnote-1639380
-Node: Empty Files639581
-Node: Ignoring Assigns641560
-Node: Getopt Function643111
-Ref: Getopt Function-Footnote-1654573
-Node: Passwd Functions654773
-Ref: Passwd Functions-Footnote-1663610
-Node: Group Functions663698
-Ref: Group Functions-Footnote-1671592
-Node: Walking Arrays671805
-Node: Library Functions Summary673408
-Node: Library Exercises674809
-Node: Sample Programs676089
-Node: Running Examples676859
-Node: Clones677587
-Node: Cut Program678811
-Node: Egrep Program688530
-Ref: Egrep Program-Footnote-1696028
-Node: Id Program696138
-Node: Split Program699783
-Ref: Split Program-Footnote-1703231
-Node: Tee Program703359
-Node: Uniq Program706148
-Node: Wc Program713567
-Ref: Wc Program-Footnote-1717817
-Node: Miscellaneous Programs717911
-Node: Dupword Program719124
-Node: Alarm Program721155
-Node: Translate Program725959
-Ref: Translate Program-Footnote-1730524
-Node: Labels Program730794
-Ref: Labels Program-Footnote-1734145
-Node: Word Sorting734229
-Node: History Sorting738300
-Node: Extract Program740136
-Node: Simple Sed747661
-Node: Igawk Program750729
-Ref: Igawk Program-Footnote-1765053
-Ref: Igawk Program-Footnote-2765254
-Ref: Igawk Program-Footnote-3765376
-Node: Anagram Program765491
-Node: Signature Program768548
-Node: Programs Summary769795
-Node: Programs Exercises770988
-Ref: Programs Exercises-Footnote-1775119
-Node: Advanced Features775210
-Node: Nondecimal Data777158
-Node: Array Sorting778748
-Node: Controlling Array Traversal779445
-Ref: Controlling Array Traversal-Footnote-1787778
-Node: Array Sorting Functions787896
-Ref: Array Sorting Functions-Footnote-1791785
-Node: Two-way I/O791981
-Ref: Two-way I/O-Footnote-1796926
-Ref: Two-way I/O-Footnote-2797112
-Node: TCP/IP Networking797194
-Node: Profiling800067
-Node: Advanced Features Summary808344
-Node: Internationalization810277
-Node: I18N and L10N811757
-Node: Explaining gettext812443
-Ref: Explaining gettext-Footnote-1817468
-Ref: Explaining gettext-Footnote-2817652
-Node: Programmer i18n817817
-Ref: Programmer i18n-Footnote-1822683
-Node: Translator i18n822732
-Node: String Extraction823526
-Ref: String Extraction-Footnote-1824657
-Node: Printf Ordering824743
-Ref: Printf Ordering-Footnote-1827529
-Node: I18N Portability827593
-Ref: I18N Portability-Footnote-1830048
-Node: I18N Example830111
-Ref: I18N Example-Footnote-1832914
-Node: Gawk I18N832986
-Node: I18N Summary833624
-Node: Debugger834963
-Node: Debugging835985
-Node: Debugging Concepts836426
-Node: Debugging Terms838279
-Node: Awk Debugging840851
-Node: Sample Debugging Session841745
-Node: Debugger Invocation842265
-Node: Finding The Bug843649
-Node: List of Debugger Commands850124
-Node: Breakpoint Control851457
-Node: Debugger Execution Control855153
-Node: Viewing And Changing Data858517
-Node: Execution Stack861895
-Node: Debugger Info863532
-Node: Miscellaneous Debugger Commands867549
-Node: Readline Support872578
-Node: Limitations873470
-Node: Debugging Summary875584
-Node: Arbitrary Precision Arithmetic876752
-Node: Computer Arithmetic878168
-Ref: table-numeric-ranges881766
-Ref: Computer Arithmetic-Footnote-1882625
-Node: Math Definitions882682
-Ref: table-ieee-formats885970
-Ref: Math Definitions-Footnote-1886574
-Node: MPFR features886679
-Node: FP Math Caution888350
-Ref: FP Math Caution-Footnote-1889400
-Node: Inexactness of computations889769
-Node: Inexact representation890728
-Node: Comparing FP Values892085
-Node: Errors accumulate893167
-Node: Getting Accuracy894600
-Node: Try To Round897262
-Node: Setting precision898161
-Ref: table-predefined-precision-strings898845
-Node: Setting the rounding mode900634
-Ref: table-gawk-rounding-modes900998
-Ref: Setting the rounding mode-Footnote-1904453
-Node: Arbitrary Precision Integers904632
-Ref: Arbitrary Precision Integers-Footnote-1909531
-Node: POSIX Floating Point Problems909680
-Ref: POSIX Floating Point Problems-Footnote-1913553
-Node: Floating point summary913591
-Node: Dynamic Extensions915785
-Node: Extension Intro917337
-Node: Plugin License918603
-Node: Extension Mechanism Outline919400
-Ref: figure-load-extension919828
-Ref: figure-register-new-function921308
-Ref: figure-call-new-function922312
-Node: Extension API Description924298
-Node: Extension API Functions Introduction925748
-Node: General Data Types930572
-Ref: General Data Types-Footnote-1936311
-Node: Memory Allocation Functions936610
-Ref: Memory Allocation Functions-Footnote-1939449
-Node: Constructor Functions939545
-Node: Registration Functions941279
-Node: Extension Functions941964
-Node: Exit Callback Functions944261
-Node: Extension Version String945509
-Node: Input Parsers946174
-Node: Output Wrappers956053
-Node: Two-way processors960568
-Node: Printing Messages962772
-Ref: Printing Messages-Footnote-1963848
-Node: Updating `ERRNO'964000
-Node: Requesting Values964740
-Ref: table-value-types-returned965468
-Node: Accessing Parameters966425
-Node: Symbol Table Access967656
-Node: Symbol table by name968170
-Node: Symbol table by cookie970151
-Ref: Symbol table by cookie-Footnote-1974295
-Node: Cached values974358
-Ref: Cached values-Footnote-1977857
-Node: Array Manipulation977948
-Ref: Array Manipulation-Footnote-1979046
-Node: Array Data Types979083
-Ref: Array Data Types-Footnote-1981738
-Node: Array Functions981830
-Node: Flattening Arrays985684
-Node: Creating Arrays992576
-Node: Extension API Variables997347
-Node: Extension Versioning997983
-Node: Extension API Informational Variables999884
-Node: Extension API Boilerplate1000949
-Node: Finding Extensions1004758
-Node: Extension Example1005318
-Node: Internal File Description1006090
-Node: Internal File Ops1010157
-Ref: Internal File Ops-Footnote-11021827
-Node: Using Internal File Ops1021967
-Ref: Using Internal File Ops-Footnote-11024350
-Node: Extension Samples1024623
-Node: Extension Sample File Functions1026149
-Node: Extension Sample Fnmatch1033787
-Node: Extension Sample Fork1035278
-Node: Extension Sample Inplace1036493
-Node: Extension Sample Ord1038168
-Node: Extension Sample Readdir1039004
-Ref: table-readdir-file-types1039880
-Node: Extension Sample Revout1040691
-Node: Extension Sample Rev2way1041281
-Node: Extension Sample Read write array1042021
-Node: Extension Sample Readfile1043961
-Node: Extension Sample Time1045056
-Node: Extension Sample API Tests1046405
-Node: gawkextlib1046896
-Node: Extension summary1049554
-Node: Extension Exercises1053243
-Node: Language History1053965
-Node: V7/SVR3.11055621
-Node: SVR41057802
-Node: POSIX1059247
-Node: BTL1060636
-Node: POSIX/GNU1061370
-Node: Feature History1066994
-Node: Common Extensions1080092
-Node: Ranges and Locales1081416
-Ref: Ranges and Locales-Footnote-11086034
-Ref: Ranges and Locales-Footnote-21086061
-Ref: Ranges and Locales-Footnote-31086295
-Node: Contributors1086516
-Node: History summary1092057
-Node: Installation1093427
-Node: Gawk Distribution1094373
-Node: Getting1094857
-Node: Extracting1095680
-Node: Distribution contents1097315
-Node: Unix Installation1103380
-Node: Quick Installation1104063
-Node: Shell Startup Files1106474
-Node: Additional Configuration Options1107553
-Node: Configuration Philosophy1109292
-Node: Non-Unix Installation1111661
-Node: PC Installation1112119
-Node: PC Binary Installation1113438
-Node: PC Compiling1115286
-Ref: PC Compiling-Footnote-11118307
-Node: PC Testing1118416
-Node: PC Using1119592
-Node: Cygwin1123707
-Node: MSYS1124530
-Node: VMS Installation1125030
-Node: VMS Compilation1125822
-Ref: VMS Compilation-Footnote-11127044
-Node: VMS Dynamic Extensions1127102
-Node: VMS Installation Details1128786
-Node: VMS Running1131038
-Node: VMS GNV1133874
-Node: VMS Old Gawk1134608
-Node: Bugs1135078
-Node: Other Versions1138961
-Node: Installation summary1145385
-Node: Notes1146441
-Node: Compatibility Mode1147306
-Node: Additions1148088
-Node: Accessing The Source1149013
-Node: Adding Code1150448
-Node: New Ports1156613
-Node: Derived Files1161095
-Ref: Derived Files-Footnote-11166570
-Ref: Derived Files-Footnote-21166604
-Ref: Derived Files-Footnote-31167200
-Node: Future Extensions1167314
-Node: Implementation Limitations1167920
-Node: Extension Design1169168
-Node: Old Extension Problems1170322
-Ref: Old Extension Problems-Footnote-11171839
-Node: Extension New Mechanism Goals1171896
-Ref: Extension New Mechanism Goals-Footnote-11175256
-Node: Extension Other Design Decisions1175445
-Node: Extension Future Growth1177553
-Node: Old Extension Mechanism1178389
-Node: Notes summary1180151
-Node: Basic Concepts1181337
-Node: Basic High Level1182018
-Ref: figure-general-flow1182290
-Ref: figure-process-flow1182889
-Ref: Basic High Level-Footnote-11186118
-Node: Basic Data Typing1186303
-Node: Glossary1189631
-Node: Copying1214789
-Node: GNU Free Documentation License1252345
-Node: Index1277481
+Node: Leftmost Longest175842
+Node: Computed Regexps177144
+Node: GNU Regexp Operators180573
+Node: Case-sensitivity184245
+Ref: Case-sensitivity-Footnote-1187130
+Ref: Case-sensitivity-Footnote-2187365
+Node: Regexp Summary187473
+Node: Reading Files188940
+Node: Records191033
+Node: awk split records191766
+Node: gawk split records196695
+Ref: gawk split records-Footnote-1201234
+Node: Fields201271
+Ref: Fields-Footnote-1204049
+Node: Nonconstant Fields204135
+Ref: Nonconstant Fields-Footnote-1206373
+Node: Changing Fields206576
+Node: Field Separators212507
+Node: Default Field Splitting215211
+Node: Regexp Field Splitting216328
+Node: Single Character Fields219678
+Node: Command Line Field Separator220737
+Node: Full Line Fields223954
+Ref: Full Line Fields-Footnote-1225475
+Ref: Full Line Fields-Footnote-2225521
+Node: Field Splitting Summary225622
+Node: Constant Size227696
+Node: Splitting By Content232279
+Ref: Splitting By Content-Footnote-1236244
+Node: Multiple Line236407
+Ref: Multiple Line-Footnote-1242288
+Node: Getline242467
+Node: Plain Getline244674
+Node: Getline/Variable247314
+Node: Getline/File248463
+Node: Getline/Variable/File249848
+Ref: Getline/Variable/File-Footnote-1251451
+Node: Getline/Pipe251538
+Node: Getline/Variable/Pipe254216
+Node: Getline/Coprocess255347
+Node: Getline/Variable/Coprocess256611
+Node: Getline Notes257350
+Node: Getline Summary260144
+Ref: table-getline-variants260556
+Node: Read Timeout261385
+Ref: Read Timeout-Footnote-1265222
+Node: Command-line directories265280
+Node: Input Summary266185
+Node: Input Exercises269570
+Node: Printing270298
+Node: Print272075
+Node: Print Examples273532
+Node: Output Separators276311
+Node: OFMT278329
+Node: Printf279684
+Node: Basic Printf280469
+Node: Control Letters282041
+Node: Format Modifiers286026
+Node: Printf Examples292036
+Node: Redirection294522
+Node: Special FD301360
+Ref: Special FD-Footnote-1304526
+Node: Special Files304600
+Node: Other Inherited Files305217
+Node: Special Network306217
+Node: Special Caveats307079
+Node: Close Files And Pipes308028
+Ref: Close Files And Pipes-Footnote-1315219
+Ref: Close Files And Pipes-Footnote-2315367
+Node: Output Summary315517
+Node: Output Exercises316515
+Node: Expressions317195
+Node: Values318380
+Node: Constants319058
+Node: Scalar Constants319749
+Ref: Scalar Constants-Footnote-1320608
+Node: Nondecimal-numbers320858
+Node: Regexp Constants323876
+Node: Using Constant Regexps324401
+Node: Variables327544
+Node: Using Variables328199
+Node: Assignment Options330110
+Node: Conversion331985
+Node: Strings And Numbers332509
+Ref: Strings And Numbers-Footnote-1335574
+Node: Locale influences conversions335683
+Ref: table-locale-affects338430
+Node: All Operators339018
+Node: Arithmetic Ops339648
+Node: Concatenation342153
+Ref: Concatenation-Footnote-1344972
+Node: Assignment Ops345078
+Ref: table-assign-ops350057
+Node: Increment Ops351329
+Node: Truth Values and Conditions354767
+Node: Truth Values355852
+Node: Typing and Comparison356901
+Node: Variable Typing357711
+Node: Comparison Operators361364
+Ref: table-relational-ops361774
+Node: POSIX String Comparison365269
+Ref: POSIX String Comparison-Footnote-1366341
+Node: Boolean Ops366479
+Ref: Boolean Ops-Footnote-1370958
+Node: Conditional Exp371049
+Node: Function Calls372776
+Node: Precedence376656
+Node: Locales380317
+Node: Expressions Summary381949
+Node: Patterns and Actions384509
+Node: Pattern Overview385629
+Node: Regexp Patterns387308
+Node: Expression Patterns387851
+Node: Ranges391561
+Node: BEGIN/END394667
+Node: Using BEGIN/END395428
+Ref: Using BEGIN/END-Footnote-1398162
+Node: I/O And BEGIN/END398268
+Node: BEGINFILE/ENDFILE400582
+Node: Empty403483
+Node: Using Shell Variables403800
+Node: Action Overview406073
+Node: Statements408399
+Node: If Statement410247
+Node: While Statement411742
+Node: Do Statement413771
+Node: For Statement414915
+Node: Switch Statement418072
+Node: Break Statement420454
+Node: Continue Statement422495
+Node: Next Statement424322
+Node: Nextfile Statement426703
+Node: Exit Statement429333
+Node: Built-in Variables431736
+Node: User-modified432869
+Ref: User-modified-Footnote-1440550
+Node: Auto-set440612
+Ref: Auto-set-Footnote-1454304
+Ref: Auto-set-Footnote-2454509
+Node: ARGC and ARGV454565
+Node: Pattern Action Summary458783
+Node: Arrays461210
+Node: Array Basics462539
+Node: Array Intro463383
+Ref: figure-array-elements465347
+Ref: Array Intro-Footnote-1467873
+Node: Reference to Elements468001
+Node: Assigning Elements470453
+Node: Array Example470944
+Node: Scanning an Array472702
+Node: Controlling Scanning475718
+Ref: Controlling Scanning-Footnote-1480914
+Node: Numeric Array Subscripts481230
+Node: Uninitialized Subscripts483415
+Node: Delete485032
+Ref: Delete-Footnote-1487775
+Node: Multidimensional487832
+Node: Multiscanning490929
+Node: Arrays of Arrays492518
+Node: Arrays Summary497277
+Node: Functions499369
+Node: Built-in500268
+Node: Calling Built-in501346
+Node: Numeric Functions503337
+Ref: Numeric Functions-Footnote-1508156
+Ref: Numeric Functions-Footnote-2508513
+Ref: Numeric Functions-Footnote-3508561
+Node: String Functions508833
+Ref: String Functions-Footnote-1532308
+Ref: String Functions-Footnote-2532437
+Ref: String Functions-Footnote-3532685
+Node: Gory Details532772
+Ref: table-sub-escapes534553
+Ref: table-sub-proposed536073
+Ref: table-posix-sub537437
+Ref: table-gensub-escapes538973
+Ref: Gory Details-Footnote-1539805
+Node: I/O Functions539956
+Ref: I/O Functions-Footnote-1547174
+Node: Time Functions547321
+Ref: Time Functions-Footnote-1557809
+Ref: Time Functions-Footnote-2557877
+Ref: Time Functions-Footnote-3558035
+Ref: Time Functions-Footnote-4558146
+Ref: Time Functions-Footnote-5558258
+Ref: Time Functions-Footnote-6558485
+Node: Bitwise Functions558751
+Ref: table-bitwise-ops559313
+Ref: Bitwise Functions-Footnote-1563622
+Node: Type Functions563791
+Node: I18N Functions564942
+Node: User-defined566587
+Node: Definition Syntax567392
+Ref: Definition Syntax-Footnote-1572799
+Node: Function Example572870
+Ref: Function Example-Footnote-1575789
+Node: Function Caveats575811
+Node: Calling A Function576329
+Node: Variable Scope577287
+Node: Pass By Value/Reference580275
+Node: Return Statement583770
+Node: Dynamic Typing586751
+Node: Indirect Calls587680
+Ref: Indirect Calls-Footnote-1598982
+Node: Functions Summary599110
+Node: Library Functions601812
+Ref: Library Functions-Footnote-1605421
+Ref: Library Functions-Footnote-2605564
+Node: Library Names605735
+Ref: Library Names-Footnote-1609189
+Ref: Library Names-Footnote-2609412
+Node: General Functions609498
+Node: Strtonum Function610601
+Node: Assert Function613623
+Node: Round Function616947
+Node: Cliff Random Function618488
+Node: Ordinal Functions619504
+Ref: Ordinal Functions-Footnote-1622567
+Ref: Ordinal Functions-Footnote-2622819
+Node: Join Function623030
+Ref: Join Function-Footnote-1624799
+Node: Getlocaltime Function624999
+Node: Readfile Function628743
+Node: Shell Quoting630713
+Node: Data File Management632114
+Node: Filetrans Function632746
+Node: Rewind Function636802
+Node: File Checking638189
+Ref: File Checking-Footnote-1639521
+Node: Empty Files639722
+Node: Ignoring Assigns641701
+Node: Getopt Function643252
+Ref: Getopt Function-Footnote-1654714
+Node: Passwd Functions654914
+Ref: Passwd Functions-Footnote-1663751
+Node: Group Functions663839
+Ref: Group Functions-Footnote-1671733
+Node: Walking Arrays671946
+Node: Library Functions Summary673549
+Node: Library Exercises674950
+Node: Sample Programs676230
+Node: Running Examples677000
+Node: Clones677728
+Node: Cut Program678952
+Node: Egrep Program688671
+Ref: Egrep Program-Footnote-1696169
+Node: Id Program696279
+Node: Split Program699924
+Ref: Split Program-Footnote-1703372
+Node: Tee Program703500
+Node: Uniq Program706289
+Node: Wc Program713708
+Ref: Wc Program-Footnote-1717958
+Node: Miscellaneous Programs718052
+Node: Dupword Program719265
+Node: Alarm Program721296
+Node: Translate Program726100
+Ref: Translate Program-Footnote-1730665
+Node: Labels Program730935
+Ref: Labels Program-Footnote-1734286
+Node: Word Sorting734370
+Node: History Sorting738441
+Node: Extract Program740277
+Node: Simple Sed747802
+Node: Igawk Program750870
+Ref: Igawk Program-Footnote-1765194
+Ref: Igawk Program-Footnote-2765395
+Ref: Igawk Program-Footnote-3765517
+Node: Anagram Program765632
+Node: Signature Program768689
+Node: Programs Summary769936
+Node: Programs Exercises771129
+Ref: Programs Exercises-Footnote-1775260
+Node: Advanced Features775351
+Node: Nondecimal Data777299
+Node: Array Sorting778889
+Node: Controlling Array Traversal779586
+Ref: Controlling Array Traversal-Footnote-1787919
+Node: Array Sorting Functions788037
+Ref: Array Sorting Functions-Footnote-1791926
+Node: Two-way I/O792122
+Ref: Two-way I/O-Footnote-1797067
+Ref: Two-way I/O-Footnote-2797253
+Node: TCP/IP Networking797335
+Node: Profiling800208
+Node: Advanced Features Summary808485
+Node: Internationalization810418
+Node: I18N and L10N811898
+Node: Explaining gettext812584
+Ref: Explaining gettext-Footnote-1817609
+Ref: Explaining gettext-Footnote-2817793
+Node: Programmer i18n817958
+Ref: Programmer i18n-Footnote-1822824
+Node: Translator i18n822873
+Node: String Extraction823667
+Ref: String Extraction-Footnote-1824798
+Node: Printf Ordering824884
+Ref: Printf Ordering-Footnote-1827670
+Node: I18N Portability827734
+Ref: I18N Portability-Footnote-1830189
+Node: I18N Example830252
+Ref: I18N Example-Footnote-1833055
+Node: Gawk I18N833127
+Node: I18N Summary833765
+Node: Debugger835104
+Node: Debugging836126
+Node: Debugging Concepts836567
+Node: Debugging Terms838420
+Node: Awk Debugging840992
+Node: Sample Debugging Session841886
+Node: Debugger Invocation842406
+Node: Finding The Bug843790
+Node: List of Debugger Commands850265
+Node: Breakpoint Control851598
+Node: Debugger Execution Control855294
+Node: Viewing And Changing Data858658
+Node: Execution Stack862036
+Node: Debugger Info863673
+Node: Miscellaneous Debugger Commands867690
+Node: Readline Support872719
+Node: Limitations873611
+Node: Debugging Summary875725
+Node: Arbitrary Precision Arithmetic876893
+Node: Computer Arithmetic878309
+Ref: table-numeric-ranges881907
+Ref: Computer Arithmetic-Footnote-1882766
+Node: Math Definitions882823
+Ref: table-ieee-formats886111
+Ref: Math Definitions-Footnote-1886715
+Node: MPFR features886820
+Node: FP Math Caution888491
+Ref: FP Math Caution-Footnote-1889541
+Node: Inexactness of computations889910
+Node: Inexact representation890869
+Node: Comparing FP Values892226
+Node: Errors accumulate893308
+Node: Getting Accuracy894741
+Node: Try To Round897403
+Node: Setting precision898302
+Ref: table-predefined-precision-strings898986
+Node: Setting the rounding mode900775
+Ref: table-gawk-rounding-modes901139
+Ref: Setting the rounding mode-Footnote-1904594
+Node: Arbitrary Precision Integers904773
+Ref: Arbitrary Precision Integers-Footnote-1909672
+Node: POSIX Floating Point Problems909821
+Ref: POSIX Floating Point Problems-Footnote-1913694
+Node: Floating point summary913732
+Node: Dynamic Extensions915926
+Node: Extension Intro917478
+Node: Plugin License918744
+Node: Extension Mechanism Outline919541
+Ref: figure-load-extension919969
+Ref: figure-register-new-function921449
+Ref: figure-call-new-function922453
+Node: Extension API Description924439
+Node: Extension API Functions Introduction925889
+Node: General Data Types930713
+Ref: General Data Types-Footnote-1936452
+Node: Memory Allocation Functions936751
+Ref: Memory Allocation Functions-Footnote-1939590
+Node: Constructor Functions939686
+Node: Registration Functions941420
+Node: Extension Functions942105
+Node: Exit Callback Functions944402
+Node: Extension Version String945650
+Node: Input Parsers946315
+Node: Output Wrappers956194
+Node: Two-way processors960709
+Node: Printing Messages962913
+Ref: Printing Messages-Footnote-1963989
+Node: Updating `ERRNO'964141
+Node: Requesting Values964881
+Ref: table-value-types-returned965609
+Node: Accessing Parameters966566
+Node: Symbol Table Access967797
+Node: Symbol table by name968311
+Node: Symbol table by cookie970292
+Ref: Symbol table by cookie-Footnote-1974436
+Node: Cached values974499
+Ref: Cached values-Footnote-1977998
+Node: Array Manipulation978089
+Ref: Array Manipulation-Footnote-1979187
+Node: Array Data Types979224
+Ref: Array Data Types-Footnote-1981879
+Node: Array Functions981971
+Node: Flattening Arrays985825
+Node: Creating Arrays992717
+Node: Extension API Variables997488
+Node: Extension Versioning998124
+Node: Extension API Informational Variables1000025
+Node: Extension API Boilerplate1001090
+Node: Finding Extensions1004899
+Node: Extension Example1005459
+Node: Internal File Description1006231
+Node: Internal File Ops1010298
+Ref: Internal File Ops-Footnote-11021968
+Node: Using Internal File Ops1022108
+Ref: Using Internal File Ops-Footnote-11024491
+Node: Extension Samples1024764
+Node: Extension Sample File Functions1026290
+Node: Extension Sample Fnmatch1033928
+Node: Extension Sample Fork1035419
+Node: Extension Sample Inplace1036634
+Node: Extension Sample Ord1038309
+Node: Extension Sample Readdir1039145
+Ref: table-readdir-file-types1040021
+Node: Extension Sample Revout1040832
+Node: Extension Sample Rev2way1041422
+Node: Extension Sample Read write array1042162
+Node: Extension Sample Readfile1044102
+Node: Extension Sample Time1045197
+Node: Extension Sample API Tests1046546
+Node: gawkextlib1047037
+Node: Extension summary1049695
+Node: Extension Exercises1053384
+Node: Language History1054106
+Node: V7/SVR3.11055762
+Node: SVR41057943
+Node: POSIX1059388
+Node: BTL1060777
+Node: POSIX/GNU1061511
+Node: Feature History1067135
+Node: Common Extensions1080233
+Node: Ranges and Locales1081557
+Ref: Ranges and Locales-Footnote-11086175
+Ref: Ranges and Locales-Footnote-21086202
+Ref: Ranges and Locales-Footnote-31086436
+Node: Contributors1086657
+Node: History summary1092198
+Node: Installation1093568
+Node: Gawk Distribution1094514
+Node: Getting1094998
+Node: Extracting1095821
+Node: Distribution contents1097456
+Node: Unix Installation1103521
+Node: Quick Installation1104204
+Node: Shell Startup Files1106615
+Node: Additional Configuration Options1107694
+Node: Configuration Philosophy1109433
+Node: Non-Unix Installation1111802
+Node: PC Installation1112260
+Node: PC Binary Installation1113579
+Node: PC Compiling1115427
+Ref: PC Compiling-Footnote-11118448
+Node: PC Testing1118557
+Node: PC Using1119733
+Node: Cygwin1123848
+Node: MSYS1124671
+Node: VMS Installation1125171
+Node: VMS Compilation1125963
+Ref: VMS Compilation-Footnote-11127185
+Node: VMS Dynamic Extensions1127243
+Node: VMS Installation Details1128927
+Node: VMS Running1131179
+Node: VMS GNV1134015
+Node: VMS Old Gawk1134749
+Node: Bugs1135219
+Node: Other Versions1139102
+Node: Installation summary1145526
+Node: Notes1146582
+Node: Compatibility Mode1147447
+Node: Additions1148229
+Node: Accessing The Source1149154
+Node: Adding Code1150589
+Node: New Ports1156746
+Node: Derived Files1161228
+Ref: Derived Files-Footnote-11166703
+Ref: Derived Files-Footnote-21166737
+Ref: Derived Files-Footnote-31167333
+Node: Future Extensions1167447
+Node: Implementation Limitations1168053
+Node: Extension Design1169301
+Node: Old Extension Problems1170455
+Ref: Old Extension Problems-Footnote-11171972
+Node: Extension New Mechanism Goals1172029
+Ref: Extension New Mechanism Goals-Footnote-11175389
+Node: Extension Other Design Decisions1175578
+Node: Extension Future Growth1177686
+Node: Old Extension Mechanism1178522
+Node: Notes summary1180284
+Node: Basic Concepts1181470
+Node: Basic High Level1182151
+Ref: figure-general-flow1182423
+Ref: figure-process-flow1183022
+Ref: Basic High Level-Footnote-11186251
+Node: Basic Data Typing1186436
+Node: Glossary1189764
+Node: Copying1214922
+Node: GNU Free Documentation License1252478
+Node: Index1277614

End Tag Table
diff --git a/doc/gawk.texi b/doc/gawk.texi
index a4567760..90f28a6b 100644
--- a/doc/gawk.texi
+++ b/doc/gawk.texi
@@ -5718,11 +5718,11 @@ and numeric characters in your character set.
@c Date: Tue, 01 Jul 2014 07:39:51 +0200
@c From: Hermann Peifer <peifer@gmx.eu>
Some utilities that match regular expressions provide a nonstandard
-@code{[:ascii:]} character class; @command{awk} does not. However, you
-can simulate such a construct using @code{[\x00-\x7F]}. This matches
+@samp{[:ascii:]} character class; @command{awk} does not. However, you
+can simulate such a construct using @samp{[\x00-\x7F]}. This matches
all values numerically between zero and 127, which is the defined
range of the ASCII character set. Use a complemented character list
-(@code{[^\x00-\x7F]}) to match any single-byte characters that are not
+(@samp{[^\x00-\x7F]}) to match any single-byte characters that are not
in the ASCII range.
@cindex bracket expressions, collating elements
@@ -5751,8 +5751,8 @@ Locale-specific names for a list of
characters that are equal. The name is enclosed between
@samp{[=} and @samp{=]}.
For example, the name @samp{e} might be used to represent all of
-``e,'' ``@`e,'' and ``@'e.'' In this case, @samp{[[=e=]]} is a regexp
-that matches any of @samp{e}, @samp{@'e}, or @samp{@`e}.
+``e,'' ``@^e,'' ``@`e,'' and ``@'e.'' In this case, @samp{[[=e=]]} is a regexp
+that matches any of @samp{e}, @samp{@^e}, @samp{@'e}, or @samp{@`e}.
@end table
These features are very valuable in non-English-speaking locales.
@@ -5781,7 +5781,7 @@ echo aaaabcd | awk '@{ sub(/a+/, "<A>"); print @}'
This example uses the @code{sub()} function to make a change to the input
record. (@code{sub()} replaces the first instance of any text matched
by the first argument with the string provided as the second argument;
-@pxref{String Functions}). Here, the regexp @code{/a+/} indicates ``one
+@pxref{String Functions}.) Here, the regexp @code{/a+/} indicates ``one
or more @samp{a} characters,'' and the replacement text is @samp{<A>}.
The input contains four @samp{a} characters.
@@ -5835,14 +5835,14 @@ and tests whether the input record matches this regexp.
@quotation NOTE
When using the @samp{~} and @samp{!~}
-operators, there is a difference between a regexp constant
+operators, be aware that there is a difference between a regexp constant
enclosed in slashes and a string constant enclosed in double quotes.
If you are going to use a string constant, you have to understand that
the string is, in essence, scanned @emph{twice}: the first time when
@command{awk} reads your program, and the second time when it goes to
match the string on the lefthand side of the operator with the pattern
on the right. This is true of any string-valued expression (such as
-@code{digits_regexp}, shown previously), not just string constants.
+@code{digits_regexp}, shown in the previous example), not just string constants.
@end quotation
@cindex regexp constants, slashes vs.@: quotes
@@ -6042,7 +6042,7 @@ matches either @samp{ball} or @samp{balls}, as a separate word.
@item \B
Matches the empty string that occurs between two
word-constituent characters. For example,
-@code{/\Brat\B/} matches @samp{crate} but it does not match @samp{dirty rat}.
+@code{/\Brat\B/} matches @samp{crate}, but it does not match @samp{dirty rat}.
@samp{\B} is essentially the opposite of @samp{\y}.
@end table
@@ -6061,14 +6061,14 @@ The operators are:
@cindex backslash (@code{\}), @code{\`} operator (@command{gawk})
@cindex @code{\} (backslash), @code{\`} operator (@command{gawk})
Matches the empty string at the
-beginning of a buffer (string).
+beginning of a buffer (string)
@c @cindex operators, @code{\'} (@command{gawk})
@cindex backslash (@code{\}), @code{\'} operator (@command{gawk})
@cindex @code{\} (backslash), @code{\'} operator (@command{gawk})
@item \'
Matches the empty string at the
-end of a buffer (string).
+end of a buffer (string)
@end table
@cindex @code{^} (caret), regexp operator
@@ -6301,7 +6301,7 @@ This makes it more convenient for programs to work on the parts of a record.
@cindex @code{getline} command
On rare occasions, you may need to use the @code{getline} command.
-The @code{getline} command is valuable, both because it
+The @code{getline} command is valuable both because it
can do explicit input from any number of files, and because the files
used with it do not have to be named on the @command{awk} command line
(@pxref{Getline}).
@@ -6352,8 +6352,8 @@ never automatically reset to zero.
Records are separated by a character called the @dfn{record separator}.
By default, the record separator is the newline character.
This is why records are, by default, single lines.
-A different character can be used for the record separator by
-assigning the character to the predefined variable @code{RS}.
+To use a different character for the record separator,
+simply assign that character to the predefined variable @code{RS}.
@cindex newlines, as record separators
@cindex @code{RS} variable
@@ -6376,8 +6376,8 @@ awk 'BEGIN @{ RS = "u" @}
@noindent
changes the value of @code{RS} to @samp{u}, before reading any input.
-This is a string whose first character is the letter ``u''; as a result, records
-are separated by the letter ``u.'' Then the input file is read, and the second
+The new value is a string whose first character is the letter ``u''; as a result, records
+are separated by the letter ``u''. Then the input file is read, and the second
rule in the @command{awk} program (the action with no pattern) prints each
record. Because each @code{print} statement adds a newline at the end of
its output, this @command{awk} program copies the input
@@ -6438,8 +6438,8 @@ Bill 555-1675 bill.drowning@@hotmail.com A
@end example
@noindent
-It contains no @samp{u} so there is no reason to split the record,
-unlike the others which have one or more occurrences of the @samp{u}.
+It contains no @samp{u}, so there is no reason to split the record,
+unlike the others, which each have one or more occurrences of the @samp{u}.
In fact, this record is treated as part of the previous record;
the newline separating them in the output
is the original newline in the @value{DF}, not the one added by
@@ -6534,7 +6534,7 @@ contains the same single character. However, when @code{RS} is a
regular expression, @code{RT} contains
the actual input text that matched the regular expression.
-If the input file ended without any text that matches @code{RS},
+If the input file ends without any text matching @code{RS},
@command{gawk} sets @code{RT} to the null string.
The following example illustrates both of these features.
@@ -6715,11 +6715,11 @@ simple @command{awk} programs so powerful.
@cindex @code{$} (dollar sign), @code{$} field operator
@cindex dollar sign (@code{$}), @code{$} field operator
@cindex field operators@comma{} dollar sign as
-You use a dollar-sign (@samp{$})
+You use a dollar sign (@samp{$})
to refer to a field in an @command{awk} program,
followed by the number of the field you want. Thus, @code{$1}
refers to the first field, @code{$2} to the second, and so on.
-(Unlike the Unix shells, the field numbers are not limited to single digits.
+(Unlike in the Unix shells, the field numbers are not limited to single digits.
@code{$127} is the 127th field in the record.)
For example, suppose the following is a line of input:
@@ -6745,7 +6745,7 @@ If you try to reference a field beyond the last
one (such as @code{$8} when the record has only seven fields), you get
the empty string. (If used in a numeric operation, you get zero.)
-The use of @code{$0}, which looks like a reference to the ``zero-th'' field, is
+The use of @code{$0}, which looks like a reference to the ``zeroth'' field, is
a special case: it represents the whole input record. Use it
when you are not interested in specific fields.
Here are some more examples:
@@ -6800,13 +6800,13 @@ awk '@{ print $(2*2) @}' mail-list
@end example
@command{awk} evaluates the expression @samp{(2*2)} and uses
-its value as the number of the field to print. The @samp{*} sign
+its value as the number of the field to print. The @samp{*}
represents multiplication, so the expression @samp{2*2} evaluates to four.
The parentheses are used so that the multiplication is done before the
@samp{$} operation; they are necessary whenever there is a binary
operator@footnote{A @dfn{binary operator}, such as @samp{*} for
multiplication, is one that takes two operands. The distinction
-is required, because @command{awk} also has unary (one-operand)
+is required because @command{awk} also has unary (one-operand)
and ternary (three-operand) operators.}
in the field-number expression. This example, then, prints the
type of relationship (the fourth field) for every line of the file
@@ -6986,7 +6986,7 @@ rebuild @code{$0} when @code{NF} is decremented.
Finally, there are times when it is convenient to force
@command{awk} to rebuild the entire record, using the current
-value of the fields and @code{OFS}. To do this, use the
+values of the fields and @code{OFS}. To do this, use the
seemingly innocuous assignment:
@example
@@ -7015,7 +7015,7 @@ such as @code{sub()} and @code{gsub()}
It is important to remember that @code{$0} is the @emph{full}
record, exactly as it was read from the input. This includes
any leading or trailing whitespace, and the exact whitespace (or other
-characters) that separate the fields.
+characters) that separates the fields.
It is a common error to try to change the field separators
in a record simply by setting @code{FS} and @code{OFS}, and then
@@ -7040,7 +7040,7 @@ with a statement such as @samp{$1 = $1}, as described earlier.
It is important to remember that @code{$0} is the @emph{full}
record, exactly as it was read from the input. This includes
any leading or trailing whitespace, and the exact whitespace (or other
-characters) that separate the fields.
+characters) that separates the fields.
It is a common error to try to change the field separators
in a record simply by setting @code{FS} and @code{OFS}, and then
@@ -7134,7 +7134,7 @@ John Q. Smith, LXIX, 29 Oak St., Walamazoo, MI 42139
@end example
@noindent
-The same program would extract @samp{@bullet{}LXIX}, instead of
+The same program would extract @samp{@bullet{}LXIX} instead of
@samp{@bullet{}29@bullet{}Oak@bullet{}St.}.
If you were expecting the program to print the
address, you would be surprised. The moral is to choose your data layout and
@@ -7395,7 +7395,7 @@ choosing your field and record separators.
@cindex Unix @command{awk}, password files@comma{} field separators and
Perhaps the most common use of a single character as the field separator
occurs when processing the Unix system password file. On many Unix
-systems, each user has a separate entry in the system password file, one
+systems, each user has a separate entry in the system password file, with one
line per user. The information in these lines is separated by colons.
The first field is the user's login name and the second is the user's
encrypted or shadow password. (A shadow password is indicated by the
@@ -7441,7 +7441,7 @@ When you do this, @code{$1} is the same as @code{$0}.
According to the POSIX standard, @command{awk} is supposed to behave
as if each record is split into fields at the time it is read.
In particular, this means that if you change the value of @code{FS}
-after a record is read, the value of the fields (i.e., how they were split)
+after a record is read, the values of the fields (i.e., how they were split)
should reflect the old value of @code{FS}, not the new one.
@cindex dark corner, field separators
@@ -7454,10 +7454,7 @@ using the @emph{current} value of @code{FS}!
@value{DARKCORNER}
This behavior can be difficult
to diagnose. The following example illustrates the difference
-between the two methods.
-(The @command{sed}@footnote{The @command{sed} utility is a ``stream editor.''
-Its behavior is also defined by the POSIX standard.}
-command prints just the first line of @file{/etc/passwd}.)
+between the two methods:
@example
sed 1q /etc/passwd | awk '@{ FS = ":" ; print $1 @}'
@@ -7478,6 +7475,10 @@ prints the full first line of the file, something like:
root:x:0:0:Root:/:
@end example
+(The @command{sed}@footnote{The @command{sed} utility is a ``stream editor.''
+Its behavior is also defined by the POSIX standard.}
+command prints just the first line of @file{/etc/passwd}.)
+
@docbook
</sidebar>
@end docbook
@@ -7494,7 +7495,7 @@ root:x:0:0:Root:/:
According to the POSIX standard, @command{awk} is supposed to behave
as if each record is split into fields at the time it is read.
In particular, this means that if you change the value of @code{FS}
-after a record is read, the value of the fields (i.e., how they were split)
+after a record is read, the values of the fields (i.e., how they were split)
should reflect the old value of @code{FS}, not the new one.
@cindex dark corner, field separators
@@ -7507,10 +7508,7 @@ using the @emph{current} value of @code{FS}!
@value{DARKCORNER}
This behavior can be difficult
to diagnose. The following example illustrates the difference
-between the two methods.
-(The @command{sed}@footnote{The @command{sed} utility is a ``stream editor.''
-Its behavior is also defined by the POSIX standard.}
-command prints just the first line of @file{/etc/passwd}.)
+between the two methods:
@example
sed 1q /etc/passwd | awk '@{ FS = ":" ; print $1 @}'
@@ -7530,6 +7528,10 @@ prints the full first line of the file, something like:
@example
root:x:0:0:Root:/:
@end example
+
+(The @command{sed}@footnote{The @command{sed} utility is a ``stream editor.''
+Its behavior is also defined by the POSIX standard.}
+command prints just the first line of @file{/etc/passwd}.)
@end cartouche
@end ifnotdocbook
@@ -7741,7 +7743,7 @@ In order to tell which kind of field splitting is in effect,
use @code{PROCINFO["FS"]}
(@pxref{Auto-set}).
The value is @code{"FS"} if regular field splitting is being used,
-or it is @code{"FIELDWIDTHS"} if fixed-width field splitting is being used:
+or @code{"FIELDWIDTHS"} if fixed-width field splitting is being used:
@example
if (PROCINFO["FS"] == "FS")
@@ -7777,14 +7779,14 @@ what they are, and not by what they are not.
The most notorious such case
is so-called @dfn{comma-separated values} (CSV) data. Many spreadsheet programs,
for example, can export their data into text files, where each record is
-terminated with a newline, and fields are separated by commas. If only
-commas separated the data, there wouldn't be an issue. The problem comes when
+terminated with a newline, and fields are separated by commas. If
+commas only separated the data, there wouldn't be an issue. The problem comes when
one of the fields contains an @emph{embedded} comma.
In such cases, most programs embed the field in double quotes.@footnote{The
CSV format lacked a formal standard definition for many years.
@uref{http://www.ietf.org/rfc/rfc4180.txt, RFC 4180}
standardizes the most common practices.}
-So we might have data like this:
+So, we might have data like this:
@example
@c file eg/misc/addresses.csv
@@ -7870,8 +7872,8 @@ of cases, and the @command{gawk} developers are satisfied with that.
@end quotation
As written, the regexp used for @code{FPAT} requires that each field
-have a least one character. A straightforward modification
-(changing changed the first @samp{+} to @samp{*}) allows fields to be empty:
+contain at least one character. A straightforward modification
+(changing the first @samp{+} to @samp{*}) allows fields to be empty:
@example
FPAT = "([^,]*)|(\"[^\"]+\")"
@@ -7881,9 +7883,9 @@ Finally, the @code{patsplit()} function makes the same functionality
available for splitting regular strings (@pxref{String Functions}).
To recap, @command{gawk} provides three independent methods
-to split input records into fields. @command{gawk} uses whichever
-mechanism was last chosen based on which of the three
-variables---@code{FS}, @code{FIELDWIDTHS}, and @code{FPAT}---was
+to split input records into fields.
+The mechanism used is based on which of the three
+variables---@code{FS}, @code{FIELDWIDTHS}, or @code{FPAT}---was
last assigned to.
@node Multiple Line
@@ -7926,7 +7928,7 @@ at the end of the record and one or more blank lines after the record.
In addition, a regular expression always matches the longest possible
sequence when there is a choice
(@pxref{Leftmost Longest}).
-So the next record doesn't start until
+So, the next record doesn't start until
the first nonblank line that follows---no matter how many blank lines
appear in a row, they are considered one record separator.
@@ -7941,10 +7943,10 @@ In the second case, this special processing is not done.
@cindex field separator, in multiline records
@cindex @code{FS}, in multiline records
Now that the input is separated into records, the second step is to
-separate the fields in the record. One way to do this is to divide each
+separate the fields in the records. One way to do this is to divide each
of the lines into fields in the normal manner. This happens by default
as the result of a special feature. When @code{RS} is set to the empty
-string, @emph{and} @code{FS} is set to a single character,
+string @emph{and} @code{FS} is set to a single character,
the newline character @emph{always} acts as a field separator.
This is in addition to whatever field separations result from
@code{FS}.@footnote{When @code{FS} is the null string (@code{""})
@@ -7959,7 +7961,7 @@ want the newline character to separate fields, because there is no way to
prevent it. However, you can work around this by using the @code{split()}
function to break up the record manually
(@pxref{String Functions}).
-If you have a single character field separator, you can work around
+If you have a single-character field separator, you can work around
the special feature in a different way, by making @code{FS} into a
regexp for that single character. For example, if the field
separator is a percent character, instead of
@@ -7967,10 +7969,10 @@ separator is a percent character, instead of
Another way to separate fields is to
put each field on a separate line: to do this, just set the
-variable @code{FS} to the string @code{"\n"}. (This single
-character separator matches a single newline.)
+variable @code{FS} to the string @code{"\n"}.
+(This single-character separator matches a single newline.)
A practical example of a @value{DF} organized this way might be a mailing
-list, where each entry is separated by blank lines. Consider a mailing
+list, where blank lines separate the entries. Consider a mailing
list in a file named @file{addresses}, which looks like this:
@example
@@ -8066,7 +8068,7 @@ then @command{gawk} sets @code{RT} to the null string.
@cindex input, explicit
So far we have been getting our input data from @command{awk}'s main
input stream---either the standard input (usually your keyboard, sometimes
-the output from another program) or from the
+the output from another program) or the
files specified on the command line. The @command{awk} language has a
special built-in command called @code{getline} that
can be used to read input under your explicit control.
@@ -8250,7 +8252,7 @@ free
@end example
The @code{getline} command used in this way sets only the variables
-@code{NR}, @code{FNR}, and @code{RT} (and of course, @var{var}).
+@code{NR}, @code{FNR}, and @code{RT} (and, of course, @var{var}).
The record is not
split into fields, so the values of the fields (including @code{$0}) and
the value of @code{NF} do not change.
@@ -8265,7 +8267,7 @@ the value of @code{NF} do not change.
@cindex left angle bracket (@code{<}), @code{<} operator (I/O)
@cindex operators, input/output
Use @samp{getline < @var{file}} to read the next record from @var{file}.
-Here @var{file} is a string-valued expression that
+Here, @var{file} is a string-valued expression that
specifies the @value{FN}. @samp{< @var{file}} is called a @dfn{redirection}
because it directs input to come from a different place.
For example, the following
@@ -8443,7 +8445,7 @@ of a construct like @samp{@w{"echo "} "date" | getline}.
Most versions, including the current version, treat it at as
@samp{@w{("echo "} "date") | getline}.
(This is also how BWK @command{awk} behaves.)
-Some versions changed and treated it as
+Some versions instead treat it as
@samp{@w{"echo "} ("date" | getline)}.
(This is how @command{mawk} behaves.)
In short, @emph{always} use explicit parentheses, and then you won't
@@ -8491,7 +8493,7 @@ program to be portable to other @command{awk} implementations.
@cindex operators, input/output
@cindex differences in @command{awk} and @command{gawk}, input/output operators
-Input into @code{getline} from a pipe is a one-way operation.
+Reading input into @code{getline} from a pipe is a one-way operation.
The command that is started with @samp{@var{command} | getline} only
sends data @emph{to} your @command{awk} program.
@@ -8501,7 +8503,7 @@ for processing and then read the results back.
communications are possible. This is done with the @samp{|&}
operator.
Typically, you write data to the coprocess first and then
-read results back, as shown in the following:
+read the results back, as shown in the following:
@example
print "@var{some query}" |& "db_server"
@@ -8584,7 +8586,7 @@ also @pxref{Auto-set}.)
@item
Using @code{FILENAME} with @code{getline}
(@samp{getline < FILENAME})
-is likely to be a source for
+is likely to be a source of
confusion. @command{awk} opens a separate input stream from the
current input file. However, by not using a variable, @code{$0}
and @code{NF} are still updated. If you're doing this, it's
@@ -8592,9 +8594,15 @@ probably by accident, and you should reconsider what it is you're
trying to accomplish.
@item
-@DBREF{Getline Summary} presents a table summarizing the
+@ifdocbook
+The next section
+@end ifdocbook
+@ifnotdocbook
+@ref{Getline Summary},
+@end ifnotdocbook
+presents a table summarizing the
@code{getline} variants and which variables they can affect.
-It is worth noting that those variants which do not use redirection
+It is worth noting that those variants that do not use redirection
can cause @code{FILENAME} to be updated if they cause
@command{awk} to start reading a new input file.
@@ -8603,7 +8611,7 @@ can cause @code{FILENAME} to be updated if they cause
If the variable being assigned is an expression with side effects,
different versions of @command{awk} behave differently upon encountering
end-of-file. Some versions don't evaluate the expression; many versions
-(including @command{gawk}) do. Here is an example, due to Duncan Moore:
+(including @command{gawk}) do. Here is an example, courtesy of Duncan Moore:
@ignore
Date: Sun, 01 Apr 2012 11:49:33 +0100
@@ -8620,7 +8628,7 @@ BEGIN @{
@noindent
Here, the side effect is the @samp{++c}. Is @code{c} incremented if
-end of file is encountered, before the element in @code{a} is assigned?
+end-of-file is encountered before the element in @code{a} is assigned?
@command{gawk} treats @code{getline} like a function call, and evaluates
the expression @samp{a[++c]} before attempting to read from @file{f}.
@@ -8662,8 +8670,8 @@ This @value{SECTION} describes a feature that is specific to @command{gawk}.
You may specify a timeout in milliseconds for reading input from the keyboard,
a pipe, or two-way communication, including TCP/IP sockets. This can be done
-on a per input, command, or connection basis, by setting a special element
-in the @code{PROCINFO} array (@pxref{Auto-set}):
+on a per-input, per-command, or per-connection basis, by setting a special
+element in the @code{PROCINFO} array (@pxref{Auto-set}):
@example
PROCINFO["input_name", "READ_TIMEOUT"] = @var{timeout in milliseconds}
@@ -8694,7 +8702,7 @@ while ((getline < "/dev/stdin") > 0)
@end example
@command{gawk} terminates the read operation if input does not
-arrive after waiting for the timeout period, returns failure
+arrive after waiting for the timeout period, returns failure,
and sets @code{ERRNO} to an appropriate string value.
A negative or zero value for the timeout is the same as specifying
no timeout at all.
@@ -8744,7 +8752,7 @@ If the @code{PROCINFO} element is not present and the
@command{gawk} uses its value to initialize the timeout value.
The exclusive use of the environment variable to specify timeout
has the disadvantage of not being able to control it
-on a per command or connection basis.
+on a per-command or per-connection basis.
@command{gawk} considers a timeout event to be an error even though
the attempt to read from the underlying device may
@@ -8810,7 +8818,7 @@ The possibilities are as follows:
@item
After splitting the input into records, @command{awk} further splits
-the record into individual fields, named @code{$1}, @code{$2}, and so
+the records into individual fields, named @code{$1}, @code{$2}, and so
on. @code{$0} is the whole record, and @code{NF} indicates how many
fields there are. The default way to split fields is between whitespace
characters.
@@ -8826,12 +8834,12 @@ thing. Decrementing @code{NF} throws away fields and rebuilds the record.
@item
Field splitting is more complicated than record splitting:
-@multitable @columnfractions .40 .45 .15
+@multitable @columnfractions .40 .40 .20
@headitem Field separator value @tab Fields are split @dots{} @tab @command{awk} / @command{gawk}
@item @code{FS == " "} @tab On runs of whitespace @tab @command{awk}
@item @code{FS == @var{any single character}} @tab On that character @tab @command{awk}
@item @code{FS == @var{regexp}} @tab On text matching the regexp @tab @command{awk}
-@item @code{FS == ""} @tab Each individual character is a separate field @tab @command{gawk}
+@item @code{FS == ""} @tab Such that each individual character is a separate field @tab @command{gawk}
@item @code{FIELDWIDTHS == @var{list of columns}} @tab Based on character position @tab @command{gawk}
@item @code{FPAT == @var{regexp}} @tab On the text surrounding text matching the regexp @tab @command{gawk}
@end multitable
@@ -8848,11 +8856,11 @@ This can also be done using command-line variable assignment.
Use @code{PROCINFO["FS"]} to see how fields are being split.
@item
-Use @code{getline} in its various forms to read additional records,
+Use @code{getline} in its various forms to read additional records
from the default input stream, from a file, or from a pipe or coprocess.
@item
-Use @code{PROCINFO[@var{file}, "READ_TIMEOUT"]} to cause reads to timeout
+Use @code{PROCINFO[@var{file}, "READ_TIMEOUT"]} to cause reads to time out
for @var{file}.
@item
@@ -8961,7 +8969,7 @@ space is printed between any two items.
Note that the @code{print} statement is a statement and not an
expression---you can't use it in the pattern part of a
-@var{pattern}-@var{action} statement, for example.
+pattern--action statement, for example.
@node Print Examples
@section @code{print} Statement Examples
@@ -9152,7 +9160,7 @@ runs together on a single line.
@cindex numeric, output format
@cindex formats@comma{} numeric output
When printing numeric values with the @code{print} statement,
-@command{awk} internally converts the number to a string of characters
+@command{awk} internally converts each number to a string of characters
and prints that string. @command{awk} uses the @code{sprintf()} function
to do this conversion
(@pxref{String Functions}).
@@ -9223,7 +9231,7 @@ printf @var{format}, @var{item1}, @var{item2}, @dots{}
@noindent
As for @code{print}, the entire list of arguments may optionally be
enclosed in parentheses. Here too, the parentheses are necessary if any
-of the item expressions use the @samp{>} relational operator; otherwise,
+of the item expressions uses the @samp{>} relational operator; otherwise,
it can be confused with an output redirection (@pxref{Redirection}).
@cindex format specifiers
@@ -9254,7 +9262,7 @@ $ @kbd{awk 'BEGIN @{}
@end example
@noindent
-Here, neither the @samp{+} nor the @samp{OUCH!} appear in
+Here, neither the @samp{+} nor the @samp{OUCH!} appears in
the output message.
@node Control Letters
@@ -9301,8 +9309,8 @@ The two control letters are equivalent.
(The @samp{%i} specification is for compatibility with ISO C.)
@item @code{%e}, @code{%E}
-Print a number in scientific (exponential) notation;
-for example:
+Print a number in scientific (exponential) notation.
+For example:
@example
printf "%4.3e\n", 1950
@@ -9339,7 +9347,7 @@ The special ``not a number'' value formats as @samp{-nan} or @samp{nan}
(@pxref{Math Definitions}).
@item @code{%F}
-Like @samp{%f} but the infinity and ``not a number'' values are spelled
+Like @samp{%f}, but the infinity and ``not a number'' values are spelled
using uppercase letters.
The @samp{%F} format is a POSIX extension to ISO C; not all systems
@@ -9583,7 +9591,7 @@ printf "%" w "." p "s\n", s
@end example
@noindent
-This is not particularly easy to read but it does work.
+This is not particularly easy to read, but it does work.
@c @cindex lint checks
@cindex troubleshooting, fatal errors, @code{printf} format strings
@@ -9629,7 +9637,7 @@ $ @kbd{awk '@{ printf "%-10s %s\n", $1, $2 @}' mail-list}
@end example
In this case, the phone numbers had to be printed as strings because
-the numbers are separated by a dash. Printing the phone numbers as
+the numbers are separated by dashes. Printing the phone numbers as
numbers would have produced just the first three digits: @samp{555}.
This would have been pretty confusing.
@@ -9689,7 +9697,7 @@ This is called @dfn{redirection}.
@quotation NOTE
When @option{--sandbox} is specified (@pxref{Options}),
-redirecting output to files, pipes and coprocesses is disabled.
+redirecting output to files, pipes, and coprocesses is disabled.
@end quotation
A redirection appears after the @code{print} or @code{printf} statement.
@@ -9742,7 +9750,7 @@ Each output file contains one name or number per line.
@cindex @code{>} (right angle bracket), @code{>>} operator (I/O)
@cindex right angle bracket (@code{>}), @code{>>} operator (I/O)
@item print @var{items} >> @var{output-file}
-This redirection prints the items into the pre-existing output file
+This redirection prints the items into the preexisting output file
named @var{output-file}. The difference between this and the
single-@samp{>} redirection is that the old contents (if any) of
@var{output-file} are not erased. Instead, the @command{awk} output is
@@ -9781,7 +9789,7 @@ The unsorted list is written with an ordinary redirection, while
the sorted list is written by piping through the @command{sort} utility.
The next example uses redirection to mail a message to the mailing
-list @samp{bug-system}. This might be useful when trouble is encountered
+list @code{bug-system}. This might be useful when trouble is encountered
in an @command{awk} script run periodically for system maintenance:
@example
@@ -9812,15 +9820,23 @@ This redirection prints the items to the input of @var{command}.
The difference between this and the
single-@samp{|} redirection is that the output from @var{command}
can be read with @code{getline}.
-Thus @var{command} is a @dfn{coprocess}, which works together with,
-but subsidiary to, the @command{awk} program.
+Thus, @var{command} is a @dfn{coprocess}, which works together with
+but is subsidiary to the @command{awk} program.
This feature is a @command{gawk} extension, and is not available in
POSIX @command{awk}.
-@DBXREF{Getline/Coprocess}
+@ifnotdocbook
+@xref{Getline/Coprocess},
for a brief discussion.
-@DBXREF{Two-way I/O}
+@xref{Two-way I/O},
+for a more complete discussion.
+@end ifnotdocbook
+@ifdocbook
+@DBXREF{Getline/Coprocess}
+for a brief discussion and
+@DBREF{Two-way I/O}
for a more complete discussion.
+@end ifdocbook
@end table
Redirecting output using @samp{>}, @samp{>>}, @samp{|}, or @samp{|&}
@@ -9845,7 +9861,7 @@ This is indeed how redirections must be used from the shell. But in
@command{awk}, it isn't necessary. In this kind of case, a program should
use @samp{>} for all the @code{print} statements, because the output file
is only opened once. (It happens that if you mix @samp{>} and @samp{>>}
-that output is produced in the expected order. However, mixing the operators
+output is produced in the expected order. However, mixing the operators
for the same file is definitely poor style, and is confusing to readers
of your program.)
@@ -9938,7 +9954,7 @@ command lines to be fed to the shell.
@end ifnotdocbook
@node Special FD
-@section Special Files for Standard Pre-Opened Data Streams
+@section Special Files for Standard Preopened Data Streams
@cindex standard input
@cindex input, standard
@cindex standard output
@@ -9951,7 +9967,7 @@ command lines to be fed to the shell.
Running programs conventionally have three input and output streams
already available to them for reading and writing. These are known
as the @dfn{standard input}, @dfn{standard output}, and @dfn{standard
-error output}. These open streams (and any other open file or pipe)
+error output}. These open streams (and any other open files or pipes)
are often referred to by the technical term @dfn{file descriptors}.
These streams are, by default, connected to your keyboard and screen, but
@@ -9989,7 +10005,7 @@ that is connected to your keyboard and screen. It represents the
``terminal,''@footnote{The ``tty'' in @file{/dev/tty} stands for
``Teletype,'' a serial terminal.} which on modern systems is a keyboard
and screen, not a serial console.)
-This generally has the same effect but not always: although the
+This generally has the same effect, but not always: although the
standard error stream is usually the screen, it can be redirected; when
that happens, writing to the screen is not correct. In fact, if
@command{awk} is run from a background job, it may not have a
@@ -10034,7 +10050,7 @@ print "Serious error detected!" > "/dev/stderr"
@cindex troubleshooting, quotes with file names
Note the use of quotes around the @value{FN}.
-Like any other redirection, the value must be a string.
+Like with any other redirection, the value must be a string.
It is a common error to omit the quotes, which leads
to confusing results.
@@ -10060,7 +10076,7 @@ TCP/IP networking.
@end menu
@node Other Inherited Files
-@subsection Accessing Other Open Files With @command{gawk}
+@subsection Accessing Other Open Files with @command{gawk}
Besides the @code{/dev/stdin}, @code{/dev/stdout}, and @code{/dev/stderr}
special @value{FN}s mentioned earlier, @command{gawk} provides syntax
@@ -10117,7 +10133,7 @@ special @value{FN}s that @command{gawk} provides:
@cindex compatibility mode (@command{gawk}), file names
@cindex file names, in compatibility mode
@item
-Recognition of the @value{FN}s for the three standard pre-opened
+Recognition of the @value{FN}s for the three standard preopened
files is disabled only in POSIX mode.
@item
@@ -10130,7 +10146,7 @@ compatibility mode (either @option{--traditional} or @option{--posix};
interprets these special @value{FN}s.
For example, using @samp{/dev/fd/4}
for output actually writes on file descriptor 4, and not on a new
-file descriptor that is @code{dup()}'ed from file descriptor 4. Most of
+file descriptor that is @code{dup()}ed from file descriptor 4. Most of
the time this does not matter; however, it is important to @emph{not}
close any of the files related to file descriptors 0, 1, and 2.
Doing so results in unpredictable behavior.
@@ -10352,9 +10368,9 @@ This value is zero if the close succeeds, or @minus{}1 if
it fails.
The POSIX standard is very vague; it says that @code{close()}
-returns zero on success and nonzero otherwise. In general,
+returns zero on success and a nonzero value otherwise. In general,
different implementations vary in what they report when closing
-pipes; thus the return value cannot be used portably.
+pipes; thus, the return value cannot be used portably.
@value{DARKCORNER}
In POSIX mode (@pxref{Options}), @command{gawk} just returns zero
when closing a pipe.
@@ -10409,9 +10425,9 @@ This value is zero if the close succeeds, or @minus{}1 if
it fails.
The POSIX standard is very vague; it says that @code{close()}
-returns zero on success and nonzero otherwise. In general,
+returns zero on success and a nonzero value otherwise. In general,
different implementations vary in what they report when closing
-pipes; thus the return value cannot be used portably.
+pipes; thus, the return value cannot be used portably.
@value{DARKCORNER}
In POSIX mode (@pxref{Options}), @command{gawk} just returns zero
when closing a pipe.
@@ -10431,8 +10447,8 @@ for numeric values for the @code{print} statement.
@item
The @code{printf} statement provides finer-grained control over output,
-with format control letters for different data types and various flags
-that modify the behavior of the format control letters.
+with format-control letters for different data types and various flags
+that modify the behavior of the format-control letters.
@item
Output from both @code{print} and @code{printf} may be redirected to
@@ -38402,7 +38418,7 @@ To get @command{awka}, go to @url{http://sourceforge.net/projects/awka}.
@c andrewsumner@@yahoo.net
The project seems to be frozen; no new code changes have been made
-since approximately 2003.
+since approximately 2001.
@cindex Beebe, Nelson H.F.@:
@cindex @command{pawk} (profiling version of Brian Kernighan's @command{awk})
@@ -38680,7 +38696,7 @@ for information on getting the latest version of @command{gawk}.)
@item
@ifnotinfo
-Follow the @uref{http://www.gnu.org/prep/standards/, @cite{GNU Coding Standards}}.
+Follow the @cite{GNU Coding Standards}.
@end ifnotinfo
@ifinfo
See @inforef{Top, , Version, standards, GNU Coding Standards}.
@@ -38689,7 +38705,7 @@ This document describes how GNU software should be written. If you haven't
read it, please do so, preferably @emph{before} starting to modify @command{gawk}.
(The @cite{GNU Coding Standards} are available from
the GNU Project's
-@uref{http://www.gnu.org/prep/standards_toc.html, website}.
+@uref{http://www.gnu.org/prep/standards/, website}.
Texinfo, Info, and DVI versions are also available.)
@cindex @command{gawk}, coding style in
diff --git a/doc/gawktexi.in b/doc/gawktexi.in
index 34c47270..61eca284 100644
--- a/doc/gawktexi.in
+++ b/doc/gawktexi.in
@@ -5546,11 +5546,11 @@ and numeric characters in your character set.
@c Date: Tue, 01 Jul 2014 07:39:51 +0200
@c From: Hermann Peifer <peifer@gmx.eu>
Some utilities that match regular expressions provide a nonstandard
-@code{[:ascii:]} character class; @command{awk} does not. However, you
-can simulate such a construct using @code{[\x00-\x7F]}. This matches
+@samp{[:ascii:]} character class; @command{awk} does not. However, you
+can simulate such a construct using @samp{[\x00-\x7F]}. This matches
all values numerically between zero and 127, which is the defined
range of the ASCII character set. Use a complemented character list
-(@code{[^\x00-\x7F]}) to match any single-byte characters that are not
+(@samp{[^\x00-\x7F]}) to match any single-byte characters that are not
in the ASCII range.
@cindex bracket expressions, collating elements
@@ -5579,8 +5579,8 @@ Locale-specific names for a list of
characters that are equal. The name is enclosed between
@samp{[=} and @samp{=]}.
For example, the name @samp{e} might be used to represent all of
-``e,'' ``@`e,'' and ``@'e.'' In this case, @samp{[[=e=]]} is a regexp
-that matches any of @samp{e}, @samp{@'e}, or @samp{@`e}.
+``e,'' ``@^e,'' ``@`e,'' and ``@'e.'' In this case, @samp{[[=e=]]} is a regexp
+that matches any of @samp{e}, @samp{@^e}, @samp{@'e}, or @samp{@`e}.
@end table
These features are very valuable in non-English-speaking locales.
@@ -5609,7 +5609,7 @@ echo aaaabcd | awk '@{ sub(/a+/, "<A>"); print @}'
This example uses the @code{sub()} function to make a change to the input
record. (@code{sub()} replaces the first instance of any text matched
by the first argument with the string provided as the second argument;
-@pxref{String Functions}). Here, the regexp @code{/a+/} indicates ``one
+@pxref{String Functions}.) Here, the regexp @code{/a+/} indicates ``one
or more @samp{a} characters,'' and the replacement text is @samp{<A>}.
The input contains four @samp{a} characters.
@@ -5663,14 +5663,14 @@ and tests whether the input record matches this regexp.
@quotation NOTE
When using the @samp{~} and @samp{!~}
-operators, there is a difference between a regexp constant
+operators, be aware that there is a difference between a regexp constant
enclosed in slashes and a string constant enclosed in double quotes.
If you are going to use a string constant, you have to understand that
the string is, in essence, scanned @emph{twice}: the first time when
@command{awk} reads your program, and the second time when it goes to
match the string on the lefthand side of the operator with the pattern
on the right. This is true of any string-valued expression (such as
-@code{digits_regexp}, shown previously), not just string constants.
+@code{digits_regexp}, shown in the previous example), not just string constants.
@end quotation
@cindex regexp constants, slashes vs.@: quotes
@@ -5826,7 +5826,7 @@ matches either @samp{ball} or @samp{balls}, as a separate word.
@item \B
Matches the empty string that occurs between two
word-constituent characters. For example,
-@code{/\Brat\B/} matches @samp{crate} but it does not match @samp{dirty rat}.
+@code{/\Brat\B/} matches @samp{crate}, but it does not match @samp{dirty rat}.
@samp{\B} is essentially the opposite of @samp{\y}.
@end table
@@ -5845,14 +5845,14 @@ The operators are:
@cindex backslash (@code{\}), @code{\`} operator (@command{gawk})
@cindex @code{\} (backslash), @code{\`} operator (@command{gawk})
Matches the empty string at the
-beginning of a buffer (string).
+beginning of a buffer (string)
@c @cindex operators, @code{\'} (@command{gawk})
@cindex backslash (@code{\}), @code{\'} operator (@command{gawk})
@cindex @code{\} (backslash), @code{\'} operator (@command{gawk})
@item \'
Matches the empty string at the
-end of a buffer (string).
+end of a buffer (string)
@end table
@cindex @code{^} (caret), regexp operator
@@ -6085,7 +6085,7 @@ This makes it more convenient for programs to work on the parts of a record.
@cindex @code{getline} command
On rare occasions, you may need to use the @code{getline} command.
-The @code{getline} command is valuable, both because it
+The @code{getline} command is valuable both because it
can do explicit input from any number of files, and because the files
used with it do not have to be named on the @command{awk} command line
(@pxref{Getline}).
@@ -6136,8 +6136,8 @@ never automatically reset to zero.
Records are separated by a character called the @dfn{record separator}.
By default, the record separator is the newline character.
This is why records are, by default, single lines.
-A different character can be used for the record separator by
-assigning the character to the predefined variable @code{RS}.
+To use a different character for the record separator,
+simply assign that character to the predefined variable @code{RS}.
@cindex newlines, as record separators
@cindex @code{RS} variable
@@ -6160,8 +6160,8 @@ awk 'BEGIN @{ RS = "u" @}
@noindent
changes the value of @code{RS} to @samp{u}, before reading any input.
-This is a string whose first character is the letter ``u''; as a result, records
-are separated by the letter ``u.'' Then the input file is read, and the second
+The new value is a string whose first character is the letter ``u''; as a result, records
+are separated by the letter ``u''. Then the input file is read, and the second
rule in the @command{awk} program (the action with no pattern) prints each
record. Because each @code{print} statement adds a newline at the end of
its output, this @command{awk} program copies the input
@@ -6222,8 +6222,8 @@ Bill 555-1675 bill.drowning@@hotmail.com A
@end example
@noindent
-It contains no @samp{u} so there is no reason to split the record,
-unlike the others which have one or more occurrences of the @samp{u}.
+It contains no @samp{u}, so there is no reason to split the record,
+unlike the others, which each have one or more occurrences of the @samp{u}.
In fact, this record is treated as part of the previous record;
the newline separating them in the output
is the original newline in the @value{DF}, not the one added by
@@ -6318,7 +6318,7 @@ contains the same single character. However, when @code{RS} is a
regular expression, @code{RT} contains
the actual input text that matched the regular expression.
-If the input file ended without any text that matches @code{RS},
+If the input file ends without any text matching @code{RS},
@command{gawk} sets @code{RT} to the null string.
The following example illustrates both of these features.
@@ -6442,11 +6442,11 @@ simple @command{awk} programs so powerful.
@cindex @code{$} (dollar sign), @code{$} field operator
@cindex dollar sign (@code{$}), @code{$} field operator
@cindex field operators@comma{} dollar sign as
-You use a dollar-sign (@samp{$})
+You use a dollar sign (@samp{$})
to refer to a field in an @command{awk} program,
followed by the number of the field you want. Thus, @code{$1}
refers to the first field, @code{$2} to the second, and so on.
-(Unlike the Unix shells, the field numbers are not limited to single digits.
+(Unlike in the Unix shells, the field numbers are not limited to single digits.
@code{$127} is the 127th field in the record.)
For example, suppose the following is a line of input:
@@ -6472,7 +6472,7 @@ If you try to reference a field beyond the last
one (such as @code{$8} when the record has only seven fields), you get
the empty string. (If used in a numeric operation, you get zero.)
-The use of @code{$0}, which looks like a reference to the ``zero-th'' field, is
+The use of @code{$0}, which looks like a reference to the ``zeroth'' field, is
a special case: it represents the whole input record. Use it
when you are not interested in specific fields.
Here are some more examples:
@@ -6527,13 +6527,13 @@ awk '@{ print $(2*2) @}' mail-list
@end example
@command{awk} evaluates the expression @samp{(2*2)} and uses
-its value as the number of the field to print. The @samp{*} sign
+its value as the number of the field to print. The @samp{*}
represents multiplication, so the expression @samp{2*2} evaluates to four.
The parentheses are used so that the multiplication is done before the
@samp{$} operation; they are necessary whenever there is a binary
operator@footnote{A @dfn{binary operator}, such as @samp{*} for
multiplication, is one that takes two operands. The distinction
-is required, because @command{awk} also has unary (one-operand)
+is required because @command{awk} also has unary (one-operand)
and ternary (three-operand) operators.}
in the field-number expression. This example, then, prints the
type of relationship (the fourth field) for every line of the file
@@ -6713,7 +6713,7 @@ rebuild @code{$0} when @code{NF} is decremented.
Finally, there are times when it is convenient to force
@command{awk} to rebuild the entire record, using the current
-value of the fields and @code{OFS}. To do this, use the
+values of the fields and @code{OFS}. To do this, use the
seemingly innocuous assignment:
@example
@@ -6737,7 +6737,7 @@ such as @code{sub()} and @code{gsub()}
It is important to remember that @code{$0} is the @emph{full}
record, exactly as it was read from the input. This includes
any leading or trailing whitespace, and the exact whitespace (or other
-characters) that separate the fields.
+characters) that separates the fields.
It is a common error to try to change the field separators
in a record simply by setting @code{FS} and @code{OFS}, and then
@@ -6830,7 +6830,7 @@ John Q. Smith, LXIX, 29 Oak St., Walamazoo, MI 42139
@end example
@noindent
-The same program would extract @samp{@bullet{}LXIX}, instead of
+The same program would extract @samp{@bullet{}LXIX} instead of
@samp{@bullet{}29@bullet{}Oak@bullet{}St.}.
If you were expecting the program to print the
address, you would be surprised. The moral is to choose your data layout and
@@ -7091,7 +7091,7 @@ choosing your field and record separators.
@cindex Unix @command{awk}, password files@comma{} field separators and
Perhaps the most common use of a single character as the field separator
occurs when processing the Unix system password file. On many Unix
-systems, each user has a separate entry in the system password file, one
+systems, each user has a separate entry in the system password file, with one
line per user. The information in these lines is separated by colons.
The first field is the user's login name and the second is the user's
encrypted or shadow password. (A shadow password is indicated by the
@@ -7132,7 +7132,7 @@ When you do this, @code{$1} is the same as @code{$0}.
According to the POSIX standard, @command{awk} is supposed to behave
as if each record is split into fields at the time it is read.
In particular, this means that if you change the value of @code{FS}
-after a record is read, the value of the fields (i.e., how they were split)
+after a record is read, the values of the fields (i.e., how they were split)
should reflect the old value of @code{FS}, not the new one.
@cindex dark corner, field separators
@@ -7145,10 +7145,7 @@ using the @emph{current} value of @code{FS}!
@value{DARKCORNER}
This behavior can be difficult
to diagnose. The following example illustrates the difference
-between the two methods.
-(The @command{sed}@footnote{The @command{sed} utility is a ``stream editor.''
-Its behavior is also defined by the POSIX standard.}
-command prints just the first line of @file{/etc/passwd}.)
+between the two methods:
@example
sed 1q /etc/passwd | awk '@{ FS = ":" ; print $1 @}'
@@ -7168,6 +7165,10 @@ prints the full first line of the file, something like:
@example
root:x:0:0:Root:/:
@end example
+
+(The @command{sed}@footnote{The @command{sed} utility is a ``stream editor.''
+Its behavior is also defined by the POSIX standard.}
+command prints just the first line of @file{/etc/passwd}.)
@end sidebar
@node Field Splitting Summary
@@ -7342,7 +7343,7 @@ In order to tell which kind of field splitting is in effect,
use @code{PROCINFO["FS"]}
(@pxref{Auto-set}).
The value is @code{"FS"} if regular field splitting is being used,
-or it is @code{"FIELDWIDTHS"} if fixed-width field splitting is being used:
+or @code{"FIELDWIDTHS"} if fixed-width field splitting is being used:
@example
if (PROCINFO["FS"] == "FS")
@@ -7378,14 +7379,14 @@ what they are, and not by what they are not.
The most notorious such case
is so-called @dfn{comma-separated values} (CSV) data. Many spreadsheet programs,
for example, can export their data into text files, where each record is
-terminated with a newline, and fields are separated by commas. If only
-commas separated the data, there wouldn't be an issue. The problem comes when
+terminated with a newline, and fields are separated by commas. If
+commas only separated the data, there wouldn't be an issue. The problem comes when
one of the fields contains an @emph{embedded} comma.
In such cases, most programs embed the field in double quotes.@footnote{The
CSV format lacked a formal standard definition for many years.
@uref{http://www.ietf.org/rfc/rfc4180.txt, RFC 4180}
standardizes the most common practices.}
-So we might have data like this:
+So, we might have data like this:
@example
@c file eg/misc/addresses.csv
@@ -7471,8 +7472,8 @@ of cases, and the @command{gawk} developers are satisfied with that.
@end quotation
As written, the regexp used for @code{FPAT} requires that each field
-have a least one character. A straightforward modification
-(changing changed the first @samp{+} to @samp{*}) allows fields to be empty:
+contain at least one character. A straightforward modification
+(changing the first @samp{+} to @samp{*}) allows fields to be empty:
@example
FPAT = "([^,]*)|(\"[^\"]+\")"
@@ -7482,9 +7483,9 @@ Finally, the @code{patsplit()} function makes the same functionality
available for splitting regular strings (@pxref{String Functions}).
To recap, @command{gawk} provides three independent methods
-to split input records into fields. @command{gawk} uses whichever
-mechanism was last chosen based on which of the three
-variables---@code{FS}, @code{FIELDWIDTHS}, and @code{FPAT}---was
+to split input records into fields.
+The mechanism used is based on which of the three
+variables---@code{FS}, @code{FIELDWIDTHS}, or @code{FPAT}---was
last assigned to.
@node Multiple Line
@@ -7527,7 +7528,7 @@ at the end of the record and one or more blank lines after the record.
In addition, a regular expression always matches the longest possible
sequence when there is a choice
(@pxref{Leftmost Longest}).
-So the next record doesn't start until
+So, the next record doesn't start until
the first nonblank line that follows---no matter how many blank lines
appear in a row, they are considered one record separator.
@@ -7542,10 +7543,10 @@ In the second case, this special processing is not done.
@cindex field separator, in multiline records
@cindex @code{FS}, in multiline records
Now that the input is separated into records, the second step is to
-separate the fields in the record. One way to do this is to divide each
+separate the fields in the records. One way to do this is to divide each
of the lines into fields in the normal manner. This happens by default
as the result of a special feature. When @code{RS} is set to the empty
-string, @emph{and} @code{FS} is set to a single character,
+string @emph{and} @code{FS} is set to a single character,
the newline character @emph{always} acts as a field separator.
This is in addition to whatever field separations result from
@code{FS}.@footnote{When @code{FS} is the null string (@code{""})
@@ -7560,7 +7561,7 @@ want the newline character to separate fields, because there is no way to
prevent it. However, you can work around this by using the @code{split()}
function to break up the record manually
(@pxref{String Functions}).
-If you have a single character field separator, you can work around
+If you have a single-character field separator, you can work around
the special feature in a different way, by making @code{FS} into a
regexp for that single character. For example, if the field
separator is a percent character, instead of
@@ -7568,10 +7569,10 @@ separator is a percent character, instead of
Another way to separate fields is to
put each field on a separate line: to do this, just set the
-variable @code{FS} to the string @code{"\n"}. (This single
-character separator matches a single newline.)
+variable @code{FS} to the string @code{"\n"}.
+(This single-character separator matches a single newline.)
A practical example of a @value{DF} organized this way might be a mailing
-list, where each entry is separated by blank lines. Consider a mailing
+list, where blank lines separate the entries. Consider a mailing
list in a file named @file{addresses}, which looks like this:
@example
@@ -7667,7 +7668,7 @@ then @command{gawk} sets @code{RT} to the null string.
@cindex input, explicit
So far we have been getting our input data from @command{awk}'s main
input stream---either the standard input (usually your keyboard, sometimes
-the output from another program) or from the
+the output from another program) or the
files specified on the command line. The @command{awk} language has a
special built-in command called @code{getline} that
can be used to read input under your explicit control.
@@ -7851,7 +7852,7 @@ free
@end example
The @code{getline} command used in this way sets only the variables
-@code{NR}, @code{FNR}, and @code{RT} (and of course, @var{var}).
+@code{NR}, @code{FNR}, and @code{RT} (and, of course, @var{var}).
The record is not
split into fields, so the values of the fields (including @code{$0}) and
the value of @code{NF} do not change.
@@ -7866,7 +7867,7 @@ the value of @code{NF} do not change.
@cindex left angle bracket (@code{<}), @code{<} operator (I/O)
@cindex operators, input/output
Use @samp{getline < @var{file}} to read the next record from @var{file}.
-Here @var{file} is a string-valued expression that
+Here, @var{file} is a string-valued expression that
specifies the @value{FN}. @samp{< @var{file}} is called a @dfn{redirection}
because it directs input to come from a different place.
For example, the following
@@ -8044,7 +8045,7 @@ of a construct like @samp{@w{"echo "} "date" | getline}.
Most versions, including the current version, treat it at as
@samp{@w{("echo "} "date") | getline}.
(This is also how BWK @command{awk} behaves.)
-Some versions changed and treated it as
+Some versions instead treat it as
@samp{@w{"echo "} ("date" | getline)}.
(This is how @command{mawk} behaves.)
In short, @emph{always} use explicit parentheses, and then you won't
@@ -8092,7 +8093,7 @@ program to be portable to other @command{awk} implementations.
@cindex operators, input/output
@cindex differences in @command{awk} and @command{gawk}, input/output operators
-Input into @code{getline} from a pipe is a one-way operation.
+Reading input into @code{getline} from a pipe is a one-way operation.
The command that is started with @samp{@var{command} | getline} only
sends data @emph{to} your @command{awk} program.
@@ -8102,7 +8103,7 @@ for processing and then read the results back.
communications are possible. This is done with the @samp{|&}
operator.
Typically, you write data to the coprocess first and then
-read results back, as shown in the following:
+read the results back, as shown in the following:
@example
print "@var{some query}" |& "db_server"
@@ -8185,7 +8186,7 @@ also @pxref{Auto-set}.)
@item
Using @code{FILENAME} with @code{getline}
(@samp{getline < FILENAME})
-is likely to be a source for
+is likely to be a source of
confusion. @command{awk} opens a separate input stream from the
current input file. However, by not using a variable, @code{$0}
and @code{NF} are still updated. If you're doing this, it's
@@ -8193,9 +8194,15 @@ probably by accident, and you should reconsider what it is you're
trying to accomplish.
@item
-@DBREF{Getline Summary} presents a table summarizing the
+@ifdocbook
+The next section
+@end ifdocbook
+@ifnotdocbook
+@ref{Getline Summary},
+@end ifnotdocbook
+presents a table summarizing the
@code{getline} variants and which variables they can affect.
-It is worth noting that those variants which do not use redirection
+It is worth noting that those variants that do not use redirection
can cause @code{FILENAME} to be updated if they cause
@command{awk} to start reading a new input file.
@@ -8204,7 +8211,7 @@ can cause @code{FILENAME} to be updated if they cause
If the variable being assigned is an expression with side effects,
different versions of @command{awk} behave differently upon encountering
end-of-file. Some versions don't evaluate the expression; many versions
-(including @command{gawk}) do. Here is an example, due to Duncan Moore:
+(including @command{gawk}) do. Here is an example, courtesy of Duncan Moore:
@ignore
Date: Sun, 01 Apr 2012 11:49:33 +0100
@@ -8221,7 +8228,7 @@ BEGIN @{
@noindent
Here, the side effect is the @samp{++c}. Is @code{c} incremented if
-end of file is encountered, before the element in @code{a} is assigned?
+end-of-file is encountered before the element in @code{a} is assigned?
@command{gawk} treats @code{getline} like a function call, and evaluates
the expression @samp{a[++c]} before attempting to read from @file{f}.
@@ -8263,8 +8270,8 @@ This @value{SECTION} describes a feature that is specific to @command{gawk}.
You may specify a timeout in milliseconds for reading input from the keyboard,
a pipe, or two-way communication, including TCP/IP sockets. This can be done
-on a per input, command, or connection basis, by setting a special element
-in the @code{PROCINFO} array (@pxref{Auto-set}):
+on a per-input, per-command, or per-connection basis, by setting a special
+element in the @code{PROCINFO} array (@pxref{Auto-set}):
@example
PROCINFO["input_name", "READ_TIMEOUT"] = @var{timeout in milliseconds}
@@ -8295,7 +8302,7 @@ while ((getline < "/dev/stdin") > 0)
@end example
@command{gawk} terminates the read operation if input does not
-arrive after waiting for the timeout period, returns failure
+arrive after waiting for the timeout period, returns failure,
and sets @code{ERRNO} to an appropriate string value.
A negative or zero value for the timeout is the same as specifying
no timeout at all.
@@ -8345,7 +8352,7 @@ If the @code{PROCINFO} element is not present and the
@command{gawk} uses its value to initialize the timeout value.
The exclusive use of the environment variable to specify timeout
has the disadvantage of not being able to control it
-on a per command or connection basis.
+on a per-command or per-connection basis.
@command{gawk} considers a timeout event to be an error even though
the attempt to read from the underlying device may
@@ -8411,7 +8418,7 @@ The possibilities are as follows:
@item
After splitting the input into records, @command{awk} further splits
-the record into individual fields, named @code{$1}, @code{$2}, and so
+the records into individual fields, named @code{$1}, @code{$2}, and so
on. @code{$0} is the whole record, and @code{NF} indicates how many
fields there are. The default way to split fields is between whitespace
characters.
@@ -8427,12 +8434,12 @@ thing. Decrementing @code{NF} throws away fields and rebuilds the record.
@item
Field splitting is more complicated than record splitting:
-@multitable @columnfractions .40 .45 .15
+@multitable @columnfractions .40 .40 .20
@headitem Field separator value @tab Fields are split @dots{} @tab @command{awk} / @command{gawk}
@item @code{FS == " "} @tab On runs of whitespace @tab @command{awk}
@item @code{FS == @var{any single character}} @tab On that character @tab @command{awk}
@item @code{FS == @var{regexp}} @tab On text matching the regexp @tab @command{awk}
-@item @code{FS == ""} @tab Each individual character is a separate field @tab @command{gawk}
+@item @code{FS == ""} @tab Such that each individual character is a separate field @tab @command{gawk}
@item @code{FIELDWIDTHS == @var{list of columns}} @tab Based on character position @tab @command{gawk}
@item @code{FPAT == @var{regexp}} @tab On the text surrounding text matching the regexp @tab @command{gawk}
@end multitable
@@ -8449,11 +8456,11 @@ This can also be done using command-line variable assignment.
Use @code{PROCINFO["FS"]} to see how fields are being split.
@item
-Use @code{getline} in its various forms to read additional records,
+Use @code{getline} in its various forms to read additional records
from the default input stream, from a file, or from a pipe or coprocess.
@item
-Use @code{PROCINFO[@var{file}, "READ_TIMEOUT"]} to cause reads to timeout
+Use @code{PROCINFO[@var{file}, "READ_TIMEOUT"]} to cause reads to time out
for @var{file}.
@item
@@ -8562,7 +8569,7 @@ space is printed between any two items.
Note that the @code{print} statement is a statement and not an
expression---you can't use it in the pattern part of a
-@var{pattern}-@var{action} statement, for example.
+pattern--action statement, for example.
@node Print Examples
@section @code{print} Statement Examples
@@ -8753,7 +8760,7 @@ runs together on a single line.
@cindex numeric, output format
@cindex formats@comma{} numeric output
When printing numeric values with the @code{print} statement,
-@command{awk} internally converts the number to a string of characters
+@command{awk} internally converts each number to a string of characters
and prints that string. @command{awk} uses the @code{sprintf()} function
to do this conversion
(@pxref{String Functions}).
@@ -8824,7 +8831,7 @@ printf @var{format}, @var{item1}, @var{item2}, @dots{}
@noindent
As for @code{print}, the entire list of arguments may optionally be
enclosed in parentheses. Here too, the parentheses are necessary if any
-of the item expressions use the @samp{>} relational operator; otherwise,
+of the item expressions uses the @samp{>} relational operator; otherwise,
it can be confused with an output redirection (@pxref{Redirection}).
@cindex format specifiers
@@ -8855,7 +8862,7 @@ $ @kbd{awk 'BEGIN @{}
@end example
@noindent
-Here, neither the @samp{+} nor the @samp{OUCH!} appear in
+Here, neither the @samp{+} nor the @samp{OUCH!} appears in
the output message.
@node Control Letters
@@ -8902,8 +8909,8 @@ The two control letters are equivalent.
(The @samp{%i} specification is for compatibility with ISO C.)
@item @code{%e}, @code{%E}
-Print a number in scientific (exponential) notation;
-for example:
+Print a number in scientific (exponential) notation.
+For example:
@example
printf "%4.3e\n", 1950
@@ -8940,7 +8947,7 @@ The special ``not a number'' value formats as @samp{-nan} or @samp{nan}
(@pxref{Math Definitions}).
@item @code{%F}
-Like @samp{%f} but the infinity and ``not a number'' values are spelled
+Like @samp{%f}, but the infinity and ``not a number'' values are spelled
using uppercase letters.
The @samp{%F} format is a POSIX extension to ISO C; not all systems
@@ -9184,7 +9191,7 @@ printf "%" w "." p "s\n", s
@end example
@noindent
-This is not particularly easy to read but it does work.
+This is not particularly easy to read, but it does work.
@c @cindex lint checks
@cindex troubleshooting, fatal errors, @code{printf} format strings
@@ -9230,7 +9237,7 @@ $ @kbd{awk '@{ printf "%-10s %s\n", $1, $2 @}' mail-list}
@end example
In this case, the phone numbers had to be printed as strings because
-the numbers are separated by a dash. Printing the phone numbers as
+the numbers are separated by dashes. Printing the phone numbers as
numbers would have produced just the first three digits: @samp{555}.
This would have been pretty confusing.
@@ -9290,7 +9297,7 @@ This is called @dfn{redirection}.
@quotation NOTE
When @option{--sandbox} is specified (@pxref{Options}),
-redirecting output to files, pipes and coprocesses is disabled.
+redirecting output to files, pipes, and coprocesses is disabled.
@end quotation
A redirection appears after the @code{print} or @code{printf} statement.
@@ -9343,7 +9350,7 @@ Each output file contains one name or number per line.
@cindex @code{>} (right angle bracket), @code{>>} operator (I/O)
@cindex right angle bracket (@code{>}), @code{>>} operator (I/O)
@item print @var{items} >> @var{output-file}
-This redirection prints the items into the pre-existing output file
+This redirection prints the items into the preexisting output file
named @var{output-file}. The difference between this and the
single-@samp{>} redirection is that the old contents (if any) of
@var{output-file} are not erased. Instead, the @command{awk} output is
@@ -9382,7 +9389,7 @@ The unsorted list is written with an ordinary redirection, while
the sorted list is written by piping through the @command{sort} utility.
The next example uses redirection to mail a message to the mailing
-list @samp{bug-system}. This might be useful when trouble is encountered
+list @code{bug-system}. This might be useful when trouble is encountered
in an @command{awk} script run periodically for system maintenance:
@example
@@ -9413,15 +9420,23 @@ This redirection prints the items to the input of @var{command}.
The difference between this and the
single-@samp{|} redirection is that the output from @var{command}
can be read with @code{getline}.
-Thus @var{command} is a @dfn{coprocess}, which works together with,
-but subsidiary to, the @command{awk} program.
+Thus, @var{command} is a @dfn{coprocess}, which works together with
+but is subsidiary to the @command{awk} program.
This feature is a @command{gawk} extension, and is not available in
POSIX @command{awk}.
-@DBXREF{Getline/Coprocess}
+@ifnotdocbook
+@xref{Getline/Coprocess},
for a brief discussion.
-@DBXREF{Two-way I/O}
+@xref{Two-way I/O},
for a more complete discussion.
+@end ifnotdocbook
+@ifdocbook
+@DBXREF{Getline/Coprocess}
+for a brief discussion and
+@DBREF{Two-way I/O}
+for a more complete discussion.
+@end ifdocbook
@end table
Redirecting output using @samp{>}, @samp{>>}, @samp{|}, or @samp{|&}
@@ -9446,7 +9461,7 @@ This is indeed how redirections must be used from the shell. But in
@command{awk}, it isn't necessary. In this kind of case, a program should
use @samp{>} for all the @code{print} statements, because the output file
is only opened once. (It happens that if you mix @samp{>} and @samp{>>}
-that output is produced in the expected order. However, mixing the operators
+output is produced in the expected order. However, mixing the operators
for the same file is definitely poor style, and is confusing to readers
of your program.)
@@ -9498,7 +9513,7 @@ command lines to be fed to the shell.
@end sidebar
@node Special FD
-@section Special Files for Standard Pre-Opened Data Streams
+@section Special Files for Standard Preopened Data Streams
@cindex standard input
@cindex input, standard
@cindex standard output
@@ -9511,7 +9526,7 @@ command lines to be fed to the shell.
Running programs conventionally have three input and output streams
already available to them for reading and writing. These are known
as the @dfn{standard input}, @dfn{standard output}, and @dfn{standard
-error output}. These open streams (and any other open file or pipe)
+error output}. These open streams (and any other open files or pipes)
are often referred to by the technical term @dfn{file descriptors}.
These streams are, by default, connected to your keyboard and screen, but
@@ -9549,7 +9564,7 @@ that is connected to your keyboard and screen. It represents the
``terminal,''@footnote{The ``tty'' in @file{/dev/tty} stands for
``Teletype,'' a serial terminal.} which on modern systems is a keyboard
and screen, not a serial console.)
-This generally has the same effect but not always: although the
+This generally has the same effect, but not always: although the
standard error stream is usually the screen, it can be redirected; when
that happens, writing to the screen is not correct. In fact, if
@command{awk} is run from a background job, it may not have a
@@ -9594,7 +9609,7 @@ print "Serious error detected!" > "/dev/stderr"
@cindex troubleshooting, quotes with file names
Note the use of quotes around the @value{FN}.
-Like any other redirection, the value must be a string.
+Like with any other redirection, the value must be a string.
It is a common error to omit the quotes, which leads
to confusing results.
@@ -9620,7 +9635,7 @@ TCP/IP networking.
@end menu
@node Other Inherited Files
-@subsection Accessing Other Open Files With @command{gawk}
+@subsection Accessing Other Open Files with @command{gawk}
Besides the @code{/dev/stdin}, @code{/dev/stdout}, and @code{/dev/stderr}
special @value{FN}s mentioned earlier, @command{gawk} provides syntax
@@ -9677,7 +9692,7 @@ special @value{FN}s that @command{gawk} provides:
@cindex compatibility mode (@command{gawk}), file names
@cindex file names, in compatibility mode
@item
-Recognition of the @value{FN}s for the three standard pre-opened
+Recognition of the @value{FN}s for the three standard preopened
files is disabled only in POSIX mode.
@item
@@ -9690,7 +9705,7 @@ compatibility mode (either @option{--traditional} or @option{--posix};
interprets these special @value{FN}s.
For example, using @samp{/dev/fd/4}
for output actually writes on file descriptor 4, and not on a new
-file descriptor that is @code{dup()}'ed from file descriptor 4. Most of
+file descriptor that is @code{dup()}ed from file descriptor 4. Most of
the time this does not matter; however, it is important to @emph{not}
close any of the files related to file descriptors 0, 1, and 2.
Doing so results in unpredictable behavior.
@@ -9907,9 +9922,9 @@ This value is zero if the close succeeds, or @minus{}1 if
it fails.
The POSIX standard is very vague; it says that @code{close()}
-returns zero on success and nonzero otherwise. In general,
+returns zero on success and a nonzero value otherwise. In general,
different implementations vary in what they report when closing
-pipes; thus the return value cannot be used portably.
+pipes; thus, the return value cannot be used portably.
@value{DARKCORNER}
In POSIX mode (@pxref{Options}), @command{gawk} just returns zero
when closing a pipe.
@@ -9928,8 +9943,8 @@ for numeric values for the @code{print} statement.
@item
The @code{printf} statement provides finer-grained control over output,
-with format control letters for different data types and various flags
-that modify the behavior of the format control letters.
+with format-control letters for different data types and various flags
+that modify the behavior of the format-control letters.
@item
Output from both @code{print} and @code{printf} may be redirected to
@@ -37495,7 +37510,7 @@ To get @command{awka}, go to @url{http://sourceforge.net/projects/awka}.
@c andrewsumner@@yahoo.net
The project seems to be frozen; no new code changes have been made
-since approximately 2003.
+since approximately 2001.
@cindex Beebe, Nelson H.F.@:
@cindex @command{pawk} (profiling version of Brian Kernighan's @command{awk})
@@ -37773,7 +37788,7 @@ for information on getting the latest version of @command{gawk}.)
@item
@ifnotinfo
-Follow the @uref{http://www.gnu.org/prep/standards/, @cite{GNU Coding Standards}}.
+Follow the @cite{GNU Coding Standards}.
@end ifnotinfo
@ifinfo
See @inforef{Top, , Version, standards, GNU Coding Standards}.
@@ -37782,7 +37797,7 @@ This document describes how GNU software should be written. If you haven't
read it, please do so, preferably @emph{before} starting to modify @command{gawk}.
(The @cite{GNU Coding Standards} are available from
the GNU Project's
-@uref{http://www.gnu.org/prep/standards_toc.html, website}.
+@uref{http://www.gnu.org/prep/standards/, website}.
Texinfo, Info, and DVI versions are also available.)
@cindex @command{gawk}, coding style in