diff options
author | Arnold D. Robbins <arnold@skeeve.com> | 2014-04-30 06:06:06 +0300 |
---|---|---|
committer | Arnold D. Robbins <arnold@skeeve.com> | 2014-04-30 06:06:06 +0300 |
commit | 2535d8a18e8c0d328fe6d1d8ae015320eeec6b5d (patch) | |
tree | a5f30542da0864518a509e6683591df353e3a49c | |
parent | 4e4446794686a101e0c64ff7242a44a646c56d7e (diff) | |
download | egawk-2535d8a18e8c0d328fe6d1d8ae015320eeec6b5d.tar.gz egawk-2535d8a18e8c0d328fe6d1d8ae015320eeec6b5d.tar.bz2 egawk-2535d8a18e8c0d328fe6d1d8ae015320eeec6b5d.zip |
Editing progress through chapter 5.
-rw-r--r-- | doc/ChangeLog | 4 | ||||
-rw-r--r-- | doc/gawk.info | 1344 | ||||
-rw-r--r-- | doc/gawk.texi | 179 | ||||
-rw-r--r-- | doc/gawktexi.in | 166 |
4 files changed, 906 insertions, 787 deletions
diff --git a/doc/ChangeLog b/doc/ChangeLog index 972e19f8..59d31520 100644 --- a/doc/ChangeLog +++ b/doc/ChangeLog @@ -1,3 +1,7 @@ +2014-04-30 Arnold D. Robbins <arnold@skeeve.com> + + * gawktexi.in: Editing progress. Through Chapter 5. + 2014-04-29 Arnold D. Robbins <arnold@skeeve.com> * gawktexi.in: Editing progress. Through Chapter 3. diff --git a/doc/gawk.info b/doc/gawk.info index 1d5f496d..57fdc3d4 100644 --- a/doc/gawk.info +++ b/doc/gawk.info @@ -651,15 +651,15 @@ texts being (a) (see below), and with the Back-Cover Texts being (b) * Basic High Level:: The high level view. * Basic Data Typing:: A very quick intro to data types. - To Miriam, for making me complete. + To my parents, for their love, and for the wonderful example they +set for me. - To Chana, for the joy you bring us. + To my wife Miriam, for making me complete. Thank you for building +your life together with me. - To Rivka, for the exponential increase. + To our children Chana, Rivka, Nachum and Malka, for enrichening our +lives in innumerable ways. - To Nachum, for the added dimension. - - To Malka, for the new beginning. File: gawk.info, Node: Foreword, Next: Preface, Prev: Top, Up: Top @@ -3939,8 +3939,19 @@ started. Another built-in variable, `NR', records the total number of input records read so far from all data files. It starts at zero, but is never automatically reset to zero. - Records are separated by a character called the "record separator". -By default, the record separator is the newline character. This is why +* Menu: + +* awk split records:: How standard `awk' splits records. +* gawk split records:: How `gawk' splits records. + + +File: gawk.info, Node: awk split records, Next: gawk split records, Up: Records + +4.1.1 Record Splitting With Standard `awk' +------------------------------------------ + +Records are separated by a character called the "record separator". By +default, the record separator is the newline character. This is why records are, by default, single lines. A different character can be used for the record separator by assigning the character to the built-in variable `RS'. @@ -4060,16 +4071,22 @@ affected. After the end of the record has been determined, `gawk' sets the variable `RT' to the text in the input that matched `RS'. - When using `gawk', the value of `RS' is not limited to a -one-character string. It can be any regular expression (*note -Regexp::). (c.e.) In general, each record ends at the next string that -matches the regular expression; the next record starts at the end of -the matching string. This general rule is actually at work in the -usual case, where `RS' contains just a newline: a record ends at the -beginning of the next matching string (the next newline in the input), -and the following record starts just after the end of this string (at -the first character of the following line). The newline, because it -matches `RS', is not part of either record. + +File: gawk.info, Node: gawk split records, Prev: awk split records, Up: Records + +4.1.2 Record Splitting With `gawk' +---------------------------------- + +When using `gawk', the value of `RS' is not limited to a one-character +string. It can be any regular expression (*note Regexp::). (c.e.) In +general, each record ends at the next string that matches the regular +expression; the next record starts at the end of the matching string. +This general rule is actually at work in the usual case, where `RS' +contains just a newline: a record ends at the beginning of the next +matching string (the next newline in the input), and the following +record starts just after the end of this string (at the first character +of the following line). The newline, because it matches `RS', is not +part of either record. When `RS' is a single character, `RT' contains the same single character. However, when `RS' is a regular expression, `RT' contains @@ -4132,8 +4149,10 @@ use for `RS' in this case: BEGIN { RS = "\0" } # whole file becomes one record? `gawk' in fact accepts this, and uses the NUL character for the -record separator. However, this usage is _not_ portable to most other -`awk' implementations. +record separator. This works for certain special files, such as +`/proc/environ' on GNU/Linux systems, where the NUL character is in +fact the record separator. However, this usage is _not_ portable to +most other `awk' implementations. Almost all other `awk' implementations(1) store strings internally as C-style strings. C strings use the NUL character as the string @@ -4144,10 +4163,9 @@ terminator. In effect, this means that `RS = "\0"' is the same as `RS as a record separator. However, this is a special case: `mawk' does not allow embedded NUL characters in strings. - The best way to treat a whole file as a single record is to simply -read the file in, one record at a time, concatenating each record onto -the end of the previous ones. - + *Note Readfile Function::, for an interesting, portable way to read +whole files. If you are using `gawk', see *note Extension Sample +Readfile::, for another option. ---------- Footnotes ---------- @@ -4172,7 +4190,7 @@ to these pieces of the record. You don't have to use them--you can operate on the whole record if you want--but fields are what make simple `awk' programs so powerful. - A dollar-sign (`$') is used to refer to a field in an `awk' program, + You use a dollar-sign (`$') to refer to a field in an `awk' program, followed by the number of the field you want. Thus, `$1' refers to the first field, `$2' to the second, and so on. (Unlike the Unix shells, the field numbers are not limited to single digits. `$127' is the one @@ -4195,8 +4213,9 @@ the last one (such as `$8' when the record has only seven fields), you get the empty string. (If used in a numeric operation, you get zero.) The use of `$0', which looks like a reference to the "zero-th" -field, is a special case: it represents the whole input record when you -are not interested in specific fields. Here are some more examples: +field, is a special case: it represents the whole input record. Use it +when you are not interested in specific fields. Here are some more +examples: $ awk '$1 ~ /li/ { print $0 }' mail-list -| Amelia 555-5553 amelia.zodiacusque@gmail.com F @@ -4228,11 +4247,11 @@ File: gawk.info, Node: Nonconstant Fields, Next: Changing Fields, Prev: Field 4.3 Nonconstant Field Numbers ============================= -The number of a field does not need to be a constant. Any expression in -the `awk' language can be used after a `$' to refer to a field. The -value of the expression specifies the field number. If the value is a -string, rather than a number, it is converted to a number. Consider -this example: +A field number need not be a constant. Any expression in the `awk' +language can be used after a `$' to refer to a field. The value of the +expression specifies the field number. If the value is a string, +rather than a number, it is converted to a number. Consider this +example: awk '{ print $NR }' @@ -4249,7 +4268,7 @@ another example of using expressions as field numbers: number of the field to print. The `*' sign represents multiplication, so the expression `2*2' evaluates to four. The parentheses are used so that the multiplication is done before the `$' operation; they are -necessary whenever there is a binary operator in the field-number +necessary whenever there is a binary operator(1) in the field-number expression. This example, then, prints the type of relationship (the fourth field) for every line of the file `mail-list'. (All of the `awk' operators are listed, in order of decreasing precedence, in *note @@ -4268,6 +4287,12 @@ Variables::). The expression `$NF' is not a special feature--it is the direct consequence of evaluating `NF' and using its value as a field number. + ---------- Footnotes ---------- + + (1) A "binary operator", such as `*' for multiplication, is one that +takes two operands. The distinction is required, since `awk' also has +unary (one-operand) and ternary (three-operand) operators. + File: gawk.info, Node: Changing Fields, Next: Field Separators, Prev: Nonconstant Fields, Up: Reading Files @@ -4293,11 +4318,11 @@ three minus ten: `$3 - 10'. (*Note Arithmetic Ops::.) Then it prints the original and new values for field three. (Someone in the warehouse made a consistent mistake while inventorying the red boxes.) - For this to work, the text in field `$3' must make sense as a -number; the string of characters must be converted to a number for the -computer to do arithmetic on it. The number resulting from the -subtraction is converted back to a string of characters that then -becomes field three. *Note Conversion::. + For this to work, the text in `$3' must make sense as a number; the +string of characters must be converted to a number for the computer to +do arithmetic on it. The number resulting from the subtraction is +converted back to a string of characters that then becomes field three. +*Note Conversion::. When the value of a field is changed (as perceived by `awk'), the text of the input record is recalculated to contain the new field where @@ -4362,7 +4387,7 @@ even when you assign the empty string to a field. For example: -| a::c:d -| 4 -The field is still there; it just has an empty value, denoted by the +The field is still there; it just has an empty value, delimited by the two colons between `a' and `c'. This example shows what happens if you create a new field: @@ -4987,7 +5012,7 @@ affects field splitting with `FPAT'. deal with this. Since there is no formal specification for CSV data, there isn't much more to be done; the `FPAT' mechanism provides an elegant solution for the majority of cases, and the - `gawk' maintainer is satisfied with that. + `gawk' developers are satisfied with that. As written, the regexp used for `FPAT' requires that each field have a least one character. A straightforward modification (changing @@ -5037,8 +5062,8 @@ doesn't start until the first nonblank line that follows--no matter how many blank lines appear in a row, they are considered one record separator. - There is an important difference between `RS = ""' and `RS = -"\n\n+"'. In the first case, leading newlines in the input data file + However, there is an important difference between `RS = ""' and `RS += "\n\n+"'. In the first case, leading newlines in the input data file are ignored, and if a file ends without extra blank lines after the last record, the final newline is removed from the record. In the second case, this special processing is not done. (d.c.) @@ -5310,9 +5335,9 @@ are changed, resulting in a new value of `NF'. `RT' is also set. According to POSIX, `getline < EXPRESSION' is ambiguous if EXPRESSION contains unparenthesized operators other than `$'; for example, `getline < dir "/" file' is ambiguous because the -concatenation operator is not parenthesized. You should write it as -`getline < (dir "/" file)' if you want your program to be portable to -all `awk' implementations. +concatenation operator (not discussed yet; *note Concatenation::) is +not parenthesized. You should write it as `getline < (dir "/" file)' if +you want your program to be portable to all `awk' implementations. File: gawk.info, Node: Getline/Variable/File, Next: Getline/Pipe, Prev: Getline/File, Up: Getline @@ -5517,10 +5542,10 @@ in mind: testing the new record against every pattern. However, the new record is tested against any subsequent rules. - * Many `awk' implementations limit the number of pipelines that an - `awk' program may have open to just one. In `gawk', there is no - such limit. You can open as many pipelines (and coprocesses) as - the underlying operating system permits. + * Some very old `awk' implementations limit the number of pipelines + that an `awk' program may have open to just one. In `gawk', there + is no such limit. You can open as many pipelines (and + coprocesses) as the underlying operating system permits. * An interesting side effect occurs if you use `getline' without a redirection inside a `BEGIN' rule. Because an unredirected @@ -5559,9 +5584,9 @@ in mind: file is encountered, before the element in `a' is assigned? `gawk' treats `getline' like a function call, and evaluates the - expression `a[++c]' before attempting to read from `f'. Other - versions of `awk' only evaluate the expression once they know that - there is a string value to be assigned. Caveat Emptor. + expression `a[++c]' before attempting to read from `f'. However, + some versions of `awk' only evaluate the expression once they know + that there is a string value to be assigned. Caveat Emptor. File: gawk.info, Node: Getline Summary, Prev: Getline Notes, Up: Getline @@ -5597,10 +5622,12 @@ File: gawk.info, Node: Read Timeout, Next: Command line directories, Prev: Ge 4.10 Reading Input With A Timeout ================================= -You may specify a timeout in milliseconds for reading input from the -keyboard, pipe or two-way communication including, TCP/IP sockets. This -can be done on a per input, command or connection basis, by setting a -special element in the `PROCINFO' array: +This minor node describes a feature that is specific to `gawk'. + + You may specify a timeout in milliseconds for reading input from the +keyboard, a pipe, or two-way communication, including TCP/IP sockets. +This can be done on a per input, command or connection basis, by +setting a special element in the `PROCINFO' (*note Auto-set::) array: PROCINFO["input_name", "READ_TIMEOUT"] = TIMEOUT IN MILLISECONDS @@ -5623,10 +5650,10 @@ for more than five seconds: while ((getline < "/dev/stdin") > 0) print $0 - `gawk' will terminate the read operation if input does not arrive -after waiting for the timeout period, return failure and set the -`ERRNO' variable to an appropriate string value. A negative or zero -value for the timeout is the same as specifying no timeout at all. + `gawk' terminates the read operation if input does not arrive after +waiting for the timeout period, returns failure and sets the `ERRNO' +variable to an appropriate string value. A negative or zero value for +the timeout is the same as specifying no timeout at all. A timeout can also be set for reading from the keyboard in the implicit loop that reads input records and matches them against @@ -5690,14 +5717,21 @@ File: gawk.info, Node: Command line directories, Prev: Read Timeout, Up: Read ==================================== According to the POSIX standard, files named on the `awk' command line -must be text files. It is a fatal error if they are not. Most -versions of `awk' treat a directory on the command line as a fatal -error. +must be text files; it is a fatal error if they are not. Most versions +of `awk' treat a directory on the command line as a fatal error. By default, `gawk' produces a warning for a directory on the command -line, but otherwise ignores it. If either of the `--posix' or -`--traditional' options is given, then `gawk' reverts to treating a -directory on the command line as a fatal error. +line, but otherwise ignores it. This makes it easier to use shell +wildcards with your `awk' program: + + $ gawk -f whizprog.awk * Directories could kill this progam + + If either of the `--posix' or `--traditional' options is given, then +`gawk' reverts to treating a directory on the command line as a fatal +error. + + *Note Extension Sample Readdir::, for a way to treat directories as +usable data from an `awk' program. File: gawk.info, Node: Printing, Next: Expressions, Prev: Reading Files, Up: Top @@ -5741,9 +5775,9 @@ File: gawk.info, Node: Print, Next: Print Examples, Up: Printing ========================= The `print' statement is used for producing output with simple, -standardized formatting. Specify only the strings or numbers to print, -in a list separated by commas. They are output, separated by single -spaces, followed by a newline. The statement looks like this: +standardized formatting. You specify only the strings or numbers to +print, in a list separated by commas. They are output, separated by +single spaces, followed by a newline. The statement looks like this: print ITEM1, ITEM2, ... @@ -5810,8 +5844,8 @@ Here is the same program, without the comma: To someone unfamiliar with the `inventory-shipped' file, neither example's output makes much sense. A heading line at the beginning would make it clearer. Let's add some headings to our table of months -(`$1') and green crates shipped (`$2'). We do this using the `BEGIN' -pattern (*note BEGIN/END::) so that the headings are only printed once: +(`$1') and green crates shipped (`$2'). We do this using a `BEGIN' +rule (*note BEGIN/END::) so that the headings are only printed once: awk 'BEGIN { print "Month Crates" print "----- ------" } @@ -6050,7 +6084,8 @@ width. Here is a list of the format-control letters: On systems supporting IEEE 754 floating point format, values representing negative infinity are formatted as `-inf' or `-infinity', and positive infinity as `inf' and `infinity'. The - special "not a number" value formats as `-nan' or `nan'. + special "not a number" value formats as `-nan' or `nan' (*note + General Arithmetic::). `%F' Like `%f' but the infinity and "not a number" values are spelled @@ -6250,11 +6285,12 @@ string, like so: This is not particularly easy to read but it does work. - C programmers may be used to supplying additional `l', `L', and `h' -modifiers in `printf' format strings. These are not valid in `awk'. -Most `awk' implementations silently ignore them. If `--lint' is -provided on the command line (*note Options::), `gawk' warns about -their use. If `--posix' is supplied, their use is a fatal error. + C programmers may be used to supplying additional modifiers (`h', +`j', `l', `L', `t', and `z') in `printf' format strings. These are not +valid in `awk'. Most `awk' implementations silently ignore them. If +`--lint' is provided on the command line (*note Options::), `gawk' +warns about their use. If `--posix' is supplied, their use is a fatal +error. File: gawk.info, Node: Printf Examples, Prev: Format Modifiers, Up: Printf @@ -6295,7 +6331,7 @@ they are last on their lines. They don't need to have spaces after them. The table could be made to look even nicer by adding headings to the -tops of the columns. This is done using the `BEGIN' pattern (*note +tops of the columns. This is done using a `BEGIN' rule (*note BEGIN/END::) so that the headers are only printed once, at the beginning of the `awk' program: @@ -6346,7 +6382,7 @@ commands, except that they are written inside the `awk' program. There are four forms of output redirection: output to a file, output appended to a file, output through a pipe to another command, and output -to a coprocess. They are all shown for the `print' statement, but they +to a coprocess. We show them all for the `print' statement, but they work identically for `printf': `print ITEMS > OUTPUT-FILE' @@ -6427,7 +6463,7 @@ work identically for `printf': FILE or COMMAND--it is not necessary to always use a string constant. Using a variable is generally a good idea, because (if you mean to refer to that same file or command) `awk' requires - that the string value be spelled identically every time. + that the string value be written identically every time. `print ITEMS |& COMMAND' This redirection prints the items to the input of COMMAND. The @@ -6539,7 +6575,7 @@ run from a background job, it may not have a terminal at all. Then opening `/dev/tty' fails. `gawk' provides special file names for accessing the three standard -streams. (c.e.). It also provides syntax for accessing any other +streams. (c.e.) It also provides syntax for accessing any other inherited open files. If the file name matches one of these special names when `gawk' redirects input or output, then it directly uses the stream that the file name stands for. These special file names work @@ -6732,14 +6768,15 @@ end-of-file return status from `getline'), the child process is not terminated;(1) more importantly, the file descriptor for the pipe is not closed and released until `close()' is called or `awk' exits. - `close()' will silently do nothing if given an argument that does -not represent a file, pipe or coprocess that was opened with a -redirection. + `close()' silently does nothing if given an argument that does not +represent a file, pipe or coprocess that was opened with a redirection. +In such a case, it returns a negative value, indicating an error. In +addition, `gawk' sets `ERRNO' to a string indicating the error. Note also that `close(FILENAME)' has no "magic" effects on the implicit loop that reads through the files named on the command line. -It is, more likely, a close of a file that was never opened, so `awk' -silently does nothing. +It is, more likely, a close of a file that was never opened with a +redirection, so `awk' silently does nothing. When using the `|&' operator to communicate with a coprocess, it is occasionally useful to be able to close one end of the two-way pipe @@ -6753,9 +6790,9 @@ I/O::, which discusses it in more detail and gives an example. Using `close()''s Return Value - In many versions of Unix `awk', the `close()' function is actually a -statement. It is a syntax error to try and use the return value from -`close()': (d.c.) + In many older versions of Unix `awk', the `close()' function is +actually a statement. It is a syntax error to try and use the return +value from `close()': (d.c.) command = "..." command | getline info @@ -30772,7 +30809,7 @@ Index * close() function, portability: Close Files And Pipes. (line 81) * close() function, return value: Close Files And Pipes. - (line 130) + (line 131) * close() function, two-way pipes and: Two-way I/O. (line 77) * Close, Diane <1>: Contributors. (line 20) * Close, Diane: Manual History. (line 41) @@ -30817,7 +30854,7 @@ Index * common extensions, func keyword: Definition Syntax. (line 83) * common extensions, length() applied to an array: String Functions. (line 194) -* common extensions, RS as a regexp: Records. (line 135) +* common extensions, RS as a regexp: gawk split records. (line 6) * common extensions, single character fields: Single Character Fields. (line 6) * comp.lang.awk newsgroup: Bugs. (line 38) @@ -30918,7 +30955,7 @@ Index (line 43) * dark corner, break statement: Break Statement. (line 51) * dark corner, close() function: Close Files And Pipes. - (line 130) + (line 131) * dark corner, command-line arguments: Assignment Options. (line 43) * dark corner, continue statement: Continue Statement. (line 43) * dark corner, CONVFMT variable: Conversion. (line 40) @@ -30934,7 +30971,7 @@ Index * dark corner, format-control characters: Control Letters. (line 18) * dark corner, FS as null string: Single Character Fields. (line 20) -* dark corner, input files: Records. (line 118) +* dark corner, input files: awk split records. (line 110) * dark corner, invoking awk: Command Line. (line 16) * dark corner, length() function: String Functions. (line 180) * dark corner, locale's decimal point character: Conversion. (line 77) @@ -30948,7 +30985,7 @@ Index * dark corner, regexp constants, as arguments to user-defined functions: Using Constant Regexps. (line 43) * dark corner, split() function: String Functions. (line 359) -* dark corner, strings, storing: Records. (line 210) +* dark corner, strings, storing: gawk split records. (line 83) * dark corner, value of ARGV[0]: Auto-set. (line 35) * data, fixed-width: Constant Size. (line 10) * data-driven languages: Basic High Level. (line 85) @@ -31143,19 +31180,23 @@ Index * differences in awk and gawk, print/printf statements: Format Modifiers. (line 13) * differences in awk and gawk, PROCINFO array: Auto-set. (line 133) -* differences in awk and gawk, record separators: Records. (line 132) +* differences in awk and gawk, read timeouts: Read Timeout. (line 6) +* differences in awk and gawk, record separators: awk split records. + (line 124) * differences in awk and gawk, regexp constants: Using Constant Regexps. (line 43) * differences in awk and gawk, regular expressions: Case-sensitivity. (line 26) -* differences in awk and gawk, RS/RT variables: Records. (line 187) +* differences in awk and gawk, RS/RT variables: gawk split records. + (line 58) * differences in awk and gawk, RT variable: Auto-set. (line 266) * differences in awk and gawk, single-character fields: Single Character Fields. (line 6) * differences in awk and gawk, split() function: String Functions. (line 347) * differences in awk and gawk, strings: Scalar Constants. (line 20) -* differences in awk and gawk, strings, storing: Records. (line 206) +* differences in awk and gawk, strings, storing: gawk split records. + (line 77) * differences in awk and gawk, SYMTAB variable: Auto-set. (line 274) * differences in awk and gawk, TEXTDOMAIN variable: User-modified. (line 162) @@ -31214,7 +31255,7 @@ Index * empty array elements: Reference to Elements. (line 18) * empty pattern: Empty. (line 6) -* empty strings: Records. (line 122) +* empty strings: awk split records. (line 114) * empty strings, See null strings: Regexp Field Splitting. (line 43) * enable breakpoint: Breakpoint Control. (line 73) @@ -31256,7 +31297,7 @@ Index * ERRNO variable: Auto-set. (line 73) * ERRNO variable, with BEGINFILE pattern: BEGINFILE/ENDFILE. (line 26) * ERRNO variable, with close() function: Close Files And Pipes. - (line 138) + (line 139) * ERRNO variable, with getline command: Getline. (line 19) * error handling: Special FD. (line 16) * error handling, ERRNO variable and: Auto-set. (line 73) @@ -31335,7 +31376,7 @@ Index * extensions, common, func keyword: Definition Syntax. (line 83) * extensions, common, length() applied to an array: String Functions. (line 194) -* extensions, common, RS as a regexp: Records. (line 135) +* extensions, common, RS as a regexp: gawk split records. (line 6) * extensions, common, single character fields: Single Character Fields. (line 6) * extensions, in gawk, not in POSIX awk: POSIX/GNU. (line 6) @@ -31349,7 +31390,6 @@ Index * FDL (Free Documentation License): GNU Free Documentation License. (line 7) * features, adding to gawk: Adding Code. (line 6) -* features, advanced, See advanced features: Obsolete. (line 6) * features, deprecated: Obsolete. (line 6) * features, undocumented: Undocumented. (line 6) * Fenlason, Jay <1>: Contributors. (line 18) @@ -31589,7 +31629,7 @@ Index * gawk, ERRNO variable in <2>: Auto-set. (line 73) * gawk, ERRNO variable in <3>: BEGINFILE/ENDFILE. (line 26) * gawk, ERRNO variable in <4>: Close Files And Pipes. - (line 138) + (line 139) * gawk, ERRNO variable in: Getline. (line 19) * gawk, escape sequences: Escape Sequences. (line 124) * gawk, extensions, disabling: Options. (line 254) @@ -31644,7 +31684,7 @@ Index * gawk, regular expressions, precedence: Regexp Operators. (line 162) * gawk, RT variable in <1>: Auto-set. (line 266) * gawk, RT variable in <2>: Multiple Line. (line 129) -* gawk, RT variable in: Records. (line 132) +* gawk, RT variable in: awk split records. (line 124) * gawk, See Also awk: Preface. (line 36) * gawk, source code, obtaining: Getting. (line 6) * gawk, splitting fields and: Constant Size. (line 88) @@ -32047,6 +32087,7 @@ Index * mktime: Time Functions. (line 25) * modifiers, in format specifiers: Format Modifiers. (line 6) * monetary information, localization: Explaining gettext. (line 103) +* Moore, Duncan: Getline Notes. (line 40) * MPFR: Gawk and MPFR. (line 6) * msgfmt utility: I18N Example. (line 62) * multiple precision: Arbitrary Precision Arithmetic. @@ -32071,7 +32112,7 @@ Index * newlines: Statements/Lines. (line 6) * newlines, as field separators: Default Field Splitting. (line 6) -* newlines, as record separators: Records. (line 20) +* newlines, as record separators: awk split records. (line 12) * newlines, in dynamic regexps: Computed Regexps. (line 59) * newlines, in regexp constants: Computed Regexps. (line 69) * newlines, printing: Print Examples. (line 12) @@ -32112,7 +32153,7 @@ Index * null strings <2>: Truth Values. (line 6) * null strings <3>: Regexp Field Splitting. (line 43) -* null strings: Records. (line 122) +* null strings: awk split records. (line 114) * null strings in gawk arguments, quoting and: Quoting. (line 79) * null strings, and deleting array elements: Delete. (line 27) * null strings, as array subscripts: Uninitialized Subscripts. @@ -32274,7 +32315,8 @@ Index (line 112) * portability, close() function and: Close Files And Pipes. (line 81) -* portability, data files as single record: Records. (line 194) +* portability, data files as single record: gawk split records. + (line 65) * portability, deleting array elements: Delete. (line 56) * portability, example programs: Library Functions. (line 42) * portability, functions, defining: Definition Syntax. (line 99) @@ -32473,17 +32515,18 @@ Index * reading input files: Reading Files. (line 6) * recipe for a programming language: History. (line 6) * record separators <1>: User-modified. (line 143) -* record separators: Records. (line 14) -* record separators, changing: Records. (line 93) -* record separators, regular expressions as: Records. (line 132) +* record separators: awk split records. (line 6) +* record separators, changing: awk split records. (line 85) +* record separators, regular expressions as: awk split records. + (line 124) * record separators, with multiline records: Multiple Line. (line 10) * records <1>: Basic High Level. (line 73) * records: Reading Files. (line 14) * records, multiline: Multiple Line. (line 6) * records, printing: Print. (line 22) * records, splitting input into: Records. (line 6) -* records, terminating: Records. (line 132) -* records, treating files as: Records. (line 219) +* records, terminating: awk split records. (line 124) +* records, treating files as: gawk split records. (line 92) * recursive functions: Definition Syntax. (line 73) * redirect gawk output, in debugger: Debugger Info. (line 72) * redirection of input: Getline/File. (line 6) @@ -32510,7 +32553,8 @@ Index (line 6) * regular expressions, as patterns <1>: Regexp Patterns. (line 6) * regular expressions, as patterns: Regexp Usage. (line 6) -* regular expressions, as record separators: Records. (line 132) +* regular expressions, as record separators: awk split records. + (line 124) * regular expressions, case sensitivity <1>: User-modified. (line 82) * regular expressions, case sensitivity: Case-sensitivity. (line 6) * regular expressions, computed: Computed Regexps. (line 6) @@ -32542,7 +32586,7 @@ Index (line 54) * return statement, user-defined functions: Return Statement. (line 6) * return value, close() function: Close Files And Pipes. - (line 130) + (line 131) * rev() user-defined function: Function Example. (line 53) * revoutput extension: Extension Sample Revout. (line 11) @@ -32587,14 +32631,14 @@ Index (line 6) * ROUNDMODE variable: User-modified. (line 138) * RS variable <1>: User-modified. (line 143) -* RS variable: Records. (line 20) +* RS variable: awk split records. (line 12) * RS variable, multiline records and: Multiple Line. (line 17) * rshift: Bitwise Functions. (line 52) * RSTART variable: Auto-set. (line 259) * RSTART variable, match() function and: String Functions. (line 221) * RT variable <1>: Auto-set. (line 266) * RT variable <2>: Multiple Line. (line 129) -* RT variable: Records. (line 132) +* RT variable: awk split records. (line 124) * Rubin, Paul <1>: Contributors. (line 15) * Rubin, Paul: History. (line 30) * rule, definition of: Getting Started. (line 21) @@ -32644,8 +32688,9 @@ Index * separators, field, FPAT variable and: User-modified. (line 45) * separators, field, POSIX and: Fields. (line 6) * separators, for records <1>: User-modified. (line 143) -* separators, for records: Records. (line 14) -* separators, for records, regular expressions as: Records. (line 132) +* separators, for records: awk split records. (line 6) +* separators, for records, regular expressions as: awk split records. + (line 124) * separators, for statements in actions: Action Overview. (line 19) * separators, subscript: User-modified. (line 156) * set breakpoint: Breakpoint Control. (line 11) @@ -32712,7 +32757,7 @@ Index * sidebar, Piping into sh: Redirection. (line 140) * sidebar, Portability Issues with #!: Executable Scripts. (line 31) * sidebar, Recipe For A Programming Language: History. (line 6) -* sidebar, RS = "\0" Is Not Portable: Records. (line 192) +* sidebar, RS = "\0" Is Not Portable: gawk split records. (line 63) * sidebar, So Why Does gawk have BEGINFILE and ENDFILE?: Filetrans Function. (line 83) * sidebar, Syntactic Ambiguities Between /= and Regular Expressions: Assignment Ops. @@ -32721,7 +32766,7 @@ Index * sidebar, Using \n in Bracket Expressions of Dynamic Regexps: Computed Regexps. (line 57) * sidebar, Using close()'s Return Value: Close Files And Pipes. - (line 128) + (line 129) * SIGHUP signal, for dynamic profiling: Profiling. (line 211) * SIGINT signal (MS-Windows): Profiling. (line 214) * signals, HUP/SIGHUP, for profiling: Profiling. (line 211) @@ -32829,7 +32874,7 @@ Index * strings, converting: Conversion. (line 6) * strings, converting letter case: String Functions. (line 520) * strings, converting, numbers to: User-modified. (line 28) -* strings, empty, See null strings: Records. (line 122) +* strings, empty, See null strings: awk split records. (line 114) * strings, extracting: String Extraction. (line 6) * strings, for localization: Programmer i18n. (line 14) * strings, length limitations: Scalar Constants. (line 20) @@ -32875,7 +32920,7 @@ Index * tee utility: Tee Program. (line 6) * tee.awk program: Tee Program. (line 26) * temporary breakpoint: Breakpoint Control. (line 90) -* terminating records: Records. (line 132) +* terminating records: awk split records. (line 124) * testbits.awk program: Bitwise Functions. (line 70) * testext extension: Extension Sample API Tests. (line 6) @@ -32923,7 +32968,7 @@ Index * traceback, display in debugger: Execution Stack. (line 13) * translate string: I18N Functions. (line 22) * translate.awk program: Translate Program. (line 55) -* treating files, as single records: Records. (line 219) +* treating files, as single records: gawk split records. (line 92) * troubleshooting, --non-decimal-data option: Options. (line 211) * troubleshooting, == operator: Comparison Operators. (line 37) @@ -32988,7 +33033,7 @@ Index * Unix awk, backslashes in escape sequences: Escape Sequences. (line 124) * Unix awk, close() function and: Close Files And Pipes. - (line 130) + (line 131) * Unix awk, password files, field separators and: Command Line Field Separator. (line 64) * Unix, awk scripts and: Executable Scripts. (line 6) @@ -33116,7 +33161,7 @@ Index * | (vertical bar), |& operator (I/O) <3>: Redirection. (line 102) * | (vertical bar), |& operator (I/O): Getline/Coprocess. (line 6) * | (vertical bar), |& operator (I/O), pipes, closing: Close Files And Pipes. - (line 118) + (line 119) * | (vertical bar), || operator <1>: Precedence. (line 89) * | (vertical bar), || operator: Boolean Ops. (line 57) * ~ (tilde), ~ operator <1>: Expression Patterns. (line 24) @@ -33132,529 +33177,532 @@ Index Tag Table: Node: Top1292 -Node: Foreword40825 -Node: Preface45170 -Ref: Preface-Footnote-148303 -Ref: Preface-Footnote-248410 -Node: History48642 -Node: Names51016 -Ref: Names-Footnote-152480 -Node: This Manual52553 -Ref: This Manual-Footnote-158327 -Node: Conventions58427 -Node: Manual History60583 -Ref: Manual History-Footnote-164013 -Ref: Manual History-Footnote-264054 -Node: How To Contribute64128 -Node: Acknowledgments65367 -Node: Getting Started69561 -Node: Running gawk71940 -Node: One-shot73130 -Node: Read Terminal74355 -Ref: Read Terminal-Footnote-176005 -Ref: Read Terminal-Footnote-276281 -Node: Long76452 -Node: Executable Scripts77828 -Ref: Executable Scripts-Footnote-179661 -Ref: Executable Scripts-Footnote-279763 -Node: Comments80310 -Node: Quoting82777 -Node: DOS Quoting88093 -Node: Sample Data Files88768 -Node: Very Simple91283 -Node: Two Rules95933 -Node: More Complex97828 -Ref: More Complex-Footnote-1100760 -Node: Statements/Lines100845 -Ref: Statements/Lines-Footnote-1105300 -Node: Other Features105565 -Node: When106493 -Node: Invoking Gawk108641 -Node: Command Line110104 -Node: Options110887 -Ref: Options-Footnote-1126699 -Node: Other Arguments126724 -Node: Naming Standard Input129386 -Node: Environment Variables130480 -Node: AWKPATH Variable131038 -Ref: AWKPATH Variable-Footnote-1133816 -Ref: AWKPATH Variable-Footnote-2133861 -Node: AWKLIBPATH Variable134121 -Node: Other Environment Variables134880 -Node: Exit Status138045 -Node: Include Files138720 -Node: Loading Shared Libraries142298 -Node: Obsolete143681 -Node: Undocumented144378 -Node: Regexp144620 -Node: Regexp Usage146009 -Node: Escape Sequences148042 -Node: Regexp Operators153709 -Ref: Regexp Operators-Footnote-1161189 -Ref: Regexp Operators-Footnote-2161336 -Node: Bracket Expressions161434 -Ref: table-char-classes163324 -Node: GNU Regexp Operators165847 -Node: Case-sensitivity169570 -Ref: Case-sensitivity-Footnote-1172462 -Ref: Case-sensitivity-Footnote-2172697 -Node: Leftmost Longest172805 -Node: Computed Regexps174006 -Node: Reading Files177355 -Node: Records179357 -Ref: Records-Footnote-1188880 -Node: Fields188917 -Ref: Fields-Footnote-1191873 -Node: Nonconstant Fields191959 -Node: Changing Fields194165 -Node: Field Separators200124 -Node: Default Field Splitting202826 -Node: Regexp Field Splitting203943 -Node: Single Character Fields207284 -Node: Command Line Field Separator208343 -Node: Full Line Fields211685 -Ref: Full Line Fields-Footnote-1212193 -Node: Field Splitting Summary212239 -Ref: Field Splitting Summary-Footnote-1215338 -Node: Constant Size215439 -Node: Splitting By Content220046 -Ref: Splitting By Content-Footnote-1223795 -Node: Multiple Line223835 -Ref: Multiple Line-Footnote-1229682 -Node: Getline229861 -Node: Plain Getline232077 -Node: Getline/Variable234172 -Node: Getline/File235319 -Node: Getline/Variable/File236660 -Ref: Getline/Variable/File-Footnote-1238259 -Node: Getline/Pipe238346 -Node: Getline/Variable/Pipe241045 -Node: Getline/Coprocess242152 -Node: Getline/Variable/Coprocess243404 -Node: Getline Notes244141 -Node: Getline Summary246928 -Ref: table-getline-variants247336 -Node: Read Timeout248248 -Ref: Read Timeout-Footnote-1251987 -Node: Command line directories252045 -Node: Printing252675 -Node: Print254306 -Node: Print Examples255643 -Node: Output Separators258427 -Node: OFMT260443 -Node: Printf261801 -Node: Basic Printf262707 -Node: Control Letters264246 -Node: Format Modifiers268066 -Node: Printf Examples274075 -Node: Redirection276787 -Node: Special Files283761 -Node: Special FD284294 -Ref: Special FD-Footnote-1287919 -Node: Special Network287993 -Node: Special Caveats288843 -Node: Close Files And Pipes289639 -Ref: Close Files And Pipes-Footnote-1296622 -Ref: Close Files And Pipes-Footnote-2296770 -Node: Expressions296920 -Node: Values298052 -Node: Constants298728 -Node: Scalar Constants299408 -Ref: Scalar Constants-Footnote-1300267 -Node: Nondecimal-numbers300449 -Node: Regexp Constants303449 -Node: Using Constant Regexps303924 -Node: Variables306979 -Node: Using Variables307634 -Node: Assignment Options309358 -Node: Conversion311233 -Ref: table-locale-affects316733 -Ref: Conversion-Footnote-1317357 -Node: All Operators317466 -Node: Arithmetic Ops318096 -Node: Concatenation320601 -Ref: Concatenation-Footnote-1323389 -Node: Assignment Ops323509 -Ref: table-assign-ops328497 -Node: Increment Ops329828 -Node: Truth Values and Conditions333262 -Node: Truth Values334345 -Node: Typing and Comparison335394 -Node: Variable Typing336187 -Ref: Variable Typing-Footnote-1340084 -Node: Comparison Operators340206 -Ref: table-relational-ops340616 -Node: POSIX String Comparison344164 -Ref: POSIX String Comparison-Footnote-1345120 -Node: Boolean Ops345258 -Ref: Boolean Ops-Footnote-1349328 -Node: Conditional Exp349419 -Node: Function Calls351151 -Node: Precedence354745 -Node: Locales358414 -Node: Patterns and Actions359503 -Node: Pattern Overview360557 -Node: Regexp Patterns362226 -Node: Expression Patterns362769 -Node: Ranges366550 -Node: BEGIN/END369654 -Node: Using BEGIN/END370416 -Ref: Using BEGIN/END-Footnote-1373152 -Node: I/O And BEGIN/END373258 -Node: BEGINFILE/ENDFILE375540 -Node: Empty378454 -Node: Using Shell Variables378771 -Node: Action Overview381056 -Node: Statements383413 -Node: If Statement385267 -Node: While Statement386766 -Node: Do Statement388810 -Node: For Statement389966 -Node: Switch Statement393118 -Node: Break Statement395272 -Node: Continue Statement397262 -Node: Next Statement399055 -Node: Nextfile Statement401445 -Node: Exit Statement404100 -Node: Built-in Variables406516 -Node: User-modified407611 -Ref: User-modified-Footnote-1415969 -Node: Auto-set416031 -Ref: Auto-set-Footnote-1429098 -Ref: Auto-set-Footnote-2429303 -Node: ARGC and ARGV429359 -Node: Arrays433213 -Node: Array Basics434718 -Node: Array Intro435544 -Node: Reference to Elements439861 -Node: Assigning Elements442131 -Node: Array Example442622 -Node: Scanning an Array444354 -Node: Controlling Scanning446668 -Ref: Controlling Scanning-Footnote-1451755 -Node: Delete452071 -Ref: Delete-Footnote-1454836 -Node: Numeric Array Subscripts454893 -Node: Uninitialized Subscripts457076 -Node: Multidimensional458703 -Node: Multiscanning461796 -Node: Arrays of Arrays463385 -Node: Functions468025 -Node: Built-in468844 -Node: Calling Built-in469922 -Node: Numeric Functions471910 -Ref: Numeric Functions-Footnote-1475744 -Ref: Numeric Functions-Footnote-2476101 -Ref: Numeric Functions-Footnote-3476149 -Node: String Functions476418 -Ref: String Functions-Footnote-1499421 -Ref: String Functions-Footnote-2499550 -Ref: String Functions-Footnote-3499798 -Node: Gory Details499885 -Ref: table-sub-escapes501564 -Ref: table-sub-posix-92502918 -Ref: table-sub-proposed504269 -Ref: table-posix-sub505623 -Ref: table-gensub-escapes507168 -Ref: Gory Details-Footnote-1508344 -Ref: Gory Details-Footnote-2508395 -Node: I/O Functions508546 -Ref: I/O Functions-Footnote-1515542 -Node: Time Functions515689 -Ref: Time Functions-Footnote-1526682 -Ref: Time Functions-Footnote-2526750 -Ref: Time Functions-Footnote-3526908 -Ref: Time Functions-Footnote-4527019 -Ref: Time Functions-Footnote-5527131 -Ref: Time Functions-Footnote-6527358 -Node: Bitwise Functions527624 -Ref: table-bitwise-ops528186 -Ref: Bitwise Functions-Footnote-1532431 -Node: Type Functions532615 -Node: I18N Functions533766 -Node: User-defined535418 -Node: Definition Syntax536222 -Ref: Definition Syntax-Footnote-1541136 -Node: Function Example541205 -Ref: Function Example-Footnote-1543854 -Node: Function Caveats543876 -Node: Calling A Function544394 -Node: Variable Scope545349 -Node: Pass By Value/Reference548312 -Node: Return Statement551820 -Node: Dynamic Typing554801 -Node: Indirect Calls555732 -Node: Library Functions565419 -Ref: Library Functions-Footnote-1568932 -Ref: Library Functions-Footnote-2569075 -Node: Library Names569246 -Ref: Library Names-Footnote-1572719 -Ref: Library Names-Footnote-2572939 -Node: General Functions573025 -Node: Strtonum Function574053 -Node: Assert Function576983 -Node: Round Function580309 -Node: Cliff Random Function581850 -Node: Ordinal Functions582866 -Ref: Ordinal Functions-Footnote-1585943 -Ref: Ordinal Functions-Footnote-2586195 -Node: Join Function586406 -Ref: Join Function-Footnote-1588177 -Node: Getlocaltime Function588377 -Node: Readfile Function592118 -Node: Data File Management593957 -Node: Filetrans Function594589 -Node: Rewind Function598658 -Node: File Checking600045 -Node: Empty Files601139 -Node: Ignoring Assigns603369 -Node: Getopt Function604923 -Ref: Getopt Function-Footnote-1616226 -Node: Passwd Functions616429 -Ref: Passwd Functions-Footnote-1625407 -Node: Group Functions625495 -Node: Walking Arrays633579 -Node: Sample Programs635715 -Node: Running Examples636389 -Node: Clones637117 -Node: Cut Program638341 -Node: Egrep Program648192 -Ref: Egrep Program-Footnote-1655965 -Node: Id Program656075 -Node: Split Program659724 -Ref: Split Program-Footnote-1663243 -Node: Tee Program663371 -Node: Uniq Program666174 -Node: Wc Program673603 -Ref: Wc Program-Footnote-1677869 -Ref: Wc Program-Footnote-2678069 -Node: Miscellaneous Programs678161 -Node: Dupword Program679349 -Node: Alarm Program681380 -Node: Translate Program686187 -Ref: Translate Program-Footnote-1690574 -Ref: Translate Program-Footnote-2690822 -Node: Labels Program690956 -Ref: Labels Program-Footnote-1694327 -Node: Word Sorting694411 -Node: History Sorting698295 -Node: Extract Program700134 -Ref: Extract Program-Footnote-1707637 -Node: Simple Sed707765 -Node: Igawk Program710827 -Ref: Igawk Program-Footnote-1725998 -Ref: Igawk Program-Footnote-2726199 -Node: Anagram Program726337 -Node: Signature Program729405 -Node: Advanced Features730505 -Node: Nondecimal Data732391 -Node: Array Sorting733974 -Node: Controlling Array Traversal734671 -Node: Array Sorting Functions742955 -Ref: Array Sorting Functions-Footnote-1746824 -Node: Two-way I/O747018 -Ref: Two-way I/O-Footnote-1752450 -Node: TCP/IP Networking752532 -Node: Profiling755376 -Node: Internationalization762879 -Node: I18N and L10N764304 -Node: Explaining gettext764990 -Ref: Explaining gettext-Footnote-1770058 -Ref: Explaining gettext-Footnote-2770242 -Node: Programmer i18n770407 -Node: Translator i18n774634 -Node: String Extraction775428 -Ref: String Extraction-Footnote-1776389 -Node: Printf Ordering776475 -Ref: Printf Ordering-Footnote-1779257 -Node: I18N Portability779321 -Ref: I18N Portability-Footnote-1781770 -Node: I18N Example781833 -Ref: I18N Example-Footnote-1784471 -Node: Gawk I18N784543 -Node: Debugger785164 -Node: Debugging786135 -Node: Debugging Concepts786568 -Node: Debugging Terms788424 -Node: Awk Debugging791021 -Node: Sample Debugging Session791913 -Node: Debugger Invocation792433 -Node: Finding The Bug793766 -Node: List of Debugger Commands800253 -Node: Breakpoint Control801587 -Node: Debugger Execution Control805251 -Node: Viewing And Changing Data808611 -Node: Execution Stack811967 -Node: Debugger Info813434 -Node: Miscellaneous Debugger Commands817428 -Node: Readline Support822606 -Node: Limitations823437 -Node: Arbitrary Precision Arithmetic825689 -Ref: Arbitrary Precision Arithmetic-Footnote-1827338 -Node: General Arithmetic827486 -Node: Floating Point Issues829206 -Node: String Conversion Precision830087 -Ref: String Conversion Precision-Footnote-1831792 -Node: Unexpected Results831901 -Node: POSIX Floating Point Problems834054 -Ref: POSIX Floating Point Problems-Footnote-1837879 -Node: Integer Programming837917 -Node: Floating-point Programming839656 -Ref: Floating-point Programming-Footnote-1845987 -Ref: Floating-point Programming-Footnote-2846257 -Node: Floating-point Representation846521 -Node: Floating-point Context847686 -Ref: table-ieee-formats848525 -Node: Rounding Mode849909 -Ref: table-rounding-modes850388 -Ref: Rounding Mode-Footnote-1853403 -Node: Gawk and MPFR853582 -Node: Arbitrary Precision Floats854991 -Ref: Arbitrary Precision Floats-Footnote-1857434 -Node: Setting Precision857750 -Ref: table-predefined-precision-strings858436 -Node: Setting Rounding Mode860581 -Ref: table-gawk-rounding-modes860985 -Node: Floating-point Constants862172 -Node: Changing Precision863601 -Ref: Changing Precision-Footnote-1864998 -Node: Exact Arithmetic865172 -Node: Arbitrary Precision Integers868310 -Ref: Arbitrary Precision Integers-Footnote-1871325 -Node: Dynamic Extensions871472 -Node: Extension Intro872930 -Node: Plugin License874195 -Node: Extension Mechanism Outline874880 -Ref: load-extension875297 -Ref: load-new-function876775 -Ref: call-new-function877770 -Node: Extension API Description879785 -Node: Extension API Functions Introduction881072 -Node: General Data Types885999 -Ref: General Data Types-Footnote-1891694 -Node: Requesting Values891993 -Ref: table-value-types-returned892730 -Node: Memory Allocation Functions893684 -Ref: Memory Allocation Functions-Footnote-1896430 -Node: Constructor Functions896526 -Node: Registration Functions898284 -Node: Extension Functions898969 -Node: Exit Callback Functions901271 -Node: Extension Version String902520 -Node: Input Parsers903170 -Node: Output Wrappers912927 -Node: Two-way processors917437 -Node: Printing Messages919645 -Ref: Printing Messages-Footnote-1920722 -Node: Updating `ERRNO'920874 -Node: Accessing Parameters921613 -Node: Symbol Table Access922843 -Node: Symbol table by name923357 -Node: Symbol table by cookie925333 -Ref: Symbol table by cookie-Footnote-1929465 -Node: Cached values929528 -Ref: Cached values-Footnote-1933018 -Node: Array Manipulation933109 -Ref: Array Manipulation-Footnote-1934207 -Node: Array Data Types934246 -Ref: Array Data Types-Footnote-1936949 -Node: Array Functions937041 -Node: Flattening Arrays940877 -Node: Creating Arrays947729 -Node: Extension API Variables952454 -Node: Extension Versioning953090 -Node: Extension API Informational Variables954991 -Node: Extension API Boilerplate956077 -Node: Finding Extensions959881 -Node: Extension Example960441 -Node: Internal File Description961171 -Node: Internal File Ops965262 -Ref: Internal File Ops-Footnote-1976771 -Node: Using Internal File Ops976911 -Ref: Using Internal File Ops-Footnote-1979258 -Node: Extension Samples979524 -Node: Extension Sample File Functions981048 -Node: Extension Sample Fnmatch989535 -Node: Extension Sample Fork991304 -Node: Extension Sample Inplace992517 -Node: Extension Sample Ord994295 -Node: Extension Sample Readdir995131 -Node: Extension Sample Revout996663 -Node: Extension Sample Rev2way997256 -Node: Extension Sample Read write array997946 -Node: Extension Sample Readfile999829 -Node: Extension Sample API Tests1000929 -Node: Extension Sample Time1001454 -Node: gawkextlib1002818 -Node: Language History1005599 -Node: V7/SVR3.11007192 -Node: SVR41009512 -Node: POSIX1010954 -Node: BTL1012340 -Node: POSIX/GNU1013074 -Node: Feature History1018673 -Node: Common Extensions1031649 -Node: Ranges and Locales1032961 -Ref: Ranges and Locales-Footnote-11037578 -Ref: Ranges and Locales-Footnote-21037605 -Ref: Ranges and Locales-Footnote-31037839 -Node: Contributors1038060 -Node: Installation1043441 -Node: Gawk Distribution1044335 -Node: Getting1044819 -Node: Extracting1045645 -Node: Distribution contents1047337 -Node: Unix Installation1053058 -Node: Quick Installation1053675 -Node: Additional Configuration Options1056121 -Node: Configuration Philosophy1057857 -Node: Non-Unix Installation1060211 -Node: PC Installation1060669 -Node: PC Binary Installation1061968 -Node: PC Compiling1063816 -Node: PC Testing1066760 -Node: PC Using1067936 -Node: Cygwin1072104 -Node: MSYS1072913 -Node: VMS Installation1073427 -Node: VMS Compilation1074223 -Ref: VMS Compilation-Footnote-11075475 -Node: VMS Dynamic Extensions1075533 -Node: VMS Installation Details1076906 -Node: VMS Running1079157 -Node: VMS GNV1081991 -Node: VMS Old Gawk1082714 -Node: Bugs1083184 -Node: Other Versions1087102 -Node: Notes1093186 -Node: Compatibility Mode1093986 -Node: Additions1094769 -Node: Accessing The Source1095696 -Node: Adding Code1097136 -Node: New Ports1103181 -Node: Derived Files1107316 -Ref: Derived Files-Footnote-11112637 -Ref: Derived Files-Footnote-21112671 -Ref: Derived Files-Footnote-31113271 -Node: Future Extensions1113369 -Node: Implementation Limitations1113952 -Node: Extension Design1115204 -Node: Old Extension Problems1116358 -Ref: Old Extension Problems-Footnote-11117866 -Node: Extension New Mechanism Goals1117923 -Ref: Extension New Mechanism Goals-Footnote-11121288 -Node: Extension Other Design Decisions1121474 -Node: Extension Future Growth1123580 -Node: Old Extension Mechanism1124416 -Node: Basic Concepts1126156 -Node: Basic High Level1126837 -Ref: figure-general-flow1127109 -Ref: figure-process-flow1127708 -Ref: Basic High Level-Footnote-11130937 -Node: Basic Data Typing1131122 -Node: Glossary1134477 -Node: Copying1159708 -Node: GNU Free Documentation License1197264 -Node: Index1222400 +Node: Foreword40832 +Node: Preface45177 +Ref: Preface-Footnote-148310 +Ref: Preface-Footnote-248417 +Node: History48649 +Node: Names51023 +Ref: Names-Footnote-152487 +Node: This Manual52560 +Ref: This Manual-Footnote-158334 +Node: Conventions58434 +Node: Manual History60590 +Ref: Manual History-Footnote-164020 +Ref: Manual History-Footnote-264061 +Node: How To Contribute64135 +Node: Acknowledgments65374 +Node: Getting Started69568 +Node: Running gawk71947 +Node: One-shot73137 +Node: Read Terminal74362 +Ref: Read Terminal-Footnote-176012 +Ref: Read Terminal-Footnote-276288 +Node: Long76459 +Node: Executable Scripts77835 +Ref: Executable Scripts-Footnote-179668 +Ref: Executable Scripts-Footnote-279770 +Node: Comments80317 +Node: Quoting82784 +Node: DOS Quoting88100 +Node: Sample Data Files88775 +Node: Very Simple91290 +Node: Two Rules95940 +Node: More Complex97835 +Ref: More Complex-Footnote-1100767 +Node: Statements/Lines100852 +Ref: Statements/Lines-Footnote-1105307 +Node: Other Features105572 +Node: When106500 +Node: Invoking Gawk108648 +Node: Command Line110111 +Node: Options110894 +Ref: Options-Footnote-1126706 +Node: Other Arguments126731 +Node: Naming Standard Input129393 +Node: Environment Variables130487 +Node: AWKPATH Variable131045 +Ref: AWKPATH Variable-Footnote-1133823 +Ref: AWKPATH Variable-Footnote-2133868 +Node: AWKLIBPATH Variable134128 +Node: Other Environment Variables134887 +Node: Exit Status138052 +Node: Include Files138727 +Node: Loading Shared Libraries142305 +Node: Obsolete143688 +Node: Undocumented144385 +Node: Regexp144627 +Node: Regexp Usage146016 +Node: Escape Sequences148049 +Node: Regexp Operators153716 +Ref: Regexp Operators-Footnote-1161196 +Ref: Regexp Operators-Footnote-2161343 +Node: Bracket Expressions161441 +Ref: table-char-classes163331 +Node: GNU Regexp Operators165854 +Node: Case-sensitivity169577 +Ref: Case-sensitivity-Footnote-1172469 +Ref: Case-sensitivity-Footnote-2172704 +Node: Leftmost Longest172812 +Node: Computed Regexps174013 +Node: Reading Files177362 +Node: Records179364 +Node: awk split records180099 +Node: gawk split records184957 +Ref: gawk split records-Footnote-1189478 +Node: Fields189515 +Ref: Fields-Footnote-1192479 +Node: Nonconstant Fields192565 +Ref: Nonconstant Fields-Footnote-1194795 +Node: Changing Fields194997 +Node: Field Separators200951 +Node: Default Field Splitting203653 +Node: Regexp Field Splitting204770 +Node: Single Character Fields208111 +Node: Command Line Field Separator209170 +Node: Full Line Fields212512 +Ref: Full Line Fields-Footnote-1213020 +Node: Field Splitting Summary213066 +Ref: Field Splitting Summary-Footnote-1216165 +Node: Constant Size216266 +Node: Splitting By Content220873 +Ref: Splitting By Content-Footnote-1224623 +Node: Multiple Line224663 +Ref: Multiple Line-Footnote-1230519 +Node: Getline230698 +Node: Plain Getline232914 +Node: Getline/Variable235009 +Node: Getline/File236156 +Node: Getline/Variable/File237540 +Ref: Getline/Variable/File-Footnote-1239139 +Node: Getline/Pipe239226 +Node: Getline/Variable/Pipe241925 +Node: Getline/Coprocess243032 +Node: Getline/Variable/Coprocess244284 +Node: Getline Notes245021 +Node: Getline Summary247825 +Ref: table-getline-variants248233 +Node: Read Timeout249145 +Ref: Read Timeout-Footnote-1252972 +Node: Command line directories253030 +Node: Printing253912 +Node: Print255543 +Node: Print Examples256884 +Node: Output Separators259663 +Node: OFMT261679 +Node: Printf263037 +Node: Basic Printf263943 +Node: Control Letters265482 +Node: Format Modifiers269336 +Node: Printf Examples275363 +Node: Redirection278070 +Node: Special Files285042 +Node: Special FD285575 +Ref: Special FD-Footnote-1289199 +Node: Special Network289273 +Node: Special Caveats290123 +Node: Close Files And Pipes290919 +Ref: Close Files And Pipes-Footnote-1298057 +Ref: Close Files And Pipes-Footnote-2298205 +Node: Expressions298355 +Node: Values299487 +Node: Constants300163 +Node: Scalar Constants300843 +Ref: Scalar Constants-Footnote-1301702 +Node: Nondecimal-numbers301884 +Node: Regexp Constants304884 +Node: Using Constant Regexps305359 +Node: Variables308414 +Node: Using Variables309069 +Node: Assignment Options310793 +Node: Conversion312668 +Ref: table-locale-affects318168 +Ref: Conversion-Footnote-1318792 +Node: All Operators318901 +Node: Arithmetic Ops319531 +Node: Concatenation322036 +Ref: Concatenation-Footnote-1324824 +Node: Assignment Ops324944 +Ref: table-assign-ops329932 +Node: Increment Ops331263 +Node: Truth Values and Conditions334697 +Node: Truth Values335780 +Node: Typing and Comparison336829 +Node: Variable Typing337622 +Ref: Variable Typing-Footnote-1341519 +Node: Comparison Operators341641 +Ref: table-relational-ops342051 +Node: POSIX String Comparison345599 +Ref: POSIX String Comparison-Footnote-1346555 +Node: Boolean Ops346693 +Ref: Boolean Ops-Footnote-1350763 +Node: Conditional Exp350854 +Node: Function Calls352586 +Node: Precedence356180 +Node: Locales359849 +Node: Patterns and Actions360938 +Node: Pattern Overview361992 +Node: Regexp Patterns363661 +Node: Expression Patterns364204 +Node: Ranges367985 +Node: BEGIN/END371089 +Node: Using BEGIN/END371851 +Ref: Using BEGIN/END-Footnote-1374587 +Node: I/O And BEGIN/END374693 +Node: BEGINFILE/ENDFILE376975 +Node: Empty379889 +Node: Using Shell Variables380206 +Node: Action Overview382491 +Node: Statements384848 +Node: If Statement386702 +Node: While Statement388201 +Node: Do Statement390245 +Node: For Statement391401 +Node: Switch Statement394553 +Node: Break Statement396707 +Node: Continue Statement398697 +Node: Next Statement400490 +Node: Nextfile Statement402880 +Node: Exit Statement405535 +Node: Built-in Variables407951 +Node: User-modified409046 +Ref: User-modified-Footnote-1417404 +Node: Auto-set417466 +Ref: Auto-set-Footnote-1430533 +Ref: Auto-set-Footnote-2430738 +Node: ARGC and ARGV430794 +Node: Arrays434648 +Node: Array Basics436153 +Node: Array Intro436979 +Node: Reference to Elements441296 +Node: Assigning Elements443566 +Node: Array Example444057 +Node: Scanning an Array445789 +Node: Controlling Scanning448103 +Ref: Controlling Scanning-Footnote-1453190 +Node: Delete453506 +Ref: Delete-Footnote-1456271 +Node: Numeric Array Subscripts456328 +Node: Uninitialized Subscripts458511 +Node: Multidimensional460138 +Node: Multiscanning463231 +Node: Arrays of Arrays464820 +Node: Functions469460 +Node: Built-in470279 +Node: Calling Built-in471357 +Node: Numeric Functions473345 +Ref: Numeric Functions-Footnote-1477179 +Ref: Numeric Functions-Footnote-2477536 +Ref: Numeric Functions-Footnote-3477584 +Node: String Functions477853 +Ref: String Functions-Footnote-1500856 +Ref: String Functions-Footnote-2500985 +Ref: String Functions-Footnote-3501233 +Node: Gory Details501320 +Ref: table-sub-escapes502999 +Ref: table-sub-posix-92504353 +Ref: table-sub-proposed505704 +Ref: table-posix-sub507058 +Ref: table-gensub-escapes508603 +Ref: Gory Details-Footnote-1509779 +Ref: Gory Details-Footnote-2509830 +Node: I/O Functions509981 +Ref: I/O Functions-Footnote-1516977 +Node: Time Functions517124 +Ref: Time Functions-Footnote-1528117 +Ref: Time Functions-Footnote-2528185 +Ref: Time Functions-Footnote-3528343 +Ref: Time Functions-Footnote-4528454 +Ref: Time Functions-Footnote-5528566 +Ref: Time Functions-Footnote-6528793 +Node: Bitwise Functions529059 +Ref: table-bitwise-ops529621 +Ref: Bitwise Functions-Footnote-1533866 +Node: Type Functions534050 +Node: I18N Functions535201 +Node: User-defined536853 +Node: Definition Syntax537657 +Ref: Definition Syntax-Footnote-1542571 +Node: Function Example542640 +Ref: Function Example-Footnote-1545289 +Node: Function Caveats545311 +Node: Calling A Function545829 +Node: Variable Scope546784 +Node: Pass By Value/Reference549747 +Node: Return Statement553255 +Node: Dynamic Typing556236 +Node: Indirect Calls557167 +Node: Library Functions566854 +Ref: Library Functions-Footnote-1570367 +Ref: Library Functions-Footnote-2570510 +Node: Library Names570681 +Ref: Library Names-Footnote-1574154 +Ref: Library Names-Footnote-2574374 +Node: General Functions574460 +Node: Strtonum Function575488 +Node: Assert Function578418 +Node: Round Function581744 +Node: Cliff Random Function583285 +Node: Ordinal Functions584301 +Ref: Ordinal Functions-Footnote-1587378 +Ref: Ordinal Functions-Footnote-2587630 +Node: Join Function587841 +Ref: Join Function-Footnote-1589612 +Node: Getlocaltime Function589812 +Node: Readfile Function593553 +Node: Data File Management595392 +Node: Filetrans Function596024 +Node: Rewind Function600093 +Node: File Checking601480 +Node: Empty Files602574 +Node: Ignoring Assigns604804 +Node: Getopt Function606358 +Ref: Getopt Function-Footnote-1617661 +Node: Passwd Functions617864 +Ref: Passwd Functions-Footnote-1626842 +Node: Group Functions626930 +Node: Walking Arrays635014 +Node: Sample Programs637150 +Node: Running Examples637824 +Node: Clones638552 +Node: Cut Program639776 +Node: Egrep Program649627 +Ref: Egrep Program-Footnote-1657400 +Node: Id Program657510 +Node: Split Program661159 +Ref: Split Program-Footnote-1664678 +Node: Tee Program664806 +Node: Uniq Program667609 +Node: Wc Program675038 +Ref: Wc Program-Footnote-1679304 +Ref: Wc Program-Footnote-2679504 +Node: Miscellaneous Programs679596 +Node: Dupword Program680784 +Node: Alarm Program682815 +Node: Translate Program687622 +Ref: Translate Program-Footnote-1692009 +Ref: Translate Program-Footnote-2692257 +Node: Labels Program692391 +Ref: Labels Program-Footnote-1695762 +Node: Word Sorting695846 +Node: History Sorting699730 +Node: Extract Program701569 +Ref: Extract Program-Footnote-1709072 +Node: Simple Sed709200 +Node: Igawk Program712262 +Ref: Igawk Program-Footnote-1727433 +Ref: Igawk Program-Footnote-2727634 +Node: Anagram Program727772 +Node: Signature Program730840 +Node: Advanced Features731940 +Node: Nondecimal Data733826 +Node: Array Sorting735409 +Node: Controlling Array Traversal736106 +Node: Array Sorting Functions744390 +Ref: Array Sorting Functions-Footnote-1748259 +Node: Two-way I/O748453 +Ref: Two-way I/O-Footnote-1753885 +Node: TCP/IP Networking753967 +Node: Profiling756811 +Node: Internationalization764314 +Node: I18N and L10N765739 +Node: Explaining gettext766425 +Ref: Explaining gettext-Footnote-1771493 +Ref: Explaining gettext-Footnote-2771677 +Node: Programmer i18n771842 +Node: Translator i18n776069 +Node: String Extraction776863 +Ref: String Extraction-Footnote-1777824 +Node: Printf Ordering777910 +Ref: Printf Ordering-Footnote-1780692 +Node: I18N Portability780756 +Ref: I18N Portability-Footnote-1783205 +Node: I18N Example783268 +Ref: I18N Example-Footnote-1785906 +Node: Gawk I18N785978 +Node: Debugger786599 +Node: Debugging787570 +Node: Debugging Concepts788003 +Node: Debugging Terms789859 +Node: Awk Debugging792456 +Node: Sample Debugging Session793348 +Node: Debugger Invocation793868 +Node: Finding The Bug795201 +Node: List of Debugger Commands801688 +Node: Breakpoint Control803022 +Node: Debugger Execution Control806686 +Node: Viewing And Changing Data810046 +Node: Execution Stack813402 +Node: Debugger Info814869 +Node: Miscellaneous Debugger Commands818863 +Node: Readline Support824041 +Node: Limitations824872 +Node: Arbitrary Precision Arithmetic827124 +Ref: Arbitrary Precision Arithmetic-Footnote-1828773 +Node: General Arithmetic828921 +Node: Floating Point Issues830641 +Node: String Conversion Precision831522 +Ref: String Conversion Precision-Footnote-1833227 +Node: Unexpected Results833336 +Node: POSIX Floating Point Problems835489 +Ref: POSIX Floating Point Problems-Footnote-1839314 +Node: Integer Programming839352 +Node: Floating-point Programming841091 +Ref: Floating-point Programming-Footnote-1847422 +Ref: Floating-point Programming-Footnote-2847692 +Node: Floating-point Representation847956 +Node: Floating-point Context849121 +Ref: table-ieee-formats849960 +Node: Rounding Mode851344 +Ref: table-rounding-modes851823 +Ref: Rounding Mode-Footnote-1854838 +Node: Gawk and MPFR855017 +Node: Arbitrary Precision Floats856426 +Ref: Arbitrary Precision Floats-Footnote-1858869 +Node: Setting Precision859185 +Ref: table-predefined-precision-strings859871 +Node: Setting Rounding Mode862016 +Ref: table-gawk-rounding-modes862420 +Node: Floating-point Constants863607 +Node: Changing Precision865036 +Ref: Changing Precision-Footnote-1866433 +Node: Exact Arithmetic866607 +Node: Arbitrary Precision Integers869745 +Ref: Arbitrary Precision Integers-Footnote-1872760 +Node: Dynamic Extensions872907 +Node: Extension Intro874365 +Node: Plugin License875630 +Node: Extension Mechanism Outline876315 +Ref: load-extension876732 +Ref: load-new-function878210 +Ref: call-new-function879205 +Node: Extension API Description881220 +Node: Extension API Functions Introduction882507 +Node: General Data Types887434 +Ref: General Data Types-Footnote-1893129 +Node: Requesting Values893428 +Ref: table-value-types-returned894165 +Node: Memory Allocation Functions895119 +Ref: Memory Allocation Functions-Footnote-1897865 +Node: Constructor Functions897961 +Node: Registration Functions899719 +Node: Extension Functions900404 +Node: Exit Callback Functions902706 +Node: Extension Version String903955 +Node: Input Parsers904605 +Node: Output Wrappers914362 +Node: Two-way processors918872 +Node: Printing Messages921080 +Ref: Printing Messages-Footnote-1922157 +Node: Updating `ERRNO'922309 +Node: Accessing Parameters923048 +Node: Symbol Table Access924278 +Node: Symbol table by name924792 +Node: Symbol table by cookie926768 +Ref: Symbol table by cookie-Footnote-1930900 +Node: Cached values930963 +Ref: Cached values-Footnote-1934453 +Node: Array Manipulation934544 +Ref: Array Manipulation-Footnote-1935642 +Node: Array Data Types935681 +Ref: Array Data Types-Footnote-1938384 +Node: Array Functions938476 +Node: Flattening Arrays942312 +Node: Creating Arrays949164 +Node: Extension API Variables953889 +Node: Extension Versioning954525 +Node: Extension API Informational Variables956426 +Node: Extension API Boilerplate957512 +Node: Finding Extensions961316 +Node: Extension Example961876 +Node: Internal File Description962606 +Node: Internal File Ops966697 +Ref: Internal File Ops-Footnote-1978206 +Node: Using Internal File Ops978346 +Ref: Using Internal File Ops-Footnote-1980693 +Node: Extension Samples980959 +Node: Extension Sample File Functions982483 +Node: Extension Sample Fnmatch990970 +Node: Extension Sample Fork992739 +Node: Extension Sample Inplace993952 +Node: Extension Sample Ord995730 +Node: Extension Sample Readdir996566 +Node: Extension Sample Revout998098 +Node: Extension Sample Rev2way998691 +Node: Extension Sample Read write array999381 +Node: Extension Sample Readfile1001264 +Node: Extension Sample API Tests1002364 +Node: Extension Sample Time1002889 +Node: gawkextlib1004253 +Node: Language History1007034 +Node: V7/SVR3.11008627 +Node: SVR41010947 +Node: POSIX1012389 +Node: BTL1013775 +Node: POSIX/GNU1014509 +Node: Feature History1020108 +Node: Common Extensions1033084 +Node: Ranges and Locales1034396 +Ref: Ranges and Locales-Footnote-11039013 +Ref: Ranges and Locales-Footnote-21039040 +Ref: Ranges and Locales-Footnote-31039274 +Node: Contributors1039495 +Node: Installation1044876 +Node: Gawk Distribution1045770 +Node: Getting1046254 +Node: Extracting1047080 +Node: Distribution contents1048772 +Node: Unix Installation1054493 +Node: Quick Installation1055110 +Node: Additional Configuration Options1057556 +Node: Configuration Philosophy1059292 +Node: Non-Unix Installation1061646 +Node: PC Installation1062104 +Node: PC Binary Installation1063403 +Node: PC Compiling1065251 +Node: PC Testing1068195 +Node: PC Using1069371 +Node: Cygwin1073539 +Node: MSYS1074348 +Node: VMS Installation1074862 +Node: VMS Compilation1075658 +Ref: VMS Compilation-Footnote-11076910 +Node: VMS Dynamic Extensions1076968 +Node: VMS Installation Details1078341 +Node: VMS Running1080592 +Node: VMS GNV1083426 +Node: VMS Old Gawk1084149 +Node: Bugs1084619 +Node: Other Versions1088537 +Node: Notes1094621 +Node: Compatibility Mode1095421 +Node: Additions1096204 +Node: Accessing The Source1097131 +Node: Adding Code1098571 +Node: New Ports1104616 +Node: Derived Files1108751 +Ref: Derived Files-Footnote-11114072 +Ref: Derived Files-Footnote-21114106 +Ref: Derived Files-Footnote-31114706 +Node: Future Extensions1114804 +Node: Implementation Limitations1115387 +Node: Extension Design1116639 +Node: Old Extension Problems1117793 +Ref: Old Extension Problems-Footnote-11119301 +Node: Extension New Mechanism Goals1119358 +Ref: Extension New Mechanism Goals-Footnote-11122723 +Node: Extension Other Design Decisions1122909 +Node: Extension Future Growth1125015 +Node: Old Extension Mechanism1125851 +Node: Basic Concepts1127591 +Node: Basic High Level1128272 +Ref: figure-general-flow1128544 +Ref: figure-process-flow1129143 +Ref: Basic High Level-Footnote-11132372 +Node: Basic Data Typing1132557 +Node: Glossary1135912 +Node: Copying1161143 +Node: GNU Free Documentation License1198699 +Node: Index1223835 End Tag Table diff --git a/doc/gawk.texi b/doc/gawk.texi index 872263d4..24cd006b 100644 --- a/doc/gawk.texi +++ b/doc/gawk.texi @@ -947,15 +947,14 @@ particular records in a file and perform operations upon them. @c dedication for Info file @ifinfo -@center To Miriam, for making me complete. +To my parents, for their love, and for the wonderful +example they set for me. @sp 1 -@center To Chana, for the joy you bring us. +To my wife Miriam, for making me complete. +Thank you for building your life together with me. @sp 1 -@center To Rivka, for the exponential increase. -@sp 1 -@center To Nachum, for the added dimension. -@sp 1 -@center To Malka, for the new beginning. +To our children Chana, Rivka, Nachum and Malka, +for enrichening our lives in innumerable ways. @end ifinfo @summarycontents @@ -4374,7 +4373,6 @@ that can be loaded with either @code{@@load} or the @option{-l} option. @node Obsolete @section Obsolete Options and/or Features -@cindex features, advanced, See advanced features @cindex options, deprecated @cindex features, deprecated @cindex obsolete features @@ -5814,6 +5812,14 @@ file is started. Another built-in variable, @code{NR}, records the total number of input records read so far from all data files. It starts at zero, but is never automatically reset to zero. +@menu +* awk split records:: How standard @command{awk} splits records. +* gawk split records:: How @command{gawk} splits records. +@end menu + +@node awk split records +@subsection Record Splitting With Standard @command{awk} + @cindex separators, for records @cindex record separators Records are separated by a character called the @dfn{record separator}. @@ -5977,6 +5983,9 @@ After the end of the record has been determined, @command{gawk} sets the variable @code{RT} to the text in the input that matched @code{RS}. +@node gawk split records +@subsection Record Splitting With @command{gawk} + @cindex common extensions, @code{RS} as a regexp @cindex extensions, common@comma{} @code{RS} as a regexp When using @command{gawk}, @@ -6060,7 +6069,6 @@ single record. The only way to make this happen is to give @code{RS} a value that you know doesn't occur in the input file. This is hard to do in a general way, such that a program always works for arbitrary input files. -@c can you say `understatement' boys and girls? You might think that for text files, the @sc{nul} character, which consists of a character with all bits equal to zero, is a good @@ -6073,6 +6081,8 @@ BEGIN @{ RS = "\0" @} # whole file becomes one record? @cindex differences in @command{awk} and @command{gawk}, strings, storing @command{gawk} in fact accepts this, and uses the @sc{nul} character for the record separator. +This works for certain special files, such as @file{/proc/environ} on +GNU/Linux systems, where the @sc{nul} character is in fact the record separator. However, this usage is @emph{not} portable to most other @command{awk} implementations. @@ -6089,11 +6099,9 @@ character as a record separator. However, this is a special case: @cindex records, treating files as @cindex treating files, as single records -The best way to treat a whole file as a single record is to -simply read the file in, one record at a time, concatenating each -record onto the end of the previous ones. - -@c @strong{FIXME}: Using @sc{nul} is good for @file{/proc/environ} etc. +@xref{Readfile Function}, for an interesting, portable way to read +whole files. If you are using @command{gawk}, see @ref{Extension Sample +Readfile}, for another option. @docbook </sidebar> @@ -6111,7 +6119,6 @@ single record. The only way to make this happen is to give @code{RS} a value that you know doesn't occur in the input file. This is hard to do in a general way, such that a program always works for arbitrary input files. -@c can you say `understatement' boys and girls? You might think that for text files, the @sc{nul} character, which consists of a character with all bits equal to zero, is a good @@ -6124,6 +6131,8 @@ BEGIN @{ RS = "\0" @} # whole file becomes one record? @cindex differences in @command{awk} and @command{gawk}, strings, storing @command{gawk} in fact accepts this, and uses the @sc{nul} character for the record separator. +This works for certain special files, such as @file{/proc/environ} on +GNU/Linux systems, where the @sc{nul} character is in fact the record separator. However, this usage is @emph{not} portable to most other @command{awk} implementations. @@ -6140,11 +6149,9 @@ character as a record separator. However, this is a special case: @cindex records, treating files as @cindex treating files, as single records -The best way to treat a whole file as a single record is to -simply read the file in, one record at a time, concatenating each -record onto the end of the previous ones. - -@c @strong{FIXME}: Using @sc{nul} is good for @file{/proc/environ} etc. +@xref{Readfile Function}, for an interesting, portable way to read +whole files. If you are using @command{gawk}, see @ref{Extension Sample +Readfile}, for another option. @end cartouche @end ifnotdocbook @c ENDOFRANGE inspl @@ -6181,7 +6188,7 @@ simple @command{awk} programs so powerful. @cindex @code{$} (dollar sign), @code{$} field operator @cindex dollar sign (@code{$}), @code{$} field operator @cindex field operators@comma{} dollar sign as -A dollar-sign (@samp{$}) is used +You use a dollar-sign (@samp{$}) to refer to a field in an @command{awk} program, followed by the number of the field you want. Thus, @code{$1} refers to the first field, @code{$2} to the second, and so on. @@ -6212,7 +6219,7 @@ one (such as @code{$8} when the record has only seven fields), you get the empty string. (If used in a numeric operation, you get zero.) The use of @code{$0}, which looks like a reference to the ``zero-th'' field, is -a special case: it represents the whole input record +a special case: it represents the whole input record. Use it when you are not interested in specific fields. Here are some more examples: @@ -6248,7 +6255,7 @@ $ @kbd{awk '/li/ @{ print $1, $NF @}' mail-list} @cindex fields, numbers @cindex field numbers -The number of a field does not need to be a constant. Any expression in +A field number need not be a constant. Any expression in the @command{awk} language can be used after a @samp{$} to refer to a field. The value of the expression specifies the field number. If the value is a string, rather than a number, it is converted to a number. @@ -6275,7 +6282,11 @@ its value as the number of the field to print. The @samp{*} sign represents multiplication, so the expression @samp{2*2} evaluates to four. The parentheses are used so that the multiplication is done before the @samp{$} operation; they are necessary whenever there is a binary -operator in the field-number expression. This example, then, prints the +operator@footnote{A @dfn{binary operator}, such as @samp{*} for +multiplication, is one that takes two operands. The distinction +is required, since @command{awk} also has unary (one-operand) +and ternary (three-operand) operators.} +in the field-number expression. This example, then, prints the type of relationship (the fourth field) for every line of the file @file{mail-list}. (All of the @command{awk} operators are listed, in order of decreasing precedence, in @@ -6325,7 +6336,7 @@ Then it prints the original and new values for field three. (Someone in the warehouse made a consistent mistake while inventorying the red boxes.) -For this to work, the text in field @code{$3} must make sense +For this to work, the text in @code{$3} must make sense as a number; the string of characters must be converted to a number for the computer to do arithmetic on it. The number resulting from the subtraction is converted back to a string of characters that @@ -6416,7 +6427,7 @@ $ @kbd{echo a b c d | awk '@{ OFS = ":"; $2 = ""} @end example @noindent -The field is still there; it just has an empty value, denoted by +The field is still there; it just has an empty value, delimited by the two colons between @samp{a} and @samp{c}. This example shows what happens if you create a new field: @@ -7234,7 +7245,7 @@ if (PROCINFO["FS"] == "FS") else if (PROCINFO["FS"] == "FIELDWIDTHS") @var{fixed-width field splitting} @dots{} else - @var{content-based field splitting} @dots{} (see next @value{SECTION}) + @var{content-based field splitting} @dots{} @ii{(see next @value{SECTION})} @end example This information is useful when writing a function @@ -7348,7 +7359,7 @@ the double quotes. @command{gawk} provides no way to deal with this. Since there is no formal specification for CSV data, there isn't much more to be done; the @code{FPAT} mechanism provides an elegant solution for the majority -of cases, and the @command{gawk} maintainer is satisfied with that. +of cases, and the @command{gawk} developers are satisfied with that. @end quotation As written, the regexp used for @code{FPAT} requires that each field @@ -7410,7 +7421,7 @@ the first nonblank line that follows---no matter how many blank lines appear in a row, they are considered one record separator. @cindex dark corner, multiline records -There is an important difference between @samp{RS = ""} and +However, there is an important difference between @samp{RS = ""} and @samp{RS = "\n\n+"}. In the first case, leading newlines in the input data file are ignored, and if a file ends without extra blank lines after the last record, the final newline is removed from the record. @@ -7563,7 +7574,19 @@ The @code{getline} command is used in several different ways and should The examples that follow the explanation of the @code{getline} command include material that has not been covered yet. Therefore, come back and study the @code{getline} command @emph{after} you have reviewed the -rest of this @value{DOCUMENT} and have a good knowledge of how @command{awk} works. +rest of +@ifinfo +this @value{DOCUMENT} +@end ifinfo +@ifhtml +this @value{DOCUMENT} +@end ifhtml +@ifnotinfo +@ifnothtml +Parts I and II +@end ifnothtml +@end ifnotinfo +and have a good knowledge of how @command{awk} works. @cindex @command{gawk}, @code{ERRNO} variable in @cindex @code{ERRNO} variable, with @command{getline} command @@ -7750,9 +7773,9 @@ changed, resulting in a new value of @code{NF}. According to POSIX, @samp{getline < @var{expression}} is ambiguous if @var{expression} contains unparenthesized operators other than @samp{$}; for example, @samp{getline < dir "/" file} is ambiguous -because the concatenation operator is not parenthesized. You should -write it as @samp{getline < (dir "/" file)} if you want your program -to be portable to all @command{awk} implementations. +because the concatenation operator (not discussed yet; @pxref{Concatenation}) +is not parenthesized. You should write it as @samp{getline < (dir "/" file)} if +you want your program to be portable to all @command{awk} implementations. @node Getline/Variable/File @subsection Using @code{getline} into a Variable from a File @@ -8015,7 +8038,7 @@ However, the new record is tested against any subsequent rules. @cindex @command{awk}, implementations, limits @cindex @command{gawk}, implementation issues, limits @item -Many @command{awk} implementations limit the number of pipelines that an @command{awk} +Some very old @command{awk} implementations limit the number of pipelines that an @command{awk} program may have open to just one. In @command{gawk}, there is no such limit. You can open as many pipelines (and coprocesses) as the underlying operating system permits. @@ -8054,6 +8077,7 @@ can cause @code{FILENAME} to be updated if they cause @command{awk} to start reading a new input file. @item +@cindex Moore, Duncan If the variable being assigned is an expression with side effects, different versions of @command{awk} behave differently upon encountering end-of-file. Some versions don't evaluate the expression; many versions @@ -8078,7 +8102,7 @@ end of file is encountered, before the element in @code{a} is assigned? @command{gawk} treats @code{getline} like a function call, and evaluates the expression @samp{a[++c]} before attempting to read from @file{f}. -Other versions of @command{awk} only evaluate the expression once they +However, some versions of @command{awk} only evaluate the expression once they know that there is a string value to be assigned. Caveat Emptor. @end itemize @@ -8114,10 +8138,13 @@ Note: for each variant, @command{gawk} sets the @code{RT} built-in variable. @section Reading Input With A Timeout @cindex timeout, reading input +@cindex differences in @command{awk} and @command{gawk}, read timeouts +This @value{SECTION} describes a feature that is specific to @command{gawk}. + You may specify a timeout in milliseconds for reading input from the keyboard, -pipe or two-way communication including, TCP/IP sockets. This can be done +a pipe, or two-way communication, including TCP/IP sockets. This can be done on a per input, command or connection basis, by setting a special element -in the @code{PROCINFO} array: +in the @code{PROCINFO} (@pxref{Auto-set}) array: @example PROCINFO["input_name", "READ_TIMEOUT"] = @var{timeout in milliseconds} @@ -8147,9 +8174,9 @@ while ((getline < "/dev/stdin") > 0) print $0 @end example -@command{gawk} will terminate the read operation if input does not -arrive after waiting for the timeout period, return failure -and set the @code{ERRNO} variable to an appropriate string value. +@command{gawk} terminates the read operation if input does not +arrive after waiting for the timeout period, returns failure +and sets the @code{ERRNO} variable to an appropriate string value. A negative or zero value for the timeout is the same as specifying no timeout at all. @@ -8221,15 +8248,25 @@ indefinitely until some other process opens it for writing. @cindex command line, directories on According to the POSIX standard, files named on the @command{awk} -command line must be text files. It is a fatal error if they are not. +command line must be text files; it is a fatal error if they are not. Most versions of @command{awk} treat a directory on the command line as a fatal error. By default, @command{gawk} produces a warning for a directory on the -command line, but otherwise ignores it. If either of the @option{--posix} +command line, but otherwise ignores it. This makes it easier to use +shell wildcards with your @command{awk} program: + +@example +$ @kbd{gawk -f whizprog.awk *} @ii{Directories could kill this progam} +@end example + +If either of the @option{--posix} or @option{--traditional} options is given, then @command{gawk} reverts to treating a directory on the command line as a fatal error. +@xref{Extension Sample Readdir}, for a way to treat directories +as usable data from an @command{awk} program. + @node Printing @chapter Printing Output @@ -8275,7 +8312,7 @@ and discusses the @code{close()} built-in function. @section The @code{print} Statement The @code{print} statement is used for producing output with simple, standardized -formatting. Specify only the strings or numbers to print, in a +formatting. You specify only the strings or numbers to print, in a list separated by commas. They are output, separated by single spaces, followed by a newline. The statement looks like this: @@ -8358,10 +8395,9 @@ $ @kbd{awk '@{ print $1 $2 @}' inventory-shipped} To someone unfamiliar with the @file{inventory-shipped} file, neither example's output makes much sense. A heading line at the beginning would make it clearer. Let's add some headings to our table of months -(@code{$1}) and green crates shipped (@code{$2}). We do this using the -@code{BEGIN} pattern -(@pxref{BEGIN/END}) -so that the headings are only printed once: +(@code{$1}) and green crates shipped (@code{$2}). We do this using +a @code{BEGIN} rule (@pxref{BEGIN/END}) so that the headings are only +printed once: @example awk 'BEGIN @{ print "Month Crates" @@ -8687,7 +8723,8 @@ infinity are formatted as @samp{-inf} or @samp{-infinity}, and positive infinity as @samp{inf} and @samp{infinity}. -The special ``not a number'' value formats as @samp{-nan} or @samp{nan}. +The special ``not a number'' value formats as @samp{-nan} or @samp{nan} +(@pxref{General Arithmetic}). @item @code{%F} Like @samp{%f} but the infinity and ``not a number'' values are spelled @@ -8830,7 +8867,7 @@ For example: $ @kbd{cat thousands.awk} @ii{Show source program} @print{} BEGIN @{ printf "%'d\n", 1234567 @} $ @kbd{LC_ALL=C gawk -f thousands.awk} -@print{} 1234567 @ii{Results in "C" locale} +@print{} 1234567 @ii{Results in} "C" @ii{locale} $ @kbd{LC_ALL=en_US.UTF-8 gawk -f thousands.awk} @print{} 1,234,567 @ii{Results in US English UTF locale} @end example @@ -8940,14 +8977,12 @@ This is not particularly easy to read but it does work. @c @cindex lint checks @cindex troubleshooting, fatal errors, @code{printf} format strings @cindex POSIX @command{awk}, @code{printf} format strings and -C programmers may be used to supplying additional -@samp{l}, @samp{L}, and @samp{h} -modifiers in @code{printf} format strings. These are not valid in @command{awk}. -Most @command{awk} implementations silently ignore them. -If @option{--lint} is provided on the command line -(@pxref{Options}), -@command{gawk} warns about their use. If @option{--posix} is supplied, -their use is a fatal error. +C programmers may be used to supplying additional modifiers (@samp{h}, +@samp{j}, @samp{l}, @samp{L}, @samp{t}, and @samp{z}) in @code{printf} +format strings. These are not valid in @command{awk}. Most @command{awk} +implementations silently ignore them. If @option{--lint} is provided +on the command line (@pxref{Options}), @command{gawk} warns about their +use. If @option{--posix} is supplied, their use is a fatal error. @c ENDOFRANGE pfm @node Printf Examples @@ -8993,7 +9028,7 @@ they are last on their lines. They don't need to have spaces after them. The table could be made to look even nicer by adding headings to the -tops of the columns. This is done using the @code{BEGIN} pattern +tops of the columns. This is done using a @code{BEGIN} rule (@pxref{BEGIN/END}) so that the headers are only printed once, at the beginning of the @command{awk} program: @@ -9065,7 +9100,7 @@ commands, except that they are written inside the @command{awk} program. @cindex @code{printf} statement, See Also redirection@comma{} of output There are four forms of output redirection: output to a file, output appended to a file, output through a pipe to another command, and output -to a coprocess. They are all shown for the @code{print} statement, +to a coprocess. We show them all for the @code{print} statement, but they work identically for @code{printf}: @table @code @@ -9170,7 +9205,7 @@ This example also illustrates the use of a variable to represent a @var{file} or @var{command}---it is not necessary to always use a string constant. Using a variable is generally a good idea, because (if you mean to refer to that same file or command) -@command{awk} requires that the string value be spelled identically +@command{awk} requires that the string value be written identically every time. @cindex coprocesses @@ -9372,7 +9407,7 @@ terminal at all. Then opening @file{/dev/tty} fails. @command{gawk} provides special file names for accessing the three standard -streams. @value{COMMONEXT}. It also provides syntax for accessing +streams. @value{COMMONEXT} It also provides syntax for accessing any other inherited open files. If the file name matches one of these special names when @command{gawk} redirects input or output, then it directly uses the stream that the file name stands for. @@ -9628,15 +9663,16 @@ more importantly, the file descriptor for the pipe is not closed and released until @code{close()} is called or @command{awk} exits. -@code{close()} will silently do nothing if given an argument that +@code{close()} silently does nothing if given an argument that does not represent a file, pipe or coprocess that was opened with -a redirection. +a redirection. In such a case, it returns a negative value, +indicating an error. In addition, @command{gawk} sets @code{ERRNO} +to a string indicating the error. -Note also that @samp{close(FILENAME)} has no -``magic'' effects on the implicit loop that reads through the -files named on the command line. It is, more likely, a close -of a file that was never opened, so @command{awk} silently -does nothing. +Note also that @samp{close(FILENAME)} has no ``magic'' effects on the +implicit loop that reads through the files named on the command line. +It is, more likely, a close of a file that was never opened with a +redirection, so @command{awk} silently does nothing. @cindex @code{|} (vertical bar), @code{|&} operator (I/O), pipes@comma{} closing When using the @samp{|&} operator to communicate with a coprocess, @@ -9665,7 +9701,7 @@ which discusses it in more detail and gives an example. @cindex differences in @command{awk} and @command{gawk}, @code{close()} function @cindex Unix @command{awk}, @code{close()} function and -In many versions of Unix @command{awk}, the @code{close()} function +In many older versions of Unix @command{awk}, the @code{close()} function is actually a statement. It is a syntax error to try and use the return value from @code{close()}: @value{DARKCORNER} @@ -9721,7 +9757,7 @@ when closing a pipe. @cindex differences in @command{awk} and @command{gawk}, @code{close()} function @cindex Unix @command{awk}, @code{close()} function and -In many versions of Unix @command{awk}, the @code{close()} function +In many older versions of Unix @command{awk}, the @code{close()} function is actually a statement. It is a syntax error to try and use the return value from @code{close()}: @value{DARKCORNER} @@ -25320,9 +25356,6 @@ It contains the following chapters: @node Advanced Features @chapter Advanced Features of @command{gawk} -@ifset WITH_NETWORK_CHAPTER -@cindex advanced features, network connections, See Also networks@comma{} connections -@end ifset @c STARTOFRANGE gawadv @cindex @command{gawk}, features, advanced @c STARTOFRANGE advgaw diff --git a/doc/gawktexi.in b/doc/gawktexi.in index 440d641b..48ed6bc6 100644 --- a/doc/gawktexi.in +++ b/doc/gawktexi.in @@ -942,15 +942,14 @@ particular records in a file and perform operations upon them. @c dedication for Info file @ifinfo -@center To Miriam, for making me complete. +To my parents, for their love, and for the wonderful +example they set for me. @sp 1 -@center To Chana, for the joy you bring us. +To my wife Miriam, for making me complete. +Thank you for building your life together with me. @sp 1 -@center To Rivka, for the exponential increase. -@sp 1 -@center To Nachum, for the added dimension. -@sp 1 -@center To Malka, for the new beginning. +To our children Chana, Rivka, Nachum and Malka, +for enrichening our lives in innumerable ways. @end ifinfo @summarycontents @@ -4302,7 +4301,6 @@ that can be loaded with either @code{@@load} or the @option{-l} option. @node Obsolete @section Obsolete Options and/or Features -@cindex features, advanced, See advanced features @cindex options, deprecated @cindex features, deprecated @cindex obsolete features @@ -5615,6 +5613,14 @@ file is started. Another built-in variable, @code{NR}, records the total number of input records read so far from all data files. It starts at zero, but is never automatically reset to zero. +@menu +* awk split records:: How standard @command{awk} splits records. +* gawk split records:: How @command{gawk} splits records. +@end menu + +@node awk split records +@subsection Record Splitting With Standard @command{awk} + @cindex separators, for records @cindex record separators Records are separated by a character called the @dfn{record separator}. @@ -5778,6 +5784,9 @@ After the end of the record has been determined, @command{gawk} sets the variable @code{RT} to the text in the input that matched @code{RS}. +@node gawk split records +@subsection Record Splitting With @command{gawk} + @cindex common extensions, @code{RS} as a regexp @cindex extensions, common@comma{} @code{RS} as a regexp When using @command{gawk}, @@ -5856,7 +5865,6 @@ single record. The only way to make this happen is to give @code{RS} a value that you know doesn't occur in the input file. This is hard to do in a general way, such that a program always works for arbitrary input files. -@c can you say `understatement' boys and girls? You might think that for text files, the @sc{nul} character, which consists of a character with all bits equal to zero, is a good @@ -5869,6 +5877,8 @@ BEGIN @{ RS = "\0" @} # whole file becomes one record? @cindex differences in @command{awk} and @command{gawk}, strings, storing @command{gawk} in fact accepts this, and uses the @sc{nul} character for the record separator. +This works for certain special files, such as @file{/proc/environ} on +GNU/Linux systems, where the @sc{nul} character is in fact the record separator. However, this usage is @emph{not} portable to most other @command{awk} implementations. @@ -5885,11 +5895,9 @@ character as a record separator. However, this is a special case: @cindex records, treating files as @cindex treating files, as single records -The best way to treat a whole file as a single record is to -simply read the file in, one record at a time, concatenating each -record onto the end of the previous ones. - -@c @strong{FIXME}: Using @sc{nul} is good for @file{/proc/environ} etc. +@xref{Readfile Function}, for an interesting, portable way to read +whole files. If you are using @command{gawk}, see @ref{Extension Sample +Readfile}, for another option. @end sidebar @c ENDOFRANGE inspl @c ENDOFRANGE recspl @@ -5925,7 +5933,7 @@ simple @command{awk} programs so powerful. @cindex @code{$} (dollar sign), @code{$} field operator @cindex dollar sign (@code{$}), @code{$} field operator @cindex field operators@comma{} dollar sign as -A dollar-sign (@samp{$}) is used +You use a dollar-sign (@samp{$}) to refer to a field in an @command{awk} program, followed by the number of the field you want. Thus, @code{$1} refers to the first field, @code{$2} to the second, and so on. @@ -5956,7 +5964,7 @@ one (such as @code{$8} when the record has only seven fields), you get the empty string. (If used in a numeric operation, you get zero.) The use of @code{$0}, which looks like a reference to the ``zero-th'' field, is -a special case: it represents the whole input record +a special case: it represents the whole input record. Use it when you are not interested in specific fields. Here are some more examples: @@ -5992,7 +6000,7 @@ $ @kbd{awk '/li/ @{ print $1, $NF @}' mail-list} @cindex fields, numbers @cindex field numbers -The number of a field does not need to be a constant. Any expression in +A field number need not be a constant. Any expression in the @command{awk} language can be used after a @samp{$} to refer to a field. The value of the expression specifies the field number. If the value is a string, rather than a number, it is converted to a number. @@ -6019,7 +6027,11 @@ its value as the number of the field to print. The @samp{*} sign represents multiplication, so the expression @samp{2*2} evaluates to four. The parentheses are used so that the multiplication is done before the @samp{$} operation; they are necessary whenever there is a binary -operator in the field-number expression. This example, then, prints the +operator@footnote{A @dfn{binary operator}, such as @samp{*} for +multiplication, is one that takes two operands. The distinction +is required, since @command{awk} also has unary (one-operand) +and ternary (three-operand) operators.} +in the field-number expression. This example, then, prints the type of relationship (the fourth field) for every line of the file @file{mail-list}. (All of the @command{awk} operators are listed, in order of decreasing precedence, in @@ -6069,7 +6081,7 @@ Then it prints the original and new values for field three. (Someone in the warehouse made a consistent mistake while inventorying the red boxes.) -For this to work, the text in field @code{$3} must make sense +For this to work, the text in @code{$3} must make sense as a number; the string of characters must be converted to a number for the computer to do arithmetic on it. The number resulting from the subtraction is converted back to a string of characters that @@ -6160,7 +6172,7 @@ $ @kbd{echo a b c d | awk '@{ OFS = ":"; $2 = ""} @end example @noindent -The field is still there; it just has an empty value, denoted by +The field is still there; it just has an empty value, delimited by the two colons between @samp{a} and @samp{c}. This example shows what happens if you create a new field: @@ -6852,7 +6864,7 @@ if (PROCINFO["FS"] == "FS") else if (PROCINFO["FS"] == "FIELDWIDTHS") @var{fixed-width field splitting} @dots{} else - @var{content-based field splitting} @dots{} (see next @value{SECTION}) + @var{content-based field splitting} @dots{} @ii{(see next @value{SECTION})} @end example This information is useful when writing a function @@ -6966,7 +6978,7 @@ the double quotes. @command{gawk} provides no way to deal with this. Since there is no formal specification for CSV data, there isn't much more to be done; the @code{FPAT} mechanism provides an elegant solution for the majority -of cases, and the @command{gawk} maintainer is satisfied with that. +of cases, and the @command{gawk} developers are satisfied with that. @end quotation As written, the regexp used for @code{FPAT} requires that each field @@ -7028,7 +7040,7 @@ the first nonblank line that follows---no matter how many blank lines appear in a row, they are considered one record separator. @cindex dark corner, multiline records -There is an important difference between @samp{RS = ""} and +However, there is an important difference between @samp{RS = ""} and @samp{RS = "\n\n+"}. In the first case, leading newlines in the input data file are ignored, and if a file ends without extra blank lines after the last record, the final newline is removed from the record. @@ -7181,7 +7193,19 @@ The @code{getline} command is used in several different ways and should The examples that follow the explanation of the @code{getline} command include material that has not been covered yet. Therefore, come back and study the @code{getline} command @emph{after} you have reviewed the -rest of this @value{DOCUMENT} and have a good knowledge of how @command{awk} works. +rest of +@ifinfo +this @value{DOCUMENT} +@end ifinfo +@ifhtml +this @value{DOCUMENT} +@end ifhtml +@ifnotinfo +@ifnothtml +Parts I and II +@end ifnothtml +@end ifnotinfo +and have a good knowledge of how @command{awk} works. @cindex @command{gawk}, @code{ERRNO} variable in @cindex @code{ERRNO} variable, with @command{getline} command @@ -7368,9 +7392,9 @@ changed, resulting in a new value of @code{NF}. According to POSIX, @samp{getline < @var{expression}} is ambiguous if @var{expression} contains unparenthesized operators other than @samp{$}; for example, @samp{getline < dir "/" file} is ambiguous -because the concatenation operator is not parenthesized. You should -write it as @samp{getline < (dir "/" file)} if you want your program -to be portable to all @command{awk} implementations. +because the concatenation operator (not discussed yet; @pxref{Concatenation}) +is not parenthesized. You should write it as @samp{getline < (dir "/" file)} if +you want your program to be portable to all @command{awk} implementations. @node Getline/Variable/File @subsection Using @code{getline} into a Variable from a File @@ -7633,7 +7657,7 @@ However, the new record is tested against any subsequent rules. @cindex @command{awk}, implementations, limits @cindex @command{gawk}, implementation issues, limits @item -Many @command{awk} implementations limit the number of pipelines that an @command{awk} +Some very old @command{awk} implementations limit the number of pipelines that an @command{awk} program may have open to just one. In @command{gawk}, there is no such limit. You can open as many pipelines (and coprocesses) as the underlying operating system permits. @@ -7672,6 +7696,7 @@ can cause @code{FILENAME} to be updated if they cause @command{awk} to start reading a new input file. @item +@cindex Moore, Duncan If the variable being assigned is an expression with side effects, different versions of @command{awk} behave differently upon encountering end-of-file. Some versions don't evaluate the expression; many versions @@ -7696,7 +7721,7 @@ end of file is encountered, before the element in @code{a} is assigned? @command{gawk} treats @code{getline} like a function call, and evaluates the expression @samp{a[++c]} before attempting to read from @file{f}. -Other versions of @command{awk} only evaluate the expression once they +However, some versions of @command{awk} only evaluate the expression once they know that there is a string value to be assigned. Caveat Emptor. @end itemize @@ -7732,10 +7757,13 @@ Note: for each variant, @command{gawk} sets the @code{RT} built-in variable. @section Reading Input With A Timeout @cindex timeout, reading input +@cindex differences in @command{awk} and @command{gawk}, read timeouts +This @value{SECTION} describes a feature that is specific to @command{gawk}. + You may specify a timeout in milliseconds for reading input from the keyboard, -pipe or two-way communication including, TCP/IP sockets. This can be done +a pipe, or two-way communication, including TCP/IP sockets. This can be done on a per input, command or connection basis, by setting a special element -in the @code{PROCINFO} array: +in the @code{PROCINFO} (@pxref{Auto-set}) array: @example PROCINFO["input_name", "READ_TIMEOUT"] = @var{timeout in milliseconds} @@ -7765,9 +7793,9 @@ while ((getline < "/dev/stdin") > 0) print $0 @end example -@command{gawk} will terminate the read operation if input does not -arrive after waiting for the timeout period, return failure -and set the @code{ERRNO} variable to an appropriate string value. +@command{gawk} terminates the read operation if input does not +arrive after waiting for the timeout period, returns failure +and sets the @code{ERRNO} variable to an appropriate string value. A negative or zero value for the timeout is the same as specifying no timeout at all. @@ -7839,15 +7867,25 @@ indefinitely until some other process opens it for writing. @cindex command line, directories on According to the POSIX standard, files named on the @command{awk} -command line must be text files. It is a fatal error if they are not. +command line must be text files; it is a fatal error if they are not. Most versions of @command{awk} treat a directory on the command line as a fatal error. By default, @command{gawk} produces a warning for a directory on the -command line, but otherwise ignores it. If either of the @option{--posix} +command line, but otherwise ignores it. This makes it easier to use +shell wildcards with your @command{awk} program: + +@example +$ @kbd{gawk -f whizprog.awk *} @ii{Directories could kill this progam} +@end example + +If either of the @option{--posix} or @option{--traditional} options is given, then @command{gawk} reverts to treating a directory on the command line as a fatal error. +@xref{Extension Sample Readdir}, for a way to treat directories +as usable data from an @command{awk} program. + @node Printing @chapter Printing Output @@ -7893,7 +7931,7 @@ and discusses the @code{close()} built-in function. @section The @code{print} Statement The @code{print} statement is used for producing output with simple, standardized -formatting. Specify only the strings or numbers to print, in a +formatting. You specify only the strings or numbers to print, in a list separated by commas. They are output, separated by single spaces, followed by a newline. The statement looks like this: @@ -7976,10 +8014,9 @@ $ @kbd{awk '@{ print $1 $2 @}' inventory-shipped} To someone unfamiliar with the @file{inventory-shipped} file, neither example's output makes much sense. A heading line at the beginning would make it clearer. Let's add some headings to our table of months -(@code{$1}) and green crates shipped (@code{$2}). We do this using the -@code{BEGIN} pattern -(@pxref{BEGIN/END}) -so that the headings are only printed once: +(@code{$1}) and green crates shipped (@code{$2}). We do this using +a @code{BEGIN} rule (@pxref{BEGIN/END}) so that the headings are only +printed once: @example awk 'BEGIN @{ print "Month Crates" @@ -8305,7 +8342,8 @@ infinity are formatted as @samp{-inf} or @samp{-infinity}, and positive infinity as @samp{inf} and @samp{infinity}. -The special ``not a number'' value formats as @samp{-nan} or @samp{nan}. +The special ``not a number'' value formats as @samp{-nan} or @samp{nan} +(@pxref{General Arithmetic}). @item @code{%F} Like @samp{%f} but the infinity and ``not a number'' values are spelled @@ -8448,7 +8486,7 @@ For example: $ @kbd{cat thousands.awk} @ii{Show source program} @print{} BEGIN @{ printf "%'d\n", 1234567 @} $ @kbd{LC_ALL=C gawk -f thousands.awk} -@print{} 1234567 @ii{Results in "C" locale} +@print{} 1234567 @ii{Results in} "C" @ii{locale} $ @kbd{LC_ALL=en_US.UTF-8 gawk -f thousands.awk} @print{} 1,234,567 @ii{Results in US English UTF locale} @end example @@ -8558,14 +8596,12 @@ This is not particularly easy to read but it does work. @c @cindex lint checks @cindex troubleshooting, fatal errors, @code{printf} format strings @cindex POSIX @command{awk}, @code{printf} format strings and -C programmers may be used to supplying additional -@samp{l}, @samp{L}, and @samp{h} -modifiers in @code{printf} format strings. These are not valid in @command{awk}. -Most @command{awk} implementations silently ignore them. -If @option{--lint} is provided on the command line -(@pxref{Options}), -@command{gawk} warns about their use. If @option{--posix} is supplied, -their use is a fatal error. +C programmers may be used to supplying additional modifiers (@samp{h}, +@samp{j}, @samp{l}, @samp{L}, @samp{t}, and @samp{z}) in @code{printf} +format strings. These are not valid in @command{awk}. Most @command{awk} +implementations silently ignore them. If @option{--lint} is provided +on the command line (@pxref{Options}), @command{gawk} warns about their +use. If @option{--posix} is supplied, their use is a fatal error. @c ENDOFRANGE pfm @node Printf Examples @@ -8611,7 +8647,7 @@ they are last on their lines. They don't need to have spaces after them. The table could be made to look even nicer by adding headings to the -tops of the columns. This is done using the @code{BEGIN} pattern +tops of the columns. This is done using a @code{BEGIN} rule (@pxref{BEGIN/END}) so that the headers are only printed once, at the beginning of the @command{awk} program: @@ -8683,7 +8719,7 @@ commands, except that they are written inside the @command{awk} program. @cindex @code{printf} statement, See Also redirection@comma{} of output There are four forms of output redirection: output to a file, output appended to a file, output through a pipe to another command, and output -to a coprocess. They are all shown for the @code{print} statement, +to a coprocess. We show them all for the @code{print} statement, but they work identically for @code{printf}: @table @code @@ -8788,7 +8824,7 @@ This example also illustrates the use of a variable to represent a @var{file} or @var{command}---it is not necessary to always use a string constant. Using a variable is generally a good idea, because (if you mean to refer to that same file or command) -@command{awk} requires that the string value be spelled identically +@command{awk} requires that the string value be written identically every time. @cindex coprocesses @@ -8952,7 +8988,7 @@ terminal at all. Then opening @file{/dev/tty} fails. @command{gawk} provides special file names for accessing the three standard -streams. @value{COMMONEXT}. It also provides syntax for accessing +streams. @value{COMMONEXT} It also provides syntax for accessing any other inherited open files. If the file name matches one of these special names when @command{gawk} redirects input or output, then it directly uses the stream that the file name stands for. @@ -9208,15 +9244,16 @@ more importantly, the file descriptor for the pipe is not closed and released until @code{close()} is called or @command{awk} exits. -@code{close()} will silently do nothing if given an argument that +@code{close()} silently does nothing if given an argument that does not represent a file, pipe or coprocess that was opened with -a redirection. +a redirection. In such a case, it returns a negative value, +indicating an error. In addition, @command{gawk} sets @code{ERRNO} +to a string indicating the error. -Note also that @samp{close(FILENAME)} has no -``magic'' effects on the implicit loop that reads through the -files named on the command line. It is, more likely, a close -of a file that was never opened, so @command{awk} silently -does nothing. +Note also that @samp{close(FILENAME)} has no ``magic'' effects on the +implicit loop that reads through the files named on the command line. +It is, more likely, a close of a file that was never opened with a +redirection, so @command{awk} silently does nothing. @cindex @code{|} (vertical bar), @code{|&} operator (I/O), pipes@comma{} closing When using the @samp{|&} operator to communicate with a coprocess, @@ -9240,7 +9277,7 @@ which discusses it in more detail and gives an example. @cindex differences in @command{awk} and @command{gawk}, @code{close()} function @cindex Unix @command{awk}, @code{close()} function and -In many versions of Unix @command{awk}, the @code{close()} function +In many older versions of Unix @command{awk}, the @code{close()} function is actually a statement. It is a syntax error to try and use the return value from @code{close()}: @value{DARKCORNER} @@ -24461,9 +24498,6 @@ It contains the following chapters: @node Advanced Features @chapter Advanced Features of @command{gawk} -@ifset WITH_NETWORK_CHAPTER -@cindex advanced features, network connections, See Also networks@comma{} connections -@end ifset @c STARTOFRANGE gawadv @cindex @command{gawk}, features, advanced @c STARTOFRANGE advgaw |