diff options
-rw-r--r-- | awklib/eg/lib/ctime.awk | 3 | ||||
-rw-r--r-- | awklib/eg/lib/ftrans.awk | 2 | ||||
-rw-r--r-- | awklib/eg/lib/gettime.awk | 2 | ||||
-rw-r--r-- | awklib/eg/lib/quicksort.awk | 2 | ||||
-rw-r--r-- | awklib/eg/lib/strtonum.awk | 2 | ||||
-rw-r--r-- | awklib/eg/misc/arraymax.awk | 10 | ||||
-rw-r--r-- | awklib/eg/misc/findpat.awk | 13 | ||||
-rw-r--r-- | doc/ChangeLog | 4 | ||||
-rw-r--r-- | doc/gawk.info | 1734 | ||||
-rw-r--r-- | doc/gawk.texi | 704 | ||||
-rw-r--r-- | doc/gawktexi.in | 636 |
11 files changed, 1597 insertions, 1515 deletions
diff --git a/awklib/eg/lib/ctime.awk b/awklib/eg/lib/ctime.awk index ca750370..cea25b7a 100644 --- a/awklib/eg/lib/ctime.awk +++ b/awklib/eg/lib/ctime.awk @@ -4,7 +4,8 @@ function ctime(ts, format) { - format = PROCINFO["strftime"] + format = "%a %b %e %H:%M:%S %Z %Y" + if (ts == 0) ts = systime() # use current time as default return strftime(format, ts) diff --git a/awklib/eg/lib/ftrans.awk b/awklib/eg/lib/ftrans.awk index 1709ac82..2fec27ef 100644 --- a/awklib/eg/lib/ftrans.awk +++ b/awklib/eg/lib/ftrans.awk @@ -12,4 +12,4 @@ FNR == 1 { beginfile(FILENAME) } -END { endfile(_filename_) } +END { endfile(_filename_) } diff --git a/awklib/eg/lib/gettime.awk b/awklib/eg/lib/gettime.awk index 3da9c8ab..4cb56330 100644 --- a/awklib/eg/lib/gettime.awk +++ b/awklib/eg/lib/gettime.awk @@ -31,7 +31,7 @@ function getlocaltime(time, ret, now, i) now = systime() # return date(1)-style output - ret = strftime(PROCINFO["strftime"], now) + ret = strftime("%a %b %e %H:%M:%S %Z %Y", now) # clear out target array delete time diff --git a/awklib/eg/lib/quicksort.awk b/awklib/eg/lib/quicksort.awk index 43357ac6..3ba2d6e3 100644 --- a/awklib/eg/lib/quicksort.awk +++ b/awklib/eg/lib/quicksort.awk @@ -26,7 +26,7 @@ function quicksort(data, left, right, less_than, i, last) # quicksort_swap --- helper function for quicksort, should really be inline -function quicksort_swap(data, i, j, temp) +function quicksort_swap(data, i, j, temp) { temp = data[i] data[i] = data[j] diff --git a/awklib/eg/lib/strtonum.awk b/awklib/eg/lib/strtonum.awk index f82c89c5..cd56a449 100644 --- a/awklib/eg/lib/strtonum.awk +++ b/awklib/eg/lib/strtonum.awk @@ -51,7 +51,7 @@ function mystrtonum(str, ret, n, i, k, c) # a[5] = "123.45" # a[6] = "1.e3" # a[7] = "1.32" -# a[7] = "1.32E2" +# a[8] = "1.32E2" # # for (i = 1; i in a; i++) # print a[i], strtonum(a[i]), mystrtonum(a[i]) diff --git a/awklib/eg/misc/arraymax.awk b/awklib/eg/misc/arraymax.awk index 20dd1768..64197f56 100644 --- a/awklib/eg/misc/arraymax.awk +++ b/awklib/eg/misc/arraymax.awk @@ -1,10 +1,10 @@ { - if ($1 > max) - max = $1 - arr[$1] = $0 + if ($1 > max) + max = $1 + arr[$1] = $0 } END { - for (x = 1; x <= max; x++) - print arr[x] + for (x = 1; x <= max; x++) + print arr[x] } diff --git a/awklib/eg/misc/findpat.awk b/awklib/eg/misc/findpat.awk index e9bef9ea..9d799434 100644 --- a/awklib/eg/misc/findpat.awk +++ b/awklib/eg/misc/findpat.awk @@ -1,10 +1,9 @@ { - if ($1 == "FIND") - regex = $2 - else { - where = match($0, regex) - if (where != 0) - print "Match of", regex, "found at", - where, "in", $0 + if ($1 == "FIND") + regex = $2 + else { + where = match($0, regex) + if (where != 0) + print "Match of", regex, "found at", where, "in", $0 } } diff --git a/doc/ChangeLog b/doc/ChangeLog index 01f9b3ed..0c68e072 100644 --- a/doc/ChangeLog +++ b/doc/ChangeLog @@ -1,3 +1,7 @@ +2014-09-22 Arnold D. Robbins <arnold@skeeve.com> + + * gawktex.in: Continue fixes after reading through the MS. + 2014-09-21 Arnold D. Robbins <arnold@skeeve.com> * gawktex.in: Start on fixes after reading through the MS. diff --git a/doc/gawk.info b/doc/gawk.info index c4809a63..fa77321b 100644 --- a/doc/gawk.info +++ b/doc/gawk.info @@ -350,12 +350,12 @@ entitled "GNU Free Documentation License". elements. * Controlling Scanning:: Controlling the order in which arrays are scanned. -* Delete:: The `delete' statement removes an - element from an array. * Numeric Array Subscripts:: How to use numbers as subscripts in `awk'. * Uninitialized Subscripts:: Using Uninitialized variables as subscripts. +* Delete:: The `delete' statement removes an + element from an array. * Multidimensional:: Emulating multidimensional arrays in `awk'. * Multiscanning:: Scanning multidimensional arrays. @@ -7361,7 +7361,7 @@ and: are exactly equivalent. One rather bizarre consequence of this rule is that the following Boolean expression is valid, but does not do what -the user probably intended: +its author probably intended: # Note that /foo/ is on the left of the ~ if (/foo/ ~ $1) print "found foo" @@ -7387,9 +7387,10 @@ of the `match()' function, and as the third argument of the `split()' and `patsplit()' functions (*note String Functions::). Modern implementations of `awk', including `gawk', allow the third argument of `split()' to be a regexp constant, but some older implementations do -not. (d.c.) This can lead to confusion when attempting to use regexp -constants as arguments to user-defined functions (*note User-defined::). -For example: +not. (d.c.) Because some built-in functions accept regexp constants +as arguments, it can be confusing when attempting to use regexp +constants as arguments to user-defined functions (*note +User-defined::). For example: function mysub(pat, repl, str, global) { @@ -7453,7 +7454,7 @@ variable's current value. Variables are given new values with "assignment operators", "increment operators", and "decrement operators". *Note Assignment Ops::. In addition, the `sub()' and `gsub()' functions can change a variable's value, and the `match()', -`patsplit()' and `split()' functions can change the contents of their +`split()' and `patsplit()' functions can change the contents of their array parameters. *Note String Functions::. A few variables have special built-in meanings, such as `FS' (the @@ -7468,8 +7469,8 @@ uppercase. The kind of value a variable holds can change over the life of a program. By default, variables are initialized to the empty string, which is zero if converted to a number. There is no need to explicitly -"initialize" a variable in `awk', which is what you would do in C and -in most other traditional languages. +initialize a variable in `awk', which is what you would do in C and in +most other traditional languages. File: gawk.info, Node: Assignment Options, Prev: Using Variables, Up: Variables @@ -7644,7 +7645,7 @@ difference in behavior, on a GNU/Linux system: The `en_DK.utf-8' locale is for English in Denmark, where the comma acts as the decimal point separator. In the normal `"C"' locale, `gawk' -treats `4,321' as `4', while in the Danish locale, it's treated as the +treats `4,321' as 4, while in the Danish locale, it's treated as the full number, 4.321. Some earlier versions of `gawk' fully complied with this aspect of @@ -8027,8 +8028,7 @@ A workaround is: awk '/[=]=/' /dev/null - `gawk' does not have this problem; BWK `awk' and `mawk' also do not -(*note Other Versions::). + `gawk' does not have this problem; BWK `awk' and `mawk' also do not. File: gawk.info, Node: Increment Ops, Prev: Assignment Ops, Up: All Operators @@ -8205,9 +8205,9 @@ determine how they are compared. Variable typing follows these rules: STRING attribute. * Fields, `getline' input, `FILENAME', `ARGV' elements, `ENVIRON' - elements, and the elements of an array created by `patsplit()', - `split()' and `match()' that are numeric strings have the STRNUM - attribute. Otherwise, they have the STRING attribute. + elements, and the elements of an array created by `match()', + `split()' and `patsplit()' that are numeric strings have the + STRNUM attribute. Otherwise, they have the STRING attribute. Uninitialized variables also have the STRNUM attribute. * Attributes propagate across assignments but are not changed by any @@ -8257,21 +8257,21 @@ In contrast, the eight characters `" +3.14"' appearing in program text comprise a string constant. The following examples print `1' when the comparison between the two different constants is true, `0' otherwise: - $ echo ' +3.14' | gawk '{ print $0 == " +3.14" }' True + $ echo ' +3.14' | awk '{ print($0 == " +3.14") }' True -| 1 - $ echo ' +3.14' | gawk '{ print $0 == "+3.14" }' False + $ echo ' +3.14' | awk '{ print($0 == "+3.14") }' False -| 0 - $ echo ' +3.14' | gawk '{ print $0 == "3.14" }' False + $ echo ' +3.14' | awk '{ print($0 == "3.14") }' False -| 0 - $ echo ' +3.14' | gawk '{ print $0 == 3.14 }' True + $ echo ' +3.14' | awk '{ print($0 == 3.14) }' True -| 1 - $ echo ' +3.14' | gawk '{ print $1 == " +3.14" }' False + $ echo ' +3.14' | awk '{ print($1 == " +3.14") }' False -| 0 - $ echo ' +3.14' | gawk '{ print $1 == "+3.14" }' True + $ echo ' +3.14' | awk '{ print($1 == "+3.14") }' True -| 1 - $ echo ' +3.14' | gawk '{ print $1 == "3.14" }' False + $ echo ' +3.14' | awk '{ print($1 == "3.14") }' False -| 0 - $ echo ' +3.14' | gawk '{ print $1 == 3.14 }' True + $ echo ' +3.14' | awk '{ print($1 == 3.14) }' True -| 1 @@ -8324,8 +8324,9 @@ Unless `b' happens to be zero or the null string, the `if' part of the test always succeeds. Because the operators are so similar, this kind of error is very difficult to spot when scanning the source code. - The following list of expressions illustrates the kind of comparison -`gawk' performs, as well as what the result of the comparison is: + The following list of expressions illustrates the kinds of +comparisons `awk' performs, as well as what the result of each +comparison is: `1.5 <= 2.0' numeric comparison (true) @@ -8376,9 +8377,9 @@ regexp constant (`/'...`/') or an ordinary expression. In the latter case, the value of the expression as a string is used as a dynamic regexp (*note Regexp Usage::; also *note Computed Regexps::). - In modern implementations of `awk', a constant regular expression in -slashes by itself is also an expression. The regexp `/REGEXP/' is an -abbreviation for the following comparison expression: + A constant regular expression in slashes by itself is also an +expression. The regexp `/REGEXP/' is an abbreviation for the following +comparison expression: $0 ~ /REGEXP/ @@ -8394,9 +8395,9 @@ File: gawk.info, Node: POSIX String Comparison, Prev: Comparison Operators, U The POSIX standard says that string comparison is performed based on the locale's "collating order". This is the order in which characters -sort, as defined by the locale (for more discussion, *note Ranges and -Locales::). This order is usually very different from the results -obtained when doing straight character-by-character comparison.(1) +sort, as defined by the locale (for more discussion, *note Locales::). +This order is usually very different from the results obtained when +doing straight character-by-character comparison.(1) Because this behavior differs considerably from existing practice, `gawk' only implements it when in POSIX mode (*note Options::). Here @@ -8453,13 +8454,15 @@ Boolean operators are: `BOOLEAN1 || BOOLEAN2' True if at least one of BOOLEAN1 or BOOLEAN2 is true. For example, the following statement prints all records in the input - that contain _either_ `edu' or `li' or both: + that contain _either_ `edu' or `li': if ($0 ~ /edu/ || $0 ~ /li/) print The subexpression BOOLEAN2 is evaluated only if BOOLEAN1 is false. This can make a difference when BOOLEAN2 contains expressions that - have side effects. + have side effects. (Thus, this test never really distinguishes + records that contain both `edu' and `li'--as soon as `edu' is + matched, the full test succeeds.) `! BOOLEAN' True if BOOLEAN is false. For example, the following program @@ -8467,7 +8470,7 @@ Boolean operators are: variable is not defined: BEGIN { if (! ("HOME" in ENVIRON)) - print "no home!" } + print "no home!" } (The `in' operator is described in *note Reference to Elements::.) @@ -8765,8 +8768,8 @@ system about the local character set and language. The ISO C standard defines a default `"C"' locale, which is an environment that is typical of what many C programmers are used to. - Once upon a time, the locale setting used to affect regexp matching -(*note Ranges and Locales::), but this is no longer true. + Once upon a time, the locale setting used to affect regexp matching, +but this is no longer true (*note Ranges and Locales::). Locales can affect record splitting. For the normal case of `RS = "\n"', the locale is largely irrelevant. For other single-character @@ -8818,10 +8821,11 @@ File: gawk.info, Node: Expressions Summary, Prev: Locales, Up: Expressions * `awk' provides the usual arithmetic operators (addition, subtraction, multiplication, division, modulus), and unary plus and minus. It also provides comparison operators, boolean - operators, and regexp matching operators. String concatenation is - accomplished by placing two expressions next to each other; there - is no explicit operator. The three-operand `?:' operator provides - an "if-else" test within expressions. + operators, array membership testing, and regexp matching + operators. String concatenation is accomplished by placing two + expressions next to each other; there is no explicit operator. + The three-operand `?:' operator provides an "if-else" test within + expressions. * Assignment operators provide convenient shorthands for common arithmetic operations. @@ -8829,8 +8833,8 @@ File: gawk.info, Node: Expressions Summary, Prev: Locales, Up: Expressions * In `awk', a value is considered to be true if it is non-zero _or_ non-null. Otherwise, the value is false. - * A value's type is set upon each assignment and may change over its - lifetime. The type determines how it behaves in comparisons + * A variable's type is set upon each assignment and may change over + its lifetime. The type determines how it behaves in comparisons (string or numeric). * Function calls return a value which may be used as part of a larger @@ -8901,7 +8905,7 @@ summary of the types of `awk' patterns: number) or non-null (if a string). (*Note Expression Patterns::.) `BEGPAT, ENDPAT' - A pair of patterns separated by a comma, specifying a range of + A pair of patterns separated by a comma, specifying a "range" of records. The range includes both the initial record that matches BEGPAT and the final record that matches ENDPAT. (*Note Ranges::.) @@ -9112,7 +9116,7 @@ input is read. For example: $ awk ' > BEGIN { print "Analysis of \"li\"" } - > /li/ { ++n } + > /li/ { ++n } > END { print "\"li\" appears in", n, "records." }' mail-list -| Analysis of "li" -| "li" appears in 4 records. @@ -9181,9 +9185,10 @@ and `NF' were _undefined_ inside an `END' rule. The POSIX standard specifies that `NF' is available in an `END' rule. It contains the number of fields from the last input record. Most probably due to an oversight, the standard does not say that `$0' is also preserved, -although logically one would think that it should be. In fact, `gawk' -does preserve the value of `$0' for use in `END' rules. Be aware, -however, that BWK `awk', and possibly other implementations, do not. +although logically one would think that it should be. In fact, all of +BWK `awk', `mawk', and `gawk' preserve the value of `$0' for use in +`END' rules. Be aware, however, that some other implementations and +many older versions of Unix `awk' do not. The third point follows from the first two. The meaning of `print' inside a `BEGIN' or `END' rule is the same as always: `print $0'. If @@ -9252,9 +9257,9 @@ makes it possible to catch and process I/O errors at the level of the `awk' program. The `next' statement (*note Next Statement::) is not allowed inside -either a `BEGINFILE' or and `ENDFILE' rule. The `nextfile' statement -(*note Nextfile Statement::) is allowed only inside a `BEGINFILE' rule, -but not inside an `ENDFILE' rule. +either a `BEGINFILE' or an `ENDFILE' rule. The `nextfile' statement is +allowed only inside a `BEGINFILE' rule, but not inside an `ENDFILE' +rule. The `getline' statement (*note Getline::) is restricted inside both `BEGINFILE' and `ENDFILE': only redirected forms of `getline' are @@ -9289,9 +9294,9 @@ hold a pattern that the `awk' program searches for. There are two ways to get the value of the shell variable into the body of the `awk' program. - The most common method is to use shell quoting to substitute the -variable's value into the program inside the script. For example, -consider the following program: + A common method is to use shell quoting to substitute the variable's +value into the program inside the script. For example, consider the +following program: printf "Enter search pattern: " read pattern @@ -9482,18 +9487,18 @@ thing the `while' statement does is test the CONDITION. If the CONDITION is true, it executes the statement BODY. (The CONDITION is true when the value is not zero and not a null string.) After BODY has been executed, CONDITION is tested again, and if it is still true, BODY -is executed again. This process repeats until the CONDITION is no -longer true. If the CONDITION is initially false, the body of the loop -is never executed and `awk' continues with the statement following the -loop. This example prints the first three fields of each record, one -per line: - - awk '{ - i = 1 - while (i <= 3) { - print $i - i++ - } +executes again. This process repeats until the CONDITION is no longer +true. If the CONDITION is initially false, the body of the loop never +executes and `awk' continues with the statement following the loop. +This example prints the first three fields of each record, one per line: + + awk ' + { + i = 1 + while (i <= 3) { + print $i + i++ + } }' inventory-shipped The body of this loop is a compound statement enclosed in braces, @@ -9524,22 +9529,22 @@ the CONDITION is true. It looks like this: BODY while (CONDITION) - Even if the CONDITION is false at the start, the BODY is executed at + Even if the CONDITION is false at the start, the BODY executes at least once (and only once, unless executing BODY makes CONDITION true). Contrast this with the corresponding `while' statement: while (CONDITION) - BODY + BODY This statement does not execute BODY even once if the CONDITION is false to begin with. The following is an example of a `do' statement: { - i = 1 - do { - print $0 - i++ - } while (i <= 10) + i = 1 + do { + print $0 + i++ + } while (i <= 10) } This program prints each input record 10 times. However, it isn't a @@ -9568,9 +9573,10 @@ INCREMENT. Typically, INITIALIZATION sets a variable to either zero or one, INCREMENT adds one to it, and CONDITION compares it against the desired number of iterations. For example: - awk '{ - for (i = 1; i <= 3; i++) - print $i + awk ' + { + for (i = 1; i <= 3; i++) + print $i }' inventory-shipped This prints the first three fields of each input record, with one field @@ -9594,7 +9600,7 @@ whatsoever. For example, the following statement prints all the powers of two between 1 and 100: for (i = 1; i <= 100; i *= 2) - print i + print i If there is nothing to be done, any of the three expressions in the parentheses following the `for' keyword may be omitted. Thus, @@ -9852,11 +9858,11 @@ rules. *Note BEGINFILE/ENDFILE::. According to the POSIX standard, the behavior is undefined if the `next' statement is used in a `BEGIN' or `END' rule. `gawk' treats it -as a syntax error. Although POSIX permits it, most other `awk' -implementations don't allow the `next' statement inside function bodies -(*note User-defined::). Just as with any other `next' statement, a -`next' statement inside a function body reads the next record and -starts processing it with the first rule in the program. +as a syntax error. Although POSIX does not disallow it, most other +`awk' implementations don't allow the `next' statement inside function +bodies (*note User-defined::). Just as with any other `next' +statement, a `next' statement inside a function body reads the next +record and starts processing it with the first rule in the program. File: gawk.info, Node: Nextfile Statement, Next: Exit Statement, Prev: Next Statement, Up: Statements @@ -9900,17 +9906,17 @@ files, pipes, and coprocesses that are opened with redirections. It is not related to the main processing that `awk' does with the files listed in `ARGV'. - NOTE: For many years, `nextfile' was a `gawk' extension. As of + NOTE: For many years, `nextfile' was a common extension. In September, 2012, it was accepted for inclusion into the POSIX standard. See the Austin Group website (http://austingroupbugs.net/view.php?id=607). - The current version of BWK `awk', and `mawk' (*note Other -Versions::) also support `nextfile'. However, they don't allow the -`nextfile' statement inside function bodies (*note User-defined::). -`gawk' does; a `nextfile' inside a function body reads the next record -and starts processing it with the first rule in the program, just as -any other `nextfile' statement. + The current version of BWK `awk', and `mawk' also support +`nextfile'. However, they don't allow the `nextfile' statement inside +function bodies (*note User-defined::). `gawk' does; a `nextfile' +inside a function body reads the next record and starts processing it +with the first rule in the program, just as any other `nextfile' +statement. File: gawk.info, Node: Exit Statement, Prev: Nextfile Statement, Up: Statements @@ -9934,8 +9940,8 @@ stop immediately. An `exit' statement that is not part of a `BEGIN' or `END' rule stops the execution of any further automatic rules for the current record, skips reading any remaining input records, and executes the -`END' rule if there is one. Any `ENDFILE' rules are also skipped; they -are not executed. +`END' rule if there is one. `gawk' also skips any `ENDFILE' rules; +they do not execute. In such a case, if you don't want the `END' rule to do its job, set a variable to nonzero before the `exit' statement and check that @@ -10022,7 +10028,7 @@ description of each variable.) use binary I/O. Any other string value is treated the same as `"rw"', but causes `gawk' to generate a warning message. `BINMODE' is described in more detail in *note PC Using::. `mawk' - *note Other Versions::), also supports this variable, but only + (*note Other Versions::), also supports this variable, but only using numeric values. ``CONVFMT'' @@ -10105,9 +10111,8 @@ description of each variable.) printing with the `print' statement. It works by being passed as the first argument to the `sprintf()' function (*note String Functions::). Its default value is `"%.6g"'. Earlier versions of - `awk' also used `OFMT' to specify the format for converting - numbers to strings in general expressions; this is now done by - `CONVFMT'. + `awk' used `OFMT' to specify the format for converting numbers to + strings in general expressions; this is now done by `CONVFMT'. `OFS' This is the output field separator (*note Output Separators::). @@ -10216,8 +10221,8 @@ Options::), they are not special. the command line. While you can change the value of `ARGIND' within your `awk' - program, `gawk' automatically sets it to a new value when the next - file is opened. + program, `gawk' automatically sets it to a new value when it opens + the next file. `ENVIRON' An associative array containing the values of the environment. @@ -10259,9 +10264,9 @@ Options::), they are not special. Getline::) inside a `BEGIN' rule can give `FILENAME' a value. `FNR' - The current record number in the current file. `FNR' is - incremented each time a new record is read (*note Records::). It - is reinitialized to zero each time a new input file is started. + The current record number in the current file. `awk' increments + `FNR' each time it reads a new record (*note Records::). `awk' + resets `FNR' to zero each time it starts a new input file. `NF' The number of fields in the current input record. `NF' is set @@ -10285,8 +10290,8 @@ Options::), they are not special. `NR' The number of input records `awk' has processed since the - beginning of the program's execution (*note Records::). `NR' is - incremented each time a new record is read. + beginning of the program's execution (*note Records::). `awk' + increments `NR' each time it reads a new record. `PROCINFO #' The elements of this array provide access to information about the @@ -10351,7 +10356,7 @@ Options::), they are not special. `PROCINFO["sorted_in"]' If this element exists in `PROCINFO', its value controls the - order in which array indices will be processed by `for (INDEX + order in which array indices will be processed by `for (INDX in ARRAY)' loops. Since this is an advanced feature, we defer the full description until later; see *note Scanning an Array::. @@ -10369,7 +10374,7 @@ Options::), they are not special. The following additional elements in the array are available to provide information about the MPFR and GMP libraries if your - version of `gawk' supports arbitrary precision numbers (*note + version of `gawk' supports arbitrary precision arithmetic (*note Arbitrary Precision Arithmetic::): `PROCINFO["mpfr_version"]' @@ -10402,14 +10407,14 @@ Options::), they are not special. The `PROCINFO' array has the following additional uses: - * It may be used to cause coprocesses to communicate over - pseudo-ttys instead of through two-way pipes; this is - discussed further in *note Two-way I/O::. - * It may be used to provide a timeout when reading from any open input file, pipe, or coprocess. *Note Read Timeout::, for more information. + * It may be used to cause coprocesses to communicate over + pseudo-ttys instead of through two-way pipes; this is + discussed further in *note Two-way I/O::. + `RLENGTH' The length of the substring matched by the `match()' function (*note String Functions::). `RLENGTH' is set by invoking the @@ -10598,6 +10603,12 @@ Because `-q' is not a valid `gawk' option, it and the following `-v' are passed on to the `awk' program. (*Note Getopt Function::, for an `awk' library function that parses command-line options.) + When designing your program, you should choose options that don't +conflict with `gawk''s, since it will process any options that it +accepts before passing the rest of the command line on to your program. +Using `#!' with the `-E' option may help (*note Executable Scripts::, +and *note Options::). + File: gawk.info, Node: Pattern Action Summary, Prev: Built-in Variables, Up: Patterns and Actions @@ -10627,8 +10638,8 @@ File: gawk.info, Node: Pattern Action Summary, Prev: Built-in Variables, Up: * The control statements in `awk' are `if'-`else', `while', `for', and `do'-`while'. `gawk' adds the `switch' statement. There are - two flavors of `for' statement: one for for performing general - looping, and the other iterating through an array. + two flavors of `for' statement: one for performing general + looping, and the other for iterating through an array. * `break' and `continue' let you exit early or start the next iteration of a loop (or get out of a `switch'). @@ -10640,12 +10651,16 @@ File: gawk.info, Node: Pattern Action Summary, Prev: Built-in Variables, Up: * The `exit' statement terminates your program. When executed from an action (or function body) it transfers control to the `END' statements. From an `END' statement body, it exits immediately. - You may pass an optional numeric value to be used at `awk''s exit + You may pass an optional numeric value to be used as `awk''s exit status. * Some built-in variables provide control over `awk', mainly for I/O. Other variables convey information from `awk' to your program. + * `ARGC' and `ARGV' make the command-line arguments available to + your program. Manipulating them from a `BEGIN' rule lets you + control how `awk' will process the provided data files. + File: gawk.info, Node: Arrays, Next: Functions, Prev: Patterns and Actions, Up: Top @@ -10665,26 +10680,21 @@ about array usage. The major node moves on to discuss `gawk''s facility for sorting arrays, and ends with a brief description of `gawk''s ability to support true arrays of arrays. - `awk' maintains a single set of names that may be used for naming -variables, arrays, and functions (*note User-defined::). Thus, you -cannot have a variable and an array with the same name in the same -`awk' program. - * Menu: * Array Basics:: The basics of arrays. -* Delete:: The `delete' statement removes an element - from an array. * Numeric Array Subscripts:: How to use numbers as subscripts in `awk'. * Uninitialized Subscripts:: Using Uninitialized variables as subscripts. +* Delete:: The `delete' statement removes an element + from an array. * Multidimensional:: Emulating multidimensional arrays in `awk'. * Arrays of Arrays:: True multidimensional arrays. * Arrays Summary:: Summary of arrays. -File: gawk.info, Node: Array Basics, Next: Delete, Up: Arrays +File: gawk.info, Node: Array Basics, Next: Numeric Array Subscripts, Up: Arrays 8.1 The Basics of Arrays ======================== @@ -10903,14 +10913,14 @@ encountering repeated numbers, gaps, or lines that don't begin with a number: { - if ($1 > max) - max = $1 - arr[$1] = $0 + if ($1 > max) + max = $1 + arr[$1] = $0 } END { - for (x = 1; x <= max; x++) - print arr[x] + for (x = 1; x <= max; x++) + print arr[x] } The first rule keeps track of the largest line number seen so far; @@ -10938,9 +10948,9 @@ overrides the others. Gaps in the line numbers can be handled with an easy improvement to the program's `END' rule, as follows: END { - for (x = 1; x <= max; x++) - if (x in arr) - print arr[x] + for (x = 1; x <= max; x++) + if (x in arr) + print arr[x] } @@ -10958,7 +10968,7 @@ lowest index up to the highest. This technique won't do the job in has a special kind of `for' statement for scanning an array: for (VAR in ARRAY) - BODY + BODY This loop executes BODY once for each index in ARRAY that the program has previously used, with the variable VAR set to that index. @@ -11015,7 +11025,7 @@ all `awk' versions do so. Consider this program, named `loopcheck.awk': } } - Here is what happens when run with `gawk': + Here is what happens when run with `gawk' (and `mawk'): $ gawk -f loopcheck.awk -| here @@ -11118,7 +11128,8 @@ available: to run. Changing `PROCINFO["sorted_in"]' in the loop body does not affect the loop. For example: - $ gawk 'BEGIN { + $ gawk ' + > BEGIN { > a[4] = 4 > a[3] = 3 > for (i in a) @@ -11126,7 +11137,8 @@ affect the loop. For example: > }' -| 4 4 -| 3 3 - $ gawk 'BEGIN { + $ gawk ' + > BEGIN { > PROCINFO["sorted_in"] = "@ind_str_asc" > a[4] = 4 > a[3] = 3 @@ -11178,87 +11190,9 @@ ordering when the numeric values are equal ensures that `gawk' behaves consistently across different environments. -File: gawk.info, Node: Delete, Next: Numeric Array Subscripts, Prev: Array Basics, Up: Arrays - -8.2 The `delete' Statement -========================== - -To remove an individual element of an array, use the `delete' statement: - - delete ARRAY[INDEX-EXPRESSION] - - Once an array element has been deleted, any value the element once -had is no longer available. It is as if the element had never been -referred to or been given a value. The following is an example of -deleting elements in an array: - - for (i in frequencies) - delete frequencies[i] - -This example removes all the elements from the array `frequencies'. -Once an element is deleted, a subsequent `for' statement to scan the -array does not report that element and the `in' operator to check for -the presence of that element returns zero (i.e., false): - - delete foo[4] - if (4 in foo) - print "This will never be printed" - - It is important to note that deleting an element is _not_ the same -as assigning it a null value (the empty string, `""'). For example: - - foo[4] = "" - if (4 in foo) - print "This is printed, even though foo[4] is empty" - - It is not an error to delete an element that does not exist. -However, if `--lint' is provided on the command line (*note Options::), -`gawk' issues a warning message when an element that is not in the -array is deleted. - - All the elements of an array may be deleted with a single statement -by leaving off the subscript in the `delete' statement, as follows: - - delete ARRAY - - Using this version of the `delete' statement is about three times -more efficient than the equivalent loop that deletes each element one -at a time. - - NOTE: For many years, using `delete' without a subscript was a - `gawk' extension. As of September, 2012, it was accepted for - inclusion into the POSIX standard. See the Austin Group website - (http://austingroupbugs.net/view.php?id=544). This form of the - `delete' statement is also supported by BWK `awk' and `mawk', as - well as by a number of other implementations (*note Other - Versions::). - - The following statement provides a portable but nonobvious way to -clear out an array:(1) - - split("", array) - - The `split()' function (*note String Functions::) clears out the -target array first. This call asks it to split apart the null string. -Because there is no data to split out, the function simply clears the -array and then returns. - - CAUTION: Deleting an array does not change its type; you cannot - delete an array and then use the array's name as a scalar (i.e., a - regular variable). For example, the following does not work: - - a[1] = 3 - delete a - a = 3 - - ---------- Footnotes ---------- - - (1) Thanks to Michael Brennan for pointing this out. - - -File: gawk.info, Node: Numeric Array Subscripts, Next: Uninitialized Subscripts, Prev: Delete, Up: Arrays +File: gawk.info, Node: Numeric Array Subscripts, Next: Uninitialized Subscripts, Prev: Array Basics, Up: Arrays -8.3 Using Numbers to Subscript Arrays +8.2 Using Numbers to Subscript Arrays ===================================== An important aspect to remember about arrays is that _array subscripts @@ -11287,9 +11221,9 @@ two significant digits. This test fails, since `"12.15"' is different from `"12.153"'. According to the rules for conversions (*note Conversion::), integer -values are always converted to strings as integers, no matter what the -value of `CONVFMT' may happen to be. So the usual case of the -following works: +values always convert to strings as integers, no matter what the value +of `CONVFMT' may happen to be. So the usual case of the following +works: for (i = 1; i <= maxsub; i++) do something with array[i] @@ -11302,14 +11236,14 @@ example, that `array[17]', `array[021]', and `array[0x11]' all refer to the same element! As with many things in `awk', the majority of the time things work -as one would expect them to. But it is useful to have a precise +as you would expect them to. But it is useful to have a precise knowledge of the actual rules since they can sometimes have a subtle effect on your programs. -File: gawk.info, Node: Uninitialized Subscripts, Next: Multidimensional, Prev: Numeric Array Subscripts, Up: Arrays +File: gawk.info, Node: Uninitialized Subscripts, Next: Delete, Prev: Numeric Array Subscripts, Up: Arrays -8.4 Using Uninitialized Variables as Subscripts +8.3 Using Uninitialized Variables as Subscripts =============================================== Suppose it's necessary to write a program to print the input data in @@ -11355,7 +11289,86 @@ string as a subscript if `--lint' is provided on the command line (*note Options::). -File: gawk.info, Node: Multidimensional, Next: Arrays of Arrays, Prev: Uninitialized Subscripts, Up: Arrays +File: gawk.info, Node: Delete, Next: Multidimensional, Prev: Uninitialized Subscripts, Up: Arrays + +8.4 The `delete' Statement +========================== + +To remove an individual element of an array, use the `delete' statement: + + delete ARRAY[INDEX-EXPRESSION] + + Once an array element has been deleted, any value the element once +had is no longer available. It is as if the element had never been +referred to or been given a value. The following is an example of +deleting elements in an array: + + for (i in frequencies) + delete frequencies[i] + +This example removes all the elements from the array `frequencies'. +Once an element is deleted, a subsequent `for' statement to scan the +array does not report that element and the `in' operator to check for +the presence of that element returns zero (i.e., false): + + delete foo[4] + if (4 in foo) + print "This will never be printed" + + It is important to note that deleting an element is _not_ the same +as assigning it a null value (the empty string, `""'). For example: + + foo[4] = "" + if (4 in foo) + print "This is printed, even though foo[4] is empty" + + It is not an error to delete an element that does not exist. +However, if `--lint' is provided on the command line (*note Options::), +`gawk' issues a warning message when an element that is not in the +array is deleted. + + All the elements of an array may be deleted with a single statement +by leaving off the subscript in the `delete' statement, as follows: + + delete ARRAY + + Using this version of the `delete' statement is about three times +more efficient than the equivalent loop that deletes each element one +at a time. + + This form of the `delete' statement is also supported by BWK `awk' +and `mawk', as well as by a number of other implementations. + + NOTE: For many years, using `delete' without a subscript was a + common extension. In September, 2012, it was accepted for + inclusion into the POSIX standard. See the Austin Group website + (http://austingroupbugs.net/view.php?id=544). + + The following statement provides a portable but nonobvious way to +clear out an array:(1) + + split("", array) + + The `split()' function (*note String Functions::) clears out the +target array first. This call asks it to split apart the null string. +Because there is no data to split out, the function simply clears the +array and then returns. + + CAUTION: Deleting all the elements from an array does not change + its type; you cannot clear an array and then use the array's name + as a scalar (i.e., a regular variable). For example, the following + does not work: + + a[1] = 3 + delete a + a = 3 + + ---------- Footnotes ---------- + + (1) Thanks to Michael Brennan for pointing this out. + + +File: gawk.info, Node: Multidimensional, Next: Arrays of Arrays, Prev: Delete, Up: Arrays 8.5 Multidimensional Arrays =========================== @@ -11367,7 +11380,7 @@ File: gawk.info, Node: Multidimensional, Next: Arrays of Arrays, Prev: Uninit A multidimensional array is an array in which an element is identified by a sequence of indices instead of a single index. For example, a two-dimensional array requires two indices. The usual way -(in most languages, including `awk') to refer to an element of a +(in many languages, including `awk') to refer to an element of a two-dimensional array named `grid' is with `grid[X,Y]'. Multidimensional arrays are supported in `awk' through concatenation @@ -11508,8 +11521,9 @@ multidimensional subscript). So the following is valid in `gawk': Each subarray and the main array can be of different length. In fact, the elements of an array or its subarray do not all have to have the same type. This means that the main array and any of its subarrays -can be non-rectangular, or jagged in structure. One can assign a scalar -value to the index `4' of the main array `a': +can be non-rectangular, or jagged in structure. You can assign a scalar +value to the index `4' of the main array `a', even though `a[1]' is +itself an array and not a scalar: a[4] = "An element in a jagged array" @@ -11570,6 +11584,8 @@ an array element is itself an array: print array[i][j] } } + else + print array[i] } If the structure of a jagged array of arrays is known in advance, @@ -11785,8 +11801,9 @@ brackets ([ ]): user-defined function that can be used to obtain a random non-negative integer less than N: - function randint(n) { - return int(n * rand()) + function randint(n) + { + return int(n * rand()) } The multiplication produces a random number greater than zero and @@ -11803,8 +11820,7 @@ brackets ([ ]): # Roll 3 six-sided dice and # print total number of points. { - printf("%d points\n", - roll(6)+roll(6)+roll(6)) + printf("%d points\n", roll(6) + roll(6) + roll(6)) } CAUTION: In most `awk' implementations, including `gawk', @@ -11891,8 +11907,7 @@ with character indices, and not byte indices. In the following list, optional parameters are enclosed in square brackets ([ ]). Several functions perform string substitution; the full discussion is provided in the description of the `sub()' function, -which comes towards the end since the list is presented in alphabetic -order. +which comes towards the end since the list is presented alphabetically. Those functions that are specific to `gawk' are marked with a pound sign (`#'). They are not available in compatibility mode (*note @@ -11925,7 +11940,8 @@ Options::): When comparing strings, `IGNORECASE' affects the sorting (*note Array Sorting Functions::). If the SOURCE array contains subarrays as values (*note Arrays of Arrays::), they will come - last, after all scalar values. + last, after all scalar values. Subarrays are _not_ recursively + sorted. For example, if the contents of `a' are as follows: @@ -12028,7 +12044,10 @@ Options::): If FIND is not found, `index()' returns zero. - It is a fatal error to use a regexp constant for FIND. + With BWK `awk' and `gawk', it is a fatal error to use a regexp + constant for FIND. Other implementations allow it, simply + treating the regexp constant as an expression meaning `$0 ~ + /regexp/'. `length('[STRING]`)' Return the number of characters in STRING. If STRING is a number, @@ -12096,13 +12115,12 @@ Options::): For example: { - if ($1 == "FIND") - regex = $2 - else { - where = match($0, regex) - if (where != 0) - print "Match of", regex, "found at", - where, "in", $0 + if ($1 == "FIND") + regex = $2 + else { + where = match($0, regex) + if (where != 0) + print "Match of", regex, "found at", where, "in", $0 } } @@ -12171,7 +12189,7 @@ Options::): The `patsplit()' function splits strings into pieces in a manner similar to the way input lines are split into fields using `FPAT' - (*note Splitting By Content::. + (*note Splitting By Content::). Before splitting the string, `patsplit()' deletes any previously existing elements in the arrays ARRAY and SEPS. @@ -12182,15 +12200,14 @@ Options::): first piece is stored in `ARRAY[1]', the second piece in `ARRAY[2]', and so forth. The string value of the third argument, FIELDSEP, is a regexp describing where to split STRING (much as - `FS' can be a regexp describing where to split input records; - *note Regexp Field Splitting::). If FIELDSEP is omitted, the - value of `FS' is used. `split()' returns the number of elements - created. SEPS is a `gawk' extension with `SEPS[I]' being the - separator string between `ARRAY[I]' and `ARRAY[I+1]'. If FIELDSEP - is a single space then any leading whitespace goes into `SEPS[0]' - and any trailing whitespace goes into `SEPS[N]' where N is the - return value of `split()' (that is, the number of elements in - ARRAY). + `FS' can be a regexp describing where to split input records). If + FIELDSEP is omitted, the value of `FS' is used. `split()' returns + the number of elements created. SEPS is a `gawk' extension with + `SEPS[I]' being the separator string between `ARRAY[I]' and + `ARRAY[I+1]'. If FIELDSEP is a single space then any leading + whitespace goes into `SEPS[0]' and any trailing whitespace goes + into `SEPS[N]' where N is the return value of `split()' (that is, + the number of elements in ARRAY). The `split()' function splits strings into pieces in a manner similar to the way input lines are split into fields. For example: @@ -12396,6 +12413,17 @@ Options::): Nonalphabetic characters are left unchanged. For example, `toupper("MiXeD cAsE 123")' returns `"MIXED CASE 123"'. + Matching the Null String + + In `awk', the `*' operator can match the null string. This is +particularly important for the `sub()', `gsub()', and `gensub()' +functions. For example: + + $ echo abc | awk '{ gsub(/m*/, "X"); print }' + -| XaXbXcX + +Although this makes a certain amount of sense, it can be surprising. + ---------- Footnotes ---------- (1) Unless you use the `--non-decimal-data' option, which isn't @@ -12415,8 +12443,8 @@ File: gawk.info, Node: Gory Details, Up: String Functions 9.1.3.1 More About `\' and `&' with `sub()', `gsub()', and `gensub()' ..................................................................... - CAUTION: This section has been known to cause headaches. You - might want to skip it upon first reading. + CAUTION: This subsubsection has been reported to cause headaches. + You might want to skip it upon first reading. When using `sub()', `gsub()', or `gensub()', and trying to get literal backslashes and ampersands into the replacement text, you need @@ -12550,17 +12578,6 @@ Table 9.4: Escape Sequence Processing For `gensub()' and the special cases for `sub()' and `gsub()', we recommend the use of `gawk' and `gensub()' when you have to do substitutions. - Matching the Null String - - In `awk', the `*' operator can match the null string. This is -particularly important for the `sub()', `gsub()', and `gensub()' -functions. For example: - - $ echo abc | awk '{ gsub(/m*/, "X"); print }' - -| XaXbXcX - -Although this makes a certain amount of sense, it can be surprising. - ---------- Footnotes ---------- (1) This was rather naive of him, despite there being a note in this @@ -12610,11 +12627,10 @@ parameters are enclosed in square brackets ([ ]): function--`gawk' also buffers its output and the `fflush()' function forces `gawk' to flush its buffers. - `fflush()' was added to BWK `awk' in April of 1992. For two - decades, it was not part of the POSIX standard. As of December, - 2012, it was accepted for inclusion into the POSIX standard. See - the Austin Group website - (http://austingroupbugs.net/view.php?id=634). + Brian Kernighan added `fflush()' to his `awk' in April of 1992. + For two decades, it was a common extension. In December, 2012, it + was accepted for inclusion into the POSIX standard. See the + Austin Group website (http://austingroupbugs.net/view.php?id=634). POSIX standardizes `fflush()' as follows: If there is no argument, or if the argument is the null string (`""'), then `awk' flushes @@ -12801,7 +12817,7 @@ enclosed in square brackets ([ ]): If DATESPEC does not contain enough elements or if the resulting time is out of range, `mktime()' returns -1. -`strftime(' [FORMAT [`,' TIMESTAMP [`,' UTC-FLAG] ] ]`)' +`strftime('[FORMAT [`,' TIMESTAMP [`,' UTC-FLAG] ] ]`)' Format the time specified by TIMESTAMP based on the contents of the FORMAT string and return the result. It is similar to the function of the same name in ISO C. If UTC-FLAG is present and is @@ -13016,7 +13032,7 @@ to the standard output and interprets the current time according to the format specifiers in the string. For example: $ date '+Today is %A, %B %d, %Y.' - -| Today is Monday, May 05, 2014. + -| Today is Monday, September 22, 2014. Here is the `gawk' version of the `date' utility. It has a shell "wrapper" to handle the `-u' option, which requires that `date' run as @@ -13105,12 +13121,13 @@ a given value. Finally, two other common operations are to shift the bits left or right. For example, if you have a bit string `10111001' and you shift -it right by three bits, you end up with `00010111'.(1) If you start over -again with `10111001' and shift it left by three bits, you end up with -`11001000'. `gawk' provides built-in functions that implement the -bitwise operations just described. They are: +it right by three bits, you end up with `00010111'.(1) If you start +over again with `10111001' and shift it left by three bits, you end up +with `11001000'. The following list describes `gawk''s built-in +functions that implement the bitwise operations. Optional parameters +are enclosed in square brackets ([ ]): -``and(V1, V2' [`,' ...]`)'' +``and('V1`,' V2 [`,' ...]`)'' Return the bitwise AND of the arguments. There must be at least two. @@ -13120,13 +13137,13 @@ bitwise operations just described. They are: ``lshift(VAL, COUNT)'' Return the value of VAL, shifted left by COUNT bits. -``or(V1, V2' [`,' ...]`)'' +``or('V1`,' V2 [`,' ...]`)'' Return the bitwise OR of the arguments. There must be at least two. ``rshift(VAL, COUNT)'' Return the value of VAL, shifted right by COUNT bits. -``xor(V1, V2' [`,' ...]`)'' +``xor('V1`,' V2 [`,' ...]`)'' Return the bitwise XOR of the arguments. There must be at least two. @@ -13211,7 +13228,7 @@ File: gawk.info, Node: Type Functions, Next: I18N Functions, Prev: Bitwise Fu `gawk' provides a single function that lets you distinguish an array from a scalar variable. This is necessary for writing code that -traverses every element of an array of arrays. (*note Arrays of +traverses every element of an array of arrays (*note Arrays of Arrays::). `isarray(X)' @@ -13223,12 +13240,12 @@ itself an array or not. The second is inside the body of a user-defined function (not discussed yet; *note User-defined::), to test if a parameter is an array or not. - Note, however, that using `isarray()' at the global level to test -variables makes no sense. Since you are the one writing the program, you -are supposed to know if your variables are arrays or not. And in fact, -due to the way `gawk' works, if you pass the name of a variable that -has not been previously used to `isarray()', `gawk' will end up turning -it into a scalar. + NOTE: Using `isarray()' at the global level to test variables + makes no sense. Since you are the one writing the program, you are + supposed to know if your variables are arrays or not. And in fact, + due to the way `gawk' works, if you pass the name of a variable + that has not been previously used to `isarray()', `gawk' ends up + turning it into a scalar. File: gawk.info, Node: I18N Functions, Prev: Type Functions, Up: Built-in @@ -13439,7 +13456,7 @@ extra whitespace signifies the start of the local variable list): function delarray(a, i) { for (i in a) - delete a[i] + delete a[i] } When working with arrays, it is often necessary to delete all the @@ -13447,8 +13464,8 @@ elements in an array and start over with a new list of elements (*note Delete::). Instead of having to repeat this loop everywhere that you need to clear out an array, your program can just call `delarray'. (This guarantees portability. The use of `delete ARRAY' to delete the -contents of an entire array is a recent(1) addition to the POSIX -standard.) +contents of an entire array is a relatively recent(1) addition to the +POSIX standard.) The following is an example of a recursive function. It takes a string as an input parameter and returns the string in backwards order. @@ -13471,7 +13488,7 @@ way: > gawk -e '{ print rev($0) }' -f rev.awk -| !cinaP t'noD - The C `ctime()' function takes a timestamp and returns it in a + The C `ctime()' function takes a timestamp and returns it as a string, formatted in a well-known fashion. The following example uses the built-in `strftime()' function (*note Time Functions::) to create an `awk' version of `ctime()': @@ -13482,12 +13499,18 @@ an `awk' version of `ctime()': function ctime(ts, format) { - format = PROCINFO["strftime"] + format = "%a %b %e %H:%M:%S %Z %Y" + if (ts == 0) ts = systime() # use current time as default return strftime(format, ts) } + You might think that `ctime()' could use `PROCINFO["strftime"]' for +its format string. That would be a mistake, since `ctime()' is supposed +to return the time formatted in a standard fashion, and user-level code +could have changed `PROCINFO["strftime"]'. + ---------- Footnotes ---------- (1) Late in 2012. @@ -14029,7 +14052,7 @@ mechanism allows you to sort arbitrary data in an arbitrary fashion. # quicksort_swap --- helper function for quicksort, should really be inline - function quicksort_swap(data, i, j, temp) + function quicksort_swap(data, i, j, temp) { temp = data[i] data[i] = data[j] @@ -14164,11 +14187,12 @@ File: gawk.info, Node: Functions Summary, Prev: Indirect Calls, Up: Functions functions. * POSIX `awk' provides three kinds of built-in functions: numeric, - string, and I/O. `gawk' provides functions that work with values - representing time, do bit manipulation, sort arrays, and - internationalize and localize programs. `gawk' also provides - several extensions to some of standard functions, typically in the - form of additional arguments. + string, and I/O. `gawk' provides functions that sort arrays, work + with values representing time, do bit manipulation, determine + variable type (array vs. scalar), and internationalize and + localize programs. `gawk' also provides several extensions to + some of standard functions, typically in the form of additional + arguments. * Functions accept zero or more arguments and return a value. The expressions that provide the argument values are completely @@ -14353,8 +14377,9 @@ program, leading to bugs that are very difficult to track down: function lib_func(x, y, l1, l2) { ... - USE VARIABLE some_var # some_var should be local - ... # but is not by oversight + # some_var should be local but by oversight is not + USE VARIABLE some_var + ... } A different convention, common in the Tcl community, is to use a @@ -14462,7 +14487,7 @@ versions of `awk': # a[5] = "123.45" # a[6] = "1.e3" # a[7] = "1.32" - # a[7] = "1.32E2" + # a[8] = "1.32E2" # # for (i = 1; i in a; i++) # print a[i], strtonum(a[i]), mystrtonum(a[i]) @@ -14471,9 +14496,11 @@ versions of `awk': The function first looks for C-style octal numbers (base 8). If the input string matches a regular expression describing octal numbers, then `mystrtonum()' loops through each character in the string. It -sets `k' to the index in `"01234567"' of the current octal digit. -Since the return value is one-based, the `k--' adjusts `k' so it can be -used in computing the return value. +sets `k' to the index in `"1234567"' of the current octal digit. The +return value will either be the same number as the digit, or zero if +the character is not there, which will be true for a `0'. This is +safe, since the regexp test in the `if' ensures that only octal values +are converted. Similar logic applies to the code that checks for and converts a hexadecimal value, which starts with `0x' or `0X'. The use of @@ -14499,7 +14526,7 @@ condition or set of conditions is true. Before proceeding with a particular computation, you make a statement about what you believe to be the case. Such a statement is known as an "assertion". The C language provides an `<assert.h>' header file and corresponding -`assert()' macro that the programmer can use to make assertions. If an +`assert()' macro that a programmer can use to make assertions. If an assertion fails, the `assert()' macro arranges to print a diagnostic message describing the condition that should have been true but was not, and then it kills the program. In C, using `assert()' looks this: @@ -14839,7 +14866,7 @@ current time formatted in the same way as the `date' utility: now = systime() # return date(1)-style output - ret = strftime(PROCINFO["strftime"], now) + ret = strftime("%a %b %e %H:%M:%S %Z %Y", now) # clear out target array delete time @@ -14935,6 +14962,9 @@ string. Thus calling code may use something like: This tests the result to see if it is empty or not. An equivalent test would be `contents == ""'. + *Note Extension Sample Readfile::, for an extension function that +also reads an entire file into memory. + File: gawk.info, Node: Data File Management, Next: Getopt Function, Prev: General Functions, Up: Library Functions @@ -14984,15 +15014,14 @@ does so _portably_; this works with any implementation of `awk': # that each take the name of the file being started or # finished, respectively. - FILENAME != _oldfilename \ - { + FILENAME != _oldfilename { if (_oldfilename != "") endfile(_oldfilename) _oldfilename = FILENAME beginfile(FILENAME) } - END { endfile(FILENAME) } + END { endfile(FILENAME) } This file must be loaded before the user's "main" program, so that the rule it supplies is executed first. @@ -15030,7 +15059,7 @@ solves the problem: beginfile(FILENAME) } - END { endfile(_filename_) } + END { endfile(_filename_) } *note Wc Program::, shows how this library function can be used and how it simplifies writing the main program. @@ -30997,7 +31026,7 @@ Index * Menu: -* ! (exclamation point), ! operator: Boolean Ops. (line 67) +* ! (exclamation point), ! operator: Boolean Ops. (line 69) * ! (exclamation point), ! operator <1>: Egrep Program. (line 175) * ! (exclamation point), ! operator <2>: Ranges. (line 48) * ! (exclamation point), ! operator: Precedence. (line 52) @@ -31027,7 +31056,7 @@ Index * % (percent sign), %= operator <1>: Precedence. (line 95) * % (percent sign), %= operator: Assignment Ops. (line 130) * & (ampersand), && operator <1>: Precedence. (line 86) -* & (ampersand), && operator: Boolean Ops. (line 57) +* & (ampersand), && operator: Boolean Ops. (line 59) * & (ampersand), gsub()/gensub()/sub() functions and: Gory Details. (line 6) * ' (single quote): One-shot. (line 15) @@ -31041,8 +31070,8 @@ Index (line 55) * * (asterisk), * operator, as regexp operator: Regexp Operators. (line 89) -* * (asterisk), * operator, null strings, matching: Gory Details. - (line 143) +* * (asterisk), * operator, null strings, matching: String Functions. + (line 535) * * (asterisk), ** operator <1>: Precedence. (line 49) * * (asterisk), ** operator: Arithmetic Ops. (line 81) * * (asterisk), **= operator <1>: Precedence. (line 95) @@ -31101,7 +31130,7 @@ Index * --re-interval option: Options. (line 279) * --sandbox option: Options. (line 286) * --sandbox option, disabling system() function: I/O Functions. - (line 97) + (line 96) * --sandbox option, input redirection with getline: Getline. (line 19) * --sandbox option, output redirection with print, printf: Redirection. (line 6) @@ -31299,12 +31328,12 @@ Index * ambiguity, syntactic: /= operator vs. /=.../ regexp constant: Assignment Ops. (line 148) * ampersand (&), && operator <1>: Precedence. (line 86) -* ampersand (&), && operator: Boolean Ops. (line 57) +* ampersand (&), && operator: Boolean Ops. (line 59) * ampersand (&), gsub()/gensub()/sub() functions and: Gory Details. (line 6) * anagram.awk program: Anagram Program. (line 22) * anagrams, finding: Anagram Program. (line 6) -* and: Bitwise Functions. (line 39) +* and: Bitwise Functions. (line 40) * AND bitwise operation: Bitwise Functions. (line 6) * and Boolean-logic operator: Boolean Ops. (line 6) * ANSI: Glossary. (line 34) @@ -31338,7 +31367,7 @@ Index (line 6) * array scanning order, controlling: Controlling Scanning. (line 14) -* array, number of elements: String Functions. (line 197) +* array, number of elements: String Functions. (line 200) * arrays: Arrays. (line 6) * arrays of arrays: Arrays of Arrays. (line 6) * arrays, an example of using: Array Example. (line 6) @@ -31346,7 +31375,7 @@ Index * arrays, as parameters to functions: Pass By Value/Reference. (line 47) * arrays, associative: Array Intro. (line 50) -* arrays, associative, library functions and: Library Names. (line 57) +* arrays, associative, library functions and: Library Names. (line 58) * arrays, deleting entire contents: Delete. (line 39) * arrays, elements that don't exist: Reference to Elements. (line 23) @@ -31354,13 +31383,12 @@ Index * arrays, elements, deleting: Delete. (line 6) * arrays, elements, order of access by in operator: Scanning an Array. (line 48) -* arrays, elements, retrieving number of: String Functions. (line 42) +* arrays, elements, retrieving number of: String Functions. (line 41) * arrays, for statement and: Scanning an Array. (line 20) * arrays, indexing: Array Intro. (line 50) * arrays, merging into strings: Join Function. (line 6) * arrays, multidimensional: Multidimensional. (line 10) * arrays, multidimensional, scanning: Multiscanning. (line 11) -* arrays, names of, and names of functions/variables: Arrays. (line 18) * arrays, numeric subscripts: Numeric Array Subscripts. (line 6) * arrays, referencing elements: Reference to Elements. @@ -31381,12 +31409,12 @@ Index * ASCII: Ordinal Functions. (line 45) * asort <1>: Array Sorting Functions. (line 6) -* asort: String Functions. (line 42) +* asort: String Functions. (line 41) * asort() function (gawk), arrays, sorting: Array Sorting Functions. (line 6) * asorti <1>: Array Sorting Functions. (line 6) -* asorti: String Functions. (line 42) +* asorti: String Functions. (line 41) * asorti() function (gawk), arrays, sorting: Array Sorting Functions. (line 6) * assert() function (C library): Assert Function. (line 6) @@ -31403,8 +31431,8 @@ Index (line 55) * asterisk (*), * operator, as regexp operator: Regexp Operators. (line 89) -* asterisk (*), * operator, null strings, matching: Gory Details. - (line 143) +* asterisk (*), * operator, null strings, matching: String Functions. + (line 535) * asterisk (*), ** operator <1>: Precedence. (line 49) * asterisk (*), ** operator: Arithmetic Ops. (line 81) * asterisk (*), **= operator <1>: Precedence. (line 95) @@ -31451,7 +31479,7 @@ Index * awk, POSIX and: Preface. (line 21) * awk, POSIX and, See Also POSIX awk: Preface. (line 21) * awk, regexp constants and: Comparison Operators. - (line 102) + (line 103) * awk, See Also gawk: Preface. (line 34) * awk, terms describing: This Manual. (line 6) * awk, uses for <1>: When. (line 6) @@ -31537,7 +31565,7 @@ Index * BEGIN pattern, next/nextfile statements and <1>: Next Statement. (line 44) * BEGIN pattern, next/nextfile statements and: I/O And BEGIN/END. - (line 36) + (line 37) * BEGIN pattern, OFS/ORS variables, assigning values to: Output Separators. (line 20) * BEGIN pattern, operators and: Using BEGIN/END. (line 17) @@ -31548,7 +31576,7 @@ Index * BEGINFILE pattern: BEGINFILE/ENDFILE. (line 6) * BEGINFILE pattern, Boolean patterns and: Expression Patterns. (line 70) -* beginfile() user-defined function: Filetrans Function. (line 62) +* beginfile() user-defined function: Filetrans Function. (line 61) * Bentley, Jon: Glossary. (line 143) * Benzinger, Michael: Contributors. (line 97) * Berry, Karl <1>: Ranges and Locales. (line 74) @@ -31562,11 +31590,11 @@ Index * BINMODE variable <1>: PC Using. (line 33) * BINMODE variable: User-modified. (line 15) * bit-manipulation functions: Bitwise Functions. (line 6) -* bits2str() user-defined function: Bitwise Functions. (line 70) -* bitwise AND: Bitwise Functions. (line 39) -* bitwise complement: Bitwise Functions. (line 43) -* bitwise OR: Bitwise Functions. (line 49) -* bitwise XOR: Bitwise Functions. (line 55) +* bits2str() user-defined function: Bitwise Functions. (line 71) +* bitwise AND: Bitwise Functions. (line 40) +* bitwise complement: Bitwise Functions. (line 44) +* bitwise OR: Bitwise Functions. (line 50) +* bitwise XOR: Bitwise Functions. (line 56) * bitwise, complement: Bitwise Functions. (line 25) * bitwise, operations: Bitwise Functions. (line 6) * bitwise, shift: Bitwise Functions. (line 32) @@ -31610,8 +31638,8 @@ Index * Brennan, Michael: Foreword. (line 83) * Brian Kernighan's awk <1>: I/O Functions. (line 43) * Brian Kernighan's awk <2>: Gory Details. (line 19) -* Brian Kernighan's awk <3>: String Functions. (line 490) -* Brian Kernighan's awk <4>: Delete. (line 48) +* Brian Kernighan's awk <3>: String Functions. (line 491) +* Brian Kernighan's awk <4>: Delete. (line 51) * Brian Kernighan's awk <5>: Nextfile Statement. (line 47) * Brian Kernighan's awk <6>: Continue Statement. (line 44) * Brian Kernighan's awk <7>: Break Statement. (line 51) @@ -31636,8 +31664,8 @@ Index * Buening, Andreas <2>: Contributors. (line 92) * Buening, Andreas: Acknowledgments. (line 60) * buffering, input/output <1>: Two-way I/O. (line 52) -* buffering, input/output: I/O Functions. (line 140) -* buffering, interactive vs. noninteractive: I/O Functions. (line 109) +* buffering, input/output: I/O Functions. (line 139) +* buffering, interactive vs. noninteractive: I/O Functions. (line 108) * buffers, flushing: I/O Functions. (line 32) * buffers, operators for: GNU Regexp Operators. (line 48) @@ -31667,7 +31695,7 @@ Index * case sensitivity, and regexps: User-modified. (line 76) * case sensitivity, and string comparisons: User-modified. (line 76) * case sensitivity, array indices and: Array Intro. (line 94) -* case sensitivity, converting case: String Functions. (line 520) +* case sensitivity, converting case: String Functions. (line 521) * case sensitivity, example programs: Library Functions. (line 53) * case sensitivity, gawk: Case-sensitivity. (line 26) * case sensitivity, regexps and: Case-sensitivity. (line 6) @@ -31747,7 +31775,7 @@ Index * common extensions, delete to delete entire arrays: Delete. (line 39) * common extensions, func keyword: Definition Syntax. (line 93) * common extensions, length() applied to an array: String Functions. - (line 197) + (line 200) * common extensions, RS as a regexp: gawk split records. (line 6) * common extensions, single character fields: Single Character Fields. (line 6) @@ -31756,7 +31784,7 @@ Index (line 9) * comparison expressions, as patterns: Expression Patterns. (line 14) * comparison expressions, string vs. regexp: Comparison Operators. - (line 78) + (line 79) * compatibility mode (gawk), extensions: POSIX/GNU. (line 6) * compatibility mode (gawk), file names: Special Caveats. (line 9) * compatibility mode (gawk), hexadecimal numbers: Nondecimal-numbers. @@ -31770,7 +31798,7 @@ Index * compiling gawk for MS-DOS and MS-Windows: PC Compiling. (line 13) * compiling gawk for VMS: VMS Compilation. (line 6) * compiling gawk with EMX for OS/2: PC Compiling. (line 28) -* compl: Bitwise Functions. (line 43) +* compl: Bitwise Functions. (line 44) * complement, bitwise: Bitwise Functions. (line 25) * compound statements, control statements and: Statements. (line 10) * concatenating: Concatenation. (line 8) @@ -31796,15 +31824,15 @@ Index * control statements: Statements. (line 6) * controlling array scanning order: Controlling Scanning. (line 14) -* convert string to lower case: String Functions. (line 521) -* convert string to number: String Functions. (line 388) -* convert string to upper case: String Functions. (line 527) +* convert string to lower case: String Functions. (line 522) +* convert string to number: String Functions. (line 389) +* convert string to upper case: String Functions. (line 528) * converting integer array subscripts: Numeric Array Subscripts. (line 31) * converting, dates to timestamps: Time Functions. (line 76) -* converting, numbers to strings <1>: Bitwise Functions. (line 109) +* converting, numbers to strings <1>: Bitwise Functions. (line 110) * converting, numbers to strings: Strings And Numbers. (line 6) -* converting, strings to numbers <1>: Bitwise Functions. (line 109) +* converting, strings to numbers <1>: Bitwise Functions. (line 110) * converting, strings to numbers: Strings And Numbers. (line 6) * CONVFMT variable <1>: User-modified. (line 30) * CONVFMT variable: Strings And Numbers. (line 29) @@ -31863,7 +31891,7 @@ Index (line 20) * dark corner, input files: awk split records. (line 111) * dark corner, invoking awk: Command Line. (line 16) -* dark corner, length() function: String Functions. (line 183) +* dark corner, length() function: String Functions. (line 186) * dark corner, locale's decimal point character: Locale influences conversions. (line 17) * dark corner, multiline records: Multiple Line. (line 35) @@ -31875,7 +31903,7 @@ Index (line 148) * dark corner, regexp constants, as arguments to user-defined functions: Using Constant Regexps. (line 43) -* dark corner, split() function: String Functions. (line 359) +* dark corner, split() function: String Functions. (line 360) * dark corner, strings, storing: gawk split records. (line 83) * dark corner, value of ARGV[0]: Auto-set. (line 39) * data, fixed-width: Constant Size. (line 10) @@ -32021,7 +32049,7 @@ Index * deleting entire arrays: Delete. (line 39) * Demaille, Akim: Acknowledgments. (line 60) * describe call stack frame, in debugger: Debugger Info. (line 27) -* differences between gawk and awk: String Functions. (line 197) +* differences between gawk and awk: String Functions. (line 200) * differences in awk and gawk, ARGC/ARGV variables: ARGC and ARGV. (line 90) * differences in awk and gawk, ARGIND variable: Auto-set. (line 44) @@ -32068,7 +32096,7 @@ Index (line 34) * differences in awk and gawk, LINT variable: User-modified. (line 88) * differences in awk and gawk, match() function: String Functions. - (line 260) + (line 262) * differences in awk and gawk, print/printf statements: Format Modifiers. (line 13) * differences in awk and gawk, PROCINFO array: Auto-set. (line 129) @@ -32085,13 +32113,13 @@ Index * differences in awk and gawk, single-character fields: Single Character Fields. (line 6) * differences in awk and gawk, split() function: String Functions. - (line 347) + (line 348) * differences in awk and gawk, strings: Scalar Constants. (line 20) * differences in awk and gawk, strings, storing: gawk split records. (line 77) * differences in awk and gawk, SYMTAB variable: Auto-set. (line 268) * differences in awk and gawk, TEXTDOMAIN variable: User-modified. - (line 152) + (line 151) * differences in awk and gawk, trunc-mod operation: Arithmetic Ops. (line 66) * directories, command-line: Command-line directories. @@ -32162,12 +32190,12 @@ Index * END pattern, next/nextfile statements and <1>: Next Statement. (line 44) * END pattern, next/nextfile statements and: I/O And BEGIN/END. - (line 36) + (line 37) * END pattern, operators and: Using BEGIN/END. (line 17) * END pattern, print statement and: I/O And BEGIN/END. (line 16) * ENDFILE pattern: BEGINFILE/ENDFILE. (line 6) * ENDFILE pattern, Boolean patterns and: Expression Patterns. (line 70) -* endfile() user-defined function: Filetrans Function. (line 62) +* endfile() user-defined function: Filetrans Function. (line 61) * endgrent() function (C library): Group Functions. (line 212) * endgrent() user-defined function: Group Functions. (line 215) * endpwent() function (C library): Passwd Functions. (line 210) @@ -32205,7 +32233,7 @@ Index * examining fields: Fields. (line 6) * exclamation point (!), ! operator <1>: Egrep Program. (line 175) * exclamation point (!), ! operator <2>: Precedence. (line 52) -* exclamation point (!), ! operator: Boolean Ops. (line 67) +* exclamation point (!), ! operator: Boolean Ops. (line 69) * exclamation point (!), != operator <1>: Precedence. (line 65) * exclamation point (!), != operator: Comparison Operators. (line 11) @@ -32265,7 +32293,7 @@ Index * extensions, common, fflush() function: I/O Functions. (line 43) * extensions, common, func keyword: Definition Syntax. (line 93) * extensions, common, length() applied to an array: String Functions. - (line 197) + (line 200) * extensions, common, RS as a regexp: gawk split records. (line 6) * extensions, common, single character fields: Single Character Fields. (line 6) @@ -32387,7 +32415,7 @@ Index * Fish, Fred: Contributors. (line 50) * fixed-width data: Constant Size. (line 10) * flag variables <1>: Tee Program. (line 20) -* flag variables: Boolean Ops. (line 67) +* flag variables: Boolean Ops. (line 69) * floating-point, numbers, arbitrary precision: Arbitrary Precision Arithmetic. (line 6) * floating-point, VAX/VMS: VMS Running. (line 51) @@ -32410,7 +32438,7 @@ Index * format time string: Time Functions. (line 48) * formats, numeric output: OFMT. (line 6) * formatting output: Printf. (line 6) -* formatting strings: String Functions. (line 381) +* formatting strings: String Functions. (line 382) * forward slash (/) to enclose regular expressions: Regexp. (line 10) * forward slash (/), / operator: Precedence. (line 55) * forward slash (/), /= operator <1>: Precedence. (line 95) @@ -32460,7 +32488,7 @@ Index * functions, defining: Definition Syntax. (line 9) * functions, library: Library Functions. (line 6) * functions, library, assertions: Assert Function. (line 6) -* functions, library, associative arrays and: Library Names. (line 57) +* functions, library, associative arrays and: Library Names. (line 58) * functions, library, C library: Getopt Function. (line 6) * functions, library, character values as numbers: Ordinal Functions. (line 6) @@ -32480,8 +32508,7 @@ Index * functions, library, rounding numbers: Round Function. (line 6) * functions, library, user database, reading: Passwd Functions. (line 6) -* functions, names of <1>: Definition Syntax. (line 23) -* functions, names of: Arrays. (line 18) +* functions, names of: Definition Syntax. (line 23) * functions, recursive: Definition Syntax. (line 83) * functions, string-translation: I18N Functions. (line 6) * functions, undefined: Pass By Value/Reference. @@ -32501,15 +32528,13 @@ Index * gawk, ARGIND variable in: Other Arguments. (line 15) * gawk, awk and <1>: This Manual. (line 14) * gawk, awk and: Preface. (line 21) -* gawk, bitwise operations in: Bitwise Functions. (line 39) +* gawk, bitwise operations in: Bitwise Functions. (line 40) * gawk, break statement in: Break Statement. (line 51) * gawk, built-in variables and: Built-in Variables. (line 14) * gawk, character classes and: Bracket Expressions. (line 100) * gawk, coding style in: Adding Code. (line 39) * gawk, command-line options, and regular expressions: GNU Regexp Operators. (line 70) -* gawk, comparison operators and: Comparison Operators. - (line 50) * gawk, configuring: Configuration Philosophy. (line 6) * gawk, configuring, options: Additional Configuration Options. @@ -32540,7 +32565,7 @@ Index * gawk, hexadecimal numbers and: Nondecimal-numbers. (line 42) * gawk, IGNORECASE variable in <1>: Array Sorting Functions. (line 83) -* gawk, IGNORECASE variable in <2>: String Functions. (line 58) +* gawk, IGNORECASE variable in <2>: String Functions. (line 57) * gawk, IGNORECASE variable in <3>: Array Intro. (line 94) * gawk, IGNORECASE variable in <4>: User-modified. (line 76) * gawk, IGNORECASE variable in: Case-sensitivity. (line 26) @@ -32582,7 +32607,7 @@ Index * gawk, splitting fields and: Constant Size. (line 88) * gawk, string-translation functions: I18N Functions. (line 6) * gawk, SYMTAB array in: Auto-set. (line 268) -* gawk, TEXTDOMAIN variable in: User-modified. (line 152) +* gawk, TEXTDOMAIN variable in: User-modified. (line 151) * gawk, timestamps: Time Functions. (line 6) * gawk, uses for: Preface. (line 34) * gawk, versions of, information about, printing: Options. (line 300) @@ -32672,7 +32697,7 @@ Index * gsub <1>: String Functions. (line 139) * gsub: Using Constant Regexps. (line 43) -* gsub() function, arguments of: String Functions. (line 460) +* gsub() function, arguments of: String Functions. (line 461) * gsub() function, escape processing: Gory Details. (line 6) * h debugger command (alias for help): Miscellaneous Debugger Commands. (line 66) @@ -32720,7 +32745,7 @@ Index * implementation issues, gawk, debugging: Compatibility Mode. (line 6) * implementation issues, gawk, limits <1>: Redirection. (line 129) * implementation issues, gawk, limits: Getline Notes. (line 14) -* in operator <1>: For Statement. (line 75) +* in operator <1>: For Statement. (line 76) * in operator <2>: Precedence. (line 83) * in operator: Comparison Operators. (line 11) @@ -32751,7 +32776,7 @@ Index * input files, running awk without: Read Terminal. (line 6) * input files, variable assignments and: Other Arguments. (line 26) * input pipeline: Getline/Pipe. (line 9) -* input record, length of: String Functions. (line 174) +* input record, length of: String Functions. (line 177) * input redirection: Getline/File. (line 6) * input, data, nondecimal: Nondecimal Data. (line 6) * input, explicit: Getline. (line 6) @@ -32775,12 +32800,12 @@ Index * integers, arbitrary precision: Arbitrary Precision Integers. (line 6) * integers, unsigned: Computer Arithmetic. (line 41) -* interacting with other programs: I/O Functions. (line 75) +* interacting with other programs: I/O Functions. (line 74) * internationalization <1>: I18N and L10N. (line 6) * internationalization: I18N Functions. (line 6) * internationalization, localization <1>: Internationalization. (line 13) -* internationalization, localization: User-modified. (line 152) +* internationalization, localization: User-modified. (line 151) * internationalization, localization, character classes: Bracket Expressions. (line 100) * internationalization, localization, gawk and: Internationalization. @@ -32796,7 +32821,7 @@ Index * interpreted programs: Basic High Level. (line 15) * interval expressions, regexp operator: Regexp Operators. (line 116) * inventory-shipped file: Sample Data Files. (line 32) -* invoke shell command: I/O Functions. (line 75) +* invoke shell command: I/O Functions. (line 74) * isarray: Type Functions. (line 11) * ISO: Glossary. (line 367) * ISO 8859-1: Glossary. (line 133) @@ -32849,19 +32874,19 @@ Index * left angle bracket (<), <= operator <1>: Precedence. (line 65) * left angle bracket (<), <= operator: Comparison Operators. (line 11) -* left shift: Bitwise Functions. (line 46) +* left shift: Bitwise Functions. (line 47) * left shift, bitwise: Bitwise Functions. (line 32) * leftmost longest match: Multiple Line. (line 26) -* length: String Functions. (line 167) -* length of input record: String Functions. (line 174) -* length of string: String Functions. (line 167) +* length: String Functions. (line 170) +* length of input record: String Functions. (line 177) +* length of string: String Functions. (line 170) * Lesser General Public License (LGPL): Glossary. (line 396) * LGPL (Lesser General Public License): Glossary. (line 396) * libmawk: Other Versions. (line 121) * libraries of awk functions: Library Functions. (line 6) * libraries of awk functions, assertions: Assert Function. (line 6) * libraries of awk functions, associative arrays and: Library Names. - (line 57) + (line 58) * libraries of awk functions, character values as numbers: Ordinal Functions. (line 6) * libraries of awk functions, command-line options: Getopt Function. @@ -32881,7 +32906,7 @@ Index * libraries of awk functions, user database, reading: Passwd Functions. (line 6) * line breaks: Statements/Lines. (line 6) -* line continuations: Boolean Ops. (line 62) +* line continuations: Boolean Ops. (line 64) * line continuations, gawk: Conditional Exp. (line 34) * line continuations, in print statement: Print Examples. (line 76) * line continuations, with C shell: More Complex. (line 30) @@ -32927,7 +32952,7 @@ Index * long options: Command Line. (line 13) * loops: While Statement. (line 6) * loops, break statement and: Break Statement. (line 6) -* loops, continue statements and: For Statement. (line 64) +* loops, continue statements and: For Statement. (line 65) * loops, count for header, in a profile: Profiling. (line 131) * loops, do-while: Do Statement. (line 6) * loops, exiting: Break Statement. (line 6) @@ -32936,7 +32961,7 @@ Index * loops, See Also while statement: While Statement. (line 6) * loops, while: While Statement. (line 6) * ls utility: More Complex. (line 15) -* lshift: Bitwise Functions. (line 46) +* lshift: Bitwise Functions. (line 47) * lvalues/rvalues: Assignment Ops. (line 32) * mail-list file: Sample Data Files. (line 6) * mailing labels, printing: Labels Program. (line 6) @@ -32948,14 +32973,14 @@ Index (line 6) * marked strings, extracting: String Extraction. (line 6) * Marx, Groucho: Increment Ops. (line 60) -* match: String Functions. (line 207) -* match regexp in string: String Functions. (line 207) +* match: String Functions. (line 210) +* match regexp in string: String Functions. (line 210) * match() function, RSTART/RLENGTH variables: String Functions. - (line 224) + (line 227) * matching, expressions, See comparison expressions: Typing and Comparison. (line 9) * matching, leftmost longest: Multiple Line. (line 26) -* matching, null strings: Gory Details. (line 143) +* matching, null strings: String Functions. (line 535) * mawk utility <1>: Other Versions. (line 44) * mawk utility <2>: Nextfile Statement. (line 47) * mawk utility <3>: Concatenation. (line 36) @@ -32985,17 +33010,15 @@ Index * multiple-line records: Multiple Line. (line 6) * n debugger command (alias for next): Debugger Execution Control. (line 43) -* names, arrays/variables <1>: Library Names. (line 6) -* names, arrays/variables: Arrays. (line 18) +* names, arrays/variables: Library Names. (line 6) * names, functions <1>: Library Names. (line 6) * names, functions: Definition Syntax. (line 23) -* namespace issues <1>: Library Names. (line 6) -* namespace issues: Arrays. (line 18) +* namespace issues: Library Names. (line 6) * namespace issues, functions: Definition Syntax. (line 23) * NetBSD: Glossary. (line 611) * networks, programming: TCP/IP Networking. (line 6) * networks, support for: Special Network. (line 6) -* newlines <1>: Boolean Ops. (line 67) +* newlines <1>: Boolean Ops. (line 69) * newlines <2>: Options. (line 260) * newlines: Statements/Lines. (line 6) * newlines, as field separators: Default Field Splitting. @@ -33011,14 +33034,14 @@ Index (line 43) * next file statement: Feature History. (line 169) * next statement <1>: Next Statement. (line 6) -* next statement: Boolean Ops. (line 93) -* next statement, BEGIN/END patterns and: I/O And BEGIN/END. (line 36) +* next statement: Boolean Ops. (line 95) +* next statement, BEGIN/END patterns and: I/O And BEGIN/END. (line 37) * next statement, BEGINFILE/ENDFILE patterns and: BEGINFILE/ENDFILE. (line 49) * next statement, user-defined functions and: Next Statement. (line 44) * nextfile statement: Nextfile Statement. (line 6) * nextfile statement, BEGIN/END patterns and: I/O And BEGIN/END. - (line 36) + (line 37) * nextfile statement, BEGINFILE/ENDFILE patterns and: BEGINFILE/ENDFILE. (line 26) * nextfile statement, user-defined functions and: Nextfile Statement. @@ -33048,9 +33071,9 @@ Index (line 43) * null strings, converting numbers to strings: Strings And Numbers. (line 21) -* null strings, matching: Gory Details. (line 143) -* number as string of bits: Bitwise Functions. (line 109) -* number of array elements: String Functions. (line 197) +* null strings, matching: String Functions. (line 535) +* number as string of bits: Bitwise Functions. (line 110) +* number of array elements: String Functions. (line 200) * number sign (#), #! (executable scripts): Executable Scripts. (line 6) * number sign (#), commenting: Comments. (line 6) @@ -33059,7 +33082,7 @@ Index * numbers, as values of characters: Ordinal Functions. (line 6) * numbers, Cliff random: Cliff Random Function. (line 6) -* numbers, converting <1>: Bitwise Functions. (line 109) +* numbers, converting <1>: Bitwise Functions. (line 110) * numbers, converting: Strings And Numbers. (line 6) * numbers, converting, to strings: User-modified. (line 30) * numbers, hexadecimal: Nondecimal-numbers. (line 6) @@ -33077,7 +33100,7 @@ Index * OFMT variable <2>: Strings And Numbers. (line 57) * OFMT variable: OFMT. (line 15) * OFMT variable, POSIX awk and: OFMT. (line 27) -* OFS variable <1>: User-modified. (line 114) +* OFS variable <1>: User-modified. (line 113) * OFS variable <2>: Output Separators. (line 6) * OFS variable: Changing Fields. (line 64) * OpenBSD: Glossary. (line 611) @@ -33107,7 +33130,7 @@ Index * operators, precedence: Increment Ops. (line 60) * operators, relational, See operators, comparison: Typing and Comparison. (line 9) -* operators, short-circuit: Boolean Ops. (line 57) +* operators, short-circuit: Boolean Ops. (line 59) * operators, string: Concatenation. (line 8) * operators, string-matching: Regexp Usage. (line 19) * operators, string-matching, for buffers: GNU Regexp Operators. @@ -33123,14 +33146,14 @@ Index * options, long <1>: Options. (line 6) * options, long: Command Line. (line 13) * options, printing list of: Options. (line 154) -* or: Bitwise Functions. (line 49) +* or: Bitwise Functions. (line 50) * OR bitwise operation: Bitwise Functions. (line 6) * or Boolean-logic operator: Boolean Ops. (line 6) * ord() extension function: Extension Sample Ord. (line 12) * ord() user-defined function: Ordinal Functions. (line 16) * order of evaluation, concatenation: Concatenation. (line 41) -* ORS variable <1>: User-modified. (line 119) +* ORS variable <1>: User-modified. (line 118) * ORS variable: Output Separators. (line 20) * output field separator, See OFS variable: Changing Fields. (line 64) * output record separator, See ORS variable: Output Separators. @@ -33154,7 +33177,7 @@ Index * parentheses (), in a profile: Profiling. (line 146) * parentheses (), regexp operator: Regexp Operators. (line 81) * password file: Passwd Functions. (line 16) -* patsplit: String Functions. (line 294) +* patsplit: String Functions. (line 296) * patterns: Patterns and Actions. (line 6) * patterns, comparison expressions as: Expression Patterns. (line 14) @@ -33210,7 +33233,7 @@ Index * portability, gawk: New Ports. (line 6) * portability, gettext library and: Explaining gettext. (line 11) * portability, internationalization and: I18N Portability. (line 6) -* portability, length() function: String Functions. (line 176) +* portability, length() function: String Functions. (line 179) * portability, new awk vs. old awk: Strings And Numbers. (line 57) * portability, next statement in user-defined functions: Pass By Value/Reference. (line 91) @@ -33218,7 +33241,7 @@ Index * portability, operators: Increment Ops. (line 60) * portability, operators, not in POSIX awk: Precedence. (line 98) * portability, POSIXLY_CORRECT environment variable: Options. (line 359) -* portability, substr() function: String Functions. (line 510) +* portability, substr() function: String Functions. (line 511) * portable object files <1>: Translator i18n. (line 6) * portable object files: Explaining gettext. (line 37) * portable object files, converting to message object files: I18N Example. @@ -33254,7 +33277,7 @@ Index * POSIX awk, FS variable and: User-modified. (line 60) * POSIX awk, function keyword in: Definition Syntax. (line 93) * POSIX awk, functions and, gsub()/sub(): Gory Details. (line 90) -* POSIX awk, functions and, length(): String Functions. (line 176) +* POSIX awk, functions and, length(): String Functions. (line 179) * POSIX awk, GNU long options and: Options. (line 15) * POSIX awk, interval expressions in: Regexp Operators. (line 135) * POSIX awk, next/nextfile statements and: Next Statement. (line 44) @@ -33271,7 +33294,7 @@ Index * POSIX, gawk extensions not included in: POSIX/GNU. (line 6) * POSIX, programs, implementing in awk: Clones. (line 6) * POSIXLY_CORRECT environment variable: Options. (line 339) -* PREC variable: User-modified. (line 124) +* PREC variable: User-modified. (line 123) * precedence <1>: Precedence. (line 6) * precedence: Increment Ops. (line 60) * precedence, regexp operators: Regexp Operators. (line 156) @@ -33282,7 +33305,7 @@ Index * print statement, commas, omitting: Print Examples. (line 31) * print statement, I/O operators in: Precedence. (line 71) * print statement, line continuations and: Print Examples. (line 76) -* print statement, OFMT variable and: User-modified. (line 114) +* print statement, OFMT variable and: User-modified. (line 113) * print statement, See Also redirection, of output: Redirection. (line 17) * print statement, sprintf() function and: Round Function. (line 6) @@ -33398,7 +33421,7 @@ Index * readfile() user-defined function: Readfile Function. (line 30) * reading input files: Reading Files. (line 6) * recipe for a programming language: History. (line 6) -* record separators <1>: User-modified. (line 133) +* record separators <1>: User-modified. (line 132) * record separators: awk split records. (line 6) * record separators, changing: awk split records. (line 85) * record separators, regular expressions as: awk split records. @@ -33419,7 +33442,7 @@ Index (line 77) * regexp: Regexp. (line 6) * regexp constants <1>: Comparison Operators. - (line 102) + (line 103) * regexp constants <2>: Regexp Constants. (line 6) * regexp constants: Regexp Usage. (line 57) * regexp constants, /=.../, /= operator and: Assignment Ops. (line 148) @@ -33465,7 +33488,7 @@ Index * regular expressions, searching for: Egrep Program. (line 6) * relational operators, See comparison operators: Typing and Comparison. (line 9) -* replace in string: String Functions. (line 406) +* replace in string: String Functions. (line 407) * return debugger command: Debugger Execution Control. (line 54) * return statement, user-defined functions: Return Statement. (line 6) @@ -33486,11 +33509,11 @@ Index (line 11) * right angle bracket (>), >> operator (I/O) <1>: Precedence. (line 65) * right angle bracket (>), >> operator (I/O): Redirection. (line 50) -* right shift: Bitwise Functions. (line 52) +* right shift: Bitwise Functions. (line 53) * right shift, bitwise: Bitwise Functions. (line 32) * Ritchie, Dennis: Basic Data Typing. (line 54) * RLENGTH variable: Auto-set. (line 251) -* RLENGTH variable, match() function and: String Functions. (line 224) +* RLENGTH variable, match() function and: String Functions. (line 227) * Robbins, Arnold <1>: Future Extensions. (line 6) * Robbins, Arnold <2>: Bugs. (line 32) * Robbins, Arnold <3>: Contributors. (line 141) @@ -33510,13 +33533,13 @@ Index * round to nearest integer: Numeric Functions. (line 23) * round() user-defined function: Round Function. (line 16) * rounding numbers: Round Function. (line 6) -* ROUNDMODE variable: User-modified. (line 128) -* RS variable <1>: User-modified. (line 133) +* ROUNDMODE variable: User-modified. (line 127) +* RS variable <1>: User-modified. (line 132) * RS variable: awk split records. (line 12) * RS variable, multiline records and: Multiple Line. (line 17) -* rshift: Bitwise Functions. (line 52) +* rshift: Bitwise Functions. (line 53) * RSTART variable: Auto-set. (line 257) -* RSTART variable, match() function and: String Functions. (line 224) +* RSTART variable, match() function and: String Functions. (line 227) * RT variable <1>: Auto-set. (line 264) * RT variable <2>: Multiple Line. (line 129) * RT variable: awk split records. (line 125) @@ -33569,12 +33592,12 @@ Index * separators, field, FIELDWIDTHS variable and: User-modified. (line 37) * separators, field, FPAT variable and: User-modified. (line 43) * separators, field, POSIX and: Fields. (line 6) -* separators, for records <1>: User-modified. (line 133) +* separators, for records <1>: User-modified. (line 132) * separators, for records: awk split records. (line 6) * separators, for records, regular expressions as: awk split records. (line 125) * separators, for statements in actions: Action Overview. (line 19) -* separators, subscript: User-modified. (line 146) +* separators, subscript: User-modified. (line 145) * set breakpoint: Breakpoint Control. (line 11) * set debugger command: Viewing And Changing Data. (line 59) @@ -33592,7 +33615,7 @@ Index * shells, variables: Using Shell Variables. (line 6) * shift, bitwise: Bitwise Functions. (line 32) -* short-circuit operators: Boolean Ops. (line 57) +* short-circuit operators: Boolean Ops. (line 59) * show all source files, in debugger: Debugger Info. (line 45) * show breakpoints: Debugger Info. (line 21) * show function arguments, in debugger: Debugger Info. (line 18) @@ -33623,14 +33646,14 @@ Index (line 38) * sidebar, Changing NR and FNR: Auto-set. (line 306) * sidebar, Controlling Output Buffering with system(): I/O Functions. - (line 138) + (line 137) * sidebar, Escape Sequences for Metacharacters: Escape Sequences. (line 134) * sidebar, FS and IGNORECASE: Field Splitting Summary. (line 64) * sidebar, Interactive Versus Noninteractive Buffering: I/O Functions. - (line 107) -* sidebar, Matching the Null String: Gory Details. (line 141) + (line 106) +* sidebar, Matching the Null String: String Functions. (line 533) * sidebar, Operator Evaluation Order: Increment Ops. (line 58) * sidebar, Piping into sh: Redirection. (line 134) * sidebar, Pre-POSIX awk Used OFMT For String Conversion: Strings And Numbers. @@ -33638,7 +33661,7 @@ Index * sidebar, Recipe For A Programming Language: History. (line 6) * sidebar, RS = "\0" Is Not Portable: gawk split records. (line 63) * sidebar, So Why Does gawk have BEGINFILE and ENDFILE?: Filetrans Function. - (line 83) + (line 82) * sidebar, Syntactic Ambiguities Between /= and Regular Expressions: Assignment Ops. (line 146) * sidebar, Understanding #!: Executable Scripts. (line 31) @@ -33674,8 +33697,8 @@ Index * sleep() extension function: Extension Sample Time. (line 22) * Solaris, POSIX-compliant awk: Other Versions. (line 96) -* sort array: String Functions. (line 42) -* sort array indices: String Functions. (line 42) +* sort array: String Functions. (line 41) +* sort array indices: String Functions. (line 41) * sort function, arrays, sorting: Array Sorting Functions. (line 6) * sort utility: Word Sorting. (line 50) @@ -33699,14 +33722,14 @@ Index * source files, search path for: Programs Exercises. (line 70) * sparse arrays: Array Intro. (line 72) * Spencer, Henry: Glossary. (line 11) -* split: String Functions. (line 313) -* split string into array: String Functions. (line 294) +* split: String Functions. (line 315) +* split string into array: String Functions. (line 296) * split utility: Split Program. (line 6) * split() function, array elements, deleting: Delete. (line 61) * split.awk program: Split Program. (line 30) -* sprintf <1>: String Functions. (line 381) +* sprintf <1>: String Functions. (line 382) * sprintf: OFMT. (line 15) -* sprintf() function, OFMT variable and: User-modified. (line 114) +* sprintf() function, OFMT variable and: User-modified. (line 113) * sprintf() function, print/printf statements and: Round Function. (line 6) * sqrt: Numeric Functions. (line 79) @@ -33742,16 +33765,16 @@ Index * string constants, vs. regexp constants: Computed Regexps. (line 39) * string extraction (internationalization): String Extraction. (line 6) -* string length: String Functions. (line 167) +* string length: String Functions. (line 170) * string operators: Concatenation. (line 8) -* string, regular expression match: String Functions. (line 207) +* string, regular expression match: String Functions. (line 210) * string-manipulation functions: String Functions. (line 6) * string-matching operators: Regexp Usage. (line 19) * string-translation functions: I18N Functions. (line 6) -* strings splitting, example: String Functions. (line 333) -* strings, converting <1>: Bitwise Functions. (line 109) +* strings splitting, example: String Functions. (line 334) +* strings, converting <1>: Bitwise Functions. (line 110) * strings, converting: Strings And Numbers. (line 6) -* strings, converting letter case: String Functions. (line 520) +* strings, converting letter case: String Functions. (line 521) * strings, converting, numbers to: User-modified. (line 30) * strings, empty, See null strings: awk split records. (line 115) * strings, extracting: String Extraction. (line 6) @@ -33761,15 +33784,15 @@ Index * strings, null: Regexp Field Splitting. (line 43) * strings, numeric: Variable Typing. (line 6) -* strtonum: String Functions. (line 388) +* strtonum: String Functions. (line 389) * strtonum() function (gawk), --non-decimal-data option and: Nondecimal Data. (line 36) -* sub <1>: String Functions. (line 406) +* sub <1>: String Functions. (line 407) * sub: Using Constant Regexps. (line 43) -* sub() function, arguments of: String Functions. (line 460) +* sub() function, arguments of: String Functions. (line 461) * sub() function, escape processing: Gory Details. (line 6) -* subscript separators: User-modified. (line 146) +* subscript separators: User-modified. (line 145) * subscripts in arrays, multidimensional: Multidimensional. (line 10) * subscripts in arrays, multidimensional, scanning: Multiscanning. (line 11) @@ -33777,30 +33800,30 @@ Index (line 6) * subscripts in arrays, uninitialized variables as: Uninitialized Subscripts. (line 6) -* SUBSEP variable: User-modified. (line 146) +* SUBSEP variable: User-modified. (line 145) * SUBSEP variable, and multidimensional arrays: Multidimensional. (line 16) * substitute in string: String Functions. (line 89) -* substr: String Functions. (line 479) -* substring: String Functions. (line 479) +* substr: String Functions. (line 480) +* substring: String Functions. (line 480) * Sumner, Andrew: Other Versions. (line 64) * supplementary groups of gawk process: Auto-set. (line 236) * switch statement: Switch Statement. (line 6) * SYMTAB array: Auto-set. (line 268) * syntactic ambiguity: /= operator vs. /=.../ regexp constant: Assignment Ops. (line 148) -* system: I/O Functions. (line 75) +* system: I/O Functions. (line 74) * systime: Time Functions. (line 66) * t debugger command (alias for tbreak): Breakpoint Control. (line 90) * tbreak debugger command: Breakpoint Control. (line 90) -* Tcl: Library Names. (line 57) +* Tcl: Library Names. (line 58) * TCP/IP: TCP/IP Networking. (line 6) * TCP/IP, support for: Special Network. (line 6) * tee utility: Tee Program. (line 6) * tee.awk program: Tee Program. (line 26) * temporary breakpoint: Breakpoint Control. (line 90) * terminating records: awk split records. (line 125) -* testbits.awk program: Bitwise Functions. (line 70) +* testbits.awk program: Bitwise Functions. (line 71) * testext extension: Extension Sample API Tests. (line 6) * Texinfo <1>: Adding Code. (line 100) @@ -33816,7 +33839,7 @@ Index * text, printing: Print. (line 22) * text, printing, unduplicated lines of: Uniq Program. (line 6) * TEXTDOMAIN variable <1>: Programmer i18n. (line 9) -* TEXTDOMAIN variable: User-modified. (line 152) +* TEXTDOMAIN variable: User-modified. (line 151) * TEXTDOMAIN variable, BEGIN pattern and: Programmer i18n. (line 60) * TEXTDOMAIN variable, portability and: I18N Portability. (line 20) * textdomain() function (C library): Explaining gettext. (line 28) @@ -33839,8 +33862,8 @@ Index * timestamps, converting dates to: Time Functions. (line 76) * timestamps, formatted: Getlocaltime Function. (line 6) -* tolower: String Functions. (line 521) -* toupper: String Functions. (line 527) +* tolower: String Functions. (line 522) +* toupper: String Functions. (line 528) * tr utility: Translate Program. (line 6) * trace debugger command: Miscellaneous Debugger Commands. (line 108) @@ -33859,15 +33882,15 @@ Index (line 23) * troubleshooting, fatal errors, printf format strings: Format Modifiers. (line 158) -* troubleshooting, fflush() function: I/O Functions. (line 63) +* troubleshooting, fflush() function: I/O Functions. (line 62) * troubleshooting, function call syntax: Function Calls. (line 30) * troubleshooting, gawk: Compatibility Mode. (line 6) * troubleshooting, gawk, bug reports: Bugs. (line 9) * troubleshooting, gawk, fatal errors, function arguments: Calling Built-in. (line 16) * troubleshooting, getline function: File Checking. (line 25) -* troubleshooting, gsub()/sub() functions: String Functions. (line 470) -* troubleshooting, match() function: String Functions. (line 289) +* troubleshooting, gsub()/sub() functions: String Functions. (line 471) +* troubleshooting, match() function: String Functions. (line 291) * troubleshooting, print statement, omitting commas: Print Examples. (line 31) * troubleshooting, printing: Redirection. (line 112) @@ -33876,8 +33899,8 @@ Index * troubleshooting, regexp constants vs. string constants: Computed Regexps. (line 39) * troubleshooting, string concatenation: Concatenation. (line 26) -* troubleshooting, substr() function: String Functions. (line 497) -* troubleshooting, system() function: I/O Functions. (line 97) +* troubleshooting, substr() function: String Functions. (line 498) +* troubleshooting, system() function: I/O Functions. (line 96) * troubleshooting, typographical errors, global variables: Options. (line 98) * true, logical: Truth Values. (line 6) @@ -33942,7 +33965,7 @@ Index * variables, built-in: Using Variables. (line 23) * variables, built-in, -v option, setting with: Options. (line 40) * variables, built-in, conveying information: Auto-set. (line 6) -* variables, flag: Boolean Ops. (line 67) +* variables, flag: Boolean Ops. (line 69) * variables, getline command into, using <1>: Getline/Variable/Coprocess. (line 6) * variables, getline command into, using <2>: Getline/Variable/Pipe. @@ -33954,7 +33977,6 @@ Index * variables, global, printing list of: Options. (line 93) * variables, initializing: Using Variables. (line 23) * variables, local to a function: Variable Scope. (line 6) -* variables, names of: Arrays. (line 18) * variables, private: Library Names. (line 11) * variables, setting: Options. (line 32) * variables, shadowing: Definition Syntax. (line 71) @@ -33975,7 +33997,7 @@ Index * vertical bar (|), |& operator (I/O) <2>: Precedence. (line 65) * vertical bar (|), |& operator (I/O): Getline/Coprocess. (line 6) * vertical bar (|), || operator <1>: Precedence. (line 89) -* vertical bar (|), || operator: Boolean Ops. (line 57) +* vertical bar (|), || operator: Boolean Ops. (line 59) * Vinschen, Corinna: Acknowledgments. (line 60) * w debugger command (alias for watch): Viewing And Changing Data. (line 67) @@ -34018,7 +34040,7 @@ Index * writea() extension function: Extension Sample Read write array. (line 9) * xgettext utility: String Extraction. (line 13) -* xor: Bitwise Functions. (line 55) +* xor: Bitwise Functions. (line 56) * XOR bitwise operation: Bitwise Functions. (line 6) * Yawitz, Efraim: Contributors. (line 131) * Zaretskii, Eli <1>: Bugs. (line 71) @@ -34040,7 +34062,7 @@ Index * | (vertical bar), |& operator (I/O), pipes, closing: Close Files And Pipes. (line 120) * | (vertical bar), || operator <1>: Precedence. (line 89) -* | (vertical bar), || operator: Boolean Ops. (line 57) +* | (vertical bar), || operator: Boolean Ops. (line 59) * ~ (tilde), ~ operator <1>: Expression Patterns. (line 24) * ~ (tilde), ~ operator <2>: Precedence. (line 80) * ~ (tilde), ~ operator <3>: Comparison Operators. @@ -34194,413 +34216,413 @@ Ref: Scalar Constants-Footnote-1315734 Node: Nondecimal-numbers315984 Node: Regexp Constants318984 Node: Using Constant Regexps319509 -Node: Variables322581 -Node: Using Variables323236 -Node: Assignment Options325142 -Node: Conversion327017 -Node: Strings And Numbers327541 -Ref: Strings And Numbers-Footnote-1330603 -Node: Locale influences conversions330712 -Ref: table-locale-affects333429 -Node: All Operators334017 -Node: Arithmetic Ops334647 -Node: Concatenation337152 -Ref: Concatenation-Footnote-1339971 -Node: Assignment Ops340077 -Ref: table-assign-ops345060 -Node: Increment Ops346363 -Node: Truth Values and Conditions349801 -Node: Truth Values350884 -Node: Typing and Comparison351933 -Node: Variable Typing352726 -Node: Comparison Operators356378 -Ref: table-relational-ops356788 -Node: POSIX String Comparison360337 -Ref: POSIX String Comparison-Footnote-1361421 -Node: Boolean Ops361559 -Ref: Boolean Ops-Footnote-1365898 -Node: Conditional Exp365989 -Node: Function Calls367716 -Node: Precedence371596 -Node: Locales375264 -Node: Expressions Summary376895 -Node: Patterns and Actions379436 -Node: Pattern Overview380552 -Node: Regexp Patterns382229 -Node: Expression Patterns382772 -Node: Ranges386552 -Node: BEGIN/END389658 -Node: Using BEGIN/END390420 -Ref: Using BEGIN/END-Footnote-1393156 -Node: I/O And BEGIN/END393262 -Node: BEGINFILE/ENDFILE395533 -Node: Empty398464 -Node: Using Shell Variables398781 -Node: Action Overview401064 -Node: Statements403391 -Node: If Statement405239 -Node: While Statement406737 -Node: Do Statement408781 -Node: For Statement409937 -Node: Switch Statement413089 -Node: Break Statement415477 -Node: Continue Statement417518 -Node: Next Statement419343 -Node: Nextfile Statement421713 -Node: Exit Statement424370 -Node: Built-in Variables426774 -Node: User-modified427901 -Ref: User-modified-Footnote-1435590 -Node: Auto-set435652 -Ref: Auto-set-Footnote-1448504 -Ref: Auto-set-Footnote-2448709 -Node: ARGC and ARGV448765 -Node: Pattern Action Summary452669 -Node: Arrays454892 -Node: Array Basics456441 -Node: Array Intro457267 -Ref: figure-array-elements459240 -Ref: Array Intro-Footnote-1461764 -Node: Reference to Elements461892 -Node: Assigning Elements464342 -Node: Array Example464833 -Node: Scanning an Array466565 -Node: Controlling Scanning469566 -Ref: Controlling Scanning-Footnote-1474739 -Node: Delete475055 -Ref: Delete-Footnote-1477806 -Node: Numeric Array Subscripts477863 -Node: Uninitialized Subscripts480046 -Node: Multidimensional481673 -Node: Multiscanning484786 -Node: Arrays of Arrays486375 -Node: Arrays Summary491038 -Node: Functions493143 -Node: Built-in494016 -Node: Calling Built-in495094 -Node: Numeric Functions497082 -Ref: Numeric Functions-Footnote-1501116 -Ref: Numeric Functions-Footnote-2501473 -Ref: Numeric Functions-Footnote-3501521 -Node: String Functions501790 -Ref: String Functions-Footnote-1524787 -Ref: String Functions-Footnote-2524916 -Ref: String Functions-Footnote-3525164 -Node: Gory Details525251 -Ref: table-sub-escapes527024 -Ref: table-sub-proposed528544 -Ref: table-posix-sub529908 -Ref: table-gensub-escapes531448 -Ref: Gory Details-Footnote-1532624 -Node: I/O Functions532775 -Ref: I/O Functions-Footnote-1539885 -Node: Time Functions540032 -Ref: Time Functions-Footnote-1550496 -Ref: Time Functions-Footnote-2550564 -Ref: Time Functions-Footnote-3550722 -Ref: Time Functions-Footnote-4550833 -Ref: Time Functions-Footnote-5550945 -Ref: Time Functions-Footnote-6551172 -Node: Bitwise Functions551438 -Ref: table-bitwise-ops552000 -Ref: Bitwise Functions-Footnote-1556245 -Node: Type Functions556414 -Node: I18N Functions557556 -Node: User-defined559201 -Node: Definition Syntax560005 -Ref: Definition Syntax-Footnote-1565409 -Node: Function Example565478 -Ref: Function Example-Footnote-1568118 -Node: Function Caveats568140 -Node: Calling A Function568658 -Node: Variable Scope569613 -Node: Pass By Value/Reference572601 -Node: Return Statement576111 -Node: Dynamic Typing579095 -Node: Indirect Calls580024 -Ref: Indirect Calls-Footnote-1589740 -Node: Functions Summary589868 -Node: Library Functions592518 -Ref: Library Functions-Footnote-1596136 -Ref: Library Functions-Footnote-2596279 -Node: Library Names596450 -Ref: Library Names-Footnote-1599923 -Ref: Library Names-Footnote-2600143 -Node: General Functions600229 -Node: Strtonum Function601257 -Node: Assert Function604159 -Node: Round Function607485 -Node: Cliff Random Function609026 -Node: Ordinal Functions610042 -Ref: Ordinal Functions-Footnote-1613107 -Ref: Ordinal Functions-Footnote-2613359 -Node: Join Function613570 -Ref: Join Function-Footnote-1615341 -Node: Getlocaltime Function615541 -Node: Readfile Function619277 -Node: Data File Management621116 -Node: Filetrans Function621748 -Node: Rewind Function625817 -Node: File Checking627375 -Ref: File Checking-Footnote-1628507 -Node: Empty Files628708 -Node: Ignoring Assigns630687 -Node: Getopt Function632241 -Ref: Getopt Function-Footnote-1643505 -Node: Passwd Functions643708 -Ref: Passwd Functions-Footnote-1652687 -Node: Group Functions652775 -Ref: Group Functions-Footnote-1660706 -Node: Walking Arrays660919 -Node: Library Functions Summary662522 -Node: Library Exercises663910 -Node: Sample Programs665190 -Node: Running Examples665960 -Node: Clones666688 -Node: Cut Program667912 -Node: Egrep Program677770 -Ref: Egrep Program-Footnote-1685357 -Node: Id Program685467 -Node: Split Program689121 -Ref: Split Program-Footnote-1692659 -Node: Tee Program692787 -Node: Uniq Program695574 -Node: Wc Program702997 -Ref: Wc Program-Footnote-1707262 -Node: Miscellaneous Programs707354 -Node: Dupword Program708567 -Node: Alarm Program710598 -Node: Translate Program715402 -Ref: Translate Program-Footnote-1719975 -Ref: Translate Program-Footnote-2720245 -Node: Labels Program720384 -Ref: Labels Program-Footnote-1723745 -Node: Word Sorting723829 -Node: History Sorting727872 -Node: Extract Program729708 -Node: Simple Sed737244 -Node: Igawk Program740306 -Ref: Igawk Program-Footnote-1754610 -Ref: Igawk Program-Footnote-2754811 -Node: Anagram Program754933 -Node: Signature Program758001 -Node: Programs Summary759248 -Node: Programs Exercises760463 -Ref: Programs Exercises-Footnote-1764594 -Node: Advanced Features764685 -Node: Nondecimal Data766633 -Node: Array Sorting768210 -Node: Controlling Array Traversal768907 -Node: Array Sorting Functions777187 -Ref: Array Sorting Functions-Footnote-1781079 -Node: Two-way I/O781273 -Ref: Two-way I/O-Footnote-1786217 -Ref: Two-way I/O-Footnote-2786396 -Node: TCP/IP Networking786478 -Node: Profiling789320 -Node: Advanced Features Summary796862 -Node: Internationalization798723 -Node: I18N and L10N800203 -Node: Explaining gettext800889 -Ref: Explaining gettext-Footnote-1805915 -Ref: Explaining gettext-Footnote-2806099 -Node: Programmer i18n806264 -Ref: Programmer i18n-Footnote-1811058 -Node: Translator i18n811107 -Node: String Extraction811901 -Ref: String Extraction-Footnote-1813034 -Node: Printf Ordering813120 -Ref: Printf Ordering-Footnote-1815902 -Node: I18N Portability815966 -Ref: I18N Portability-Footnote-1818415 -Node: I18N Example818478 -Ref: I18N Example-Footnote-1821184 -Node: Gawk I18N821256 -Node: I18N Summary821894 -Node: Debugger823233 -Node: Debugging824255 -Node: Debugging Concepts824696 -Node: Debugging Terms826552 -Node: Awk Debugging829149 -Node: Sample Debugging Session830041 -Node: Debugger Invocation830561 -Node: Finding The Bug831897 -Node: List of Debugger Commands838376 -Node: Breakpoint Control839708 -Node: Debugger Execution Control843372 -Node: Viewing And Changing Data846732 -Node: Execution Stack850090 -Node: Debugger Info851603 -Node: Miscellaneous Debugger Commands855597 -Node: Readline Support860781 -Node: Limitations861673 -Node: Debugging Summary863946 -Node: Arbitrary Precision Arithmetic865114 -Node: Computer Arithmetic866601 -Ref: Computer Arithmetic-Footnote-1870988 -Node: Math Definitions871045 -Ref: table-ieee-formats874334 -Ref: Math Definitions-Footnote-1874874 -Node: MPFR features874977 -Node: FP Math Caution876594 -Ref: FP Math Caution-Footnote-1877644 -Node: Inexactness of computations878013 -Node: Inexact representation878961 -Node: Comparing FP Values880316 -Node: Errors accumulate881280 -Node: Getting Accuracy882713 -Node: Try To Round885372 -Node: Setting precision886271 -Ref: table-predefined-precision-strings886953 -Node: Setting the rounding mode888746 -Ref: table-gawk-rounding-modes889110 -Ref: Setting the rounding mode-Footnote-1892564 -Node: Arbitrary Precision Integers892743 -Ref: Arbitrary Precision Integers-Footnote-1895724 -Node: POSIX Floating Point Problems895873 -Ref: POSIX Floating Point Problems-Footnote-1899749 -Node: Floating point summary899787 -Node: Dynamic Extensions901991 -Node: Extension Intro903543 -Node: Plugin License904808 -Node: Extension Mechanism Outline905493 -Ref: figure-load-extension905917 -Ref: figure-load-new-function907402 -Ref: figure-call-new-function908404 -Node: Extension API Description910388 -Node: Extension API Functions Introduction911838 -Node: General Data Types916705 -Ref: General Data Types-Footnote-1922398 -Node: Requesting Values922697 -Ref: table-value-types-returned923434 -Node: Memory Allocation Functions924392 -Ref: Memory Allocation Functions-Footnote-1927139 -Node: Constructor Functions927235 -Node: Registration Functions928993 -Node: Extension Functions929678 -Node: Exit Callback Functions931980 -Node: Extension Version String933228 -Node: Input Parsers933878 -Node: Output Wrappers943692 -Node: Two-way processors948208 -Node: Printing Messages950412 -Ref: Printing Messages-Footnote-1951489 -Node: Updating `ERRNO'951641 -Node: Accessing Parameters952380 -Node: Symbol Table Access953610 -Node: Symbol table by name954124 -Node: Symbol table by cookie956100 -Ref: Symbol table by cookie-Footnote-1960233 -Node: Cached values960296 -Ref: Cached values-Footnote-1963800 -Node: Array Manipulation963891 -Ref: Array Manipulation-Footnote-1964989 -Node: Array Data Types965028 -Ref: Array Data Types-Footnote-1967731 -Node: Array Functions967823 -Node: Flattening Arrays971697 -Node: Creating Arrays978549 -Node: Extension API Variables983280 -Node: Extension Versioning983916 -Node: Extension API Informational Variables985817 -Node: Extension API Boilerplate986903 -Node: Finding Extensions990707 -Node: Extension Example991267 -Node: Internal File Description991997 -Node: Internal File Ops996088 -Ref: Internal File Ops-Footnote-11007520 -Node: Using Internal File Ops1007660 -Ref: Using Internal File Ops-Footnote-11010007 -Node: Extension Samples1010275 -Node: Extension Sample File Functions1011799 -Node: Extension Sample Fnmatch1019367 -Node: Extension Sample Fork1020849 -Node: Extension Sample Inplace1022062 -Node: Extension Sample Ord1023737 -Node: Extension Sample Readdir1024573 -Ref: table-readdir-file-types1025429 -Node: Extension Sample Revout1026228 -Node: Extension Sample Rev2way1026819 -Node: Extension Sample Read write array1027560 -Node: Extension Sample Readfile1029439 -Node: Extension Sample API Tests1030539 -Node: Extension Sample Time1031064 -Node: gawkextlib1032379 -Node: Extension summary1035192 -Node: Extension Exercises1038885 -Node: Language History1039607 -Node: V7/SVR3.11041250 -Node: SVR41043570 -Node: POSIX1045012 -Node: BTL1046398 -Node: POSIX/GNU1047132 -Node: Feature History1052848 -Node: Common Extensions1065939 -Node: Ranges and Locales1067251 -Ref: Ranges and Locales-Footnote-11071868 -Ref: Ranges and Locales-Footnote-21071895 -Ref: Ranges and Locales-Footnote-31072129 -Node: Contributors1072350 -Node: History summary1077775 -Node: Installation1079144 -Node: Gawk Distribution1080095 -Node: Getting1080579 -Node: Extracting1081403 -Node: Distribution contents1083045 -Node: Unix Installation1088762 -Node: Quick Installation1089379 -Node: Additional Configuration Options1091821 -Node: Configuration Philosophy1093559 -Node: Non-Unix Installation1095910 -Node: PC Installation1096368 -Node: PC Binary Installation1097679 -Node: PC Compiling1099527 -Ref: PC Compiling-Footnote-11102526 -Node: PC Testing1102631 -Node: PC Using1103807 -Node: Cygwin1107959 -Node: MSYS1108768 -Node: VMS Installation1109266 -Node: VMS Compilation1110062 -Ref: VMS Compilation-Footnote-11111284 -Node: VMS Dynamic Extensions1111342 -Node: VMS Installation Details1112715 -Node: VMS Running1114967 -Node: VMS GNV1117801 -Node: VMS Old Gawk1118524 -Node: Bugs1118994 -Node: Other Versions1122998 -Node: Installation summary1129222 -Node: Notes1130278 -Node: Compatibility Mode1131143 -Node: Additions1131925 -Node: Accessing The Source1132850 -Node: Adding Code1134286 -Node: New Ports1140464 -Node: Derived Files1144945 -Ref: Derived Files-Footnote-11150420 -Ref: Derived Files-Footnote-21150454 -Ref: Derived Files-Footnote-31151050 -Node: Future Extensions1151164 -Node: Implementation Limitations1151770 -Node: Extension Design1153018 -Node: Old Extension Problems1154172 -Ref: Old Extension Problems-Footnote-11155689 -Node: Extension New Mechanism Goals1155746 -Ref: Extension New Mechanism Goals-Footnote-11159106 -Node: Extension Other Design Decisions1159295 -Node: Extension Future Growth1161401 -Node: Old Extension Mechanism1162237 -Node: Notes summary1163999 -Node: Basic Concepts1165185 -Node: Basic High Level1165866 -Ref: figure-general-flow1166138 -Ref: figure-process-flow1166737 -Ref: Basic High Level-Footnote-11169966 -Node: Basic Data Typing1170151 -Node: Glossary1173479 -Node: Copying1198631 -Node: GNU Free Documentation License1236187 -Node: Index1261323 +Node: Variables322647 +Node: Using Variables323302 +Node: Assignment Options325206 +Node: Conversion327081 +Node: Strings And Numbers327605 +Ref: Strings And Numbers-Footnote-1330667 +Node: Locale influences conversions330776 +Ref: table-locale-affects333491 +Node: All Operators334079 +Node: Arithmetic Ops334709 +Node: Concatenation337214 +Ref: Concatenation-Footnote-1340033 +Node: Assignment Ops340139 +Ref: table-assign-ops345122 +Node: Increment Ops346400 +Node: Truth Values and Conditions349838 +Node: Truth Values350921 +Node: Typing and Comparison351970 +Node: Variable Typing352763 +Node: Comparison Operators356415 +Ref: table-relational-ops356825 +Node: POSIX String Comparison360340 +Ref: POSIX String Comparison-Footnote-1361412 +Node: Boolean Ops361550 +Ref: Boolean Ops-Footnote-1366029 +Node: Conditional Exp366120 +Node: Function Calls367847 +Node: Precedence371727 +Node: Locales375395 +Node: Expressions Summary377026 +Node: Patterns and Actions379600 +Node: Pattern Overview380716 +Node: Regexp Patterns382395 +Node: Expression Patterns382938 +Node: Ranges386718 +Node: BEGIN/END389824 +Node: Using BEGIN/END390586 +Ref: Using BEGIN/END-Footnote-1393323 +Node: I/O And BEGIN/END393429 +Node: BEGINFILE/ENDFILE395743 +Node: Empty398644 +Node: Using Shell Variables398961 +Node: Action Overview401237 +Node: Statements403564 +Node: If Statement405412 +Node: While Statement406910 +Node: Do Statement408938 +Node: For Statement410080 +Node: Switch Statement413235 +Node: Break Statement415623 +Node: Continue Statement417664 +Node: Next Statement419489 +Node: Nextfile Statement421869 +Node: Exit Statement424499 +Node: Built-in Variables426902 +Node: User-modified428029 +Ref: User-modified-Footnote-1435709 +Node: Auto-set435771 +Ref: Auto-set-Footnote-1448628 +Ref: Auto-set-Footnote-2448833 +Node: ARGC and ARGV448889 +Node: Pattern Action Summary453093 +Node: Arrays455512 +Node: Array Basics456841 +Node: Array Intro457685 +Ref: figure-array-elements459658 +Ref: Array Intro-Footnote-1462182 +Node: Reference to Elements462310 +Node: Assigning Elements464760 +Node: Array Example465251 +Node: Scanning an Array467009 +Node: Controlling Scanning470025 +Ref: Controlling Scanning-Footnote-1475214 +Node: Numeric Array Subscripts475530 +Node: Uninitialized Subscripts477713 +Node: Delete479330 +Ref: Delete-Footnote-1482074 +Node: Multidimensional482131 +Node: Multiscanning485226 +Node: Arrays of Arrays486815 +Node: Arrays Summary491576 +Node: Functions493681 +Node: Built-in494554 +Node: Calling Built-in495632 +Node: Numeric Functions497620 +Ref: Numeric Functions-Footnote-1501642 +Ref: Numeric Functions-Footnote-2501999 +Ref: Numeric Functions-Footnote-3502047 +Node: String Functions502316 +Ref: String Functions-Footnote-1525776 +Ref: String Functions-Footnote-2525905 +Ref: String Functions-Footnote-3526153 +Node: Gory Details526240 +Ref: table-sub-escapes528021 +Ref: table-sub-proposed529541 +Ref: table-posix-sub530905 +Ref: table-gensub-escapes532445 +Ref: Gory Details-Footnote-1533277 +Node: I/O Functions533428 +Ref: I/O Functions-Footnote-1540529 +Node: Time Functions540676 +Ref: Time Functions-Footnote-1551145 +Ref: Time Functions-Footnote-2551213 +Ref: Time Functions-Footnote-3551371 +Ref: Time Functions-Footnote-4551482 +Ref: Time Functions-Footnote-5551594 +Ref: Time Functions-Footnote-6551821 +Node: Bitwise Functions552087 +Ref: table-bitwise-ops552649 +Ref: Bitwise Functions-Footnote-1556957 +Node: Type Functions557126 +Node: I18N Functions558275 +Node: User-defined559920 +Node: Definition Syntax560724 +Ref: Definition Syntax-Footnote-1566128 +Node: Function Example566197 +Ref: Function Example-Footnote-1569114 +Node: Function Caveats569136 +Node: Calling A Function569654 +Node: Variable Scope570609 +Node: Pass By Value/Reference573597 +Node: Return Statement577107 +Node: Dynamic Typing580091 +Node: Indirect Calls581020 +Ref: Indirect Calls-Footnote-1590741 +Node: Functions Summary590869 +Node: Library Functions593568 +Ref: Library Functions-Footnote-1597186 +Ref: Library Functions-Footnote-2597329 +Node: Library Names597500 +Ref: Library Names-Footnote-1600958 +Ref: Library Names-Footnote-2601178 +Node: General Functions601264 +Node: Strtonum Function602292 +Node: Assert Function605312 +Node: Round Function608636 +Node: Cliff Random Function610177 +Node: Ordinal Functions611193 +Ref: Ordinal Functions-Footnote-1614258 +Ref: Ordinal Functions-Footnote-2614510 +Node: Join Function614721 +Ref: Join Function-Footnote-1616492 +Node: Getlocaltime Function616692 +Node: Readfile Function620433 +Node: Data File Management622381 +Node: Filetrans Function623013 +Node: Rewind Function627072 +Node: File Checking628630 +Ref: File Checking-Footnote-1629762 +Node: Empty Files629963 +Node: Ignoring Assigns631942 +Node: Getopt Function633496 +Ref: Getopt Function-Footnote-1644760 +Node: Passwd Functions644963 +Ref: Passwd Functions-Footnote-1653942 +Node: Group Functions654030 +Ref: Group Functions-Footnote-1661961 +Node: Walking Arrays662174 +Node: Library Functions Summary663777 +Node: Library Exercises665165 +Node: Sample Programs666445 +Node: Running Examples667215 +Node: Clones667943 +Node: Cut Program669167 +Node: Egrep Program679025 +Ref: Egrep Program-Footnote-1686612 +Node: Id Program686722 +Node: Split Program690376 +Ref: Split Program-Footnote-1693914 +Node: Tee Program694042 +Node: Uniq Program696829 +Node: Wc Program704252 +Ref: Wc Program-Footnote-1708517 +Node: Miscellaneous Programs708609 +Node: Dupword Program709822 +Node: Alarm Program711853 +Node: Translate Program716657 +Ref: Translate Program-Footnote-1721230 +Ref: Translate Program-Footnote-2721500 +Node: Labels Program721639 +Ref: Labels Program-Footnote-1725000 +Node: Word Sorting725084 +Node: History Sorting729127 +Node: Extract Program730963 +Node: Simple Sed738499 +Node: Igawk Program741561 +Ref: Igawk Program-Footnote-1755865 +Ref: Igawk Program-Footnote-2756066 +Node: Anagram Program756188 +Node: Signature Program759256 +Node: Programs Summary760503 +Node: Programs Exercises761718 +Ref: Programs Exercises-Footnote-1765849 +Node: Advanced Features765940 +Node: Nondecimal Data767888 +Node: Array Sorting769465 +Node: Controlling Array Traversal770162 +Node: Array Sorting Functions778442 +Ref: Array Sorting Functions-Footnote-1782334 +Node: Two-way I/O782528 +Ref: Two-way I/O-Footnote-1787472 +Ref: Two-way I/O-Footnote-2787651 +Node: TCP/IP Networking787733 +Node: Profiling790575 +Node: Advanced Features Summary798117 +Node: Internationalization799978 +Node: I18N and L10N801458 +Node: Explaining gettext802144 +Ref: Explaining gettext-Footnote-1807170 +Ref: Explaining gettext-Footnote-2807354 +Node: Programmer i18n807519 +Ref: Programmer i18n-Footnote-1812313 +Node: Translator i18n812362 +Node: String Extraction813156 +Ref: String Extraction-Footnote-1814289 +Node: Printf Ordering814375 +Ref: Printf Ordering-Footnote-1817157 +Node: I18N Portability817221 +Ref: I18N Portability-Footnote-1819670 +Node: I18N Example819733 +Ref: I18N Example-Footnote-1822439 +Node: Gawk I18N822511 +Node: I18N Summary823149 +Node: Debugger824488 +Node: Debugging825510 +Node: Debugging Concepts825951 +Node: Debugging Terms827807 +Node: Awk Debugging830404 +Node: Sample Debugging Session831296 +Node: Debugger Invocation831816 +Node: Finding The Bug833152 +Node: List of Debugger Commands839631 +Node: Breakpoint Control840963 +Node: Debugger Execution Control844627 +Node: Viewing And Changing Data847987 +Node: Execution Stack851345 +Node: Debugger Info852858 +Node: Miscellaneous Debugger Commands856852 +Node: Readline Support862036 +Node: Limitations862928 +Node: Debugging Summary865201 +Node: Arbitrary Precision Arithmetic866369 +Node: Computer Arithmetic867856 +Ref: Computer Arithmetic-Footnote-1872243 +Node: Math Definitions872300 +Ref: table-ieee-formats875589 +Ref: Math Definitions-Footnote-1876129 +Node: MPFR features876232 +Node: FP Math Caution877849 +Ref: FP Math Caution-Footnote-1878899 +Node: Inexactness of computations879268 +Node: Inexact representation880216 +Node: Comparing FP Values881571 +Node: Errors accumulate882535 +Node: Getting Accuracy883968 +Node: Try To Round886627 +Node: Setting precision887526 +Ref: table-predefined-precision-strings888208 +Node: Setting the rounding mode890001 +Ref: table-gawk-rounding-modes890365 +Ref: Setting the rounding mode-Footnote-1893819 +Node: Arbitrary Precision Integers893998 +Ref: Arbitrary Precision Integers-Footnote-1896979 +Node: POSIX Floating Point Problems897128 +Ref: POSIX Floating Point Problems-Footnote-1901004 +Node: Floating point summary901042 +Node: Dynamic Extensions903246 +Node: Extension Intro904798 +Node: Plugin License906063 +Node: Extension Mechanism Outline906748 +Ref: figure-load-extension907172 +Ref: figure-load-new-function908657 +Ref: figure-call-new-function909659 +Node: Extension API Description911643 +Node: Extension API Functions Introduction913093 +Node: General Data Types917960 +Ref: General Data Types-Footnote-1923653 +Node: Requesting Values923952 +Ref: table-value-types-returned924689 +Node: Memory Allocation Functions925647 +Ref: Memory Allocation Functions-Footnote-1928394 +Node: Constructor Functions928490 +Node: Registration Functions930248 +Node: Extension Functions930933 +Node: Exit Callback Functions933235 +Node: Extension Version String934483 +Node: Input Parsers935133 +Node: Output Wrappers944947 +Node: Two-way processors949463 +Node: Printing Messages951667 +Ref: Printing Messages-Footnote-1952744 +Node: Updating `ERRNO'952896 +Node: Accessing Parameters953635 +Node: Symbol Table Access954865 +Node: Symbol table by name955379 +Node: Symbol table by cookie957355 +Ref: Symbol table by cookie-Footnote-1961488 +Node: Cached values961551 +Ref: Cached values-Footnote-1965055 +Node: Array Manipulation965146 +Ref: Array Manipulation-Footnote-1966244 +Node: Array Data Types966283 +Ref: Array Data Types-Footnote-1968986 +Node: Array Functions969078 +Node: Flattening Arrays972952 +Node: Creating Arrays979804 +Node: Extension API Variables984535 +Node: Extension Versioning985171 +Node: Extension API Informational Variables987072 +Node: Extension API Boilerplate988158 +Node: Finding Extensions991962 +Node: Extension Example992522 +Node: Internal File Description993252 +Node: Internal File Ops997343 +Ref: Internal File Ops-Footnote-11008775 +Node: Using Internal File Ops1008915 +Ref: Using Internal File Ops-Footnote-11011262 +Node: Extension Samples1011530 +Node: Extension Sample File Functions1013054 +Node: Extension Sample Fnmatch1020622 +Node: Extension Sample Fork1022104 +Node: Extension Sample Inplace1023317 +Node: Extension Sample Ord1024992 +Node: Extension Sample Readdir1025828 +Ref: table-readdir-file-types1026684 +Node: Extension Sample Revout1027483 +Node: Extension Sample Rev2way1028074 +Node: Extension Sample Read write array1028815 +Node: Extension Sample Readfile1030694 +Node: Extension Sample API Tests1031794 +Node: Extension Sample Time1032319 +Node: gawkextlib1033634 +Node: Extension summary1036447 +Node: Extension Exercises1040140 +Node: Language History1040862 +Node: V7/SVR3.11042505 +Node: SVR41044825 +Node: POSIX1046267 +Node: BTL1047653 +Node: POSIX/GNU1048387 +Node: Feature History1054103 +Node: Common Extensions1067194 +Node: Ranges and Locales1068506 +Ref: Ranges and Locales-Footnote-11073123 +Ref: Ranges and Locales-Footnote-21073150 +Ref: Ranges and Locales-Footnote-31073384 +Node: Contributors1073605 +Node: History summary1079030 +Node: Installation1080399 +Node: Gawk Distribution1081350 +Node: Getting1081834 +Node: Extracting1082658 +Node: Distribution contents1084300 +Node: Unix Installation1090017 +Node: Quick Installation1090634 +Node: Additional Configuration Options1093076 +Node: Configuration Philosophy1094814 +Node: Non-Unix Installation1097165 +Node: PC Installation1097623 +Node: PC Binary Installation1098934 +Node: PC Compiling1100782 +Ref: PC Compiling-Footnote-11103781 +Node: PC Testing1103886 +Node: PC Using1105062 +Node: Cygwin1109214 +Node: MSYS1110023 +Node: VMS Installation1110521 +Node: VMS Compilation1111317 +Ref: VMS Compilation-Footnote-11112539 +Node: VMS Dynamic Extensions1112597 +Node: VMS Installation Details1113970 +Node: VMS Running1116222 +Node: VMS GNV1119056 +Node: VMS Old Gawk1119779 +Node: Bugs1120249 +Node: Other Versions1124253 +Node: Installation summary1130477 +Node: Notes1131533 +Node: Compatibility Mode1132398 +Node: Additions1133180 +Node: Accessing The Source1134105 +Node: Adding Code1135541 +Node: New Ports1141719 +Node: Derived Files1146200 +Ref: Derived Files-Footnote-11151675 +Ref: Derived Files-Footnote-21151709 +Ref: Derived Files-Footnote-31152305 +Node: Future Extensions1152419 +Node: Implementation Limitations1153025 +Node: Extension Design1154273 +Node: Old Extension Problems1155427 +Ref: Old Extension Problems-Footnote-11156944 +Node: Extension New Mechanism Goals1157001 +Ref: Extension New Mechanism Goals-Footnote-11160361 +Node: Extension Other Design Decisions1160550 +Node: Extension Future Growth1162656 +Node: Old Extension Mechanism1163492 +Node: Notes summary1165254 +Node: Basic Concepts1166440 +Node: Basic High Level1167121 +Ref: figure-general-flow1167393 +Ref: figure-process-flow1167992 +Ref: Basic High Level-Footnote-11171221 +Node: Basic Data Typing1171406 +Node: Glossary1174734 +Node: Copying1199886 +Node: GNU Free Documentation License1237442 +Node: Index1262578 End Tag Table diff --git a/doc/gawk.texi b/doc/gawk.texi index 43da36ad..492d5cd6 100644 --- a/doc/gawk.texi +++ b/doc/gawk.texi @@ -726,12 +726,12 @@ particular records in a file and perform operations upon them. elements. * Controlling Scanning:: Controlling the order in which arrays are scanned. -* Delete:: The @code{delete} statement removes an - element from an array. * Numeric Array Subscripts:: How to use numbers as subscripts in @command{awk}. * Uninitialized Subscripts:: Using Uninitialized variables as subscripts. +* Delete:: The @code{delete} statement removes an + element from an array. * Multidimensional:: Emulating multidimensional arrays in @command{awk}. * Multiscanning:: Scanning multidimensional arrays. @@ -10638,7 +10638,7 @@ if (/barfly/ || /camelot/) @noindent are exactly equivalent. One rather bizarre consequence of this rule is that the following -Boolean expression is valid, but does not do what the user probably +Boolean expression is valid, but does not do what its author probably intended: @example @@ -10684,10 +10684,9 @@ Modern implementations of @command{awk}, including @command{gawk}, allow the third argument of @code{split()} to be a regexp constant, but some older implementations do not. @value{DARKCORNER} -This can lead to confusion when attempting to use regexp constants -as arguments to user-defined functions -(@pxref{User-defined}). -For example: +Because some built-in functions accept regexp constants as arguments, +it can be confusing when attempting to use regexp constants as arguments +to user-defined functions (@pxref{User-defined}). For example: @example function mysub(pat, repl, str, global) @@ -10755,8 +10754,8 @@ variable's current value. Variables are given new values with @dfn{decrement operators}. @xref{Assignment Ops}. In addition, the @code{sub()} and @code{gsub()} functions can -change a variable's value, and the @code{match()}, @code{patsplit()} -and @code{split()} functions can change the contents of their +change a variable's value, and the @code{match()}, @code{split()} +and @code{patsplit()} functions can change the contents of their array parameters. @xref{String Functions}. @cindex variables, built-in @@ -10772,7 +10771,7 @@ Variables in @command{awk} can be assigned either numeric or string values. The kind of value a variable holds can change over the life of a program. By default, variables are initialized to the empty string, which is zero if converted to a number. There is no need to explicitly -``initialize'' a variable in @command{awk}, +initialize a variable in @command{awk}, which is what you would do in C and in most other traditional languages. @node Assignment Options @@ -11009,7 +11008,7 @@ $ @kbd{echo 4,321 | LC_ALL=en_DK.utf-8 gawk '@{ print $1 + 1 @}'} @noindent The @code{en_DK.utf-8} locale is for English in Denmark, where the comma acts as the decimal point separator. In the normal @code{"C"} locale, @command{gawk} -treats @samp{4,321} as @samp{4}, while in the Danish locale, it's treated +treats @samp{4,321} as 4, while in the Danish locale, it's treated as the full number, 4.321. Some earlier versions of @command{gawk} fully complied with this aspect @@ -11566,7 +11565,7 @@ awk '/[=]=/' /dev/null @end example @command{gawk} does not have this problem; BWK @command{awk} -and @command{mawk} also do not (@pxref{Other Versions}). +and @command{mawk} also do not. @docbook </sidebar> @@ -11612,7 +11611,7 @@ awk '/[=]=/' /dev/null @end example @command{gawk} does not have this problem; BWK @command{awk} -and @command{mawk} also do not (@pxref{Other Versions}). +and @command{mawk} also do not. @end cartouche @end ifnotdocbook @c ENDOFRANGE exas @@ -11924,7 +11923,7 @@ attribute. @item Fields, @code{getline} input, @code{FILENAME}, @code{ARGV} elements, @code{ENVIRON} elements, and the elements of an array created by -@code{patsplit()}, @code{split()} and @code{match()} that are numeric +@code{match()}, @code{split()} and @code{patsplit()} that are numeric strings have the @var{strnum} attribute. Otherwise, they have the @var{string} attribute. Uninitialized variables also have the @var{strnum} attribute. @@ -12079,22 +12078,23 @@ Thus, the six-character input string @w{@samp{ +3.14}} receives the The following examples print @samp{1} when the comparison between the two different constants is true, @samp{0} otherwise: +@c 22.9.2014: Tested with mawk and BWK awk, got same results. @example -$ @kbd{echo ' +3.14' | gawk '@{ print $0 == " +3.14" @}'} @ii{True} +$ @kbd{echo ' +3.14' | awk '@{ print($0 == " +3.14") @}'} @ii{True} @print{} 1 -$ @kbd{echo ' +3.14' | gawk '@{ print $0 == "+3.14" @}'} @ii{False} +$ @kbd{echo ' +3.14' | awk '@{ print($0 == "+3.14") @}'} @ii{False} @print{} 0 -$ @kbd{echo ' +3.14' | gawk '@{ print $0 == "3.14" @}'} @ii{False} +$ @kbd{echo ' +3.14' | awk '@{ print($0 == "3.14") @}'} @ii{False} @print{} 0 -$ @kbd{echo ' +3.14' | gawk '@{ print $0 == 3.14 @}'} @ii{True} +$ @kbd{echo ' +3.14' | awk '@{ print($0 == 3.14) @}'} @ii{True} @print{} 1 -$ @kbd{echo ' +3.14' | gawk '@{ print $1 == " +3.14" @}'} @ii{False} +$ @kbd{echo ' +3.14' | awk '@{ print($1 == " +3.14") @}'} @ii{False} @print{} 0 -$ @kbd{echo ' +3.14' | gawk '@{ print $1 == "+3.14" @}'} @ii{True} +$ @kbd{echo ' +3.14' | awk '@{ print($1 == "+3.14") @}'} @ii{True} @print{} 1 -$ @kbd{echo ' +3.14' | gawk '@{ print $1 == "3.14" @}'} @ii{False} +$ @kbd{echo ' +3.14' | awk '@{ print($1 == "3.14") @}'} @ii{False} @print{} 0 -$ @kbd{echo ' +3.14' | gawk '@{ print $1 == 3.14 @}'} @ii{True} +$ @kbd{echo ' +3.14' | awk '@{ print($1 == 3.14) @}'} @ii{True} @print{} 1 @end example @@ -12168,9 +12168,8 @@ part of the test always succeeds. Because the operators are so similar, this kind of error is very difficult to spot when scanning the source code. -@cindex @command{gawk}, comparison operators and -The following list of expressions illustrates the kind of comparison -@command{gawk} performs, as well as what the result of the comparison is: +The following list of expressions illustrates the kinds of comparisons +@command{awk} performs, as well as what the result of each comparison is: @table @code @item 1.5 <= 2.0 @@ -12243,7 +12242,7 @@ dynamic regexp (@pxref{Regexp Usage}; also @cindex @command{awk}, regexp constants and @cindex regexp constants -In modern implementations of @command{awk}, a constant regular +A constant regular expression in slashes by itself is also an expression. The regexp @code{/@var{regexp}/} is an abbreviation for the following comparison expression: @@ -12263,7 +12262,7 @@ where this is discussed in more detail. The POSIX standard says that string comparison is performed based on the locale's @dfn{collating order}. This is the order in which characters sort, as defined by the locale (for more discussion, -@pxref{Ranges and Locales}). This order is usually very different +@pxref{Locales}). This order is usually very different from the results obtained when doing straight character-by-character comparison.@footnote{Technically, string comparison is supposed to behave the same way as if the strings are compared with the C @@ -12343,7 +12342,7 @@ no substring @samp{foo} in the record. True if at least one of @var{boolean1} or @var{boolean2} is true. For example, the following statement prints all records in the input that contain @emph{either} @samp{edu} or -@samp{li} or both: +@samp{li}: @example if ($0 ~ /edu/ || $0 ~ /li/) print @@ -12352,6 +12351,9 @@ if ($0 ~ /edu/ || $0 ~ /li/) print The subexpression @var{boolean2} is evaluated only if @var{boolean1} is false. This can make a difference when @var{boolean2} contains expressions that have side effects. +(Thus, this test never really distinguishes records that contain both +@samp{edu} and @samp{li}---as soon as @samp{edu} is matched, +the full test succeeds.) @item ! @var{boolean} True if @var{boolean} is false. For example, @@ -12361,7 +12363,7 @@ variable is not defined: @example BEGIN @{ if (! ("HOME" in ENVIRON)) - print "no home!" @} + print "no home!" @} @end example (The @code{in} operator is described in @@ -12817,8 +12819,8 @@ system about the local character set and language. The ISO C standard defines a default @code{"C"} locale, which is an environment that is typical of what many C programmers are used to. -Once upon a time, the locale setting used to affect regexp matching -(@pxref{Ranges and Locales}), but this is no longer true. +Once upon a time, the locale setting used to affect regexp matching, +but this is no longer true (@pxref{Ranges and Locales}). Locales can affect record splitting. For the normal case of @samp{RS = "\n"}, the locale is largely irrelevant. For other single-character @@ -12872,7 +12874,8 @@ Locales can influence the conversions. @item @command{awk} provides the usual arithmetic operators (addition, subtraction, multiplication, division, modulus), and unary plus and minus. -It also provides comparison operators, boolean operators, and regexp +It also provides comparison operators, boolean operators, array membership +testing, and regexp matching operators. String concatenation is accomplished by placing two expressions next to each other; there is no explicit operator. The three-operand @samp{?:} operator provides an ``if-else'' test within @@ -12887,7 +12890,7 @@ In @command{awk}, a value is considered to be true if it is non-zero @emph{or} non-null. Otherwise, the value is false. @item -A value's type is set upon each assignment and may change over its +A variable's type is set upon each assignment and may change over its lifetime. The type determines how it behaves in comparisons (string or numeric). @@ -12967,7 +12970,7 @@ is nonzero (if a number) or non-null (if a string). (@xref{Expression Patterns}.) @item @var{begpat}, @var{endpat} -A pair of patterns separated by a comma, specifying a range of records. +A pair of patterns separated by a comma, specifying a @dfn{range} of records. The range includes both the initial record that matches @var{begpat} and the final record that matches @var{endpat}. (@xref{Ranges}.) @@ -13057,8 +13060,8 @@ $ @kbd{awk '$1 ~ /li/ @{ print $2 @}' mail-list} @cindex regexp constants, as patterns @cindex patterns, regexp constants as A regexp constant as a pattern is also a special case of an expression -pattern. The expression @code{/li/} has the value one if @samp{li} -appears in the current input record. Thus, as a pattern, @code{/li/} +pattern. The expression @samp{/li/} has the value one if @samp{li} +appears in the current input record. Thus, as a pattern, @samp{/li/} matches any record containing @samp{li}. @cindex Boolean expressions, as patterns @@ -13240,7 +13243,7 @@ input is read. For example: @example $ @kbd{awk '} > @kbd{BEGIN @{ print "Analysis of \"li\"" @}} -> @kbd{/li/ @{ ++n @}} +> @kbd{/li/ @{ ++n @}} > @kbd{END @{ print "\"li\" appears in", n, "records." @}' mail-list} @print{} Analysis of "li" @print{} "li" appears in 4 records. @@ -13320,9 +13323,10 @@ The POSIX standard specifies that @code{NF} is available in an @code{END} rule. It contains the number of fields from the last input record. Most probably due to an oversight, the standard does not say that @code{$0} is also preserved, although logically one would think that it should be. -In fact, @command{gawk} does preserve the value of @code{$0} for use in -@code{END} rules. Be aware, however, that BWK @command{awk}, and possibly -other implementations, do not. +In fact, all of BWK @command{awk}, @command{mawk}, and @command{gawk} +preserve the value of @code{$0} for use in @code{END} rules. Be aware, +however, that some other implementations and many older versions +of Unix @command{awk} do not. The third point follows from the first two. The meaning of @samp{print} inside a @code{BEGIN} or @code{END} rule is the same as always: @@ -13417,8 +13421,8 @@ level of the @command{awk} program. @cindex @code{next} statement, @code{BEGINFILE}/@code{ENDFILE} patterns and The @code{next} statement (@pxref{Next Statement}) is not allowed inside -either a @code{BEGINFILE} or and @code{ENDFILE} rule. The @code{nextfile} -statement (@pxref{Nextfile Statement}) is allowed only inside a +either a @code{BEGINFILE} or an @code{ENDFILE} rule. The @code{nextfile} +statement is allowed only inside a @code{BEGINFILE} rule, but not inside an @code{ENDFILE} rule. @cindex @code{getline} statement, @code{BEGINFILE}/@code{ENDFILE} patterns and @@ -13482,7 +13486,7 @@ There are two ways to get the value of the shell variable into the body of the @command{awk} program. @cindex shells, quoting -The most common method is to use shell quoting to substitute +A common method is to use shell quoting to substitute the variable's value into the program inside the script. For example, consider the following program: @@ -13739,20 +13743,21 @@ If the @var{condition} is true, it executes the statement @var{body}. is not zero and not a null string.) @end ifinfo After @var{body} has been executed, -@var{condition} is tested again, and if it is still true, @var{body} is -executed again. This process repeats until the @var{condition} is no longer -true. If the @var{condition} is initially false, the body of the loop is -never executed and @command{awk} continues with the statement following +@var{condition} is tested again, and if it is still true, @var{body} +executes again. This process repeats until the @var{condition} is no longer +true. If the @var{condition} is initially false, the body of the loop +never executes and @command{awk} continues with the statement following the loop. This example prints the first three fields of each record, one per line: @example -awk '@{ - i = 1 - while (i <= 3) @{ - print $i - i++ - @} +awk ' +@{ + i = 1 + while (i <= 3) @{ + print $i + i++ + @} @}' inventory-shipped @end example @@ -13786,14 +13791,14 @@ do while (@var{condition}) @end example -Even if the @var{condition} is false at the start, the @var{body} is -executed at least once (and only once, unless executing @var{body} +Even if the @var{condition} is false at the start, the @var{body} +executes at least once (and only once, unless executing @var{body} makes @var{condition} true). Contrast this with the corresponding @code{while} statement: @example while (@var{condition}) - @var{body} + @var{body} @end example @noindent @@ -13803,11 +13808,11 @@ The following is an example of a @code{do} statement: @example @{ - i = 1 - do @{ - print $0 - i++ - @} while (i <= 10) + i = 1 + do @{ + print $0 + i++ + @} while (i <= 10) @} @end example @@ -13844,9 +13849,10 @@ compares it against the desired number of iterations. For example: @example -awk '@{ - for (i = 1; i <= 3; i++) - print $i +awk ' +@{ + for (i = 1; i <= 3; i++) + print $i @}' inventory-shipped @end example @@ -13874,7 +13880,7 @@ between 1 and 100: @example for (i = 1; i <= 100; i *= 2) - print i + print i @end example If there is nothing to be done, any of the three expressions in the @@ -14194,7 +14200,7 @@ The @code{next} statement is not allowed inside @code{BEGINFILE} and @cindex functions, user-defined, @code{next}/@code{nextfile} statements and According to the POSIX standard, the behavior is undefined if the @code{next} statement is used in a @code{BEGIN} or @code{END} rule. -@command{gawk} treats it as a syntax error. Although POSIX permits it, +@command{gawk} treats it as a syntax error. Although POSIX does not disallow it, most other @command{awk} implementations don't allow the @code{next} statement inside function bodies (@pxref{User-defined}). Just as with any other @code{next} statement, a @code{next} statement inside a function @@ -14249,7 +14255,7 @@ opened with redirections. It is not related to the main processing that @quotation NOTE For many years, @code{nextfile} was a -@command{gawk} extension. As of September, 2012, it was accepted for +common extension. In September, 2012, it was accepted for inclusion into the POSIX standard. See @uref{http://austingroupbugs.net/view.php?id=607, the Austin Group website}. @end quotation @@ -14258,8 +14264,8 @@ See @uref{http://austingroupbugs.net/view.php?id=607, the Austin Group website}. @cindex @code{nextfile} statement, user-defined functions and @cindex Brian Kernighan's @command{awk} @cindex @command{mawk} utility -The current version of BWK @command{awk}, and @command{mawk} (@pxref{Other -Versions}) also support @code{nextfile}. However, they don't allow the +The current version of BWK @command{awk}, and @command{mawk} +also support @code{nextfile}. However, they don't allow the @code{nextfile} statement inside function bodies (@pxref{User-defined}). @command{gawk} does; a @code{nextfile} inside a function body reads the next record and starts processing it with the first rule in the program, @@ -14291,8 +14297,8 @@ the program to stop immediately. An @code{exit} statement that is not part of a @code{BEGIN} or @code{END} rule stops the execution of any further automatic rules for the current record, skips reading any remaining input records, and executes the -@code{END} rule if there is one. -Any @code{ENDFILE} rules are also skipped; they are not executed. +@code{END} rule if there is one. @command{gawk} also skips +any @code{ENDFILE} rules; they do not execute. In such a case, if you don't want the @code{END} rule to do its job, set a variable @@ -14400,7 +14406,7 @@ respectively, should use binary I/O. A string value of @code{"rw"} or @code{"wr"} indicates that all files should use binary I/O. Any other string value is treated the same as @code{"rw"}, but causes @command{gawk} to generate a warning message. @code{BINMODE} is described in more -detail in @ref{PC Using}. @command{mawk} @pxref{Other Versions}), +detail in @ref{PC Using}. @command{mawk} (@pxref{Other Versions}), also supports this variable, but only using numeric values. @cindex @code{CONVFMT} variable @@ -14527,7 +14533,7 @@ printing with the @code{print} statement. It works by being passed as the first argument to the @code{sprintf()} function (@pxref{String Functions}). Its default value is @code{"%.6g"}. Earlier versions of @command{awk} -also used @code{OFMT} to specify the format for converting numbers to +used @code{OFMT} to specify the format for converting numbers to strings in general expressions; this is now done by @code{CONVFMT}. @cindex @code{sprintf()} function, @code{OFMT} variable and @@ -14679,8 +14685,8 @@ successive instances of the same @value{FN} on the command line. @cindex file names, distinguishing While you can change the value of @code{ARGIND} within your @command{awk} -program, @command{gawk} automatically sets it to a new value when the -next file is opened. +program, @command{gawk} automatically sets it to a new value when it +opens the next file. @cindex @code{ENVIRON} array @cindex environment variables, in @code{ENVIRON} array @@ -14737,10 +14743,10 @@ can give @code{FILENAME} a value. @cindex @code{FNR} variable @item @code{FNR} -The current record number in the current file. @code{FNR} is -incremented each time a new record is read -(@pxref{Records}). It is reinitialized -to zero each time a new input file is started. +The current record number in the current file. @command{awk} increments +@code{FNR} each time it reads a new record (@pxref{Records}). +@command{awk} resets @code{FNR} to zero each time it starts a new +input file. @cindex @code{NF} variable @item @code{NF} @@ -14772,7 +14778,7 @@ array causes a fatal error. Any attempt to assign to an element of The number of input records @command{awk} has processed since the beginning of the program's execution (@pxref{Records}). -@code{NR} is incremented each time a new record is read. +@command{awk} increments @code{NR} each time it reads a new record. @cindex @command{gawk}, @code{PROCINFO} array in @cindex @code{PROCINFO} array @@ -14852,7 +14858,7 @@ The parent process ID of the current process. @item PROCINFO["sorted_in"] If this element exists in @code{PROCINFO}, its value controls the order in which array indices will be processed by -@samp{for (@var{index} in @var{array})} loops. +@samp{for (@var{indx} in @var{array})} loops. Since this is an advanced feature, we defer the full description until later; see @ref{Scanning an Array}. @@ -14873,7 +14879,7 @@ The version of @command{gawk}. The following additional elements in the array are available to provide information about the MPFR and GMP libraries -if your version of @command{gawk} supports arbitrary precision numbers +if your version of @command{gawk} supports arbitrary precision arithmetic (@pxref{Arbitrary Precision Arithmetic}): @table @code @@ -14922,14 +14928,14 @@ The @code{PROCINFO} array has the following additional uses: @itemize @value{BULLET} @item -It may be used to cause coprocesses to communicate over pseudo-ttys -instead of through two-way pipes; this is discussed further in -@ref{Two-way I/O}. - -@item It may be used to provide a timeout when reading from any open input file, pipe, or coprocess. @xref{Read Timeout}, for more information. + +@item +It may be used to cause coprocesses to communicate over pseudo-ttys +instead of through two-way pipes; this is discussed further in +@ref{Two-way I/O}. @end itemize @cindex @code{RLENGTH} variable @@ -15217,6 +15223,12 @@ following @option{-v} are passed on to the @command{awk} program. (@xref{Getopt Function}, for an @command{awk} library function that parses command-line options.) +When designing your program, you should choose options that don't +conflict with @command{gawk}'s, since it will process any options +that it accepts before passing the rest of the command line on to +your program. Using @samp{#!} with the @option{-E} option may help +(@pxref{Executable Scripts}, and @pxref{Options}). + @node Pattern Action Summary @section Summary @@ -15251,7 +15263,7 @@ input and output statements, and deletion statements. The control statements in @command{awk} are @code{if}-@code{else}, @code{while}, @code{for}, and @code{do}-@code{while}. @command{gawk} adds the @code{switch} statement. There are two flavors of @code{for} -statement: one for for performing general looping, and the other iterating +statement: one for performing general looping, and the other for iterating through an array. @item @@ -15268,12 +15280,17 @@ The @code{exit} statement terminates your program. When executed from an action (or function body) it transfers control to the @code{END} statements. From an @code{END} statement body, it exits immediately. You may pass an optional numeric value to be used -at @command{awk}'s exit status. +as @command{awk}'s exit status. @item Some built-in variables provide control over @command{awk}, mainly for I/O. Other variables convey information from @command{awk} to your program. +@item +@code{ARGC} and @code{ARGV} make the command-line arguments available +to your program. Manipulating them from a @code{BEGIN} rule lets you +control how @command{awk} will process the provided @value{DF}s. + @end itemize @node Arrays @@ -15294,24 +15311,13 @@ The @value{CHAPTER} moves on to discuss @command{gawk}'s facility for sorting arrays, and ends with a brief description of @command{gawk}'s ability to support true arrays of arrays. -@cindex variables, names of -@cindex functions, names of -@cindex arrays, names of, and names of functions/variables -@cindex names, arrays/variables -@cindex namespace issues -@command{awk} maintains a single set -of names that may be used for naming variables, arrays, and functions -(@pxref{User-defined}). -Thus, you cannot have a variable and an array with the same name in the -same @command{awk} program. - @menu * Array Basics:: The basics of arrays. -* Delete:: The @code{delete} statement removes an element - from an array. * Numeric Array Subscripts:: How to use numbers as subscripts in @command{awk}. * Uninitialized Subscripts:: Using Uninitialized variables as subscripts. +* Delete:: The @code{delete} statement removes an element + from an array. * Multidimensional:: Emulating multidimensional arrays in @command{awk}. * Arrays of Arrays:: True multidimensional arrays. @@ -15739,14 +15745,14 @@ begin with a number: @example @c file eg/misc/arraymax.awk @{ - if ($1 > max) - max = $1 - arr[$1] = $0 + if ($1 > max) + max = $1 + arr[$1] = $0 @} END @{ - for (x = 1; x <= max; x++) - print arr[x] + for (x = 1; x <= max; x++) + print arr[x] @} @c endfile @end example @@ -15786,9 +15792,9 @@ program's @code{END} rule, as follows: @example END @{ - for (x = 1; x <= max; x++) - if (x in arr) - print arr[x] + for (x = 1; x <= max; x++) + if (x in arr) + print arr[x] @} @end example @@ -15810,7 +15816,7 @@ an array: @example for (@var{var} in @var{array}) - @var{body} + @var{body} @end example @noindent @@ -15883,7 +15889,7 @@ BEGIN @{ @} @end example -Here is what happens when run with @command{gawk}: +Here is what happens when run with @command{gawk} (and @command{mawk}): @example $ @kbd{gawk -f loopcheck.awk} @@ -16001,7 +16007,8 @@ does not affect the loop. For example: @example -$ @kbd{gawk 'BEGIN @{} +$ @kbd{gawk '} +> @kbd{BEGIN @{} > @kbd{ a[4] = 4} > @kbd{ a[3] = 3} > @kbd{ for (i in a)} @@ -16009,7 +16016,8 @@ $ @kbd{gawk 'BEGIN @{} > @kbd{@}'} @print{} 4 4 @print{} 3 3 -$ @kbd{gawk 'BEGIN @{} +$ @kbd{gawk '} +> @kbd{BEGIN @{} > @kbd{ PROCINFO["sorted_in"] = "@@ind_str_asc"} > @kbd{ a[4] = 4} > @kbd{ a[3] = 3} @@ -16058,118 +16066,6 @@ the @code{delete} statement. In addition, @command{gawk} provides built-in functions for sorting arrays; see @ref{Array Sorting Functions}. -@node Delete -@section The @code{delete} Statement -@cindex @code{delete} statement -@cindex deleting elements in arrays -@cindex arrays, elements, deleting -@cindex elements in arrays, deleting - -To remove an individual element of an array, use the @code{delete} -statement: - -@example -delete @var{array}[@var{index-expression}] -@end example - -Once an array element has been deleted, any value the element once -had is no longer available. It is as if the element had never -been referred to or been given a value. -The following is an example of deleting elements in an array: - -@example -for (i in frequencies) - delete frequencies[i] -@end example - -@noindent -This example removes all the elements from the array @code{frequencies}. -Once an element is deleted, a subsequent @code{for} statement to scan the array -does not report that element and the @code{in} operator to check for -the presence of that element returns zero (i.e., false): - -@example -delete foo[4] -if (4 in foo) - print "This will never be printed" -@end example - -@cindex null strings, and deleting array elements -It is important to note that deleting an element is @emph{not} the -same as assigning it a null value (the empty string, @code{""}). -For example: - -@example -foo[4] = "" -if (4 in foo) - print "This is printed, even though foo[4] is empty" -@end example - -@cindex lint checking, array elements -It is not an error to delete an element that does not exist. -However, if @option{--lint} is provided on the command line -(@pxref{Options}), -@command{gawk} issues a warning message when an element that -is not in the array is deleted. - -@cindex common extensions, @code{delete} to delete entire arrays -@cindex extensions, common@comma{} @code{delete} to delete entire arrays -@cindex arrays, deleting entire contents -@cindex deleting entire arrays -@cindex @code{delete} @var{array} -@cindex differences in @command{awk} and @command{gawk}, array elements, deleting -All the elements of an array may be deleted with a single statement -by leaving off the subscript in the @code{delete} statement, -as follows: - - -@example -delete @var{array} -@end example - -Using this version of the @code{delete} statement is about three times -more efficient than the equivalent loop that deletes each element one -at a time. - -@cindex Brian Kernighan's @command{awk} -@quotation NOTE -For many years, -using @code{delete} without a subscript was a @command{gawk} extension. -As of September, 2012, it was accepted for -inclusion into the POSIX standard. See @uref{http://austingroupbugs.net/view.php?id=544, -the Austin Group website}. This form of the @code{delete} statement is also supported -by BWK @command{awk} and @command{mawk}, as well as -by a number of other implementations (@pxref{Other Versions}). -@end quotation - -@cindex portability, deleting array elements -@cindex Brennan, Michael -The following statement provides a portable but nonobvious way to clear -out an array:@footnote{Thanks to Michael Brennan for pointing this out.} - -@example -split("", array) -@end example - -@cindex @code{split()} function, array elements@comma{} deleting -The @code{split()} function -(@pxref{String Functions}) -clears out the target array first. This call asks it to split -apart the null string. Because there is no data to split out, the -function simply clears the array and then returns. - -@quotation CAUTION -Deleting an array does not change its type; you cannot -delete an array and then use the array's name as a scalar -(i.e., a regular variable). For example, the following does not work: - -@example -a[1] = 3 -delete a -a = 3 -@end example -@end quotation - @node Numeric Array Subscripts @section Using Numbers to Subscript Arrays @@ -16210,7 +16106,7 @@ since @code{"12.15"} is different from @code{"12.153"}. @cindex integer array indices According to the rules for conversions (@pxref{Conversion}), integer -values are always converted to strings as integers, no matter what the +values always convert to strings as integers, no matter what the value of @code{CONVFMT} may happen to be. So the usual case of the following works: @@ -16233,7 +16129,7 @@ and all refer to the same element! As with many things in @command{awk}, the majority of the time -things work as one would expect them to. But it is useful to have a precise +things work as you would expect them to. But it is useful to have a precise knowledge of the actual rules since they can sometimes have a subtle effect on your programs. @@ -16297,6 +16193,119 @@ Even though it is somewhat unusual, the null string if @option{--lint} is provided on the command line (@pxref{Options}). +@node Delete +@section The @code{delete} Statement +@cindex @code{delete} statement +@cindex deleting elements in arrays +@cindex arrays, elements, deleting +@cindex elements in arrays, deleting + +To remove an individual element of an array, use the @code{delete} +statement: + +@example +delete @var{array}[@var{index-expression}] +@end example + +Once an array element has been deleted, any value the element once +had is no longer available. It is as if the element had never +been referred to or been given a value. +The following is an example of deleting elements in an array: + +@example +for (i in frequencies) + delete frequencies[i] +@end example + +@noindent +This example removes all the elements from the array @code{frequencies}. +Once an element is deleted, a subsequent @code{for} statement to scan the array +does not report that element and the @code{in} operator to check for +the presence of that element returns zero (i.e., false): + +@example +delete foo[4] +if (4 in foo) + print "This will never be printed" +@end example + +@cindex null strings, and deleting array elements +It is important to note that deleting an element is @emph{not} the +same as assigning it a null value (the empty string, @code{""}). +For example: + +@example +foo[4] = "" +if (4 in foo) + print "This is printed, even though foo[4] is empty" +@end example + +@cindex lint checking, array elements +It is not an error to delete an element that does not exist. +However, if @option{--lint} is provided on the command line +(@pxref{Options}), +@command{gawk} issues a warning message when an element that +is not in the array is deleted. + +@cindex common extensions, @code{delete} to delete entire arrays +@cindex extensions, common@comma{} @code{delete} to delete entire arrays +@cindex arrays, deleting entire contents +@cindex deleting entire arrays +@cindex @code{delete} @var{array} +@cindex differences in @command{awk} and @command{gawk}, array elements, deleting +All the elements of an array may be deleted with a single statement +by leaving off the subscript in the @code{delete} statement, +as follows: + + +@example +delete @var{array} +@end example + +Using this version of the @code{delete} statement is about three times +more efficient than the equivalent loop that deletes each element one +at a time. + +This form of the @code{delete} statement is also supported +by BWK @command{awk} and @command{mawk}, as well as +by a number of other implementations. + +@cindex Brian Kernighan's @command{awk} +@quotation NOTE +For many years, using @code{delete} without a subscript was a common +extension. In September, 2012, it was accepted for inclusion into the +POSIX standard. See @uref{http://austingroupbugs.net/view.php?id=544, +the Austin Group website}. +@end quotation + +@cindex portability, deleting array elements +@cindex Brennan, Michael +The following statement provides a portable but nonobvious way to clear +out an array:@footnote{Thanks to Michael Brennan for pointing this out.} + +@example +split("", array) +@end example + +@cindex @code{split()} function, array elements@comma{} deleting +The @code{split()} function +(@pxref{String Functions}) +clears out the target array first. This call asks it to split +apart the null string. Because there is no data to split out, the +function simply clears the array and then returns. + +@quotation CAUTION +Deleting all the elements from an array does not change its type; you cannot +clear an array and then use the array's name as a scalar +(i.e., a regular variable). For example, the following does not work: + +@example +a[1] = 3 +delete a +a = 3 +@end example +@end quotation + @node Multidimensional @section Multidimensional Arrays @@ -16308,7 +16317,7 @@ on the command line (@pxref{Options}). @cindex arrays, multidimensional A multidimensional array is an array in which an element is identified by a sequence of indices instead of a single index. For example, a -two-dimensional array requires two indices. The usual way (in most +two-dimensional array requires two indices. The usual way (in many languages, including @command{awk}) to refer to an element of a two-dimensional array named @code{grid} is with @code{grid[@var{x},@var{y}]}. @@ -16483,8 +16492,9 @@ a[1][3][1, "name"] = "barney" Each subarray and the main array can be of different length. In fact, the elements of an array or its subarray do not all have to have the same type. This means that the main array and any of its subarrays can be -non-rectangular, or jagged in structure. One can assign a scalar value to -the index @code{4} of the main array @code{a}: +non-rectangular, or jagged in structure. You can assign a scalar value to +the index @code{4} of the main array @code{a}, even though @code{a[1]} +is itself an array and not a scalar: @example a[4] = "An element in a jagged array" @@ -16566,6 +16576,8 @@ for (i in array) @{ print array[i][j] @} @} + else + print array[i] @} @end example @@ -16833,8 +16845,9 @@ Often random integers are needed instead. Following is a user-defined function that can be used to obtain a random non-negative integer less than @var{n}: @example -function randint(n) @{ - return int(n * rand()) +function randint(n) +@{ + return int(n * rand()) @} @end example @@ -16854,8 +16867,7 @@ function roll(n) @{ return 1 + int(rand() * n) @} # Roll 3 six-sided dice and # print total number of points. @{ - printf("%d points\n", - roll(6)+roll(6)+roll(6)) + printf("%d points\n", roll(6) + roll(6) + roll(6)) @} @end example @@ -16944,7 +16956,7 @@ doing index calculations, particularly if you are used to C. In the following list, optional parameters are enclosed in square brackets@w{ ([ ]).} Several functions perform string substitution; the full discussion is provided in the description of the @code{sub()} function, which comes -towards the end since the list is presented in alphabetic order. +towards the end since the list is presented alphabetically. Those functions that are specific to @command{gawk} are marked with a pound sign (@samp{#}). They are not available in compatibility mode @@ -16988,6 +17000,7 @@ When comparing strings, @code{IGNORECASE} affects the sorting (@pxref{Array Sorting Functions}). If the @var{source} array contains subarrays as values (@pxref{Arrays of Arrays}), they will come last, after all scalar values. +Subarrays are @emph{not} recursively sorted. For example, if the contents of @code{a} are as follows: @@ -17124,7 +17137,10 @@ $ @kbd{awk 'BEGIN @{ print index("peanut", "an") @}'} @noindent If @var{find} is not found, @code{index()} returns zero. -It is a fatal error to use a regexp constant for @var{find}. +With BWK @command{awk} and @command{gawk}, +it is a fatal error to use a regexp constant for @var{find}. +Other implementations allow it, simply treating the regexp +constant as an expression meaning @samp{$0 ~ /regexp/}. @item @code{length(}[@var{string}]@code{)} @cindexawkfunc{length} @@ -17238,13 +17254,12 @@ For example: @example @c file eg/misc/findpat.awk @{ - if ($1 == "FIND") - regex = $2 - else @{ - where = match($0, regex) - if (where != 0) - print "Match of", regex, "found at", - where, "in", $0 + if ($1 == "FIND") + regex = $2 + else @{ + where = match($0, regex) + if (where != 0) + print "Match of", regex, "found at", where, "in", $0 @} @} @c endfile @@ -17340,7 +17355,7 @@ Any leading separator will be in @code{@var{seps}[0]}. The @code{patsplit()} function splits strings into pieces in a manner similar to the way input lines are split into fields using @code{FPAT} -(@pxref{Splitting By Content}. +(@pxref{Splitting By Content}). Before splitting the string, @code{patsplit()} deletes any previously existing elements in the arrays @var{array} and @var{seps}. @@ -17353,8 +17368,7 @@ and store the pieces in @var{array} and the separator strings in the @code{@var{array}[1]}, the second piece in @code{@var{array}[2]}, and so forth. The string value of the third argument, @var{fieldsep}, is a regexp describing where to split @var{string} (much as @code{FS} can -be a regexp describing where to split input records; -@pxref{Regexp Field Splitting}). +be a regexp describing where to split input records). If @var{fieldsep} is omitted, the value of @code{FS} is used. @code{split()} returns the number of elements created. @var{seps} is a @command{gawk} extension with @code{@var{seps}[@var{i}]} @@ -17649,6 +17663,59 @@ Nonalphabetic characters are left unchanged. For example, @code{toupper("MiXeD cAsE 123")} returns @code{"MIXED CASE 123"}. @end table +@cindex sidebar, Matching the Null String +@ifdocbook +@docbook +<sidebar><title>Matching the Null String</title> +@end docbook + +@cindex matching, null strings +@cindex null strings, matching +@cindex @code{*} (asterisk), @code{*} operator, null strings@comma{} matching +@cindex asterisk (@code{*}), @code{*} operator, null strings@comma{} matching + +In @command{awk}, the @samp{*} operator can match the null string. +This is particularly important for the @code{sub()}, @code{gsub()}, +and @code{gensub()} functions. For example: + +@example +$ @kbd{echo abc | awk '@{ gsub(/m*/, "X"); print @}'} +@print{} XaXbXcX +@end example + +@noindent +Although this makes a certain amount of sense, it can be surprising. + +@docbook +</sidebar> +@end docbook +@end ifdocbook + +@ifnotdocbook +@cartouche +@center @b{Matching the Null String} + + +@cindex matching, null strings +@cindex null strings, matching +@cindex @code{*} (asterisk), @code{*} operator, null strings@comma{} matching +@cindex asterisk (@code{*}), @code{*} operator, null strings@comma{} matching + +In @command{awk}, the @samp{*} operator can match the null string. +This is particularly important for the @code{sub()}, @code{gsub()}, +and @code{gensub()} functions. For example: + +@example +$ @kbd{echo abc | awk '@{ gsub(/m*/, "X"); print @}'} +@print{} XaXbXcX +@end example + +@noindent +Although this makes a certain amount of sense, it can be surprising. +@end cartouche +@end ifnotdocbook + + @node Gory Details @subsubsection More About @samp{\} and @samp{&} with @code{sub()}, @code{gsub()}, and @code{gensub()} @@ -17662,7 +17729,7 @@ Nonalphabetic characters are left unchanged. For example, @cindex ampersand (@code{&}), @code{gsub()}/@code{gensub()}/@code{sub()} functions and @quotation CAUTION -This section has been known to cause headaches. +This subsubsection has been reported to cause headaches. You might want to skip it upon first reading. @end quotation @@ -17953,58 +18020,6 @@ and the special cases for @code{sub()} and @code{gsub()}, we recommend the use of @command{gawk} and @code{gensub()} when you have to do substitutions. -@cindex sidebar, Matching the Null String -@ifdocbook -@docbook -<sidebar><title>Matching the Null String</title> -@end docbook - -@cindex matching, null strings -@cindex null strings, matching -@cindex @code{*} (asterisk), @code{*} operator, null strings@comma{} matching -@cindex asterisk (@code{*}), @code{*} operator, null strings@comma{} matching - -In @command{awk}, the @samp{*} operator can match the null string. -This is particularly important for the @code{sub()}, @code{gsub()}, -and @code{gensub()} functions. For example: - -@example -$ @kbd{echo abc | awk '@{ gsub(/m*/, "X"); print @}'} -@print{} XaXbXcX -@end example - -@noindent -Although this makes a certain amount of sense, it can be surprising. - -@docbook -</sidebar> -@end docbook -@end ifdocbook - -@ifnotdocbook -@cartouche -@center @b{Matching the Null String} - - -@cindex matching, null strings -@cindex null strings, matching -@cindex @code{*} (asterisk), @code{*} operator, null strings@comma{} matching -@cindex asterisk (@code{*}), @code{*} operator, null strings@comma{} matching - -In @command{awk}, the @samp{*} operator can match the null string. -This is particularly important for the @code{sub()}, @code{gsub()}, -and @code{gensub()} functions. For example: - -@example -$ @kbd{echo abc | awk '@{ gsub(/m*/, "X"); print @}'} -@print{} XaXbXcX -@end example - -@noindent -Although this makes a certain amount of sense, it can be surprising. -@end cartouche -@end ifnotdocbook - @node I/O Functions @subsection Input/Output Functions @cindex input/output functions @@ -18057,10 +18072,9 @@ buffers its output and the @code{fflush()} function forces @cindex extensions, common@comma{} @code{fflush()} function @cindex Brian Kernighan's @command{awk} -@code{fflush()} was added to BWK @command{awk} in -April of 1992. For two decades, it was not part of the POSIX standard. -As of December, 2012, it was accepted for inclusion into the POSIX -standard. +Brian Kernighan added @code{fflush()} to his @command{awk} in April +of 1992. For two decades, it was a common extension. In December, +2012, it was accepted for inclusion into the POSIX standard. See @uref{http://austingroupbugs.net/view.php?id=634, the Austin Group website}. POSIX standardizes @code{fflush()} as follows: If there @@ -18457,7 +18471,7 @@ is out of range, @code{mktime()} returns @minus{}1. @cindex @command{gawk}, @code{PROCINFO} array in @cindex @code{PROCINFO} array -@item @code{strftime(} [@var{format} [@code{,} @var{timestamp} [@code{,} @var{utc-flag}] ] ]@code{)} +@item @code{strftime(}[@var{format} [@code{,} @var{timestamp} [@code{,} @var{utc-flag}] ] ]@code{)} @c STARTOFRANGE strf @cindexgawkfunc{strftime} @cindex format time string @@ -18724,7 +18738,7 @@ the string. For example: @example $ date '+Today is %A, %B %d, %Y.' -@print{} Today is Monday, May 05, 2014. +@print{} Today is Monday, September 22, 2014. @end example Here is the @command{gawk} version of the @command{date} utility. @@ -18918,17 +18932,16 @@ shows that 0's come in on the left side. For @command{gawk}, this is always true, but in some languages, it's possible to have the left side fill with 1's.} @c Purposely decided to use 0's and 1's here. 2/2001. -If you start over -again with @samp{10111001} and shift it left by three bits, you end up -with @samp{11001000}. -@command{gawk} provides built-in functions that implement the -bitwise operations just described. They are: +If you start over again with @samp{10111001} and shift it left by three +bits, you end up with @samp{11001000}. The following list describes +@command{gawk}'s built-in functions that implement the bitwise operations. +Optional parameters are enclosed in square brackets ([ ]): @cindex @command{gawk}, bitwise operations in @table @code @cindexgawkfunc{and} @cindex bitwise AND -@item @code{and(@var{v1}, @var{v2}} [@code{,} @dots{}]@code{)} +@item @code{and(}@var{v1}@code{,} @var{v2} [@code{,} @dots{}]@code{)} Return the bitwise AND of the arguments. There must be at least two. @cindexgawkfunc{compl} @@ -18943,7 +18956,7 @@ Return the value of @var{val}, shifted left by @var{count} bits. @cindexgawkfunc{or} @cindex bitwise OR -@item @code{or(@var{v1}, @var{v2}} [@code{,} @dots{}]@code{)} +@item @code{or(}@var{v1}@code{,} @var{v2} [@code{,} @dots{}]@code{)} Return the bitwise OR of the arguments. There must be at least two. @cindexgawkfunc{rshift} @@ -18953,7 +18966,7 @@ Return the value of @var{val}, shifted right by @var{count} bits. @cindexgawkfunc{xor} @cindex bitwise XOR -@item @code{xor(@var{v1}, @var{v2}} [@code{,} @dots{}]@code{)} +@item @code{xor(}@var{v1}@code{,} @var{v2} [@code{,} @dots{}]@code{)} Return the bitwise XOR of the arguments. There must be at least two. @end table @@ -19076,7 +19089,7 @@ results of the @code{compl()}, @code{lshift()}, and @code{rshift()} functions. @command{gawk} provides a single function that lets you distinguish an array from a scalar variable. This is necessary for writing code -that traverses every element of an array of arrays. +that traverses every element of an array of arrays (@pxref{Arrays of Arrays}). @table @code @@ -19092,12 +19105,14 @@ an array or not. The second is inside the body of a user-defined function (not discussed yet; @pxref{User-defined}), to test if a parameter is an array or not. -Note, however, that using @code{isarray()} at the global level to test +@quotation NOTE +Using @code{isarray()} at the global level to test variables makes no sense. Since you are the one writing the program, you are supposed to know if your variables are arrays or not. And in fact, due to the way @command{gawk} works, if you pass the name of a variable that has not been previously used to @code{isarray()}, @command{gawk} -will end up turning it into a scalar. +ends up turning it into a scalar. +@end quotation @node I18N Functions @subsection String-Translation Functions @@ -19358,7 +19373,7 @@ extra whitespace signifies the start of the local variable list): function delarray(a, i) @{ for (i in a) - delete a[i] + delete a[i] @} @end example @@ -19369,7 +19384,7 @@ Instead of having to repeat this loop everywhere that you need to clear out an array, your program can just call @code{delarray}. (This guarantees portability. The use of @samp{delete @var{array}} to delete -the contents of an entire array is a recent@footnote{Late in 2012.} +the contents of an entire array is a relatively recent@footnote{Late in 2012.} addition to the POSIX standard.) The following is an example of a recursive function. It takes a string @@ -19399,7 +19414,7 @@ $ @kbd{echo "Don't Panic!" |} @print{} !cinaP t'noD @end example -The C @code{ctime()} function takes a timestamp and returns it in a string, +The C @code{ctime()} function takes a timestamp and returns it as a string, formatted in a well-known fashion. The following example uses the built-in @code{strftime()} function (@pxref{Time Functions}) @@ -19414,13 +19429,19 @@ to create an @command{awk} version of @code{ctime()}: function ctime(ts, format) @{ - format = PROCINFO["strftime"] + format = "%a %b %e %H:%M:%S %Z %Y" + if (ts == 0) ts = systime() # use current time as default return strftime(format, ts) @} @c endfile @end example + +You might think that @code{ctime()} could use @code{PROCINFO["strftime"]} +for its format string. That would be a mistake, since @code{ctime()} is +supposed to return the time formatted in a standard fashion, and user-level +code could have changed @code{PROCINFO["strftime"]}. @c ENDOFRANGE fdef @node Function Caveats @@ -20069,7 +20090,7 @@ function quicksort(data, left, right, less_than, i, last) # quicksort_swap --- helper function for quicksort, should really be inline -function quicksort_swap(data, i, j, temp) +function quicksort_swap(data, i, j, temp) @{ temp = data[i] data[i] = data[j] @@ -20220,10 +20241,11 @@ functions. @item POSIX @command{awk} provides three kinds of built-in functions: numeric, -string, and I/O. @command{gawk} provides functions that work with values -representing time, do bit manipulation, sort arrays, and internationalize -and localize programs. @command{gawk} also provides several extensions to -some of standard functions, typically in the form of additional arguments. +string, and I/O. @command{gawk} provides functions that sort arrays, work +with values representing time, do bit manipulation, determine variable +type (array vs.@: scalar), and internationalize and localize programs. +@command{gawk} also provides several extensions to some of standard +functions, typically in the form of additional arguments. @item Functions accept zero or more arguments and return a value. The @@ -20474,8 +20496,9 @@ are very difficult to track down: function lib_func(x, y, l1, l2) @{ @dots{} - @var{use variable} some_var # some_var should be local - @dots{} # but is not by oversight + # some_var should be local but by oversight is not + @var{use variable} some_var + @dots{} @} @end example @@ -20586,7 +20609,7 @@ function mystrtonum(str, ret, n, i, k, c) # a[5] = "123.45" # a[6] = "1.e3" # a[7] = "1.32" -# a[7] = "1.32E2" +# a[8] = "1.32E2" # # for (i = 1; i in a; i++) # print a[i], strtonum(a[i]), mystrtonum(a[i]) @@ -20597,9 +20620,12 @@ function mystrtonum(str, ret, n, i, k, c) The function first looks for C-style octal numbers (base 8). If the input string matches a regular expression describing octal numbers, then @code{mystrtonum()} loops through each character in the -string. It sets @code{k} to the index in @code{"01234567"} of the current -octal digit. Since the return value is one-based, the @samp{k--} -adjusts @code{k} so it can be used in computing the return value. +string. It sets @code{k} to the index in @code{"1234567"} of the current +octal digit. +The return value will either be the same number as the digit, or zero +if the character is not there, which will be true for a @samp{0}. +This is safe, since the regexp test in the @code{if} ensures that +only octal values are converted. Similar logic applies to the code that checks for and converts a hexadecimal value, which starts with @samp{0x} or @samp{0X}. @@ -20632,7 +20658,7 @@ that a condition or set of conditions is true. Before proceeding with a particular computation, you make a statement about what you believe to be the case. Such a statement is known as an @dfn{assertion}. The C language provides an @code{<assert.h>} header file -and corresponding @code{assert()} macro that the programmer can use to make +and corresponding @code{assert()} macro that a programmer can use to make assertions. If an assertion fails, the @code{assert()} macro arranges to print a diagnostic message describing the condition that should have been true but was not, and then it kills the program. In C, using @@ -21102,7 +21128,7 @@ function getlocaltime(time, ret, now, i) now = systime() # return date(1)-style output - ret = strftime(PROCINFO["strftime"], now) + ret = strftime("%a %b %e %H:%M:%S %Z %Y", now) # clear out target array delete time @@ -21217,6 +21243,9 @@ if (length(contents) == 0) This tests the result to see if it is empty or not. An equivalent test would be @samp{contents == ""}. +@xref{Extension Sample Readfile}, for an extension function that +also reads an entire file into memory. + @node Data File Management @section @value{DDF} Management @@ -21274,15 +21303,14 @@ Besides solving the problem in only nine(!) lines of code, it does so @c # Arnold Robbins, arnold@@skeeve.com, Public Domain @c # January 1992 -FILENAME != _oldfilename \ -@{ +FILENAME != _oldfilename @{ if (_oldfilename != "") endfile(_oldfilename) _oldfilename = FILENAME beginfile(FILENAME) @} -END @{ endfile(FILENAME) @} +END @{ endfile(FILENAME) @} @end example This file must be loaded before the user's ``main'' program, so that the @@ -21335,7 +21363,7 @@ FNR == 1 @{ beginfile(FILENAME) @} -END @{ endfile(_filename_) @} +END @{ endfile(_filename_) @} @c endfile @end example diff --git a/doc/gawktexi.in b/doc/gawktexi.in index dfc710b5..ecd6a972 100644 --- a/doc/gawktexi.in +++ b/doc/gawktexi.in @@ -721,12 +721,12 @@ particular records in a file and perform operations upon them. elements. * Controlling Scanning:: Controlling the order in which arrays are scanned. -* Delete:: The @code{delete} statement removes an - element from an array. * Numeric Array Subscripts:: How to use numbers as subscripts in @command{awk}. * Uninitialized Subscripts:: Using Uninitialized variables as subscripts. +* Delete:: The @code{delete} statement removes an + element from an array. * Multidimensional:: Emulating multidimensional arrays in @command{awk}. * Multiscanning:: Scanning multidimensional arrays. @@ -10110,7 +10110,7 @@ if (/barfly/ || /camelot/) @noindent are exactly equivalent. One rather bizarre consequence of this rule is that the following -Boolean expression is valid, but does not do what the user probably +Boolean expression is valid, but does not do what its author probably intended: @example @@ -10156,10 +10156,9 @@ Modern implementations of @command{awk}, including @command{gawk}, allow the third argument of @code{split()} to be a regexp constant, but some older implementations do not. @value{DARKCORNER} -This can lead to confusion when attempting to use regexp constants -as arguments to user-defined functions -(@pxref{User-defined}). -For example: +Because some built-in functions accept regexp constants as arguments, +it can be confusing when attempting to use regexp constants as arguments +to user-defined functions (@pxref{User-defined}). For example: @example function mysub(pat, repl, str, global) @@ -10227,8 +10226,8 @@ variable's current value. Variables are given new values with @dfn{decrement operators}. @xref{Assignment Ops}. In addition, the @code{sub()} and @code{gsub()} functions can -change a variable's value, and the @code{match()}, @code{patsplit()} -and @code{split()} functions can change the contents of their +change a variable's value, and the @code{match()}, @code{split()} +and @code{patsplit()} functions can change the contents of their array parameters. @xref{String Functions}. @cindex variables, built-in @@ -10244,7 +10243,7 @@ Variables in @command{awk} can be assigned either numeric or string values. The kind of value a variable holds can change over the life of a program. By default, variables are initialized to the empty string, which is zero if converted to a number. There is no need to explicitly -``initialize'' a variable in @command{awk}, +initialize a variable in @command{awk}, which is what you would do in C and in most other traditional languages. @node Assignment Options @@ -10452,7 +10451,7 @@ $ @kbd{echo 4,321 | LC_ALL=en_DK.utf-8 gawk '@{ print $1 + 1 @}'} @noindent The @code{en_DK.utf-8} locale is for English in Denmark, where the comma acts as the decimal point separator. In the normal @code{"C"} locale, @command{gawk} -treats @samp{4,321} as @samp{4}, while in the Danish locale, it's treated +treats @samp{4,321} as 4, while in the Danish locale, it's treated as the full number, 4.321. Some earlier versions of @command{gawk} fully complied with this aspect @@ -11004,7 +11003,7 @@ awk '/[=]=/' /dev/null @end example @command{gawk} does not have this problem; BWK @command{awk} -and @command{mawk} also do not (@pxref{Other Versions}). +and @command{mawk} also do not. @end sidebar @c ENDOFRANGE exas @c ENDOFRANGE opas @@ -11257,7 +11256,7 @@ attribute. @item Fields, @code{getline} input, @code{FILENAME}, @code{ARGV} elements, @code{ENVIRON} elements, and the elements of an array created by -@code{patsplit()}, @code{split()} and @code{match()} that are numeric +@code{match()}, @code{split()} and @code{patsplit()} that are numeric strings have the @var{strnum} attribute. Otherwise, they have the @var{string} attribute. Uninitialized variables also have the @var{strnum} attribute. @@ -11412,22 +11411,23 @@ Thus, the six-character input string @w{@samp{ +3.14}} receives the The following examples print @samp{1} when the comparison between the two different constants is true, @samp{0} otherwise: +@c 22.9.2014: Tested with mawk and BWK awk, got same results. @example -$ @kbd{echo ' +3.14' | gawk '@{ print $0 == " +3.14" @}'} @ii{True} +$ @kbd{echo ' +3.14' | awk '@{ print($0 == " +3.14") @}'} @ii{True} @print{} 1 -$ @kbd{echo ' +3.14' | gawk '@{ print $0 == "+3.14" @}'} @ii{False} +$ @kbd{echo ' +3.14' | awk '@{ print($0 == "+3.14") @}'} @ii{False} @print{} 0 -$ @kbd{echo ' +3.14' | gawk '@{ print $0 == "3.14" @}'} @ii{False} +$ @kbd{echo ' +3.14' | awk '@{ print($0 == "3.14") @}'} @ii{False} @print{} 0 -$ @kbd{echo ' +3.14' | gawk '@{ print $0 == 3.14 @}'} @ii{True} +$ @kbd{echo ' +3.14' | awk '@{ print($0 == 3.14) @}'} @ii{True} @print{} 1 -$ @kbd{echo ' +3.14' | gawk '@{ print $1 == " +3.14" @}'} @ii{False} +$ @kbd{echo ' +3.14' | awk '@{ print($1 == " +3.14") @}'} @ii{False} @print{} 0 -$ @kbd{echo ' +3.14' | gawk '@{ print $1 == "+3.14" @}'} @ii{True} +$ @kbd{echo ' +3.14' | awk '@{ print($1 == "+3.14") @}'} @ii{True} @print{} 1 -$ @kbd{echo ' +3.14' | gawk '@{ print $1 == "3.14" @}'} @ii{False} +$ @kbd{echo ' +3.14' | awk '@{ print($1 == "3.14") @}'} @ii{False} @print{} 0 -$ @kbd{echo ' +3.14' | gawk '@{ print $1 == 3.14 @}'} @ii{True} +$ @kbd{echo ' +3.14' | awk '@{ print($1 == 3.14) @}'} @ii{True} @print{} 1 @end example @@ -11501,9 +11501,8 @@ part of the test always succeeds. Because the operators are so similar, this kind of error is very difficult to spot when scanning the source code. -@cindex @command{gawk}, comparison operators and -The following list of expressions illustrates the kind of comparison -@command{gawk} performs, as well as what the result of the comparison is: +The following list of expressions illustrates the kinds of comparisons +@command{awk} performs, as well as what the result of each comparison is: @table @code @item 1.5 <= 2.0 @@ -11576,7 +11575,7 @@ dynamic regexp (@pxref{Regexp Usage}; also @cindex @command{awk}, regexp constants and @cindex regexp constants -In modern implementations of @command{awk}, a constant regular +A constant regular expression in slashes by itself is also an expression. The regexp @code{/@var{regexp}/} is an abbreviation for the following comparison expression: @@ -11596,7 +11595,7 @@ where this is discussed in more detail. The POSIX standard says that string comparison is performed based on the locale's @dfn{collating order}. This is the order in which characters sort, as defined by the locale (for more discussion, -@pxref{Ranges and Locales}). This order is usually very different +@pxref{Locales}). This order is usually very different from the results obtained when doing straight character-by-character comparison.@footnote{Technically, string comparison is supposed to behave the same way as if the strings are compared with the C @@ -11676,7 +11675,7 @@ no substring @samp{foo} in the record. True if at least one of @var{boolean1} or @var{boolean2} is true. For example, the following statement prints all records in the input that contain @emph{either} @samp{edu} or -@samp{li} or both: +@samp{li}: @example if ($0 ~ /edu/ || $0 ~ /li/) print @@ -11685,6 +11684,9 @@ if ($0 ~ /edu/ || $0 ~ /li/) print The subexpression @var{boolean2} is evaluated only if @var{boolean1} is false. This can make a difference when @var{boolean2} contains expressions that have side effects. +(Thus, this test never really distinguishes records that contain both +@samp{edu} and @samp{li}---as soon as @samp{edu} is matched, +the full test succeeds.) @item ! @var{boolean} True if @var{boolean} is false. For example, @@ -11694,7 +11696,7 @@ variable is not defined: @example BEGIN @{ if (! ("HOME" in ENVIRON)) - print "no home!" @} + print "no home!" @} @end example (The @code{in} operator is described in @@ -12150,8 +12152,8 @@ system about the local character set and language. The ISO C standard defines a default @code{"C"} locale, which is an environment that is typical of what many C programmers are used to. -Once upon a time, the locale setting used to affect regexp matching -(@pxref{Ranges and Locales}), but this is no longer true. +Once upon a time, the locale setting used to affect regexp matching, +but this is no longer true (@pxref{Ranges and Locales}). Locales can affect record splitting. For the normal case of @samp{RS = "\n"}, the locale is largely irrelevant. For other single-character @@ -12205,7 +12207,8 @@ Locales can influence the conversions. @item @command{awk} provides the usual arithmetic operators (addition, subtraction, multiplication, division, modulus), and unary plus and minus. -It also provides comparison operators, boolean operators, and regexp +It also provides comparison operators, boolean operators, array membership +testing, and regexp matching operators. String concatenation is accomplished by placing two expressions next to each other; there is no explicit operator. The three-operand @samp{?:} operator provides an ``if-else'' test within @@ -12220,7 +12223,7 @@ In @command{awk}, a value is considered to be true if it is non-zero @emph{or} non-null. Otherwise, the value is false. @item -A value's type is set upon each assignment and may change over its +A variable's type is set upon each assignment and may change over its lifetime. The type determines how it behaves in comparisons (string or numeric). @@ -12300,7 +12303,7 @@ is nonzero (if a number) or non-null (if a string). (@xref{Expression Patterns}.) @item @var{begpat}, @var{endpat} -A pair of patterns separated by a comma, specifying a range of records. +A pair of patterns separated by a comma, specifying a @dfn{range} of records. The range includes both the initial record that matches @var{begpat} and the final record that matches @var{endpat}. (@xref{Ranges}.) @@ -12390,8 +12393,8 @@ $ @kbd{awk '$1 ~ /li/ @{ print $2 @}' mail-list} @cindex regexp constants, as patterns @cindex patterns, regexp constants as A regexp constant as a pattern is also a special case of an expression -pattern. The expression @code{/li/} has the value one if @samp{li} -appears in the current input record. Thus, as a pattern, @code{/li/} +pattern. The expression @samp{/li/} has the value one if @samp{li} +appears in the current input record. Thus, as a pattern, @samp{/li/} matches any record containing @samp{li}. @cindex Boolean expressions, as patterns @@ -12573,7 +12576,7 @@ input is read. For example: @example $ @kbd{awk '} > @kbd{BEGIN @{ print "Analysis of \"li\"" @}} -> @kbd{/li/ @{ ++n @}} +> @kbd{/li/ @{ ++n @}} > @kbd{END @{ print "\"li\" appears in", n, "records." @}' mail-list} @print{} Analysis of "li" @print{} "li" appears in 4 records. @@ -12653,9 +12656,10 @@ The POSIX standard specifies that @code{NF} is available in an @code{END} rule. It contains the number of fields from the last input record. Most probably due to an oversight, the standard does not say that @code{$0} is also preserved, although logically one would think that it should be. -In fact, @command{gawk} does preserve the value of @code{$0} for use in -@code{END} rules. Be aware, however, that BWK @command{awk}, and possibly -other implementations, do not. +In fact, all of BWK @command{awk}, @command{mawk}, and @command{gawk} +preserve the value of @code{$0} for use in @code{END} rules. Be aware, +however, that some other implementations and many older versions +of Unix @command{awk} do not. The third point follows from the first two. The meaning of @samp{print} inside a @code{BEGIN} or @code{END} rule is the same as always: @@ -12750,8 +12754,8 @@ level of the @command{awk} program. @cindex @code{next} statement, @code{BEGINFILE}/@code{ENDFILE} patterns and The @code{next} statement (@pxref{Next Statement}) is not allowed inside -either a @code{BEGINFILE} or and @code{ENDFILE} rule. The @code{nextfile} -statement (@pxref{Nextfile Statement}) is allowed only inside a +either a @code{BEGINFILE} or an @code{ENDFILE} rule. The @code{nextfile} +statement is allowed only inside a @code{BEGINFILE} rule, but not inside an @code{ENDFILE} rule. @cindex @code{getline} statement, @code{BEGINFILE}/@code{ENDFILE} patterns and @@ -12815,7 +12819,7 @@ There are two ways to get the value of the shell variable into the body of the @command{awk} program. @cindex shells, quoting -The most common method is to use shell quoting to substitute +A common method is to use shell quoting to substitute the variable's value into the program inside the script. For example, consider the following program: @@ -13072,20 +13076,21 @@ If the @var{condition} is true, it executes the statement @var{body}. is not zero and not a null string.) @end ifinfo After @var{body} has been executed, -@var{condition} is tested again, and if it is still true, @var{body} is -executed again. This process repeats until the @var{condition} is no longer -true. If the @var{condition} is initially false, the body of the loop is -never executed and @command{awk} continues with the statement following +@var{condition} is tested again, and if it is still true, @var{body} +executes again. This process repeats until the @var{condition} is no longer +true. If the @var{condition} is initially false, the body of the loop +never executes and @command{awk} continues with the statement following the loop. This example prints the first three fields of each record, one per line: @example -awk '@{ - i = 1 - while (i <= 3) @{ - print $i - i++ - @} +awk ' +@{ + i = 1 + while (i <= 3) @{ + print $i + i++ + @} @}' inventory-shipped @end example @@ -13119,14 +13124,14 @@ do while (@var{condition}) @end example -Even if the @var{condition} is false at the start, the @var{body} is -executed at least once (and only once, unless executing @var{body} +Even if the @var{condition} is false at the start, the @var{body} +executes at least once (and only once, unless executing @var{body} makes @var{condition} true). Contrast this with the corresponding @code{while} statement: @example while (@var{condition}) - @var{body} + @var{body} @end example @noindent @@ -13136,11 +13141,11 @@ The following is an example of a @code{do} statement: @example @{ - i = 1 - do @{ - print $0 - i++ - @} while (i <= 10) + i = 1 + do @{ + print $0 + i++ + @} while (i <= 10) @} @end example @@ -13177,9 +13182,10 @@ compares it against the desired number of iterations. For example: @example -awk '@{ - for (i = 1; i <= 3; i++) - print $i +awk ' +@{ + for (i = 1; i <= 3; i++) + print $i @}' inventory-shipped @end example @@ -13207,7 +13213,7 @@ between 1 and 100: @example for (i = 1; i <= 100; i *= 2) - print i + print i @end example If there is nothing to be done, any of the three expressions in the @@ -13527,7 +13533,7 @@ The @code{next} statement is not allowed inside @code{BEGINFILE} and @cindex functions, user-defined, @code{next}/@code{nextfile} statements and According to the POSIX standard, the behavior is undefined if the @code{next} statement is used in a @code{BEGIN} or @code{END} rule. -@command{gawk} treats it as a syntax error. Although POSIX permits it, +@command{gawk} treats it as a syntax error. Although POSIX does not disallow it, most other @command{awk} implementations don't allow the @code{next} statement inside function bodies (@pxref{User-defined}). Just as with any other @code{next} statement, a @code{next} statement inside a function @@ -13582,7 +13588,7 @@ opened with redirections. It is not related to the main processing that @quotation NOTE For many years, @code{nextfile} was a -@command{gawk} extension. As of September, 2012, it was accepted for +common extension. In September, 2012, it was accepted for inclusion into the POSIX standard. See @uref{http://austingroupbugs.net/view.php?id=607, the Austin Group website}. @end quotation @@ -13591,8 +13597,8 @@ See @uref{http://austingroupbugs.net/view.php?id=607, the Austin Group website}. @cindex @code{nextfile} statement, user-defined functions and @cindex Brian Kernighan's @command{awk} @cindex @command{mawk} utility -The current version of BWK @command{awk}, and @command{mawk} (@pxref{Other -Versions}) also support @code{nextfile}. However, they don't allow the +The current version of BWK @command{awk}, and @command{mawk} +also support @code{nextfile}. However, they don't allow the @code{nextfile} statement inside function bodies (@pxref{User-defined}). @command{gawk} does; a @code{nextfile} inside a function body reads the next record and starts processing it with the first rule in the program, @@ -13624,8 +13630,8 @@ the program to stop immediately. An @code{exit} statement that is not part of a @code{BEGIN} or @code{END} rule stops the execution of any further automatic rules for the current record, skips reading any remaining input records, and executes the -@code{END} rule if there is one. -Any @code{ENDFILE} rules are also skipped; they are not executed. +@code{END} rule if there is one. @command{gawk} also skips +any @code{ENDFILE} rules; they do not execute. In such a case, if you don't want the @code{END} rule to do its job, set a variable @@ -13733,7 +13739,7 @@ respectively, should use binary I/O. A string value of @code{"rw"} or @code{"wr"} indicates that all files should use binary I/O. Any other string value is treated the same as @code{"rw"}, but causes @command{gawk} to generate a warning message. @code{BINMODE} is described in more -detail in @ref{PC Using}. @command{mawk} @pxref{Other Versions}), +detail in @ref{PC Using}. @command{mawk} (@pxref{Other Versions}), also supports this variable, but only using numeric values. @cindex @code{CONVFMT} variable @@ -13860,7 +13866,7 @@ printing with the @code{print} statement. It works by being passed as the first argument to the @code{sprintf()} function (@pxref{String Functions}). Its default value is @code{"%.6g"}. Earlier versions of @command{awk} -also used @code{OFMT} to specify the format for converting numbers to +used @code{OFMT} to specify the format for converting numbers to strings in general expressions; this is now done by @code{CONVFMT}. @cindex @code{sprintf()} function, @code{OFMT} variable and @@ -14012,8 +14018,8 @@ successive instances of the same @value{FN} on the command line. @cindex file names, distinguishing While you can change the value of @code{ARGIND} within your @command{awk} -program, @command{gawk} automatically sets it to a new value when the -next file is opened. +program, @command{gawk} automatically sets it to a new value when it +opens the next file. @cindex @code{ENVIRON} array @cindex environment variables, in @code{ENVIRON} array @@ -14070,10 +14076,10 @@ can give @code{FILENAME} a value. @cindex @code{FNR} variable @item @code{FNR} -The current record number in the current file. @code{FNR} is -incremented each time a new record is read -(@pxref{Records}). It is reinitialized -to zero each time a new input file is started. +The current record number in the current file. @command{awk} increments +@code{FNR} each time it reads a new record (@pxref{Records}). +@command{awk} resets @code{FNR} to zero each time it starts a new +input file. @cindex @code{NF} variable @item @code{NF} @@ -14105,7 +14111,7 @@ array causes a fatal error. Any attempt to assign to an element of The number of input records @command{awk} has processed since the beginning of the program's execution (@pxref{Records}). -@code{NR} is incremented each time a new record is read. +@command{awk} increments @code{NR} each time it reads a new record. @cindex @command{gawk}, @code{PROCINFO} array in @cindex @code{PROCINFO} array @@ -14185,7 +14191,7 @@ The parent process ID of the current process. @item PROCINFO["sorted_in"] If this element exists in @code{PROCINFO}, its value controls the order in which array indices will be processed by -@samp{for (@var{index} in @var{array})} loops. +@samp{for (@var{indx} in @var{array})} loops. Since this is an advanced feature, we defer the full description until later; see @ref{Scanning an Array}. @@ -14206,7 +14212,7 @@ The version of @command{gawk}. The following additional elements in the array are available to provide information about the MPFR and GMP libraries -if your version of @command{gawk} supports arbitrary precision numbers +if your version of @command{gawk} supports arbitrary precision arithmetic (@pxref{Arbitrary Precision Arithmetic}): @table @code @@ -14255,14 +14261,14 @@ The @code{PROCINFO} array has the following additional uses: @itemize @value{BULLET} @item -It may be used to cause coprocesses to communicate over pseudo-ttys -instead of through two-way pipes; this is discussed further in -@ref{Two-way I/O}. - -@item It may be used to provide a timeout when reading from any open input file, pipe, or coprocess. @xref{Read Timeout}, for more information. + +@item +It may be used to cause coprocesses to communicate over pseudo-ttys +instead of through two-way pipes; this is discussed further in +@ref{Two-way I/O}. @end itemize @cindex @code{RLENGTH} variable @@ -14504,6 +14510,12 @@ following @option{-v} are passed on to the @command{awk} program. (@xref{Getopt Function}, for an @command{awk} library function that parses command-line options.) +When designing your program, you should choose options that don't +conflict with @command{gawk}'s, since it will process any options +that it accepts before passing the rest of the command line on to +your program. Using @samp{#!} with the @option{-E} option may help +(@pxref{Executable Scripts}, and @pxref{Options}). + @node Pattern Action Summary @section Summary @@ -14538,7 +14550,7 @@ input and output statements, and deletion statements. The control statements in @command{awk} are @code{if}-@code{else}, @code{while}, @code{for}, and @code{do}-@code{while}. @command{gawk} adds the @code{switch} statement. There are two flavors of @code{for} -statement: one for for performing general looping, and the other iterating +statement: one for performing general looping, and the other for iterating through an array. @item @@ -14555,12 +14567,17 @@ The @code{exit} statement terminates your program. When executed from an action (or function body) it transfers control to the @code{END} statements. From an @code{END} statement body, it exits immediately. You may pass an optional numeric value to be used -at @command{awk}'s exit status. +as @command{awk}'s exit status. @item Some built-in variables provide control over @command{awk}, mainly for I/O. Other variables convey information from @command{awk} to your program. +@item +@code{ARGC} and @code{ARGV} make the command-line arguments available +to your program. Manipulating them from a @code{BEGIN} rule lets you +control how @command{awk} will process the provided @value{DF}s. + @end itemize @node Arrays @@ -14581,24 +14598,13 @@ The @value{CHAPTER} moves on to discuss @command{gawk}'s facility for sorting arrays, and ends with a brief description of @command{gawk}'s ability to support true arrays of arrays. -@cindex variables, names of -@cindex functions, names of -@cindex arrays, names of, and names of functions/variables -@cindex names, arrays/variables -@cindex namespace issues -@command{awk} maintains a single set -of names that may be used for naming variables, arrays, and functions -(@pxref{User-defined}). -Thus, you cannot have a variable and an array with the same name in the -same @command{awk} program. - @menu * Array Basics:: The basics of arrays. -* Delete:: The @code{delete} statement removes an element - from an array. * Numeric Array Subscripts:: How to use numbers as subscripts in @command{awk}. * Uninitialized Subscripts:: Using Uninitialized variables as subscripts. +* Delete:: The @code{delete} statement removes an element + from an array. * Multidimensional:: Emulating multidimensional arrays in @command{awk}. * Arrays of Arrays:: True multidimensional arrays. @@ -15026,14 +15032,14 @@ begin with a number: @example @c file eg/misc/arraymax.awk @{ - if ($1 > max) - max = $1 - arr[$1] = $0 + if ($1 > max) + max = $1 + arr[$1] = $0 @} END @{ - for (x = 1; x <= max; x++) - print arr[x] + for (x = 1; x <= max; x++) + print arr[x] @} @c endfile @end example @@ -15073,9 +15079,9 @@ program's @code{END} rule, as follows: @example END @{ - for (x = 1; x <= max; x++) - if (x in arr) - print arr[x] + for (x = 1; x <= max; x++) + if (x in arr) + print arr[x] @} @end example @@ -15097,7 +15103,7 @@ an array: @example for (@var{var} in @var{array}) - @var{body} + @var{body} @end example @noindent @@ -15170,7 +15176,7 @@ BEGIN @{ @} @end example -Here is what happens when run with @command{gawk}: +Here is what happens when run with @command{gawk} (and @command{mawk}): @example $ @kbd{gawk -f loopcheck.awk} @@ -15288,7 +15294,8 @@ does not affect the loop. For example: @example -$ @kbd{gawk 'BEGIN @{} +$ @kbd{gawk '} +> @kbd{BEGIN @{} > @kbd{ a[4] = 4} > @kbd{ a[3] = 3} > @kbd{ for (i in a)} @@ -15296,7 +15303,8 @@ $ @kbd{gawk 'BEGIN @{} > @kbd{@}'} @print{} 4 4 @print{} 3 3 -$ @kbd{gawk 'BEGIN @{} +$ @kbd{gawk '} +> @kbd{BEGIN @{} > @kbd{ PROCINFO["sorted_in"] = "@@ind_str_asc"} > @kbd{ a[4] = 4} > @kbd{ a[3] = 3} @@ -15345,118 +15353,6 @@ the @code{delete} statement. In addition, @command{gawk} provides built-in functions for sorting arrays; see @ref{Array Sorting Functions}. -@node Delete -@section The @code{delete} Statement -@cindex @code{delete} statement -@cindex deleting elements in arrays -@cindex arrays, elements, deleting -@cindex elements in arrays, deleting - -To remove an individual element of an array, use the @code{delete} -statement: - -@example -delete @var{array}[@var{index-expression}] -@end example - -Once an array element has been deleted, any value the element once -had is no longer available. It is as if the element had never -been referred to or been given a value. -The following is an example of deleting elements in an array: - -@example -for (i in frequencies) - delete frequencies[i] -@end example - -@noindent -This example removes all the elements from the array @code{frequencies}. -Once an element is deleted, a subsequent @code{for} statement to scan the array -does not report that element and the @code{in} operator to check for -the presence of that element returns zero (i.e., false): - -@example -delete foo[4] -if (4 in foo) - print "This will never be printed" -@end example - -@cindex null strings, and deleting array elements -It is important to note that deleting an element is @emph{not} the -same as assigning it a null value (the empty string, @code{""}). -For example: - -@example -foo[4] = "" -if (4 in foo) - print "This is printed, even though foo[4] is empty" -@end example - -@cindex lint checking, array elements -It is not an error to delete an element that does not exist. -However, if @option{--lint} is provided on the command line -(@pxref{Options}), -@command{gawk} issues a warning message when an element that -is not in the array is deleted. - -@cindex common extensions, @code{delete} to delete entire arrays -@cindex extensions, common@comma{} @code{delete} to delete entire arrays -@cindex arrays, deleting entire contents -@cindex deleting entire arrays -@cindex @code{delete} @var{array} -@cindex differences in @command{awk} and @command{gawk}, array elements, deleting -All the elements of an array may be deleted with a single statement -by leaving off the subscript in the @code{delete} statement, -as follows: - - -@example -delete @var{array} -@end example - -Using this version of the @code{delete} statement is about three times -more efficient than the equivalent loop that deletes each element one -at a time. - -@cindex Brian Kernighan's @command{awk} -@quotation NOTE -For many years, -using @code{delete} without a subscript was a @command{gawk} extension. -As of September, 2012, it was accepted for -inclusion into the POSIX standard. See @uref{http://austingroupbugs.net/view.php?id=544, -the Austin Group website}. This form of the @code{delete} statement is also supported -by BWK @command{awk} and @command{mawk}, as well as -by a number of other implementations (@pxref{Other Versions}). -@end quotation - -@cindex portability, deleting array elements -@cindex Brennan, Michael -The following statement provides a portable but nonobvious way to clear -out an array:@footnote{Thanks to Michael Brennan for pointing this out.} - -@example -split("", array) -@end example - -@cindex @code{split()} function, array elements@comma{} deleting -The @code{split()} function -(@pxref{String Functions}) -clears out the target array first. This call asks it to split -apart the null string. Because there is no data to split out, the -function simply clears the array and then returns. - -@quotation CAUTION -Deleting an array does not change its type; you cannot -delete an array and then use the array's name as a scalar -(i.e., a regular variable). For example, the following does not work: - -@example -a[1] = 3 -delete a -a = 3 -@end example -@end quotation - @node Numeric Array Subscripts @section Using Numbers to Subscript Arrays @@ -15497,7 +15393,7 @@ since @code{"12.15"} is different from @code{"12.153"}. @cindex integer array indices According to the rules for conversions (@pxref{Conversion}), integer -values are always converted to strings as integers, no matter what the +values always convert to strings as integers, no matter what the value of @code{CONVFMT} may happen to be. So the usual case of the following works: @@ -15520,7 +15416,7 @@ and all refer to the same element! As with many things in @command{awk}, the majority of the time -things work as one would expect them to. But it is useful to have a precise +things work as you would expect them to. But it is useful to have a precise knowledge of the actual rules since they can sometimes have a subtle effect on your programs. @@ -15584,6 +15480,119 @@ Even though it is somewhat unusual, the null string if @option{--lint} is provided on the command line (@pxref{Options}). +@node Delete +@section The @code{delete} Statement +@cindex @code{delete} statement +@cindex deleting elements in arrays +@cindex arrays, elements, deleting +@cindex elements in arrays, deleting + +To remove an individual element of an array, use the @code{delete} +statement: + +@example +delete @var{array}[@var{index-expression}] +@end example + +Once an array element has been deleted, any value the element once +had is no longer available. It is as if the element had never +been referred to or been given a value. +The following is an example of deleting elements in an array: + +@example +for (i in frequencies) + delete frequencies[i] +@end example + +@noindent +This example removes all the elements from the array @code{frequencies}. +Once an element is deleted, a subsequent @code{for} statement to scan the array +does not report that element and the @code{in} operator to check for +the presence of that element returns zero (i.e., false): + +@example +delete foo[4] +if (4 in foo) + print "This will never be printed" +@end example + +@cindex null strings, and deleting array elements +It is important to note that deleting an element is @emph{not} the +same as assigning it a null value (the empty string, @code{""}). +For example: + +@example +foo[4] = "" +if (4 in foo) + print "This is printed, even though foo[4] is empty" +@end example + +@cindex lint checking, array elements +It is not an error to delete an element that does not exist. +However, if @option{--lint} is provided on the command line +(@pxref{Options}), +@command{gawk} issues a warning message when an element that +is not in the array is deleted. + +@cindex common extensions, @code{delete} to delete entire arrays +@cindex extensions, common@comma{} @code{delete} to delete entire arrays +@cindex arrays, deleting entire contents +@cindex deleting entire arrays +@cindex @code{delete} @var{array} +@cindex differences in @command{awk} and @command{gawk}, array elements, deleting +All the elements of an array may be deleted with a single statement +by leaving off the subscript in the @code{delete} statement, +as follows: + + +@example +delete @var{array} +@end example + +Using this version of the @code{delete} statement is about three times +more efficient than the equivalent loop that deletes each element one +at a time. + +This form of the @code{delete} statement is also supported +by BWK @command{awk} and @command{mawk}, as well as +by a number of other implementations. + +@cindex Brian Kernighan's @command{awk} +@quotation NOTE +For many years, using @code{delete} without a subscript was a common +extension. In September, 2012, it was accepted for inclusion into the +POSIX standard. See @uref{http://austingroupbugs.net/view.php?id=544, +the Austin Group website}. +@end quotation + +@cindex portability, deleting array elements +@cindex Brennan, Michael +The following statement provides a portable but nonobvious way to clear +out an array:@footnote{Thanks to Michael Brennan for pointing this out.} + +@example +split("", array) +@end example + +@cindex @code{split()} function, array elements@comma{} deleting +The @code{split()} function +(@pxref{String Functions}) +clears out the target array first. This call asks it to split +apart the null string. Because there is no data to split out, the +function simply clears the array and then returns. + +@quotation CAUTION +Deleting all the elements from an array does not change its type; you cannot +clear an array and then use the array's name as a scalar +(i.e., a regular variable). For example, the following does not work: + +@example +a[1] = 3 +delete a +a = 3 +@end example +@end quotation + @node Multidimensional @section Multidimensional Arrays @@ -15595,7 +15604,7 @@ on the command line (@pxref{Options}). @cindex arrays, multidimensional A multidimensional array is an array in which an element is identified by a sequence of indices instead of a single index. For example, a -two-dimensional array requires two indices. The usual way (in most +two-dimensional array requires two indices. The usual way (in many languages, including @command{awk}) to refer to an element of a two-dimensional array named @code{grid} is with @code{grid[@var{x},@var{y}]}. @@ -15770,8 +15779,9 @@ a[1][3][1, "name"] = "barney" Each subarray and the main array can be of different length. In fact, the elements of an array or its subarray do not all have to have the same type. This means that the main array and any of its subarrays can be -non-rectangular, or jagged in structure. One can assign a scalar value to -the index @code{4} of the main array @code{a}: +non-rectangular, or jagged in structure. You can assign a scalar value to +the index @code{4} of the main array @code{a}, even though @code{a[1]} +is itself an array and not a scalar: @example a[4] = "An element in a jagged array" @@ -15853,6 +15863,8 @@ for (i in array) @{ print array[i][j] @} @} + else + print array[i] @} @end example @@ -16120,8 +16132,9 @@ Often random integers are needed instead. Following is a user-defined function that can be used to obtain a random non-negative integer less than @var{n}: @example -function randint(n) @{ - return int(n * rand()) +function randint(n) +@{ + return int(n * rand()) @} @end example @@ -16141,8 +16154,7 @@ function roll(n) @{ return 1 + int(rand() * n) @} # Roll 3 six-sided dice and # print total number of points. @{ - printf("%d points\n", - roll(6)+roll(6)+roll(6)) + printf("%d points\n", roll(6) + roll(6) + roll(6)) @} @end example @@ -16231,7 +16243,7 @@ doing index calculations, particularly if you are used to C. In the following list, optional parameters are enclosed in square brackets@w{ ([ ]).} Several functions perform string substitution; the full discussion is provided in the description of the @code{sub()} function, which comes -towards the end since the list is presented in alphabetic order. +towards the end since the list is presented alphabetically. Those functions that are specific to @command{gawk} are marked with a pound sign (@samp{#}). They are not available in compatibility mode @@ -16275,6 +16287,7 @@ When comparing strings, @code{IGNORECASE} affects the sorting (@pxref{Array Sorting Functions}). If the @var{source} array contains subarrays as values (@pxref{Arrays of Arrays}), they will come last, after all scalar values. +Subarrays are @emph{not} recursively sorted. For example, if the contents of @code{a} are as follows: @@ -16411,7 +16424,10 @@ $ @kbd{awk 'BEGIN @{ print index("peanut", "an") @}'} @noindent If @var{find} is not found, @code{index()} returns zero. -It is a fatal error to use a regexp constant for @var{find}. +With BWK @command{awk} and @command{gawk}, +it is a fatal error to use a regexp constant for @var{find}. +Other implementations allow it, simply treating the regexp +constant as an expression meaning @samp{$0 ~ /regexp/}. @item @code{length(}[@var{string}]@code{)} @cindexawkfunc{length} @@ -16525,13 +16541,12 @@ For example: @example @c file eg/misc/findpat.awk @{ - if ($1 == "FIND") - regex = $2 - else @{ - where = match($0, regex) - if (where != 0) - print "Match of", regex, "found at", - where, "in", $0 + if ($1 == "FIND") + regex = $2 + else @{ + where = match($0, regex) + if (where != 0) + print "Match of", regex, "found at", where, "in", $0 @} @} @c endfile @@ -16627,7 +16642,7 @@ Any leading separator will be in @code{@var{seps}[0]}. The @code{patsplit()} function splits strings into pieces in a manner similar to the way input lines are split into fields using @code{FPAT} -(@pxref{Splitting By Content}. +(@pxref{Splitting By Content}). Before splitting the string, @code{patsplit()} deletes any previously existing elements in the arrays @var{array} and @var{seps}. @@ -16640,8 +16655,7 @@ and store the pieces in @var{array} and the separator strings in the @code{@var{array}[1]}, the second piece in @code{@var{array}[2]}, and so forth. The string value of the third argument, @var{fieldsep}, is a regexp describing where to split @var{string} (much as @code{FS} can -be a regexp describing where to split input records; -@pxref{Regexp Field Splitting}). +be a regexp describing where to split input records). If @var{fieldsep} is omitted, the value of @code{FS} is used. @code{split()} returns the number of elements created. @var{seps} is a @command{gawk} extension with @code{@var{seps}[@var{i}]} @@ -16936,6 +16950,26 @@ Nonalphabetic characters are left unchanged. For example, @code{toupper("MiXeD cAsE 123")} returns @code{"MIXED CASE 123"}. @end table +@sidebar Matching the Null String +@cindex matching, null strings +@cindex null strings, matching +@cindex @code{*} (asterisk), @code{*} operator, null strings@comma{} matching +@cindex asterisk (@code{*}), @code{*} operator, null strings@comma{} matching + +In @command{awk}, the @samp{*} operator can match the null string. +This is particularly important for the @code{sub()}, @code{gsub()}, +and @code{gensub()} functions. For example: + +@example +$ @kbd{echo abc | awk '@{ gsub(/m*/, "X"); print @}'} +@print{} XaXbXcX +@end example + +@noindent +Although this makes a certain amount of sense, it can be surprising. +@end sidebar + + @node Gory Details @subsubsection More About @samp{\} and @samp{&} with @code{sub()}, @code{gsub()}, and @code{gensub()} @@ -16949,7 +16983,7 @@ Nonalphabetic characters are left unchanged. For example, @cindex ampersand (@code{&}), @code{gsub()}/@code{gensub()}/@code{sub()} functions and @quotation CAUTION -This section has been known to cause headaches. +This subsubsection has been reported to cause headaches. You might want to skip it upon first reading. @end quotation @@ -17240,25 +17274,6 @@ and the special cases for @code{sub()} and @code{gsub()}, we recommend the use of @command{gawk} and @code{gensub()} when you have to do substitutions. -@sidebar Matching the Null String -@cindex matching, null strings -@cindex null strings, matching -@cindex @code{*} (asterisk), @code{*} operator, null strings@comma{} matching -@cindex asterisk (@code{*}), @code{*} operator, null strings@comma{} matching - -In @command{awk}, the @samp{*} operator can match the null string. -This is particularly important for the @code{sub()}, @code{gsub()}, -and @code{gensub()} functions. For example: - -@example -$ @kbd{echo abc | awk '@{ gsub(/m*/, "X"); print @}'} -@print{} XaXbXcX -@end example - -@noindent -Although this makes a certain amount of sense, it can be surprising. -@end sidebar - @node I/O Functions @subsection Input/Output Functions @cindex input/output functions @@ -17311,10 +17326,9 @@ buffers its output and the @code{fflush()} function forces @cindex extensions, common@comma{} @code{fflush()} function @cindex Brian Kernighan's @command{awk} -@code{fflush()} was added to BWK @command{awk} in -April of 1992. For two decades, it was not part of the POSIX standard. -As of December, 2012, it was accepted for inclusion into the POSIX -standard. +Brian Kernighan added @code{fflush()} to his @command{awk} in April +of 1992. For two decades, it was a common extension. In December, +2012, it was accepted for inclusion into the POSIX standard. See @uref{http://austingroupbugs.net/view.php?id=634, the Austin Group website}. POSIX standardizes @code{fflush()} as follows: If there @@ -17583,7 +17597,7 @@ is out of range, @code{mktime()} returns @minus{}1. @cindex @command{gawk}, @code{PROCINFO} array in @cindex @code{PROCINFO} array -@item @code{strftime(} [@var{format} [@code{,} @var{timestamp} [@code{,} @var{utc-flag}] ] ]@code{)} +@item @code{strftime(}[@var{format} [@code{,} @var{timestamp} [@code{,} @var{utc-flag}] ] ]@code{)} @c STARTOFRANGE strf @cindexgawkfunc{strftime} @cindex format time string @@ -17850,7 +17864,7 @@ the string. For example: @example $ date '+Today is %A, %B %d, %Y.' -@print{} Today is Monday, May 05, 2014. +@print{} Today is Monday, September 22, 2014. @end example Here is the @command{gawk} version of the @command{date} utility. @@ -18044,17 +18058,16 @@ shows that 0's come in on the left side. For @command{gawk}, this is always true, but in some languages, it's possible to have the left side fill with 1's.} @c Purposely decided to use 0's and 1's here. 2/2001. -If you start over -again with @samp{10111001} and shift it left by three bits, you end up -with @samp{11001000}. -@command{gawk} provides built-in functions that implement the -bitwise operations just described. They are: +If you start over again with @samp{10111001} and shift it left by three +bits, you end up with @samp{11001000}. The following list describes +@command{gawk}'s built-in functions that implement the bitwise operations. +Optional parameters are enclosed in square brackets ([ ]): @cindex @command{gawk}, bitwise operations in @table @code @cindexgawkfunc{and} @cindex bitwise AND -@item @code{and(@var{v1}, @var{v2}} [@code{,} @dots{}]@code{)} +@item @code{and(}@var{v1}@code{,} @var{v2} [@code{,} @dots{}]@code{)} Return the bitwise AND of the arguments. There must be at least two. @cindexgawkfunc{compl} @@ -18069,7 +18082,7 @@ Return the value of @var{val}, shifted left by @var{count} bits. @cindexgawkfunc{or} @cindex bitwise OR -@item @code{or(@var{v1}, @var{v2}} [@code{,} @dots{}]@code{)} +@item @code{or(}@var{v1}@code{,} @var{v2} [@code{,} @dots{}]@code{)} Return the bitwise OR of the arguments. There must be at least two. @cindexgawkfunc{rshift} @@ -18079,7 +18092,7 @@ Return the value of @var{val}, shifted right by @var{count} bits. @cindexgawkfunc{xor} @cindex bitwise XOR -@item @code{xor(@var{v1}, @var{v2}} [@code{,} @dots{}]@code{)} +@item @code{xor(}@var{v1}@code{,} @var{v2} [@code{,} @dots{}]@code{)} Return the bitwise XOR of the arguments. There must be at least two. @end table @@ -18202,7 +18215,7 @@ results of the @code{compl()}, @code{lshift()}, and @code{rshift()} functions. @command{gawk} provides a single function that lets you distinguish an array from a scalar variable. This is necessary for writing code -that traverses every element of an array of arrays. +that traverses every element of an array of arrays (@pxref{Arrays of Arrays}). @table @code @@ -18218,12 +18231,14 @@ an array or not. The second is inside the body of a user-defined function (not discussed yet; @pxref{User-defined}), to test if a parameter is an array or not. -Note, however, that using @code{isarray()} at the global level to test +@quotation NOTE +Using @code{isarray()} at the global level to test variables makes no sense. Since you are the one writing the program, you are supposed to know if your variables are arrays or not. And in fact, due to the way @command{gawk} works, if you pass the name of a variable that has not been previously used to @code{isarray()}, @command{gawk} -will end up turning it into a scalar. +ends up turning it into a scalar. +@end quotation @node I18N Functions @subsection String-Translation Functions @@ -18484,7 +18499,7 @@ extra whitespace signifies the start of the local variable list): function delarray(a, i) @{ for (i in a) - delete a[i] + delete a[i] @} @end example @@ -18495,7 +18510,7 @@ Instead of having to repeat this loop everywhere that you need to clear out an array, your program can just call @code{delarray}. (This guarantees portability. The use of @samp{delete @var{array}} to delete -the contents of an entire array is a recent@footnote{Late in 2012.} +the contents of an entire array is a relatively recent@footnote{Late in 2012.} addition to the POSIX standard.) The following is an example of a recursive function. It takes a string @@ -18525,7 +18540,7 @@ $ @kbd{echo "Don't Panic!" |} @print{} !cinaP t'noD @end example -The C @code{ctime()} function takes a timestamp and returns it in a string, +The C @code{ctime()} function takes a timestamp and returns it as a string, formatted in a well-known fashion. The following example uses the built-in @code{strftime()} function (@pxref{Time Functions}) @@ -18540,13 +18555,19 @@ to create an @command{awk} version of @code{ctime()}: function ctime(ts, format) @{ - format = PROCINFO["strftime"] + format = "%a %b %e %H:%M:%S %Z %Y" + if (ts == 0) ts = systime() # use current time as default return strftime(format, ts) @} @c endfile @end example + +You might think that @code{ctime()} could use @code{PROCINFO["strftime"]} +for its format string. That would be a mistake, since @code{ctime()} is +supposed to return the time formatted in a standard fashion, and user-level +code could have changed @code{PROCINFO["strftime"]}. @c ENDOFRANGE fdef @node Function Caveats @@ -19195,7 +19216,7 @@ function quicksort(data, left, right, less_than, i, last) # quicksort_swap --- helper function for quicksort, should really be inline -function quicksort_swap(data, i, j, temp) +function quicksort_swap(data, i, j, temp) @{ temp = data[i] data[i] = data[j] @@ -19346,10 +19367,11 @@ functions. @item POSIX @command{awk} provides three kinds of built-in functions: numeric, -string, and I/O. @command{gawk} provides functions that work with values -representing time, do bit manipulation, sort arrays, and internationalize -and localize programs. @command{gawk} also provides several extensions to -some of standard functions, typically in the form of additional arguments. +string, and I/O. @command{gawk} provides functions that sort arrays, work +with values representing time, do bit manipulation, determine variable +type (array vs.@: scalar), and internationalize and localize programs. +@command{gawk} also provides several extensions to some of standard +functions, typically in the form of additional arguments. @item Functions accept zero or more arguments and return a value. The @@ -19600,8 +19622,9 @@ are very difficult to track down: function lib_func(x, y, l1, l2) @{ @dots{} - @var{use variable} some_var # some_var should be local - @dots{} # but is not by oversight + # some_var should be local but by oversight is not + @var{use variable} some_var + @dots{} @} @end example @@ -19712,7 +19735,7 @@ function mystrtonum(str, ret, n, i, k, c) # a[5] = "123.45" # a[6] = "1.e3" # a[7] = "1.32" -# a[7] = "1.32E2" +# a[8] = "1.32E2" # # for (i = 1; i in a; i++) # print a[i], strtonum(a[i]), mystrtonum(a[i]) @@ -19723,9 +19746,12 @@ function mystrtonum(str, ret, n, i, k, c) The function first looks for C-style octal numbers (base 8). If the input string matches a regular expression describing octal numbers, then @code{mystrtonum()} loops through each character in the -string. It sets @code{k} to the index in @code{"01234567"} of the current -octal digit. Since the return value is one-based, the @samp{k--} -adjusts @code{k} so it can be used in computing the return value. +string. It sets @code{k} to the index in @code{"1234567"} of the current +octal digit. +The return value will either be the same number as the digit, or zero +if the character is not there, which will be true for a @samp{0}. +This is safe, since the regexp test in the @code{if} ensures that +only octal values are converted. Similar logic applies to the code that checks for and converts a hexadecimal value, which starts with @samp{0x} or @samp{0X}. @@ -19758,7 +19784,7 @@ that a condition or set of conditions is true. Before proceeding with a particular computation, you make a statement about what you believe to be the case. Such a statement is known as an @dfn{assertion}. The C language provides an @code{<assert.h>} header file -and corresponding @code{assert()} macro that the programmer can use to make +and corresponding @code{assert()} macro that a programmer can use to make assertions. If an assertion fails, the @code{assert()} macro arranges to print a diagnostic message describing the condition that should have been true but was not, and then it kills the program. In C, using @@ -20228,7 +20254,7 @@ function getlocaltime(time, ret, now, i) now = systime() # return date(1)-style output - ret = strftime(PROCINFO["strftime"], now) + ret = strftime("%a %b %e %H:%M:%S %Z %Y", now) # clear out target array delete time @@ -20343,6 +20369,9 @@ if (length(contents) == 0) This tests the result to see if it is empty or not. An equivalent test would be @samp{contents == ""}. +@xref{Extension Sample Readfile}, for an extension function that +also reads an entire file into memory. + @node Data File Management @section @value{DDF} Management @@ -20400,15 +20429,14 @@ Besides solving the problem in only nine(!) lines of code, it does so @c # Arnold Robbins, arnold@@skeeve.com, Public Domain @c # January 1992 -FILENAME != _oldfilename \ -@{ +FILENAME != _oldfilename @{ if (_oldfilename != "") endfile(_oldfilename) _oldfilename = FILENAME beginfile(FILENAME) @} -END @{ endfile(FILENAME) @} +END @{ endfile(FILENAME) @} @end example This file must be loaded before the user's ``main'' program, so that the @@ -20461,7 +20489,7 @@ FNR == 1 @{ beginfile(FILENAME) @} -END @{ endfile(_filename_) @} +END @{ endfile(_filename_) @} @c endfile @end example |