diff options
author | Arnold D. Robbins <arnold@skeeve.com> | 2014-04-30 21:29:18 +0300 |
---|---|---|
committer | Arnold D. Robbins <arnold@skeeve.com> | 2014-04-30 21:29:18 +0300 |
commit | 0ceab11e44cac45f8653fa79510726cc121719f4 (patch) | |
tree | 1bdf8859d0ae6263aab27bd289a3d1c3598fc8e5 | |
parent | 2535d8a18e8c0d328fe6d1d8ae015320eeec6b5d (diff) | |
download | egawk-0ceab11e44cac45f8653fa79510726cc121719f4.tar.gz egawk-0ceab11e44cac45f8653fa79510726cc121719f4.tar.bz2 egawk-0ceab11e44cac45f8653fa79510726cc121719f4.zip |
Edits, into chapter 7.
-rw-r--r-- | doc/ChangeLog | 2 | ||||
-rw-r--r-- | doc/gawk.info | 1082 | ||||
-rw-r--r-- | doc/gawk.texi | 318 | ||||
-rw-r--r-- | doc/gawktexi.in | 312 |
4 files changed, 919 insertions, 795 deletions
diff --git a/doc/ChangeLog b/doc/ChangeLog index 59d31520..dde9c9af 100644 --- a/doc/ChangeLog +++ b/doc/ChangeLog @@ -1,6 +1,8 @@ 2014-04-30 Arnold D. Robbins <arnold@skeeve.com> * gawktexi.in: Editing progress. Through Chapter 5. + * gawktexi.in: Editing progress. Through Chapter 6 and into + Chapter 7. 2014-04-29 Arnold D. Robbins <arnold@skeeve.com> diff --git a/doc/gawk.info b/doc/gawk.info index 57fdc3d4..8c13d181 100644 --- a/doc/gawk.info +++ b/doc/gawk.info @@ -6921,7 +6921,8 @@ codes. (1) The internal representation of all numbers, including integers, uses double precision floating-point numbers. On most modern systems, -these are in IEEE 754 standard format. +these are in IEEE 754 standard format. *Note Arbitrary Precision +Arithmetic::, for much more information. File: gawk.info, Node: Nondecimal-numbers, Next: Regexp Constants, Prev: Scalar Constants, Up: Constants @@ -7054,8 +7055,8 @@ the contents of the current input record. Constant regular expressions are also used as the first argument for the `gensub()', `sub()', and `gsub()' functions, as the second argument -of the `match()' function, and as the third argument of the -`patsplit()' function (*note String Functions::). Modern +of the `match()' function, and as the third argument of the `split()' +and `patsplit()' functions (*note String Functions::). Modern implementations of `awk', including `gawk', allow the third argument of `split()' to be a regexp constant, but some older implementations do not. (d.c.) This can lead to confusion when attempting to use regexp @@ -7248,29 +7249,28 @@ use when printing numbers with `print'. `CONVFMT' was introduced in order to separate the semantics of conversion from the semantics of printing. Both `CONVFMT' and `OFMT' have the same default value: `"%.6g"'. In the vast majority of cases, old `awk' programs do not -change their behavior. However, these semantics for `OFMT' are -something to keep in mind if you must port your new-style program to -older implementations of `awk'. We recommend that instead of changing -your programs, just port `gawk' itself. *Note Print::, for more -information on the `print' statement. - - And, once again, where you are can matter when it comes to converting -between numbers and strings. In *note Locales::, we mentioned that the -local character set and language (the locale) can affect how `gawk' -matches characters. The locale also affects numeric formats. In -particular, for `awk' programs, it affects the decimal point character. -The `"C"' locale, and most English-language locales, use the period -character (`.') as the decimal point. However, many (if not most) -European and non-English locales use the comma (`,') as the decimal -point character. +change their behavior. *Note Print::, for more information on the +`print' statement. + + Where you are can matter when it comes to converting between numbers +and strings. The local character set and language--the "locale"--can +affect numeric formats. In particular, for `awk' programs, it affects +the decimal point character and the thousands-separator character. The +`"C"' locale, and most English-language locales, use the period +character (`.') as the decimal point and don't have a thousands +separator. However, many (if not most) European and non-English +locales use the comma (`,') as the decimal point character. European +locales often use either a space or a period as the thousands +separator, if they have one. The POSIX standard says that `awk' always uses the period as the decimal point when reading the `awk' program source code, and for command-line variable assignments (*note Other Arguments::). However, when interpreting input data, for `print' and `printf' output, and for number to string conversion, the local decimal point character is used. -(d.c.) Here are some examples indicating the difference in behavior, -on a GNU/Linux system: +(d.c.) In all cases, numbers in source code and in input data cannot +have a thousands separator. Here are some examples indicating the +difference in behavior, on a GNU/Linux system: $ export POSIXLY_CORRECT=1 Force POSIX behavior $ gawk 'BEGIN { printf "%g\n", 3.1415927 }' @@ -7475,9 +7475,9 @@ example: print (a " " (a = "panic")) } -It is not defined whether the assignment to `a' happens before or after -the value of `a' is retrieved for producing the concatenated value. -The result could be either `don't panic', or `panic panic'. +It is not defined whether the second assignment to `a' happens before +or after the value of `a' is retrieved for producing the concatenated +value. The result could be either `don't panic', or `panic panic'. The precedence of concatenation, when mixed with other operators, is often counter-intuitive. Consider this example: @@ -7550,9 +7550,9 @@ that the assignment stores in the specified variable, field, or array element. (Such values are called "rvalues".) It is important to note that variables do _not_ have permanent types. -A variable's type is simply the type of whatever value it happens to -hold at the moment. In the following program fragment, the variable -`foo' has a numeric value at first, and a string value later on: +A variable's type is simply the type of whatever value was last assigned +to it. In the following program fragment, the variable `foo' has a +numeric value at first, and a string value later on: foo = 1 print foo @@ -7625,9 +7625,10 @@ The indices of `bar' are practically guaranteed to be different, because the `rand()' function haven't been covered yet. *Note Arrays::, and see *note Numeric Functions::, for more information). This example illustrates an important fact about assignment operators: the lefthand -expression is only evaluated _once_. It is up to the implementation as -to which expression is evaluated first, the lefthand or the righthand. -Consider this example: +expression is only evaluated _once_. + + It is up to the implementation as to which expression is evaluated +first, the lefthand or the righthand. Consider this example: i = 1 a[i += 2] = i + 1 @@ -7640,14 +7641,14 @@ converted to a number. Operator Effect -------------------------------------------------------------------------- -LVALUE `+=' INCREMENT Adds INCREMENT to the value of LVALUE. -LVALUE `-=' DECREMENT Subtracts DECREMENT from the value of LVALUE. -LVALUE `*=' Multiplies the value of LVALUE by COEFFICIENT. +LVALUE `+=' INCREMENT Add INCREMENT to the value of LVALUE. +LVALUE `-=' DECREMENT Subtract DECREMENT from the value of LVALUE. +LVALUE `*=' Multiply the value of LVALUE by COEFFICIENT. COEFFICIENT -LVALUE `/=' DIVISOR Divides the value of LVALUE by DIVISOR. -LVALUE `%=' MODULUS Sets LVALUE to its remainder by MODULUS. +LVALUE `/=' DIVISOR Divide the value of LVALUE by DIVISOR. +LVALUE `%=' MODULUS Set LVALUE to its remainder by MODULUS. LVALUE `^=' POWER -LVALUE `**=' POWER Raises LVALUE to the power POWER. (c.e.) +LVALUE `**=' POWER Raise LVALUE to the power POWER. (c.e.) Table 6.2: Arithmetic Assignment Operators @@ -7670,8 +7671,8 @@ A workaround is: awk '/[=]=/' /dev/null - `gawk' does not have this problem, nor do the other freely available -versions described in *note Other Versions::. + `gawk' does not have this problem; Brian Kernighan's `awk' and +`mawk' also do not (*note Other Versions::). File: gawk.info, Node: Increment Ops, Prev: Assignment Ops, Up: All Operators @@ -7686,13 +7687,13 @@ they are convenient abbreviations for very common operations. The operator used for adding one is written `++'. It can be used to increment a variable either before or after taking its value. To -pre-increment a variable `v', write `++v'. This adds one to the value -of `v'--that new value is also the value of the expression. (The +"pre-increment" a variable `v', write `++v'. This adds one to the +value of `v'--that new value is also the value of the expression. (The assignment expression `v += 1' is completely equivalent.) Writing the -`++' after the variable specifies post-increment. This increments the -variable value just the same; the difference is that the value of the -increment expression itself is the variable's _old_ value. Thus, if -`foo' has the value four, then the expression `foo++' has the value +`++' after the variable specifies "post-increment". This increments +the variable value just the same; the difference is that the value of +the increment expression itself is the variable's _old_ value. Thus, +if `foo' has the value four, then the expression `foo++' has the value four, but it changes the value of `foo' to five. In other words, the operator returns the old value of the variable, but with the side effect of incrementing it. @@ -7839,10 +7840,12 @@ The 1992 POSIX standard introduced the concept of a "numeric string", which is simply a string that looks like a number--for example, `" +2"'. This concept is used for determining the type of a variable. The type of the variable is important because the types of two variables -determine how they are compared. The various versions of the POSIX -standard did not get the rules quite right for several editions. -Fortunately, as of at least the 2008 standard (and possibly earlier), -the standard has been fixed, and variable typing follows these rules:(1) +determine how they are compared. + + The various versions of the POSIX standard did not get the rules +quite right for several editions. Fortunately, as of at least the 2008 +standard (and possibly earlier), the standard has been fixed, and +variable typing follows these rules:(1) * A numeric constant or the result of a numeric operation has the NUMERIC attribute. @@ -7899,10 +7902,9 @@ comparison is performed. characters, and so is first and foremost of STRING type; input strings that look numeric are additionally given the STRNUM attribute. Thus, the six-character input string ` +3.14' receives the STRNUM attribute. -In contrast, the eight-character literal `" +3.14"' appearing in -program text is a string constant. The following examples print `1' -when the comparison between the two different constants is true, `0' -otherwise: +In contrast, the eight characters `" +3.14"' appearing in program text +comprise a string constant. The following examples print `1' when the +comparison between the two different constants is true, `0' otherwise: $ echo ' +3.14' | gawk '{ print $0 == " +3.14" }' True -| 1 @@ -8045,9 +8047,10 @@ File: gawk.info, Node: POSIX String Comparison, Prev: Comparison Operators, U .......................................... The POSIX standard says that string comparison is performed based on -the locale's collating order. This is usually very different from the -results obtained when doing straight character-by-character -comparison.(1) +the locale's "collating order". This is the order in which characters +sort, as defined by the locale (for more discussion, *note Ranges and +Locales::). This order is usually very different from the results +obtained when doing straight character-by-character comparison.(1) Because this behavior differs considerably from existing practice, `gawk' only implements it when in POSIX mode (*note Options::). Here @@ -8199,7 +8202,7 @@ not. *Note Arrays::, for more information about arrays. continued simply by putting a newline after either character. However, putting a newline in front of either character does not work without using backslash continuation (*note Statements/Lines::). If `--posix' -is specified (*note Options::), then this extension is disabled. +is specified (*note Options::), this extension is disabled. File: gawk.info, Node: Function Calls, Next: Precedence, Prev: Truth Values and Conditions, Up: Expressions @@ -8216,6 +8219,8 @@ available in every `awk' program. The `sqrt()' function is one of these. *Note Built-in::, for a list of built-in functions and their descriptions. In addition, you can define functions for use in your program. *Note User-defined::, for instructions on how to do this. +Finally, `gawk' lets you write functions in C or C++ that may be called +from your program: see *note Dynamic Extensions::. The way to use a function is with a "function call" expression, which consists of the function name followed immediately by a list of @@ -8255,11 +8260,12 @@ which is a way to choose the function to call at runtime, instead of when you write the source code to your program. We defer discussion of this feature until later; see *note Indirect Calls::. - Like every other expression, the function call has a value, which is -computed by the function based on the arguments you give it. In this -example, the value of `sqrt(ARGUMENT)' is the square root of ARGUMENT. -The following program reads numbers, one number per line, and prints the -square root of each one: + Like every other expression, the function call has a value, often +called the "return value", which is computed by the function based on +the arguments you give it. In this example, the return value of +`sqrt(ARGUMENT)' is the square root of ARGUMENT. The following program +reads numbers, one number per line, and prints the square root of each +one: $ awk '{ print "The square root of", $1, "is", sqrt($1) }' 1 @@ -8472,10 +8478,10 @@ summary of the types of `awk' patterns: A single expression. It matches when its value is nonzero (if a number) or non-null (if a string). (*Note Expression Patterns::.) -`PAT1, PAT2' +`BEGPAT, ENDPAT' A pair of patterns separated by a comma, specifying a range of records. The range includes both the initial record that matches - PAT1 and the final record that matches PAT2. (*Note Ranges::.) + BEGPAT and the final record that matches ENDPAT. (*Note Ranges::.) `BEGIN' `END' @@ -8485,7 +8491,7 @@ summary of the types of `awk' patterns: `BEGINFILE' `ENDFILE' Special patterns for you to supply startup or cleanup actions to be - done on a per file basis. (*Note BEGINFILE/ENDFILE::.) + done on a per-file basis. (*Note BEGINFILE/ENDFILE::.) `EMPTY' The empty pattern matches every input record. (*Note Empty::.) @@ -8605,7 +8611,7 @@ record. When a record matches BEGPAT, the range pattern is "turned on" and the range pattern matches this record as well. As long as the range pattern stays turned on, it automatically matches every input record read. The range pattern also matches ENDPAT against every input -record; when this succeeds, the range pattern is turned off again for +record; when this succeeds, the range pattern is "turned off" again for the following record. Then the range pattern goes back to checking BEGPAT against each record. @@ -8737,10 +8743,10 @@ File: gawk.info, Node: I/O And BEGIN/END, Prev: Using BEGIN/END, Up: BEGIN/EN 7.1.4.2 Input/Output from `BEGIN' and `END' Rules ................................................. -There are several (sometimes subtle) points to remember when doing I/O -from a `BEGIN' or `END' rule. The first has to do with the value of -`$0' in a `BEGIN' rule. Because `BEGIN' rules are executed before any -input is read, there simply is no input record, and therefore no +There are several (sometimes subtle) points to be aware of when doing +I/O from a `BEGIN' or `END' rule. The first has to do with the value +of `$0' in a `BEGIN' rule. Because `BEGIN' rules are executed before +any input is read, there simply is no input record, and therefore no fields, when executing `BEGIN' rules. References to `$0' and the fields yield a null string or zero, depending upon the context. One way to give `$0' a real value is to execute a `getline' command without a @@ -8808,10 +8814,10 @@ tasks that would otherwise be difficult or impossible to perform: entirely. Otherwise, `gawk' exits with the usual fatal error. * If you have written extensions that modify the record handling (by - inserting an "input parser"), you can invoke them at this point, - before `gawk' has started processing the file. (This is a _very_ - advanced feature, currently used only by the `gawkextlib' project - (http://gawkextlib.sourceforge.net).) + inserting an "input parser," *note Input Parsers::), you can invoke + them at this point, before `gawk' has started processing the file. + (This is a _very_ advanced feature, currently used only by the + `gawkextlib' project (http://gawkextlib.sourceforge.net).) The `ENDFILE' rule is called when `gawk' has finished processing the last record in an input file. For the last input file, it will be @@ -8863,15 +8869,15 @@ to get the value of the shell variable into the body of the `awk' program. The most common method is to use shell quoting to substitute the -variable's value into the program inside the script. For example, in -the following program: +variable's value into the program inside the script. For example, +consider the following program: printf "Enter search pattern: " read pattern awk "/$pattern/ "'{ nmatches++ } END { print nmatches, "found" }' /path/to/data -the `awk' program consists of two pieces of quoted text that are +The `awk' program consists of two pieces of quoted text that are concatenated together to form the program. The first part is double-quoted, which allows substitution of the `pattern' shell variable inside the quotes. The second part is single-quoted. @@ -8883,7 +8889,7 @@ quotes when reading the program. A better method is to use `awk''s variable assignment feature (*note Assignment Options::) to assign the shell variable's value to an `awk' -variable's value. Then use dynamic regexps to match the pattern (*note +variable. Then use dynamic regexps to match the pattern (*note Computed Regexps::). The following shows how to redo the previous example using this technique: @@ -8945,9 +8951,9 @@ Control statements well as a few special ones (*note Statements::). Compound statements - Consist of one or more statements enclosed in curly braces. A - compound statement is used in order to put several statements - together in the body of an `if', `while', `do', or `for' statement. + Enclose one or more statements in curly braces. A compound + statement is used in order to put several statements together in + the body of an `if', `while', `do', or `for' statement. Input statements Use the `getline' command (*note Getline::). Also supplied in @@ -9210,7 +9216,8 @@ File: gawk.info, Node: Switch Statement, Next: Break Statement, Prev: For Sta 7.4.5 The `switch' Statement ---------------------------- -This minor node describes a `gawk'-specific feature. +This minor node describes a `gawk'-specific feature. If `gawk' is in +compatibility mode (*note Options::), it is not available. The `switch' statement allows the evaluation of an expression and the execution of statements based on a `case' match. Case statements @@ -9261,9 +9268,6 @@ is executed and then falls through into the `default' section, executing its `print' statement. In turn, the -1 case will also be executed since the `default' does not halt execution. - This `switch' statement is a `gawk' extension. If `gawk' is in -compatibility mode (*note Options::), it is not available. - File: gawk.info, Node: Break Statement, Next: Continue Statement, Prev: Switch Statement, Up: Statements @@ -9276,15 +9280,15 @@ divisor of any integer, and also identifies prime numbers: # find smallest divisor of num { - num = $1 - for (div = 2; div * div <= num; div++) { - if (num % div == 0) - break - } - if (num % div == 0) - printf "Smallest divisor of %d is %d\n", num, div - else - printf "%d is prime\n", num + num = $1 + for (div = 2; div * div <= num; div++) { + if (num % div == 0) + break + } + if (num % div == 0) + printf "Smallest divisor of %d is %d\n", num, div + else + printf "%d is prime\n", num } When the remainder is zero in the first `if' statement, `awk' @@ -9299,17 +9303,17 @@ Statement::.) # find smallest divisor of num { - num = $1 - for (div = 2; ; div++) { - if (num % div == 0) { - printf "Smallest divisor of %d is %d\n", num, div - break - } - if (div * div > num) { - printf "%d is prime\n", num - break + num = $1 + for (div = 2; ; div++) { + if (num % div == 0) { + printf "Smallest divisor of %d is %d\n", num, div + break + } + if (div * div > num) { + printf "%d is prime\n", num + break + } } - } } The `break' statement is also used to break out of the `switch' @@ -9420,7 +9424,7 @@ rules. *Note BEGINFILE/ENDFILE::. According to the POSIX standard, the behavior is undefined if the `next' statement is used in a `BEGIN' or `END' rule. `gawk' treats it -as a syntax error. Although POSIX permits it, some other `awk' +as a syntax error. Although POSIX permits it, most other `awk' implementations don't allow the `next' statement inside function bodies (*note User-defined::). Just as with any other `next' statement, a `next' statement inside a function body reads the next record and @@ -9524,12 +9528,12 @@ with a nonzero status. An `awk' program can do this using an `exit' statement with a nonzero argument, as shown in the following example: BEGIN { - if (("date" | getline date_now) <= 0) { - print "Can't get system date" > "/dev/stderr" - exit 1 - } - print "current date is", date_now - close("date") + if (("date" | getline date_now) <= 0) { + print "Can't get system date" > "/dev/stderr" + exit 1 + } + print "current date is", date_now + close("date") } NOTE: For full portability, exit values should be between zero and @@ -27797,7 +27801,7 @@ Item Limit -------------------------------------------------------------------------- Characters in a character 2^(number of bits per byte) class -Length of input record `MAX_INT ' +Length of input record `MAX_INT' Length of output record Unlimited Length of source line Unlimited Number of fields in a record `MAX_LONG' @@ -27810,9 +27814,9 @@ Number of pipe redirections min(number of processes per user, number of open files) Numeric values Double-precision floating point (if not using MPFR) -Size of a field `MAX_INT ' -Size of a literal string `MAX_INT ' -Size of a printf string `MAX_INT ' +Size of a field `MAX_INT' +Size of a literal string `MAX_INT' +Size of a printf string `MAX_INT' File: gawk.info, Node: Extension Design, Next: Old Extension Mechanism, Prev: Implementation Limitations, Up: Notes @@ -30129,7 +30133,7 @@ Index * $ (dollar sign), regexp operator: Regexp Operators. (line 35) * % (percent sign), % operator: Precedence. (line 55) * % (percent sign), %= operator <1>: Precedence. (line 95) -* % (percent sign), %= operator: Assignment Ops. (line 129) +* % (percent sign), %= operator: Assignment Ops. (line 130) * & (ampersand), && operator <1>: Precedence. (line 86) * & (ampersand), && operator: Boolean Ops. (line 57) * & (ampersand), gsub()/gensub()/sub() functions and: Gory Details. @@ -30150,9 +30154,9 @@ Index * * (asterisk), ** operator <1>: Precedence. (line 49) * * (asterisk), ** operator: Arithmetic Ops. (line 81) * * (asterisk), **= operator <1>: Precedence. (line 95) -* * (asterisk), **= operator: Assignment Ops. (line 129) +* * (asterisk), **= operator: Assignment Ops. (line 130) * * (asterisk), *= operator <1>: Precedence. (line 95) -* * (asterisk), *= operator: Assignment Ops. (line 129) +* * (asterisk), *= operator: Assignment Ops. (line 130) * + (plus sign), + operator: Precedence. (line 52) * + (plus sign), ++ operator <1>: Precedence. (line 46) * + (plus sign), ++ operator: Increment Ops. (line 11) @@ -30164,7 +30168,7 @@ Index * - (hyphen), -- operator <1>: Precedence. (line 46) * - (hyphen), -- operator: Increment Ops. (line 48) * - (hyphen), -= operator <1>: Precedence. (line 95) -* - (hyphen), -= operator: Assignment Ops. (line 129) +* - (hyphen), -= operator: Assignment Ops. (line 130) * - (hyphen), filenames beginning with: Options. (line 59) * - (hyphen), in bracket expressions: Bracket Expressions. (line 17) * --assign option: Options. (line 32) @@ -30260,11 +30264,11 @@ Index * / (forward slash) to enclose regular expressions: Regexp. (line 10) * / (forward slash), / operator: Precedence. (line 55) * / (forward slash), /= operator <1>: Precedence. (line 95) -* / (forward slash), /= operator: Assignment Ops. (line 129) +* / (forward slash), /= operator: Assignment Ops. (line 130) * / (forward slash), /= operator, vs. /=.../ regexp constant: Assignment Ops. - (line 147) + (line 148) * / (forward slash), patterns and: Expression Patterns. (line 24) -* /= operator vs. /=.../ regexp constant: Assignment Ops. (line 147) +* /= operator vs. /=.../ regexp constant: Assignment Ops. (line 148) * /dev/... special files: Special FD. (line 46) * /dev/fd/N special files (gawk): Special FD. (line 46) * /inet/... special files (gawk): TCP/IP Networking. (line 6) @@ -30354,7 +30358,7 @@ Index * \ (backslash), regexp operator: Regexp Operators. (line 18) * ^ (caret), ^ operator: Precedence. (line 49) * ^ (caret), ^= operator <1>: Precedence. (line 95) -* ^ (caret), ^= operator: Assignment Ops. (line 129) +* ^ (caret), ^= operator: Assignment Ops. (line 130) * ^ (caret), in bracket expressions: Bracket Expressions. (line 17) * ^ (caret), in FS: Regexp Field Splitting. (line 59) @@ -30399,7 +30403,7 @@ Index * amazing awk assembler (aaa): Glossary. (line 12) * amazingly workable formatter (awf): Glossary. (line 25) * ambiguity, syntactic: /= operator vs. /=.../ regexp constant: Assignment Ops. - (line 147) + (line 148) * ampersand (&), && operator <1>: Precedence. (line 86) * ampersand (&), && operator: Boolean Ops. (line 57) * ampersand (&), gsub()/gensub()/sub() functions and: Gory Details. @@ -30431,7 +30435,7 @@ Index * arguments, command-line <2>: Auto-set. (line 11) * arguments, command-line: Other Arguments. (line 6) * arguments, command-line, invoking awk: Command Line. (line 6) -* arguments, in function calls: Function Calls. (line 16) +* arguments, in function calls: Function Calls. (line 18) * arguments, processing: Getopt Function. (line 6) * ARGV array, indexing into: Other Arguments. (line 12) * arithmetic operators: Arithmetic Ops. (line 6) @@ -30510,14 +30514,14 @@ Index * asterisk (*), ** operator <1>: Precedence. (line 49) * asterisk (*), ** operator: Arithmetic Ops. (line 81) * asterisk (*), **= operator <1>: Precedence. (line 95) -* asterisk (*), **= operator: Assignment Ops. (line 129) +* asterisk (*), **= operator: Assignment Ops. (line 130) * asterisk (*), *= operator <1>: Precedence. (line 95) -* asterisk (*), *= operator: Assignment Ops. (line 129) +* asterisk (*), *= operator: Assignment Ops. (line 130) * atan2: Numeric Functions. (line 11) * automatic displays, in debugger: Debugger Info. (line 24) * awf (amazingly workable formatter) program: Glossary. (line 25) * awk debugging, enabling: Options. (line 108) -* awk language, POSIX version: Assignment Ops. (line 136) +* awk language, POSIX version: Assignment Ops. (line 137) * awk profiling, enabling: Options. (line 242) * awk programs <1>: Two Rules. (line 6) * awk programs <2>: Executable Scripts. (line 6) @@ -30763,7 +30767,7 @@ Index * call stack, display in debugger: Execution Stack. (line 13) * caret (^), ^ operator: Precedence. (line 49) * caret (^), ^= operator <1>: Precedence. (line 95) -* caret (^), ^= operator: Assignment Ops. (line 129) +* caret (^), ^= operator: Assignment Ops. (line 130) * caret (^), in bracket expressions: Bracket Expressions. (line 17) * caret (^), regexp operator <1>: GNU Regexp Operators. (line 59) @@ -30844,7 +30848,7 @@ Index * commenting: Comments. (line 6) * commenting, backslash continuation and: Statements/Lines. (line 76) * common extensions, ** operator: Arithmetic Ops. (line 30) -* common extensions, **= operator: Assignment Ops. (line 136) +* common extensions, **= operator: Assignment Ops. (line 137) * common extensions, /dev/stderr special file: Special FD. (line 46) * common extensions, /dev/stdin special file: Special FD. (line 46) * common extensions, /dev/stdout special file: Special FD. (line 46) @@ -30948,7 +30952,7 @@ Index * dark corner: Conventions. (line 38) * dark corner, "0" is actually true: Truth Values. (line 24) * dark corner, /= operator vs. /=.../ regexp constant: Assignment Ops. - (line 147) + (line 148) * dark corner, ^, in FS: Regexp Field Splitting. (line 59) * dark corner, array subscripts: Uninitialized Subscripts. @@ -30974,14 +30978,14 @@ Index * dark corner, input files: awk split records. (line 110) * dark corner, invoking awk: Command Line. (line 16) * dark corner, length() function: String Functions. (line 180) -* dark corner, locale's decimal point character: Conversion. (line 77) +* dark corner, locale's decimal point character: Conversion. (line 75) * dark corner, multiline records: Multiple Line. (line 35) * dark corner, NF variable, decrementing: Changing Fields. (line 107) * dark corner, OFMT variable: OFMT. (line 27) * dark corner, regexp constants: Using Constant Regexps. (line 6) * dark corner, regexp constants, /= operator and: Assignment Ops. - (line 147) + (line 148) * dark corner, regexp constants, as arguments to user-defined functions: Using Constant Regexps. (line 43) * dark corner, split() function: String Functions. (line 359) @@ -31365,7 +31369,7 @@ Index * extensions, Brian Kernighan's awk <1>: Common Extensions. (line 6) * extensions, Brian Kernighan's awk: BTL. (line 6) * extensions, common, ** operator: Arithmetic Ops. (line 30) -* extensions, common, **= operator: Assignment Ops. (line 136) +* extensions, common, **= operator: Assignment Ops. (line 137) * extensions, common, /dev/stderr special file: Special FD. (line 46) * extensions, common, /dev/stdin special file: Special FD. (line 46) * extensions, common, /dev/stdout special file: Special FD. (line 46) @@ -31523,9 +31527,9 @@ Index * forward slash (/) to enclose regular expressions: Regexp. (line 10) * forward slash (/), / operator: Precedence. (line 55) * forward slash (/), /= operator <1>: Precedence. (line 95) -* forward slash (/), /= operator: Assignment Ops. (line 129) +* forward slash (/), /= operator: Assignment Ops. (line 130) * forward slash (/), /= operator, vs. /=.../ regexp constant: Assignment Ops. - (line 147) + (line 148) * forward slash (/), patterns and: Expression Patterns. (line 24) * FPAT variable <1>: User-modified. (line 45) * FPAT variable: Splitting By Content. @@ -31803,7 +31807,7 @@ Index * hyphen (-), -- operator <1>: Precedence. (line 46) * hyphen (-), -- operator: Increment Ops. (line 48) * hyphen (-), -= operator <1>: Precedence. (line 95) -* hyphen (-), -= operator: Assignment Ops. (line 129) +* hyphen (-), -= operator: Assignment Ops. (line 130) * hyphen (-), filenames beginning with: Options. (line 59) * hyphen (-), in bracket expressions: Bracket Expressions. (line 17) * i debugger command (alias for info): Debugger Info. (line 13) @@ -32285,7 +32289,7 @@ Index * PC operating systems, gawk on, installing: PC Installation. (line 6) * percent sign (%), % operator: Precedence. (line 55) * percent sign (%), %= operator <1>: Precedence. (line 95) -* percent sign (%), %= operator: Assignment Ops. (line 129) +* percent sign (%), %= operator: Assignment Ops. (line 130) * period (.), regexp operator: Regexp Operators. (line 44) * Perl: Future Extensions. (line 6) * Peters, Arno: Contributors. (line 85) @@ -32308,7 +32312,7 @@ Index * portability: Escape Sequences. (line 94) * portability, #! (executable scripts): Executable Scripts. (line 33) * portability, ** operator and: Arithmetic Ops. (line 81) -* portability, **= operator and: Assignment Ops. (line 142) +* portability, **= operator and: Assignment Ops. (line 143) * portability, ARGV variable: Executable Scripts. (line 42) * portability, backslash continuation and: Statements/Lines. (line 30) * portability, backslash in escape sequences: Escape Sequences. @@ -32345,10 +32349,10 @@ Index * positional specifiers, printf statement, mixing with regular formats: Printf Ordering. (line 57) * positive zero: Unexpected Results. (line 34) -* POSIX awk <1>: Assignment Ops. (line 136) +* POSIX awk <1>: Assignment Ops. (line 137) * POSIX awk: This Manual. (line 14) * POSIX awk, ** operator and: Precedence. (line 98) -* POSIX awk, **= operator and: Assignment Ops. (line 142) +* POSIX awk, **= operator and: Assignment Ops. (line 143) * POSIX awk, < operator and: Getline/File. (line 26) * POSIX awk, arithmetic operators and: Arithmetic Ops. (line 30) * POSIX awk, backslashes in string constants: Escape Sequences. @@ -32538,7 +32542,7 @@ Index (line 102) * regexp constants <2>: Regexp Constants. (line 6) * regexp constants: Regexp Usage. (line 57) -* regexp constants, /=.../, /= operator and: Assignment Ops. (line 147) +* regexp constants, /=.../, /= operator and: Assignment Ops. (line 148) * regexp constants, as patterns: Expression Patterns. (line 34) * regexp constants, in gawk: Using Constant Regexps. (line 28) @@ -32735,7 +32739,7 @@ Index * side effects, conditional expressions: Conditional Exp. (line 22) * side effects, decrement/increment operators: Increment Ops. (line 11) * side effects, FILENAME variable: Getline Notes. (line 19) -* side effects, function calls: Function Calls. (line 54) +* side effects, function calls: Function Calls. (line 56) * side effects, statements: Action Overview. (line 32) * sidebar, A Constant's Base Does Not Affect Its Value: Nondecimal-numbers. (line 64) @@ -32761,7 +32765,7 @@ Index * sidebar, So Why Does gawk have BEGINFILE and ENDFILE?: Filetrans Function. (line 83) * sidebar, Syntactic Ambiguities Between /= and Regular Expressions: Assignment Ops. - (line 145) + (line 146) * sidebar, Understanding $0: Changing Fields. (line 134) * sidebar, Using \n in Bracket Expressions of Dynamic Regexps: Computed Regexps. (line 57) @@ -32909,7 +32913,7 @@ Index * switch statement: Switch Statement. (line 6) * SYMTAB array: Auto-set. (line 274) * syntactic ambiguity: /= operator vs. /=.../ regexp constant: Assignment Ops. - (line 147) + (line 148) * system: I/O Functions. (line 72) * systime: Time Functions. (line 66) * t debugger command (alias for tbreak): Breakpoint Control. (line 90) @@ -32981,7 +32985,7 @@ Index * troubleshooting, fatal errors, printf format strings: Format Modifiers. (line 159) * troubleshooting, fflush() function: I/O Functions. (line 60) -* troubleshooting, function call syntax: Function Calls. (line 28) +* troubleshooting, function call syntax: Function Calls. (line 30) * troubleshooting, gawk: Compatibility Mode. (line 6) * troubleshooting, gawk, bug reports: Bugs. (line 9) * troubleshooting, gawk, fatal errors, function arguments: Calling Built-in. @@ -33309,400 +33313,400 @@ Node: Values299487 Node: Constants300163 Node: Scalar Constants300843 Ref: Scalar Constants-Footnote-1301702 -Node: Nondecimal-numbers301884 -Node: Regexp Constants304884 -Node: Using Constant Regexps305359 -Node: Variables308414 -Node: Using Variables309069 -Node: Assignment Options310793 -Node: Conversion312668 -Ref: table-locale-affects318168 -Ref: Conversion-Footnote-1318792 -Node: All Operators318901 -Node: Arithmetic Ops319531 -Node: Concatenation322036 -Ref: Concatenation-Footnote-1324824 -Node: Assignment Ops324944 -Ref: table-assign-ops329932 -Node: Increment Ops331263 -Node: Truth Values and Conditions334697 -Node: Truth Values335780 -Node: Typing and Comparison336829 -Node: Variable Typing337622 -Ref: Variable Typing-Footnote-1341519 -Node: Comparison Operators341641 -Ref: table-relational-ops342051 -Node: POSIX String Comparison345599 -Ref: POSIX String Comparison-Footnote-1346555 -Node: Boolean Ops346693 -Ref: Boolean Ops-Footnote-1350763 -Node: Conditional Exp350854 -Node: Function Calls352586 -Node: Precedence356180 -Node: Locales359849 -Node: Patterns and Actions360938 -Node: Pattern Overview361992 -Node: Regexp Patterns363661 -Node: Expression Patterns364204 -Node: Ranges367985 -Node: BEGIN/END371089 -Node: Using BEGIN/END371851 -Ref: Using BEGIN/END-Footnote-1374587 -Node: I/O And BEGIN/END374693 -Node: BEGINFILE/ENDFILE376975 -Node: Empty379889 -Node: Using Shell Variables380206 -Node: Action Overview382491 -Node: Statements384848 -Node: If Statement386702 -Node: While Statement388201 -Node: Do Statement390245 -Node: For Statement391401 -Node: Switch Statement394553 -Node: Break Statement396707 -Node: Continue Statement398697 -Node: Next Statement400490 -Node: Nextfile Statement402880 -Node: Exit Statement405535 -Node: Built-in Variables407951 -Node: User-modified409046 -Ref: User-modified-Footnote-1417404 -Node: Auto-set417466 -Ref: Auto-set-Footnote-1430533 -Ref: Auto-set-Footnote-2430738 -Node: ARGC and ARGV430794 -Node: Arrays434648 -Node: Array Basics436153 -Node: Array Intro436979 -Node: Reference to Elements441296 -Node: Assigning Elements443566 -Node: Array Example444057 -Node: Scanning an Array445789 -Node: Controlling Scanning448103 -Ref: Controlling Scanning-Footnote-1453190 -Node: Delete453506 -Ref: Delete-Footnote-1456271 -Node: Numeric Array Subscripts456328 -Node: Uninitialized Subscripts458511 -Node: Multidimensional460138 -Node: Multiscanning463231 -Node: Arrays of Arrays464820 -Node: Functions469460 -Node: Built-in470279 -Node: Calling Built-in471357 -Node: Numeric Functions473345 -Ref: Numeric Functions-Footnote-1477179 -Ref: Numeric Functions-Footnote-2477536 -Ref: Numeric Functions-Footnote-3477584 -Node: String Functions477853 -Ref: String Functions-Footnote-1500856 -Ref: String Functions-Footnote-2500985 -Ref: String Functions-Footnote-3501233 -Node: Gory Details501320 -Ref: table-sub-escapes502999 -Ref: table-sub-posix-92504353 -Ref: table-sub-proposed505704 -Ref: table-posix-sub507058 -Ref: table-gensub-escapes508603 -Ref: Gory Details-Footnote-1509779 -Ref: Gory Details-Footnote-2509830 -Node: I/O Functions509981 -Ref: I/O Functions-Footnote-1516977 -Node: Time Functions517124 -Ref: Time Functions-Footnote-1528117 -Ref: Time Functions-Footnote-2528185 -Ref: Time Functions-Footnote-3528343 -Ref: Time Functions-Footnote-4528454 -Ref: Time Functions-Footnote-5528566 -Ref: Time Functions-Footnote-6528793 -Node: Bitwise Functions529059 -Ref: table-bitwise-ops529621 -Ref: Bitwise Functions-Footnote-1533866 -Node: Type Functions534050 -Node: I18N Functions535201 -Node: User-defined536853 -Node: Definition Syntax537657 -Ref: Definition Syntax-Footnote-1542571 -Node: Function Example542640 -Ref: Function Example-Footnote-1545289 -Node: Function Caveats545311 -Node: Calling A Function545829 -Node: Variable Scope546784 -Node: Pass By Value/Reference549747 -Node: Return Statement553255 -Node: Dynamic Typing556236 -Node: Indirect Calls557167 -Node: Library Functions566854 -Ref: Library Functions-Footnote-1570367 -Ref: Library Functions-Footnote-2570510 -Node: Library Names570681 -Ref: Library Names-Footnote-1574154 -Ref: Library Names-Footnote-2574374 -Node: General Functions574460 -Node: Strtonum Function575488 -Node: Assert Function578418 -Node: Round Function581744 -Node: Cliff Random Function583285 -Node: Ordinal Functions584301 -Ref: Ordinal Functions-Footnote-1587378 -Ref: Ordinal Functions-Footnote-2587630 -Node: Join Function587841 -Ref: Join Function-Footnote-1589612 -Node: Getlocaltime Function589812 -Node: Readfile Function593553 -Node: Data File Management595392 -Node: Filetrans Function596024 -Node: Rewind Function600093 -Node: File Checking601480 -Node: Empty Files602574 -Node: Ignoring Assigns604804 -Node: Getopt Function606358 -Ref: Getopt Function-Footnote-1617661 -Node: Passwd Functions617864 -Ref: Passwd Functions-Footnote-1626842 -Node: Group Functions626930 -Node: Walking Arrays635014 -Node: Sample Programs637150 -Node: Running Examples637824 -Node: Clones638552 -Node: Cut Program639776 -Node: Egrep Program649627 -Ref: Egrep Program-Footnote-1657400 -Node: Id Program657510 -Node: Split Program661159 -Ref: Split Program-Footnote-1664678 -Node: Tee Program664806 -Node: Uniq Program667609 -Node: Wc Program675038 -Ref: Wc Program-Footnote-1679304 -Ref: Wc Program-Footnote-2679504 -Node: Miscellaneous Programs679596 -Node: Dupword Program680784 -Node: Alarm Program682815 -Node: Translate Program687622 -Ref: Translate Program-Footnote-1692009 -Ref: Translate Program-Footnote-2692257 -Node: Labels Program692391 -Ref: Labels Program-Footnote-1695762 -Node: Word Sorting695846 -Node: History Sorting699730 -Node: Extract Program701569 -Ref: Extract Program-Footnote-1709072 -Node: Simple Sed709200 -Node: Igawk Program712262 -Ref: Igawk Program-Footnote-1727433 -Ref: Igawk Program-Footnote-2727634 -Node: Anagram Program727772 -Node: Signature Program730840 -Node: Advanced Features731940 -Node: Nondecimal Data733826 -Node: Array Sorting735409 -Node: Controlling Array Traversal736106 -Node: Array Sorting Functions744390 -Ref: Array Sorting Functions-Footnote-1748259 -Node: Two-way I/O748453 -Ref: Two-way I/O-Footnote-1753885 -Node: TCP/IP Networking753967 -Node: Profiling756811 -Node: Internationalization764314 -Node: I18N and L10N765739 -Node: Explaining gettext766425 -Ref: Explaining gettext-Footnote-1771493 -Ref: Explaining gettext-Footnote-2771677 -Node: Programmer i18n771842 -Node: Translator i18n776069 -Node: String Extraction776863 -Ref: String Extraction-Footnote-1777824 -Node: Printf Ordering777910 -Ref: Printf Ordering-Footnote-1780692 -Node: I18N Portability780756 -Ref: I18N Portability-Footnote-1783205 -Node: I18N Example783268 -Ref: I18N Example-Footnote-1785906 -Node: Gawk I18N785978 -Node: Debugger786599 -Node: Debugging787570 -Node: Debugging Concepts788003 -Node: Debugging Terms789859 -Node: Awk Debugging792456 -Node: Sample Debugging Session793348 -Node: Debugger Invocation793868 -Node: Finding The Bug795201 -Node: List of Debugger Commands801688 -Node: Breakpoint Control803022 -Node: Debugger Execution Control806686 -Node: Viewing And Changing Data810046 -Node: Execution Stack813402 -Node: Debugger Info814869 -Node: Miscellaneous Debugger Commands818863 -Node: Readline Support824041 -Node: Limitations824872 -Node: Arbitrary Precision Arithmetic827124 -Ref: Arbitrary Precision Arithmetic-Footnote-1828773 -Node: General Arithmetic828921 -Node: Floating Point Issues830641 -Node: String Conversion Precision831522 -Ref: String Conversion Precision-Footnote-1833227 -Node: Unexpected Results833336 -Node: POSIX Floating Point Problems835489 -Ref: POSIX Floating Point Problems-Footnote-1839314 -Node: Integer Programming839352 -Node: Floating-point Programming841091 -Ref: Floating-point Programming-Footnote-1847422 -Ref: Floating-point Programming-Footnote-2847692 -Node: Floating-point Representation847956 -Node: Floating-point Context849121 -Ref: table-ieee-formats849960 -Node: Rounding Mode851344 -Ref: table-rounding-modes851823 -Ref: Rounding Mode-Footnote-1854838 -Node: Gawk and MPFR855017 -Node: Arbitrary Precision Floats856426 -Ref: Arbitrary Precision Floats-Footnote-1858869 -Node: Setting Precision859185 -Ref: table-predefined-precision-strings859871 -Node: Setting Rounding Mode862016 -Ref: table-gawk-rounding-modes862420 -Node: Floating-point Constants863607 -Node: Changing Precision865036 -Ref: Changing Precision-Footnote-1866433 -Node: Exact Arithmetic866607 -Node: Arbitrary Precision Integers869745 -Ref: Arbitrary Precision Integers-Footnote-1872760 -Node: Dynamic Extensions872907 -Node: Extension Intro874365 -Node: Plugin License875630 -Node: Extension Mechanism Outline876315 -Ref: load-extension876732 -Ref: load-new-function878210 -Ref: call-new-function879205 -Node: Extension API Description881220 -Node: Extension API Functions Introduction882507 -Node: General Data Types887434 -Ref: General Data Types-Footnote-1893129 -Node: Requesting Values893428 -Ref: table-value-types-returned894165 -Node: Memory Allocation Functions895119 -Ref: Memory Allocation Functions-Footnote-1897865 -Node: Constructor Functions897961 -Node: Registration Functions899719 -Node: Extension Functions900404 -Node: Exit Callback Functions902706 -Node: Extension Version String903955 -Node: Input Parsers904605 -Node: Output Wrappers914362 -Node: Two-way processors918872 -Node: Printing Messages921080 -Ref: Printing Messages-Footnote-1922157 -Node: Updating `ERRNO'922309 -Node: Accessing Parameters923048 -Node: Symbol Table Access924278 -Node: Symbol table by name924792 -Node: Symbol table by cookie926768 -Ref: Symbol table by cookie-Footnote-1930900 -Node: Cached values930963 -Ref: Cached values-Footnote-1934453 -Node: Array Manipulation934544 -Ref: Array Manipulation-Footnote-1935642 -Node: Array Data Types935681 -Ref: Array Data Types-Footnote-1938384 -Node: Array Functions938476 -Node: Flattening Arrays942312 -Node: Creating Arrays949164 -Node: Extension API Variables953889 -Node: Extension Versioning954525 -Node: Extension API Informational Variables956426 -Node: Extension API Boilerplate957512 -Node: Finding Extensions961316 -Node: Extension Example961876 -Node: Internal File Description962606 -Node: Internal File Ops966697 -Ref: Internal File Ops-Footnote-1978206 -Node: Using Internal File Ops978346 -Ref: Using Internal File Ops-Footnote-1980693 -Node: Extension Samples980959 -Node: Extension Sample File Functions982483 -Node: Extension Sample Fnmatch990970 -Node: Extension Sample Fork992739 -Node: Extension Sample Inplace993952 -Node: Extension Sample Ord995730 -Node: Extension Sample Readdir996566 -Node: Extension Sample Revout998098 -Node: Extension Sample Rev2way998691 -Node: Extension Sample Read write array999381 -Node: Extension Sample Readfile1001264 -Node: Extension Sample API Tests1002364 -Node: Extension Sample Time1002889 -Node: gawkextlib1004253 -Node: Language History1007034 -Node: V7/SVR3.11008627 -Node: SVR41010947 -Node: POSIX1012389 -Node: BTL1013775 -Node: POSIX/GNU1014509 -Node: Feature History1020108 -Node: Common Extensions1033084 -Node: Ranges and Locales1034396 -Ref: Ranges and Locales-Footnote-11039013 -Ref: Ranges and Locales-Footnote-21039040 -Ref: Ranges and Locales-Footnote-31039274 -Node: Contributors1039495 -Node: Installation1044876 -Node: Gawk Distribution1045770 -Node: Getting1046254 -Node: Extracting1047080 -Node: Distribution contents1048772 -Node: Unix Installation1054493 -Node: Quick Installation1055110 -Node: Additional Configuration Options1057556 -Node: Configuration Philosophy1059292 -Node: Non-Unix Installation1061646 -Node: PC Installation1062104 -Node: PC Binary Installation1063403 -Node: PC Compiling1065251 -Node: PC Testing1068195 -Node: PC Using1069371 -Node: Cygwin1073539 -Node: MSYS1074348 -Node: VMS Installation1074862 -Node: VMS Compilation1075658 -Ref: VMS Compilation-Footnote-11076910 -Node: VMS Dynamic Extensions1076968 -Node: VMS Installation Details1078341 -Node: VMS Running1080592 -Node: VMS GNV1083426 -Node: VMS Old Gawk1084149 -Node: Bugs1084619 -Node: Other Versions1088537 -Node: Notes1094621 -Node: Compatibility Mode1095421 -Node: Additions1096204 -Node: Accessing The Source1097131 -Node: Adding Code1098571 -Node: New Ports1104616 -Node: Derived Files1108751 -Ref: Derived Files-Footnote-11114072 -Ref: Derived Files-Footnote-21114106 -Ref: Derived Files-Footnote-31114706 -Node: Future Extensions1114804 -Node: Implementation Limitations1115387 -Node: Extension Design1116639 -Node: Old Extension Problems1117793 -Ref: Old Extension Problems-Footnote-11119301 -Node: Extension New Mechanism Goals1119358 -Ref: Extension New Mechanism Goals-Footnote-11122723 -Node: Extension Other Design Decisions1122909 -Node: Extension Future Growth1125015 -Node: Old Extension Mechanism1125851 -Node: Basic Concepts1127591 -Node: Basic High Level1128272 -Ref: figure-general-flow1128544 -Ref: figure-process-flow1129143 -Ref: Basic High Level-Footnote-11132372 -Node: Basic Data Typing1132557 -Node: Glossary1135912 -Node: Copying1161143 -Node: GNU Free Documentation License1198699 -Node: Index1223835 +Node: Nondecimal-numbers301952 +Node: Regexp Constants304952 +Node: Using Constant Regexps305427 +Node: Variables308497 +Node: Using Variables309152 +Node: Assignment Options310876 +Node: Conversion312751 +Ref: table-locale-affects318187 +Ref: Conversion-Footnote-1318811 +Node: All Operators318920 +Node: Arithmetic Ops319550 +Node: Concatenation322055 +Ref: Concatenation-Footnote-1324851 +Node: Assignment Ops324971 +Ref: table-assign-ops329954 +Node: Increment Ops331271 +Node: Truth Values and Conditions334709 +Node: Truth Values335792 +Node: Typing and Comparison336841 +Node: Variable Typing337634 +Ref: Variable Typing-Footnote-1341534 +Node: Comparison Operators341656 +Ref: table-relational-ops342066 +Node: POSIX String Comparison345614 +Ref: POSIX String Comparison-Footnote-1346698 +Node: Boolean Ops346836 +Ref: Boolean Ops-Footnote-1350906 +Node: Conditional Exp350997 +Node: Function Calls352724 +Node: Precedence356482 +Node: Locales360151 +Node: Patterns and Actions361240 +Node: Pattern Overview362294 +Node: Regexp Patterns363971 +Node: Expression Patterns364514 +Node: Ranges368295 +Node: BEGIN/END371401 +Node: Using BEGIN/END372163 +Ref: Using BEGIN/END-Footnote-1374899 +Node: I/O And BEGIN/END375005 +Node: BEGINFILE/ENDFILE377290 +Node: Empty380226 +Node: Using Shell Variables380543 +Node: Action Overview382826 +Node: Statements385171 +Node: If Statement387025 +Node: While Statement388524 +Node: Do Statement390568 +Node: For Statement391724 +Node: Switch Statement394876 +Node: Break Statement396979 +Node: Continue Statement399034 +Node: Next Statement400827 +Node: Nextfile Statement403217 +Node: Exit Statement405872 +Node: Built-in Variables408274 +Node: User-modified409369 +Ref: User-modified-Footnote-1417727 +Node: Auto-set417789 +Ref: Auto-set-Footnote-1430856 +Ref: Auto-set-Footnote-2431061 +Node: ARGC and ARGV431117 +Node: Arrays434971 +Node: Array Basics436476 +Node: Array Intro437302 +Node: Reference to Elements441619 +Node: Assigning Elements443889 +Node: Array Example444380 +Node: Scanning an Array446112 +Node: Controlling Scanning448426 +Ref: Controlling Scanning-Footnote-1453513 +Node: Delete453829 +Ref: Delete-Footnote-1456594 +Node: Numeric Array Subscripts456651 +Node: Uninitialized Subscripts458834 +Node: Multidimensional460461 +Node: Multiscanning463554 +Node: Arrays of Arrays465143 +Node: Functions469783 +Node: Built-in470602 +Node: Calling Built-in471680 +Node: Numeric Functions473668 +Ref: Numeric Functions-Footnote-1477502 +Ref: Numeric Functions-Footnote-2477859 +Ref: Numeric Functions-Footnote-3477907 +Node: String Functions478176 +Ref: String Functions-Footnote-1501179 +Ref: String Functions-Footnote-2501308 +Ref: String Functions-Footnote-3501556 +Node: Gory Details501643 +Ref: table-sub-escapes503322 +Ref: table-sub-posix-92504676 +Ref: table-sub-proposed506027 +Ref: table-posix-sub507381 +Ref: table-gensub-escapes508926 +Ref: Gory Details-Footnote-1510102 +Ref: Gory Details-Footnote-2510153 +Node: I/O Functions510304 +Ref: I/O Functions-Footnote-1517300 +Node: Time Functions517447 +Ref: Time Functions-Footnote-1528440 +Ref: Time Functions-Footnote-2528508 +Ref: Time Functions-Footnote-3528666 +Ref: Time Functions-Footnote-4528777 +Ref: Time Functions-Footnote-5528889 +Ref: Time Functions-Footnote-6529116 +Node: Bitwise Functions529382 +Ref: table-bitwise-ops529944 +Ref: Bitwise Functions-Footnote-1534189 +Node: Type Functions534373 +Node: I18N Functions535524 +Node: User-defined537176 +Node: Definition Syntax537980 +Ref: Definition Syntax-Footnote-1542894 +Node: Function Example542963 +Ref: Function Example-Footnote-1545612 +Node: Function Caveats545634 +Node: Calling A Function546152 +Node: Variable Scope547107 +Node: Pass By Value/Reference550070 +Node: Return Statement553578 +Node: Dynamic Typing556559 +Node: Indirect Calls557490 +Node: Library Functions567177 +Ref: Library Functions-Footnote-1570690 +Ref: Library Functions-Footnote-2570833 +Node: Library Names571004 +Ref: Library Names-Footnote-1574477 +Ref: Library Names-Footnote-2574697 +Node: General Functions574783 +Node: Strtonum Function575811 +Node: Assert Function578741 +Node: Round Function582067 +Node: Cliff Random Function583608 +Node: Ordinal Functions584624 +Ref: Ordinal Functions-Footnote-1587701 +Ref: Ordinal Functions-Footnote-2587953 +Node: Join Function588164 +Ref: Join Function-Footnote-1589935 +Node: Getlocaltime Function590135 +Node: Readfile Function593876 +Node: Data File Management595715 +Node: Filetrans Function596347 +Node: Rewind Function600416 +Node: File Checking601803 +Node: Empty Files602897 +Node: Ignoring Assigns605127 +Node: Getopt Function606681 +Ref: Getopt Function-Footnote-1617984 +Node: Passwd Functions618187 +Ref: Passwd Functions-Footnote-1627165 +Node: Group Functions627253 +Node: Walking Arrays635337 +Node: Sample Programs637473 +Node: Running Examples638147 +Node: Clones638875 +Node: Cut Program640099 +Node: Egrep Program649950 +Ref: Egrep Program-Footnote-1657723 +Node: Id Program657833 +Node: Split Program661482 +Ref: Split Program-Footnote-1665001 +Node: Tee Program665129 +Node: Uniq Program667932 +Node: Wc Program675361 +Ref: Wc Program-Footnote-1679627 +Ref: Wc Program-Footnote-2679827 +Node: Miscellaneous Programs679919 +Node: Dupword Program681107 +Node: Alarm Program683138 +Node: Translate Program687945 +Ref: Translate Program-Footnote-1692332 +Ref: Translate Program-Footnote-2692580 +Node: Labels Program692714 +Ref: Labels Program-Footnote-1696085 +Node: Word Sorting696169 +Node: History Sorting700053 +Node: Extract Program701892 +Ref: Extract Program-Footnote-1709395 +Node: Simple Sed709523 +Node: Igawk Program712585 +Ref: Igawk Program-Footnote-1727756 +Ref: Igawk Program-Footnote-2727957 +Node: Anagram Program728095 +Node: Signature Program731163 +Node: Advanced Features732263 +Node: Nondecimal Data734149 +Node: Array Sorting735732 +Node: Controlling Array Traversal736429 +Node: Array Sorting Functions744713 +Ref: Array Sorting Functions-Footnote-1748582 +Node: Two-way I/O748776 +Ref: Two-way I/O-Footnote-1754208 +Node: TCP/IP Networking754290 +Node: Profiling757134 +Node: Internationalization764637 +Node: I18N and L10N766062 +Node: Explaining gettext766748 +Ref: Explaining gettext-Footnote-1771816 +Ref: Explaining gettext-Footnote-2772000 +Node: Programmer i18n772165 +Node: Translator i18n776392 +Node: String Extraction777186 +Ref: String Extraction-Footnote-1778147 +Node: Printf Ordering778233 +Ref: Printf Ordering-Footnote-1781015 +Node: I18N Portability781079 +Ref: I18N Portability-Footnote-1783528 +Node: I18N Example783591 +Ref: I18N Example-Footnote-1786229 +Node: Gawk I18N786301 +Node: Debugger786922 +Node: Debugging787893 +Node: Debugging Concepts788326 +Node: Debugging Terms790182 +Node: Awk Debugging792779 +Node: Sample Debugging Session793671 +Node: Debugger Invocation794191 +Node: Finding The Bug795524 +Node: List of Debugger Commands802011 +Node: Breakpoint Control803345 +Node: Debugger Execution Control807009 +Node: Viewing And Changing Data810369 +Node: Execution Stack813725 +Node: Debugger Info815192 +Node: Miscellaneous Debugger Commands819186 +Node: Readline Support824364 +Node: Limitations825195 +Node: Arbitrary Precision Arithmetic827447 +Ref: Arbitrary Precision Arithmetic-Footnote-1829096 +Node: General Arithmetic829244 +Node: Floating Point Issues830964 +Node: String Conversion Precision831845 +Ref: String Conversion Precision-Footnote-1833550 +Node: Unexpected Results833659 +Node: POSIX Floating Point Problems835812 +Ref: POSIX Floating Point Problems-Footnote-1839637 +Node: Integer Programming839675 +Node: Floating-point Programming841414 +Ref: Floating-point Programming-Footnote-1847745 +Ref: Floating-point Programming-Footnote-2848015 +Node: Floating-point Representation848279 +Node: Floating-point Context849444 +Ref: table-ieee-formats850283 +Node: Rounding Mode851667 +Ref: table-rounding-modes852146 +Ref: Rounding Mode-Footnote-1855161 +Node: Gawk and MPFR855340 +Node: Arbitrary Precision Floats856749 +Ref: Arbitrary Precision Floats-Footnote-1859192 +Node: Setting Precision859508 +Ref: table-predefined-precision-strings860194 +Node: Setting Rounding Mode862339 +Ref: table-gawk-rounding-modes862743 +Node: Floating-point Constants863930 +Node: Changing Precision865359 +Ref: Changing Precision-Footnote-1866756 +Node: Exact Arithmetic866930 +Node: Arbitrary Precision Integers870068 +Ref: Arbitrary Precision Integers-Footnote-1873083 +Node: Dynamic Extensions873230 +Node: Extension Intro874688 +Node: Plugin License875953 +Node: Extension Mechanism Outline876638 +Ref: load-extension877055 +Ref: load-new-function878533 +Ref: call-new-function879528 +Node: Extension API Description881543 +Node: Extension API Functions Introduction882830 +Node: General Data Types887757 +Ref: General Data Types-Footnote-1893452 +Node: Requesting Values893751 +Ref: table-value-types-returned894488 +Node: Memory Allocation Functions895442 +Ref: Memory Allocation Functions-Footnote-1898188 +Node: Constructor Functions898284 +Node: Registration Functions900042 +Node: Extension Functions900727 +Node: Exit Callback Functions903029 +Node: Extension Version String904278 +Node: Input Parsers904928 +Node: Output Wrappers914685 +Node: Two-way processors919195 +Node: Printing Messages921403 +Ref: Printing Messages-Footnote-1922480 +Node: Updating `ERRNO'922632 +Node: Accessing Parameters923371 +Node: Symbol Table Access924601 +Node: Symbol table by name925115 +Node: Symbol table by cookie927091 +Ref: Symbol table by cookie-Footnote-1931223 +Node: Cached values931286 +Ref: Cached values-Footnote-1934776 +Node: Array Manipulation934867 +Ref: Array Manipulation-Footnote-1935965 +Node: Array Data Types936004 +Ref: Array Data Types-Footnote-1938707 +Node: Array Functions938799 +Node: Flattening Arrays942635 +Node: Creating Arrays949487 +Node: Extension API Variables954212 +Node: Extension Versioning954848 +Node: Extension API Informational Variables956749 +Node: Extension API Boilerplate957835 +Node: Finding Extensions961639 +Node: Extension Example962199 +Node: Internal File Description962929 +Node: Internal File Ops967020 +Ref: Internal File Ops-Footnote-1978529 +Node: Using Internal File Ops978669 +Ref: Using Internal File Ops-Footnote-1981016 +Node: Extension Samples981282 +Node: Extension Sample File Functions982806 +Node: Extension Sample Fnmatch991293 +Node: Extension Sample Fork993062 +Node: Extension Sample Inplace994275 +Node: Extension Sample Ord996053 +Node: Extension Sample Readdir996889 +Node: Extension Sample Revout998421 +Node: Extension Sample Rev2way999014 +Node: Extension Sample Read write array999704 +Node: Extension Sample Readfile1001587 +Node: Extension Sample API Tests1002687 +Node: Extension Sample Time1003212 +Node: gawkextlib1004576 +Node: Language History1007357 +Node: V7/SVR3.11008950 +Node: SVR41011270 +Node: POSIX1012712 +Node: BTL1014098 +Node: POSIX/GNU1014832 +Node: Feature History1020431 +Node: Common Extensions1033407 +Node: Ranges and Locales1034719 +Ref: Ranges and Locales-Footnote-11039336 +Ref: Ranges and Locales-Footnote-21039363 +Ref: Ranges and Locales-Footnote-31039597 +Node: Contributors1039818 +Node: Installation1045199 +Node: Gawk Distribution1046093 +Node: Getting1046577 +Node: Extracting1047403 +Node: Distribution contents1049095 +Node: Unix Installation1054816 +Node: Quick Installation1055433 +Node: Additional Configuration Options1057879 +Node: Configuration Philosophy1059615 +Node: Non-Unix Installation1061969 +Node: PC Installation1062427 +Node: PC Binary Installation1063726 +Node: PC Compiling1065574 +Node: PC Testing1068518 +Node: PC Using1069694 +Node: Cygwin1073862 +Node: MSYS1074671 +Node: VMS Installation1075185 +Node: VMS Compilation1075981 +Ref: VMS Compilation-Footnote-11077233 +Node: VMS Dynamic Extensions1077291 +Node: VMS Installation Details1078664 +Node: VMS Running1080915 +Node: VMS GNV1083749 +Node: VMS Old Gawk1084472 +Node: Bugs1084942 +Node: Other Versions1088860 +Node: Notes1094944 +Node: Compatibility Mode1095744 +Node: Additions1096527 +Node: Accessing The Source1097454 +Node: Adding Code1098894 +Node: New Ports1104939 +Node: Derived Files1109074 +Ref: Derived Files-Footnote-11114395 +Ref: Derived Files-Footnote-21114429 +Ref: Derived Files-Footnote-31115029 +Node: Future Extensions1115127 +Node: Implementation Limitations1115710 +Node: Extension Design1116958 +Node: Old Extension Problems1118112 +Ref: Old Extension Problems-Footnote-11119620 +Node: Extension New Mechanism Goals1119677 +Ref: Extension New Mechanism Goals-Footnote-11123042 +Node: Extension Other Design Decisions1123228 +Node: Extension Future Growth1125334 +Node: Old Extension Mechanism1126170 +Node: Basic Concepts1127910 +Node: Basic High Level1128591 +Ref: figure-general-flow1128863 +Ref: figure-process-flow1129462 +Ref: Basic High Level-Footnote-11132691 +Node: Basic Data Typing1132876 +Node: Glossary1136231 +Node: Copying1161462 +Node: GNU Free Documentation License1199018 +Node: Index1224154 End Tag Table diff --git a/doc/gawk.texi b/doc/gawk.texi index 24cd006b..f721b5f4 100644 --- a/doc/gawk.texi +++ b/doc/gawk.texi @@ -9874,9 +9874,9 @@ have different forms, but are stored identically internally. A @dfn{numeric constant} stands for a number. This number can be an integer, a decimal fraction, or a number in scientific (exponential) notation.@footnote{The internal representation of all numbers, -including integers, uses double precision -floating-point numbers. -On most modern systems, these are in IEEE 754 standard format.} +including integers, uses double precision floating-point numbers. +On most modern systems, these are in IEEE 754 standard format. +@xref{Arbitrary Precision Arithmetic}, for much more information.} Here are some examples of numeric constants that all have the same value: @@ -10118,7 +10118,7 @@ upon the contents of the current input record. Constant regular expressions are also used as the first argument for the @code{gensub()}, @code{sub()}, and @code{gsub()} functions, as the second argument of the @code{match()} function, -and as the third argument of the @code{patsplit()} function +and as the third argument of the @code{split()} and @code{patsplit()} functions (@pxref{String Functions}). Modern implementations of @command{awk}, including @command{gawk}, allow the third argument of @code{split()} to be a regexp constant, but some @@ -10360,32 +10360,28 @@ specifies the output format to use when printing numbers with @code{print}. conversion from the semantics of printing. Both @code{CONVFMT} and @code{OFMT} have the same default value: @code{"%.6g"}. In the vast majority of cases, old @command{awk} programs do not change their behavior. -However, these semantics for @code{OFMT} are something to keep in mind if you must -port your new-style program to older implementations of @command{awk}. -We recommend -that instead of changing your programs, just port @command{gawk} itself. -@xref{Print}, -for more information on the @code{print} statement. - -And, once again, where you are can matter when it comes to converting -between numbers and strings. In @ref{Locales}, we mentioned that -the local character set and language (the locale) can affect how -@command{gawk} matches characters. The locale also affects numeric -formats. In particular, for @command{awk} programs, it affects the -decimal point character. The @code{"C"} locale, and most English-language -locales, use the period character (@samp{.}) as the decimal point. -However, many (if not most) European and non-English locales use the comma -(@samp{,}) as the decimal point character. +@xref{Print}, for more information on the @code{print} statement. + +Where you are can matter when it comes to converting between numbers and +strings. The local character set and language---the @dfn{locale}---can +affect numeric formats. In particular, for @command{awk} programs, +it affects the decimal point character and the thousands-separator +character. The @code{"C"} locale, and most English-language locales, +use the period character (@samp{.}) as the decimal point and don't +have a thousands separator. However, many (if not most) European and +non-English locales use the comma (@samp{,}) as the decimal point +character. European locales often use either a space or a period as +the thousands separator, if they have one. @cindex dark corner, locale's decimal point character The POSIX standard says that @command{awk} always uses the period as the decimal -point when reading the @command{awk} program source code, and for command-line -variable assignments (@pxref{Other Arguments}). -However, when interpreting input data, for @code{print} and @code{printf} output, -and for number to string conversion, the local decimal point character is used. -@value{DARKCORNER} -Here are some examples indicating the difference in behavior, -on a GNU/Linux system: +point when reading the @command{awk} program source code, and for +command-line variable assignments (@pxref{Other Arguments}). However, +when interpreting input data, for @code{print} and @code{printf} output, +and for number to string conversion, the local decimal point character +is used. @value{DARKCORNER} In all cases, numbers in source code and +in input data cannot have a thousands separator. Here are some examples +indicating the difference in behavior, on a GNU/Linux system: @example $ @kbd{export POSIXLY_CORRECT=1} @ii{Force POSIX behavior} @@ -10400,7 +10396,7 @@ $ @kbd{echo 4,321 | LC_ALL=en_DK.utf-8 gawk '@{ print $1 + 1 @}'} @end example @noindent -The @samp{en_DK.utf-8} locale is for English in Denmark, where the comma acts as +The @code{en_DK.utf-8} locale is for English in Denmark, where the comma acts as the decimal point separator. In the normal @code{"C"} locale, @command{gawk} treats @samp{4,321} as @samp{4}, while in the Danish locale, it's treated as the full number, 4.321. @@ -10547,7 +10543,7 @@ b * int(a / b) + (a % b) == a @end example One possibly undesirable effect of this definition of remainder is that -@code{@var{x} % @var{y}} is negative if @var{x} is negative. Thus: +@samp{@var{x} % @var{y}} is negative if @var{x} is negative. Thus: @example -17 % 8 = -1 @@ -10641,7 +10637,7 @@ BEGIN @{ @end example @noindent -It is not defined whether the assignment to @code{a} happens +It is not defined whether the second assignment to @code{a} happens before or after the value of @code{a} is retrieved for producing the concatenated value. The result could be either @samp{don't panic}, or @samp{panic panic}. @@ -10763,8 +10759,8 @@ element. (Such values are called @dfn{rvalues}.) @cindex variables, types of It is important to note that variables do @emph{not} have permanent types. -A variable's type is simply the type of whatever value it happens -to hold at the moment. In the following program fragment, the variable +A variable's type is simply the type of whatever value was last assigned +to it. In the following program fragment, the variable @code{foo} has a numeric value at first, and a string value later on: @example @@ -10865,6 +10861,7 @@ The indices of @code{bar} are practically guaranteed to be different, because and see @ref{Numeric Functions}, for more information). This example illustrates an important fact about assignment operators: the lefthand expression is only evaluated @emph{once}. + It is up to the implementation as to which expression is evaluated first, the lefthand or the righthand. Consider this example: @@ -10897,17 +10894,17 @@ to a number. @caption{Arithmetic Assignment Operators} @multitable @columnfractions .30 .70 @headitem Operator @tab Effect -@item @var{lvalue} @code{+=} @var{increment} @tab Adds @var{increment} to the value of @var{lvalue}. -@item @var{lvalue} @code{-=} @var{decrement} @tab Subtracts @var{decrement} from the value of @var{lvalue}. -@item @var{lvalue} @code{*=} @var{coefficient} @tab Multiplies the value of @var{lvalue} by @var{coefficient}. -@item @var{lvalue} @code{/=} @var{divisor} @tab Divides the value of @var{lvalue} by @var{divisor}. -@item @var{lvalue} @code{%=} @var{modulus} @tab Sets @var{lvalue} to its remainder by @var{modulus}. +@item @var{lvalue} @code{+=} @var{increment} @tab Add @var{increment} to the value of @var{lvalue}. +@item @var{lvalue} @code{-=} @var{decrement} @tab Subtract @var{decrement} from the value of @var{lvalue}. +@item @var{lvalue} @code{*=} @var{coefficient} @tab Multiply the value of @var{lvalue} by @var{coefficient}. +@item @var{lvalue} @code{/=} @var{divisor} @tab Divide the value of @var{lvalue} by @var{divisor}. +@item @var{lvalue} @code{%=} @var{modulus} @tab Set @var{lvalue} to its remainder by @var{modulus}. @cindex common extensions, @code{**=} operator @cindex extensions, common@comma{} @code{**=} operator @cindex @command{awk} language, POSIX version @cindex POSIX @command{awk} @item @var{lvalue} @code{^=} @var{power} @tab -@item @var{lvalue} @code{**=} @var{power} @tab Raises @var{lvalue} to the power @var{power}. @value{COMMONEXT} +@item @var{lvalue} @code{**=} @var{power} @tab Raise @var{lvalue} to the power @var{power}. @value{COMMONEXT} @end multitable @end float @@ -10957,10 +10954,8 @@ A workaround is: awk '/[=]=/' /dev/null @end example -@command{gawk} does not have this problem, -nor do the other -freely available versions described in -@ref{Other Versions}. +@command{gawk} does not have this problem; Brian Kernighan's @command{awk} +and @command{mawk} also do not (@pxref{Other Versions}). @docbook </sidebar> @@ -11005,10 +11000,8 @@ A workaround is: awk '/[=]=/' /dev/null @end example -@command{gawk} does not have this problem, -nor do the other -freely available versions described in -@ref{Other Versions}. +@command{gawk} does not have this problem; Brian Kernighan's @command{awk} +and @command{mawk} also do not (@pxref{Other Versions}). @end cartouche @end ifnotdocbook @c ENDOFRANGE exas @@ -11033,11 +11026,10 @@ are convenient abbreviations for very common operations. @cindex side effects, decrement/increment operators The operator used for adding one is written @samp{++}. It can be used to increment a variable either before or after taking its value. -To pre-increment a variable @code{v}, write @samp{++v}. This adds +To @dfn{pre-increment} a variable @code{v}, write @samp{++v}. This adds one to the value of @code{v}---that new value is also the value of the -expression. (The assignment expression @samp{v += 1} is completely -equivalent.) -Writing the @samp{++} after the variable specifies post-increment. This +expression. (The assignment expression @samp{v += 1} is completely equivalent.) +Writing the @samp{++} after the variable specifies @dfn{post-increment}. This increments the variable value just the same; the difference is that the value of the increment expression itself is the variable's @emph{old} value. Thus, if @code{foo} has the value four, then the expression @samp{foo++} @@ -11049,7 +11041,18 @@ The post-increment @samp{foo++} is nearly the same as writing @samp{(foo += 1) - 1}. It is not perfectly equivalent because all numbers in @command{awk} are floating-point---in floating-point, @samp{foo + 1 - 1} does not necessarily equal @code{foo}. But the difference is minute as -long as you stick to numbers that are fairly small (less than 10e12). +long as you stick to numbers that are fairly small (less than +@iftex +@math{10^12}). +@end iftex +@ifnottex +@ifnotdocbook +10e12). +@end ifnotdocbook +@end ifnottex +@docbook +10<superscript>12</superscript>). @c +@end docbook @cindex @code{$} (dollar sign), incrementing fields and arrays @cindex dollar sign (@code{$}), incrementing fields and arrays @@ -11295,6 +11298,7 @@ like a number---for example, @code{@w{" +2"}}. This concept is used for determining the type of a variable. The type of the variable is important because the types of two variables determine how they are compared. + The various versions of the POSIX standard did not get the rules quite right for several editions. Fortunately, as of at least the 2008 standard (and possibly earlier), the standard has been fixed, @@ -11388,6 +11392,7 @@ STRNUM &&string &numeric &numeric\cr }}} @end tex @ifnottex +@ifnotdocbook @display +---------------------------------------------- | STRING NUMERIC STRNUM @@ -11400,7 +11405,51 @@ NUMERIC | string numeric numeric STRNUM | string numeric numeric --------+---------------------------------------------- @end display +@end ifnotdocbook @end ifnottex +@docbook +<informaltable> +<tgroup cols="4"> +<colspec colname="1" align="left"/> +<colspec colname="2" align="left"/> +<colspec colname="3" align="left"/> +<colspec colname="4" align="left"/> +<thead> +<row> +<entry/> +<entry>STRING</entry> +<entry>NUMERIC</entry> +<entry>STRNUM</entry> +</row> +</thead> + +<tbody> +<row> +<entry><emphasis role="bold">STRING</emphasis></entry> +<entry>string</entry> +<entry>string</entry> +<entry>string</entry> +</row> + +<row> +<entry><emphasis role="bold">NUMERIC</emphasis></entry> +<entry>string</entry> +<entry>numeric</entry> +<entry>numeric</entry> +</row> + +<row> +<entry><emphasis role="bold">STRNUM</emphasis></entry> +<entry>string</entry> +<entry>numeric</entry> +<entry>numeric</entry> +</row> + +</tbody> +</tgroup> +</informaltable> + +@end docbook The basic idea is that user input that looks numeric---and @emph{only} user input---should be treated as numeric, even though it is actually @@ -11419,8 +11468,8 @@ This point bears additional emphasis: All user input is made of characters, and so is first and foremost of @var{string} type; input strings that look numeric are additionally given the @var{strnum} attribute. Thus, the six-character input string @w{@samp{ +3.14}} receives the -@var{strnum} attribute. In contrast, the eight-character literal -@w{@code{" +3.14"}} appearing in program text is a string constant. +@var{strnum} attribute. In contrast, the eight characters +@w{@code{" +3.14"}} appearing in program text comprise a string constant. The following examples print @samp{1} when the comparison between the two different constants is true, @samp{0} otherwise: @@ -11606,7 +11655,9 @@ where this is discussed in more detail. @subsubsection String Comparison With POSIX Rules The POSIX standard says that string comparison is performed based -on the locale's collating order. This is usually very different +on the locale's @dfn{collating order}. This is the order in which +characters sort, as defined by the locale (for more discussion, +@pxref{Ranges and Locales}). This order is usually very different from the results obtained when doing straight character-by-character comparison.@footnote{Technically, string comparison is supposed to behave the same way as if the strings are compared with the C @@ -11614,7 +11665,7 @@ to behave the same way as if the strings are compared with the C Because this behavior differs considerably from existing practice, @command{gawk} only implements it when in POSIX mode (@pxref{Options}). -Here is an example to illustrate the difference, in an @samp{en_US.UTF-8} +Here is an example to illustrate the difference, in an @code{en_US.UTF-8} locale: @example @@ -11830,7 +11881,7 @@ However, putting a newline in front of either character does not work without using backslash continuation (@pxref{Statements/Lines}). If @option{--posix} is specified -(@pxref{Options}), then this extension is disabled. +(@pxref{Options}), this extension is disabled. @node Function Calls @section Function Calls @@ -11849,6 +11900,8 @@ functions and their descriptions. In addition, you can define functions for use in your program. @xref{User-defined}, for instructions on how to do this. +Finally, @command{gawk} lets you write functions in C or C++ +that may be called from your program: see @ref{Dynamic Extensions}. @cindex arguments, in function calls The way to use a function is with a @dfn{function call} expression, @@ -11899,12 +11952,12 @@ when you write the source code to your program. We defer discussion of this feature until later; see @ref{Indirect Calls}. @cindex side effects, function calls -Like every other expression, the function call has a value, which is -computed by the function based on the arguments you give it. In this -example, the value of @samp{sqrt(@var{argument})} is the square root of -@var{argument}. -The following program reads numbers, one number per line, and prints the -square root of each one: +Like every other expression, the function call has a value, often +called the @dfn{return value}, which is computed by the function +based on the arguments you give it. In this example, the return value +of @samp{sqrt(@var{argument})} is the square root of @var{argument}. +The following program reads numbers, one number per line, and prints +the square root of each one: @example $ @kbd{awk '@{ print "The square root of", $1, "is", sqrt($1) @}'} @@ -12219,10 +12272,10 @@ A single expression. It matches when its value is nonzero (if a number) or non-null (if a string). (@xref{Expression Patterns}.) -@item @var{pat1}, @var{pat2} +@item @var{begpat}, @var{endpat} A pair of patterns separated by a comma, specifying a range of records. -The range includes both the initial record that matches @var{pat1} and -the final record that matches @var{pat2}. +The range includes both the initial record that matches @var{begpat} and +the final record that matches @var{endpat}. (@xref{Ranges}.) @item BEGIN @@ -12234,7 +12287,7 @@ Special patterns for you to supply startup or cleanup actions for your @item BEGINFILE @itemx ENDFILE Special patterns for you to supply startup or cleanup actions to be -done on a per file basis. +done on a per-file basis. (@xref{BEGINFILE/ENDFILE}.) @item @var{empty} @@ -12395,7 +12448,7 @@ input record. When a record matches @var{begpat}, the range pattern is @dfn{turned on} and the range pattern matches this record as well. As long as the range pattern stays turned on, it automatically matches every input record read. The range pattern also matches @var{endpat} against every -input record; when this succeeds, the range pattern is turned off again +input record; when this succeeds, the range pattern is @dfn{turned off} again for the following record. Then the range pattern goes back to checking @var{begpat} against each record. @@ -12549,7 +12602,7 @@ rule checks the @code{FNR} and @code{NR} variables. @subsubsection Input/Output from @code{BEGIN} and @code{END} Rules @cindex input/output, from @code{BEGIN} and @code{END} -There are several (sometimes subtle) points to remember when doing I/O +There are several (sometimes subtle) points to be aware of when doing I/O from a @code{BEGIN} or @code{END} rule. The first has to do with the value of @code{$0} in a @code{BEGIN} rule. Because @code{BEGIN} rules are executed before any input is read, @@ -12610,8 +12663,19 @@ This @value{SECTION} describes a @command{gawk}-specific feature. Two special kinds of rule, @code{BEGINFILE} and @code{ENDFILE}, give you ``hooks'' into @command{gawk}'s command-line file processing loop. -As with the @code{BEGIN} and @code{END} rules (@pxref{BEGIN/END}), all -@code{BEGINFILE} rules in a program are merged, in the order they are +As with the @code{BEGIN} and @code{END} rules +@ifnottex +@ifnotdocbook +(@pxref{BEGIN/END}), +@end ifnotdocbook +@end ifnottex +@iftex +(see the previous section), +@end iftex +@ifdocbook +(see the previous section), +@end ifdocbook +all @code{BEGINFILE} rules in a program are merged, in the order they are read by @command{gawk}, and all @code{ENDFILE} rules are merged as well. The body of the @code{BEGINFILE} rules is executed just before @@ -12639,10 +12703,11 @@ the file entirely. Otherwise, @command{gawk} exits with the usual fatal error. @item -If you have written extensions that modify the record handling (by inserting -an ``input parser''), you can invoke them at this point, before @command{gawk} -has started processing the file. (This is a @emph{very} advanced feature, -currently used only by the @uref{http://gawkextlib.sourceforge.net, @code{gawkextlib} project}.) +If you have written extensions that modify the record handling (by +inserting an ``input parser,'' @pxref{Input Parsers}), you can invoke +them at this point, before @command{gawk} has started processing the file. +(This is a @emph{very} advanced feature, currently used only by the +@uref{http://gawkextlib.sourceforge.net, @code{gawkextlib} project}.) @end itemize The @code{ENDFILE} rule is called when @command{gawk} has finished processing @@ -12725,7 +12790,7 @@ into the body of the @command{awk} program. @cindex shells, quoting The most common method is to use shell quoting to substitute the variable's value into the program inside the script. -For example, in the following program: +For example, consider the following program: @example printf "Enter search pattern: " @@ -12735,7 +12800,7 @@ awk "/$pattern/ "'@{ nmatches++ @} @end example @noindent -the @command{awk} program consists of two pieces of quoted text +The @command{awk} program consists of two pieces of quoted text that are concatenated together to form the program. The first part is double-quoted, which allows substitution of the @code{pattern} shell variable inside the quotes. @@ -12749,8 +12814,8 @@ match up the quotes when reading the program. A better method is to use @command{awk}'s variable assignment feature (@pxref{Assignment Options}) -to assign the shell variable's value to an @command{awk} variable's -value. Then use dynamic regexps to match the pattern +to assign the shell variable's value to an @command{awk} variable. +Then use dynamic regexps to match the pattern (@pxref{Computed Regexps}). The following shows how to redo the previous example using this technique: @@ -12803,7 +12868,7 @@ function @var{name}(@var{args}) @{ @dots{} @} @cindex @code{;} (semicolon), separating statements in actions @cindex semicolon (@code{;}), separating statements in actions An action consists of one or more @command{awk} @dfn{statements}, enclosed -in curly braces (@samp{@{@dots{}@}}). Each statement specifies one +in curly braces (@samp{@{@r{@dots{}}@}}). Each statement specifies one thing to do. The statements are separated by newlines or semicolons. The curly braces around an action must be used even if the action contains only one statement, or if it contains no statements at @@ -12833,10 +12898,9 @@ programs. The @command{awk} language gives you C-like constructs special ones (@pxref{Statements}). @item Compound statements -Consist of one or more statements enclosed in -curly braces. A compound statement is used in order to put several -statements together in the body of an @code{if}, @code{while}, @code{do}, -or @code{for} statement. +Enclose one or more statements in curly braces. A compound statement +is used in order to put several statements together in the body of an +@code{if}, @code{while}, @code{do}, or @code{for} statement. @item Input statements Use the @code{getline} command @@ -13170,6 +13234,8 @@ for more information on this version of the @code{for} loop. @cindex @code{default} keyword This @value{SECTION} describes a @command{gawk}-specific feature. +If @command{gawk} is in compatibility mode (@pxref{Options}), +it is not available. The @code{switch} statement allows the evaluation of an expression and the execution of statements based on a @code{case} match. Case statements @@ -13226,11 +13292,6 @@ the @code{print} statement is executed and then falls through into the the @minus{}1 case will also be executed since the @code{default} does not halt execution. -This @code{switch} statement is a @command{gawk} extension. -If @command{gawk} is in compatibility mode -(@pxref{Options}), -it is not available. - @node Break Statement @subsection The @code{break} Statement @cindex @code{break} statement @@ -13245,15 +13306,15 @@ numbers: @example # find smallest divisor of num @{ - num = $1 - for (div = 2; div * div <= num; div++) @{ - if (num % div == 0) - break - @} - if (num % div == 0) - printf "Smallest divisor of %d is %d\n", num, div - else - printf "%d is prime\n", num + num = $1 + for (div = 2; div * div <= num; div++) @{ + if (num % div == 0) + break + @} + if (num % div == 0) + printf "Smallest divisor of %d is %d\n", num, div + else + printf "%d is prime\n", num @} @end example @@ -13271,17 +13332,17 @@ an @code{if}: @example # find smallest divisor of num @{ - num = $1 - for (div = 2; ; div++) @{ - if (num % div == 0) @{ - printf "Smallest divisor of %d is %d\n", num, div - break - @} - if (div * div > num) @{ - printf "%d is prime\n", num - break + num = $1 + for (div = 2; ; div++) @{ + if (num % div == 0) @{ + printf "Smallest divisor of %d is %d\n", num, div + break + @} + if (div * div > num) @{ + printf "%d is prime\n", num + break + @} @} - @} @} @end example @@ -13430,16 +13491,14 @@ The @code{next} statement is not allowed inside @code{BEGINFILE} and @cindex POSIX @command{awk}, @code{next}/@code{nextfile} statements and @cindex @code{next} statement, user-defined functions and @cindex functions, user-defined, @code{next}/@code{nextfile} statements and -According to the POSIX standard, the behavior is undefined if -the @code{next} statement is used in a @code{BEGIN} or @code{END} rule. -@command{gawk} treats it as a syntax error. -Although POSIX permits it, -some other @command{awk} implementations don't allow the @code{next} -statement inside function bodies -(@pxref{User-defined}). -Just as with any other @code{next} statement, a @code{next} statement inside a -function body reads the next record and starts processing it with the -first rule in the program. +According to the POSIX standard, the behavior is undefined if the +@code{next} statement is used in a @code{BEGIN} or @code{END} rule. +@command{gawk} treats it as a syntax error. Although POSIX permits it, +most other @command{awk} implementations don't allow the @code{next} +statement inside function bodies (@pxref{User-defined}). Just as with any +other @code{next} statement, a @code{next} statement inside a function +body reads the next record and starts processing it with the first rule +in the program. @node Nextfile Statement @subsection The @code{nextfile} Statement @@ -13550,8 +13609,7 @@ status code for the @command{awk} process. If no argument is supplied, In the case where an argument is supplied to a first @code{exit} statement, and then @code{exit} is called a second time from an @code{END} rule with no argument, -@command{awk} uses the previously supplied exit value. -@value{DARKCORNER} +@command{awk} uses the previously supplied exit value. @value{DARKCORNER} @xref{Exit Status}, for more information. @cindex programming conventions, @code{exit} statement @@ -13563,12 +13621,12 @@ in the following example: @example BEGIN @{ - if (("date" | getline date_now) <= 0) @{ - print "Can't get system date" > "/dev/stderr" - exit 1 - @} - print "current date is", date_now - close("date") + if (("date" | getline date_now) <= 0) @{ + print "Can't get system date" > "/dev/stderr" + exit 1 + @} + print "current date is", date_now + close("date") @} @end example @@ -35087,7 +35145,7 @@ it on your system). @cindex Unicode Similar considerations apply to other ranges. For example, @samp{["-/]} is perfectly valid in ASCII, but is not valid in many Unicode locales, -such as @samp{en_US.UTF-8}. +such as @code{en_US.UTF-8}. Early versions of @command{gawk} used regexp matching code that was not locale aware, so ranges had their traditional interpretation. @@ -37535,7 +37593,7 @@ different limits. @multitable @columnfractions .40 .60 @headitem Item @tab Limit @item Characters in a character class @tab 2^(number of bits per byte) -@item Length of input record @tab @code{MAX_INT } +@item Length of input record @tab @code{MAX_INT} @item Length of output record @tab Unlimited @item Length of source line @tab Unlimited @item Number of fields in a record @tab @code{MAX_LONG} @@ -37544,9 +37602,9 @@ different limits. @item Number of input records total @tab @code{MAX_LONG} @item Number of pipe redirections @tab min(number of processes per user, number of open files) @item Numeric values @tab Double-precision floating point (if not using MPFR) -@item Size of a field @tab @code{MAX_INT } -@item Size of a literal string @tab @code{MAX_INT } -@item Size of a printf string @tab @code{MAX_INT } +@item Size of a field @tab @code{MAX_INT} +@item Size of a literal string @tab @code{MAX_INT} +@item Size of a printf string @tab @code{MAX_INT} @end multitable @node Extension Design diff --git a/doc/gawktexi.in b/doc/gawktexi.in index 48ed6bc6..d73697df 100644 --- a/doc/gawktexi.in +++ b/doc/gawktexi.in @@ -9393,9 +9393,9 @@ have different forms, but are stored identically internally. A @dfn{numeric constant} stands for a number. This number can be an integer, a decimal fraction, or a number in scientific (exponential) notation.@footnote{The internal representation of all numbers, -including integers, uses double precision -floating-point numbers. -On most modern systems, these are in IEEE 754 standard format.} +including integers, uses double precision floating-point numbers. +On most modern systems, these are in IEEE 754 standard format. +@xref{Arbitrary Precision Arithmetic}, for much more information.} Here are some examples of numeric constants that all have the same value: @@ -9608,7 +9608,7 @@ upon the contents of the current input record. Constant regular expressions are also used as the first argument for the @code{gensub()}, @code{sub()}, and @code{gsub()} functions, as the second argument of the @code{match()} function, -and as the third argument of the @code{patsplit()} function +and as the third argument of the @code{split()} and @code{patsplit()} functions (@pxref{String Functions}). Modern implementations of @command{awk}, including @command{gawk}, allow the third argument of @code{split()} to be a regexp constant, but some @@ -9850,32 +9850,28 @@ specifies the output format to use when printing numbers with @code{print}. conversion from the semantics of printing. Both @code{CONVFMT} and @code{OFMT} have the same default value: @code{"%.6g"}. In the vast majority of cases, old @command{awk} programs do not change their behavior. -However, these semantics for @code{OFMT} are something to keep in mind if you must -port your new-style program to older implementations of @command{awk}. -We recommend -that instead of changing your programs, just port @command{gawk} itself. -@xref{Print}, -for more information on the @code{print} statement. - -And, once again, where you are can matter when it comes to converting -between numbers and strings. In @ref{Locales}, we mentioned that -the local character set and language (the locale) can affect how -@command{gawk} matches characters. The locale also affects numeric -formats. In particular, for @command{awk} programs, it affects the -decimal point character. The @code{"C"} locale, and most English-language -locales, use the period character (@samp{.}) as the decimal point. -However, many (if not most) European and non-English locales use the comma -(@samp{,}) as the decimal point character. +@xref{Print}, for more information on the @code{print} statement. + +Where you are can matter when it comes to converting between numbers and +strings. The local character set and language---the @dfn{locale}---can +affect numeric formats. In particular, for @command{awk} programs, +it affects the decimal point character and the thousands-separator +character. The @code{"C"} locale, and most English-language locales, +use the period character (@samp{.}) as the decimal point and don't +have a thousands separator. However, many (if not most) European and +non-English locales use the comma (@samp{,}) as the decimal point +character. European locales often use either a space or a period as +the thousands separator, if they have one. @cindex dark corner, locale's decimal point character The POSIX standard says that @command{awk} always uses the period as the decimal -point when reading the @command{awk} program source code, and for command-line -variable assignments (@pxref{Other Arguments}). -However, when interpreting input data, for @code{print} and @code{printf} output, -and for number to string conversion, the local decimal point character is used. -@value{DARKCORNER} -Here are some examples indicating the difference in behavior, -on a GNU/Linux system: +point when reading the @command{awk} program source code, and for +command-line variable assignments (@pxref{Other Arguments}). However, +when interpreting input data, for @code{print} and @code{printf} output, +and for number to string conversion, the local decimal point character +is used. @value{DARKCORNER} In all cases, numbers in source code and +in input data cannot have a thousands separator. Here are some examples +indicating the difference in behavior, on a GNU/Linux system: @example $ @kbd{export POSIXLY_CORRECT=1} @ii{Force POSIX behavior} @@ -9890,7 +9886,7 @@ $ @kbd{echo 4,321 | LC_ALL=en_DK.utf-8 gawk '@{ print $1 + 1 @}'} @end example @noindent -The @samp{en_DK.utf-8} locale is for English in Denmark, where the comma acts as +The @code{en_DK.utf-8} locale is for English in Denmark, where the comma acts as the decimal point separator. In the normal @code{"C"} locale, @command{gawk} treats @samp{4,321} as @samp{4}, while in the Danish locale, it's treated as the full number, 4.321. @@ -10037,7 +10033,7 @@ b * int(a / b) + (a % b) == a @end example One possibly undesirable effect of this definition of remainder is that -@code{@var{x} % @var{y}} is negative if @var{x} is negative. Thus: +@samp{@var{x} % @var{y}} is negative if @var{x} is negative. Thus: @example -17 % 8 = -1 @@ -10131,7 +10127,7 @@ BEGIN @{ @end example @noindent -It is not defined whether the assignment to @code{a} happens +It is not defined whether the second assignment to @code{a} happens before or after the value of @code{a} is retrieved for producing the concatenated value. The result could be either @samp{don't panic}, or @samp{panic panic}. @@ -10253,8 +10249,8 @@ element. (Such values are called @dfn{rvalues}.) @cindex variables, types of It is important to note that variables do @emph{not} have permanent types. -A variable's type is simply the type of whatever value it happens -to hold at the moment. In the following program fragment, the variable +A variable's type is simply the type of whatever value was last assigned +to it. In the following program fragment, the variable @code{foo} has a numeric value at first, and a string value later on: @example @@ -10355,6 +10351,7 @@ The indices of @code{bar} are practically guaranteed to be different, because and see @ref{Numeric Functions}, for more information). This example illustrates an important fact about assignment operators: the lefthand expression is only evaluated @emph{once}. + It is up to the implementation as to which expression is evaluated first, the lefthand or the righthand. Consider this example: @@ -10387,17 +10384,17 @@ to a number. @caption{Arithmetic Assignment Operators} @multitable @columnfractions .30 .70 @headitem Operator @tab Effect -@item @var{lvalue} @code{+=} @var{increment} @tab Adds @var{increment} to the value of @var{lvalue}. -@item @var{lvalue} @code{-=} @var{decrement} @tab Subtracts @var{decrement} from the value of @var{lvalue}. -@item @var{lvalue} @code{*=} @var{coefficient} @tab Multiplies the value of @var{lvalue} by @var{coefficient}. -@item @var{lvalue} @code{/=} @var{divisor} @tab Divides the value of @var{lvalue} by @var{divisor}. -@item @var{lvalue} @code{%=} @var{modulus} @tab Sets @var{lvalue} to its remainder by @var{modulus}. +@item @var{lvalue} @code{+=} @var{increment} @tab Add @var{increment} to the value of @var{lvalue}. +@item @var{lvalue} @code{-=} @var{decrement} @tab Subtract @var{decrement} from the value of @var{lvalue}. +@item @var{lvalue} @code{*=} @var{coefficient} @tab Multiply the value of @var{lvalue} by @var{coefficient}. +@item @var{lvalue} @code{/=} @var{divisor} @tab Divide the value of @var{lvalue} by @var{divisor}. +@item @var{lvalue} @code{%=} @var{modulus} @tab Set @var{lvalue} to its remainder by @var{modulus}. @cindex common extensions, @code{**=} operator @cindex extensions, common@comma{} @code{**=} operator @cindex @command{awk} language, POSIX version @cindex POSIX @command{awk} @item @var{lvalue} @code{^=} @var{power} @tab -@item @var{lvalue} @code{**=} @var{power} @tab Raises @var{lvalue} to the power @var{power}. @value{COMMONEXT} +@item @var{lvalue} @code{**=} @var{power} @tab Raise @var{lvalue} to the power @var{power}. @value{COMMONEXT} @end multitable @end float @@ -10442,10 +10439,8 @@ A workaround is: awk '/[=]=/' /dev/null @end example -@command{gawk} does not have this problem, -nor do the other -freely available versions described in -@ref{Other Versions}. +@command{gawk} does not have this problem; Brian Kernighan's @command{awk} +and @command{mawk} also do not (@pxref{Other Versions}). @end sidebar @c ENDOFRANGE exas @c ENDOFRANGE opas @@ -10469,11 +10464,10 @@ are convenient abbreviations for very common operations. @cindex side effects, decrement/increment operators The operator used for adding one is written @samp{++}. It can be used to increment a variable either before or after taking its value. -To pre-increment a variable @code{v}, write @samp{++v}. This adds +To @dfn{pre-increment} a variable @code{v}, write @samp{++v}. This adds one to the value of @code{v}---that new value is also the value of the -expression. (The assignment expression @samp{v += 1} is completely -equivalent.) -Writing the @samp{++} after the variable specifies post-increment. This +expression. (The assignment expression @samp{v += 1} is completely equivalent.) +Writing the @samp{++} after the variable specifies @dfn{post-increment}. This increments the variable value just the same; the difference is that the value of the increment expression itself is the variable's @emph{old} value. Thus, if @code{foo} has the value four, then the expression @samp{foo++} @@ -10485,7 +10479,18 @@ The post-increment @samp{foo++} is nearly the same as writing @samp{(foo += 1) - 1}. It is not perfectly equivalent because all numbers in @command{awk} are floating-point---in floating-point, @samp{foo + 1 - 1} does not necessarily equal @code{foo}. But the difference is minute as -long as you stick to numbers that are fairly small (less than 10e12). +long as you stick to numbers that are fairly small (less than +@iftex +@math{10^12}). +@end iftex +@ifnottex +@ifnotdocbook +10e12). +@end ifnotdocbook +@end ifnottex +@docbook +10<superscript>12</superscript>). @c +@end docbook @cindex @code{$} (dollar sign), incrementing fields and arrays @cindex dollar sign (@code{$}), incrementing fields and arrays @@ -10673,6 +10678,7 @@ like a number---for example, @code{@w{" +2"}}. This concept is used for determining the type of a variable. The type of the variable is important because the types of two variables determine how they are compared. + The various versions of the POSIX standard did not get the rules quite right for several editions. Fortunately, as of at least the 2008 standard (and possibly earlier), the standard has been fixed, @@ -10766,6 +10772,7 @@ STRNUM &&string &numeric &numeric\cr }}} @end tex @ifnottex +@ifnotdocbook @display +---------------------------------------------- | STRING NUMERIC STRNUM @@ -10778,7 +10785,51 @@ NUMERIC | string numeric numeric STRNUM | string numeric numeric --------+---------------------------------------------- @end display +@end ifnotdocbook @end ifnottex +@docbook +<informaltable> +<tgroup cols="4"> +<colspec colname="1" align="left"/> +<colspec colname="2" align="left"/> +<colspec colname="3" align="left"/> +<colspec colname="4" align="left"/> +<thead> +<row> +<entry/> +<entry>STRING</entry> +<entry>NUMERIC</entry> +<entry>STRNUM</entry> +</row> +</thead> + +<tbody> +<row> +<entry><emphasis role="bold">STRING</emphasis></entry> +<entry>string</entry> +<entry>string</entry> +<entry>string</entry> +</row> + +<row> +<entry><emphasis role="bold">NUMERIC</emphasis></entry> +<entry>string</entry> +<entry>numeric</entry> +<entry>numeric</entry> +</row> + +<row> +<entry><emphasis role="bold">STRNUM</emphasis></entry> +<entry>string</entry> +<entry>numeric</entry> +<entry>numeric</entry> +</row> + +</tbody> +</tgroup> +</informaltable> + +@end docbook The basic idea is that user input that looks numeric---and @emph{only} user input---should be treated as numeric, even though it is actually @@ -10797,8 +10848,8 @@ This point bears additional emphasis: All user input is made of characters, and so is first and foremost of @var{string} type; input strings that look numeric are additionally given the @var{strnum} attribute. Thus, the six-character input string @w{@samp{ +3.14}} receives the -@var{strnum} attribute. In contrast, the eight-character literal -@w{@code{" +3.14"}} appearing in program text is a string constant. +@var{strnum} attribute. In contrast, the eight characters +@w{@code{" +3.14"}} appearing in program text comprise a string constant. The following examples print @samp{1} when the comparison between the two different constants is true, @samp{0} otherwise: @@ -10984,7 +11035,9 @@ where this is discussed in more detail. @subsubsection String Comparison With POSIX Rules The POSIX standard says that string comparison is performed based -on the locale's collating order. This is usually very different +on the locale's @dfn{collating order}. This is the order in which +characters sort, as defined by the locale (for more discussion, +@pxref{Ranges and Locales}). This order is usually very different from the results obtained when doing straight character-by-character comparison.@footnote{Technically, string comparison is supposed to behave the same way as if the strings are compared with the C @@ -10992,7 +11045,7 @@ to behave the same way as if the strings are compared with the C Because this behavior differs considerably from existing practice, @command{gawk} only implements it when in POSIX mode (@pxref{Options}). -Here is an example to illustrate the difference, in an @samp{en_US.UTF-8} +Here is an example to illustrate the difference, in an @code{en_US.UTF-8} locale: @example @@ -11208,7 +11261,7 @@ However, putting a newline in front of either character does not work without using backslash continuation (@pxref{Statements/Lines}). If @option{--posix} is specified -(@pxref{Options}), then this extension is disabled. +(@pxref{Options}), this extension is disabled. @node Function Calls @section Function Calls @@ -11227,6 +11280,8 @@ functions and their descriptions. In addition, you can define functions for use in your program. @xref{User-defined}, for instructions on how to do this. +Finally, @command{gawk} lets you write functions in C or C++ +that may be called from your program: see @ref{Dynamic Extensions}. @cindex arguments, in function calls The way to use a function is with a @dfn{function call} expression, @@ -11277,12 +11332,12 @@ when you write the source code to your program. We defer discussion of this feature until later; see @ref{Indirect Calls}. @cindex side effects, function calls -Like every other expression, the function call has a value, which is -computed by the function based on the arguments you give it. In this -example, the value of @samp{sqrt(@var{argument})} is the square root of -@var{argument}. -The following program reads numbers, one number per line, and prints the -square root of each one: +Like every other expression, the function call has a value, often +called the @dfn{return value}, which is computed by the function +based on the arguments you give it. In this example, the return value +of @samp{sqrt(@var{argument})} is the square root of @var{argument}. +The following program reads numbers, one number per line, and prints +the square root of each one: @example $ @kbd{awk '@{ print "The square root of", $1, "is", sqrt($1) @}'} @@ -11597,10 +11652,10 @@ A single expression. It matches when its value is nonzero (if a number) or non-null (if a string). (@xref{Expression Patterns}.) -@item @var{pat1}, @var{pat2} +@item @var{begpat}, @var{endpat} A pair of patterns separated by a comma, specifying a range of records. -The range includes both the initial record that matches @var{pat1} and -the final record that matches @var{pat2}. +The range includes both the initial record that matches @var{begpat} and +the final record that matches @var{endpat}. (@xref{Ranges}.) @item BEGIN @@ -11612,7 +11667,7 @@ Special patterns for you to supply startup or cleanup actions for your @item BEGINFILE @itemx ENDFILE Special patterns for you to supply startup or cleanup actions to be -done on a per file basis. +done on a per-file basis. (@xref{BEGINFILE/ENDFILE}.) @item @var{empty} @@ -11773,7 +11828,7 @@ input record. When a record matches @var{begpat}, the range pattern is @dfn{turned on} and the range pattern matches this record as well. As long as the range pattern stays turned on, it automatically matches every input record read. The range pattern also matches @var{endpat} against every -input record; when this succeeds, the range pattern is turned off again +input record; when this succeeds, the range pattern is @dfn{turned off} again for the following record. Then the range pattern goes back to checking @var{begpat} against each record. @@ -11927,7 +11982,7 @@ rule checks the @code{FNR} and @code{NR} variables. @subsubsection Input/Output from @code{BEGIN} and @code{END} Rules @cindex input/output, from @code{BEGIN} and @code{END} -There are several (sometimes subtle) points to remember when doing I/O +There are several (sometimes subtle) points to be aware of when doing I/O from a @code{BEGIN} or @code{END} rule. The first has to do with the value of @code{$0} in a @code{BEGIN} rule. Because @code{BEGIN} rules are executed before any input is read, @@ -11988,8 +12043,19 @@ This @value{SECTION} describes a @command{gawk}-specific feature. Two special kinds of rule, @code{BEGINFILE} and @code{ENDFILE}, give you ``hooks'' into @command{gawk}'s command-line file processing loop. -As with the @code{BEGIN} and @code{END} rules (@pxref{BEGIN/END}), all -@code{BEGINFILE} rules in a program are merged, in the order they are +As with the @code{BEGIN} and @code{END} rules +@ifnottex +@ifnotdocbook +(@pxref{BEGIN/END}), +@end ifnotdocbook +@end ifnottex +@iftex +(see the previous section), +@end iftex +@ifdocbook +(see the previous section), +@end ifdocbook +all @code{BEGINFILE} rules in a program are merged, in the order they are read by @command{gawk}, and all @code{ENDFILE} rules are merged as well. The body of the @code{BEGINFILE} rules is executed just before @@ -12017,10 +12083,11 @@ the file entirely. Otherwise, @command{gawk} exits with the usual fatal error. @item -If you have written extensions that modify the record handling (by inserting -an ``input parser''), you can invoke them at this point, before @command{gawk} -has started processing the file. (This is a @emph{very} advanced feature, -currently used only by the @uref{http://gawkextlib.sourceforge.net, @code{gawkextlib} project}.) +If you have written extensions that modify the record handling (by +inserting an ``input parser,'' @pxref{Input Parsers}), you can invoke +them at this point, before @command{gawk} has started processing the file. +(This is a @emph{very} advanced feature, currently used only by the +@uref{http://gawkextlib.sourceforge.net, @code{gawkextlib} project}.) @end itemize The @code{ENDFILE} rule is called when @command{gawk} has finished processing @@ -12103,7 +12170,7 @@ into the body of the @command{awk} program. @cindex shells, quoting The most common method is to use shell quoting to substitute the variable's value into the program inside the script. -For example, in the following program: +For example, consider the following program: @example printf "Enter search pattern: " @@ -12113,7 +12180,7 @@ awk "/$pattern/ "'@{ nmatches++ @} @end example @noindent -the @command{awk} program consists of two pieces of quoted text +The @command{awk} program consists of two pieces of quoted text that are concatenated together to form the program. The first part is double-quoted, which allows substitution of the @code{pattern} shell variable inside the quotes. @@ -12127,8 +12194,8 @@ match up the quotes when reading the program. A better method is to use @command{awk}'s variable assignment feature (@pxref{Assignment Options}) -to assign the shell variable's value to an @command{awk} variable's -value. Then use dynamic regexps to match the pattern +to assign the shell variable's value to an @command{awk} variable. +Then use dynamic regexps to match the pattern (@pxref{Computed Regexps}). The following shows how to redo the previous example using this technique: @@ -12181,7 +12248,7 @@ function @var{name}(@var{args}) @{ @dots{} @} @cindex @code{;} (semicolon), separating statements in actions @cindex semicolon (@code{;}), separating statements in actions An action consists of one or more @command{awk} @dfn{statements}, enclosed -in curly braces (@samp{@{@dots{}@}}). Each statement specifies one +in curly braces (@samp{@{@r{@dots{}}@}}). Each statement specifies one thing to do. The statements are separated by newlines or semicolons. The curly braces around an action must be used even if the action contains only one statement, or if it contains no statements at @@ -12211,10 +12278,9 @@ programs. The @command{awk} language gives you C-like constructs special ones (@pxref{Statements}). @item Compound statements -Consist of one or more statements enclosed in -curly braces. A compound statement is used in order to put several -statements together in the body of an @code{if}, @code{while}, @code{do}, -or @code{for} statement. +Enclose one or more statements in curly braces. A compound statement +is used in order to put several statements together in the body of an +@code{if}, @code{while}, @code{do}, or @code{for} statement. @item Input statements Use the @code{getline} command @@ -12548,6 +12614,8 @@ for more information on this version of the @code{for} loop. @cindex @code{default} keyword This @value{SECTION} describes a @command{gawk}-specific feature. +If @command{gawk} is in compatibility mode (@pxref{Options}), +it is not available. The @code{switch} statement allows the evaluation of an expression and the execution of statements based on a @code{case} match. Case statements @@ -12604,11 +12672,6 @@ the @code{print} statement is executed and then falls through into the the @minus{}1 case will also be executed since the @code{default} does not halt execution. -This @code{switch} statement is a @command{gawk} extension. -If @command{gawk} is in compatibility mode -(@pxref{Options}), -it is not available. - @node Break Statement @subsection The @code{break} Statement @cindex @code{break} statement @@ -12623,15 +12686,15 @@ numbers: @example # find smallest divisor of num @{ - num = $1 - for (div = 2; div * div <= num; div++) @{ - if (num % div == 0) - break - @} - if (num % div == 0) - printf "Smallest divisor of %d is %d\n", num, div - else - printf "%d is prime\n", num + num = $1 + for (div = 2; div * div <= num; div++) @{ + if (num % div == 0) + break + @} + if (num % div == 0) + printf "Smallest divisor of %d is %d\n", num, div + else + printf "%d is prime\n", num @} @end example @@ -12649,17 +12712,17 @@ an @code{if}: @example # find smallest divisor of num @{ - num = $1 - for (div = 2; ; div++) @{ - if (num % div == 0) @{ - printf "Smallest divisor of %d is %d\n", num, div - break - @} - if (div * div > num) @{ - printf "%d is prime\n", num - break + num = $1 + for (div = 2; ; div++) @{ + if (num % div == 0) @{ + printf "Smallest divisor of %d is %d\n", num, div + break + @} + if (div * div > num) @{ + printf "%d is prime\n", num + break + @} @} - @} @} @end example @@ -12808,16 +12871,14 @@ The @code{next} statement is not allowed inside @code{BEGINFILE} and @cindex POSIX @command{awk}, @code{next}/@code{nextfile} statements and @cindex @code{next} statement, user-defined functions and @cindex functions, user-defined, @code{next}/@code{nextfile} statements and -According to the POSIX standard, the behavior is undefined if -the @code{next} statement is used in a @code{BEGIN} or @code{END} rule. -@command{gawk} treats it as a syntax error. -Although POSIX permits it, -some other @command{awk} implementations don't allow the @code{next} -statement inside function bodies -(@pxref{User-defined}). -Just as with any other @code{next} statement, a @code{next} statement inside a -function body reads the next record and starts processing it with the -first rule in the program. +According to the POSIX standard, the behavior is undefined if the +@code{next} statement is used in a @code{BEGIN} or @code{END} rule. +@command{gawk} treats it as a syntax error. Although POSIX permits it, +most other @command{awk} implementations don't allow the @code{next} +statement inside function bodies (@pxref{User-defined}). Just as with any +other @code{next} statement, a @code{next} statement inside a function +body reads the next record and starts processing it with the first rule +in the program. @node Nextfile Statement @subsection The @code{nextfile} Statement @@ -12928,8 +12989,7 @@ status code for the @command{awk} process. If no argument is supplied, In the case where an argument is supplied to a first @code{exit} statement, and then @code{exit} is called a second time from an @code{END} rule with no argument, -@command{awk} uses the previously supplied exit value. -@value{DARKCORNER} +@command{awk} uses the previously supplied exit value. @value{DARKCORNER} @xref{Exit Status}, for more information. @cindex programming conventions, @code{exit} statement @@ -12941,12 +13001,12 @@ in the following example: @example BEGIN @{ - if (("date" | getline date_now) <= 0) @{ - print "Can't get system date" > "/dev/stderr" - exit 1 - @} - print "current date is", date_now - close("date") + if (("date" | getline date_now) <= 0) @{ + print "Can't get system date" > "/dev/stderr" + exit 1 + @} + print "current date is", date_now + close("date") @} @end example @@ -34229,7 +34289,7 @@ it on your system). @cindex Unicode Similar considerations apply to other ranges. For example, @samp{["-/]} is perfectly valid in ASCII, but is not valid in many Unicode locales, -such as @samp{en_US.UTF-8}. +such as @code{en_US.UTF-8}. Early versions of @command{gawk} used regexp matching code that was not locale aware, so ranges had their traditional interpretation. @@ -36677,7 +36737,7 @@ different limits. @multitable @columnfractions .40 .60 @headitem Item @tab Limit @item Characters in a character class @tab 2^(number of bits per byte) -@item Length of input record @tab @code{MAX_INT } +@item Length of input record @tab @code{MAX_INT} @item Length of output record @tab Unlimited @item Length of source line @tab Unlimited @item Number of fields in a record @tab @code{MAX_LONG} @@ -36686,9 +36746,9 @@ different limits. @item Number of input records total @tab @code{MAX_LONG} @item Number of pipe redirections @tab min(number of processes per user, number of open files) @item Numeric values @tab Double-precision floating point (if not using MPFR) -@item Size of a field @tab @code{MAX_INT } -@item Size of a literal string @tab @code{MAX_INT } -@item Size of a printf string @tab @code{MAX_INT } +@item Size of a field @tab @code{MAX_INT} +@item Size of a literal string @tab @code{MAX_INT} +@item Size of a printf string @tab @code{MAX_INT} @end multitable @node Extension Design |