diff options
-rw-r--r-- | doc/ChangeLog | 4 | ||||
-rw-r--r-- | doc/gawk.info | 1630 | ||||
-rw-r--r-- | doc/gawk.texi | 678 | ||||
-rw-r--r-- | doc/gawktexi.in | 676 |
4 files changed, 1566 insertions, 1422 deletions
diff --git a/doc/ChangeLog b/doc/ChangeLog index 243ef843..972e19f8 100644 --- a/doc/ChangeLog +++ b/doc/ChangeLog @@ -1,3 +1,7 @@ +2014-04-29 Arnold D. Robbins <arnold@skeeve.com> + + * gawktexi.in: Editing progress. Through Chapter 3. + 2014-04-24 Arnold D. Robbins <arnold@skeeve.com> * gawktexi.in: Start on revisions. diff --git a/doc/gawk.info b/doc/gawk.info index 589ac015..1d5f496d 100644 --- a/doc/gawk.info +++ b/doc/gawk.info @@ -1652,6 +1652,23 @@ knowledge of shell quoting rules. The following rules apply only to POSIX-compliant, Bourne-style shells (such as Bash, the GNU Bourne-Again Shell). If you use the C shell, you're on your own. + Before diving into the rules, we introduce a concept that appears +throughout this Info file, which is that of the "null", or empty, +string. + + The null string is character data that has no value. In other +words, it is empty. It is written in `awk' programs like this: `""'. +In the shell, it can be written using single or double quotes: `""' or +`'''. While the null string has no characters in it, it does exist. +Consider this command: + + $ echo "" + +Here, the `echo' utility receives a single argument, even though that +argument has no characters in it. In the rest of this Info file, we use +the terms "null string" and "empty string" interchangeably. Now, on to +the quoting rules. + * Quoted items can be concatenated with nonquoted items as well as with other quoted items. The shell turns everything into one argument for the command. @@ -1892,7 +1909,7 @@ different ways to do the same things shown here: awk 'length($0) > 80' data The sole rule has a relational expression as its pattern and it - has no action--so the default action, printing the record, is used. + has no action--so it uses the default action, printing the record. * Print the length of the longest line in `data': @@ -1949,9 +1966,9 @@ File: gawk.info, Node: Two Rules, Next: More Complex, Prev: Very Simple, Up: The `awk' utility reads the input files one line at a time. For each line, `awk' tries the patterns of each of the rules. If several -patterns match, then several actions are run in the order in which they -appear in the `awk' program. If no patterns match, then no actions are -run. +patterns match, then several actions execture in the order in which +they appear in the `awk' program. If no patterns match, then no +actions run. After processing all the rules that match the line (and perhaps there are none), `awk' reads the next line. (However, *note Next @@ -2027,12 +2044,12 @@ contains the file name.(1) The `$6 == "Nov"' in our `awk' program is an expression that tests whether the sixth field of the output from `ls -l' matches the string -`Nov'. Each time a line has the string `Nov' for its sixth field, the -action `sum += $5' is performed. This adds the fifth field (the file's -size) to the variable `sum'. As a result, when `awk' has finished -reading all the input lines, `sum' is the total of the sizes of the -files whose lines matched the pattern. (This works because `awk' -variables are automatically initialized to zero.) +`Nov'. Each time a line has the string `Nov' for its sixth field, +`awk' performs the action `sum += $5'. This adds the fifth field (the +file's size) to the variable `sum'. As a result, when `awk' has +finished reading all the input lines, `sum' is the total of the sizes +of the files whose lines matched the pattern. (This works because +`awk' variables are automatically initialized to zero.) After the last line of output from `ls' has been processed, the `END' rule executes and prints the value of `sum'. In this example, @@ -2084,15 +2101,15 @@ We have generally not used backslash continuation in our sample programs. `gawk' places no limit on the length of a line, so backslash continuation is never strictly necessary; it just makes programs more readable. For this same reason, as well as for clarity, we have kept -most statements short in the sample programs presented throughout the -Info file. Backslash continuation is most useful when your `awk' -program is in a separate source file instead of entered from the -command line. You should also note that many `awk' implementations are -more particular about where you may use backslash continuation. For -example, they may not allow you to split a string constant using -backslash continuation. Thus, for maximum portability of your `awk' -programs, it is best not to split your lines in the middle of a regular -expression or a string. +most statements short in the programs presented throughout the Info +file. Backslash continuation is most useful when your `awk' program is +in a separate source file instead of entered from the command line. +You should also note that many `awk' implementations are more +particular about where you may use backslash continuation. For example, +they may not allow you to split a string constant using backslash +continuation. Thus, for maximum portability of your `awk' programs, it +is best not to split your lines in the middle of a regular expression +or a string. CAUTION: _Backslash continuation does not work as described with the C shell._ It works for `awk' programs in files and for @@ -2213,14 +2230,15 @@ those that it has are much larger than they used to be. If you find yourself writing `awk' scripts of more than, say, a few hundred lines, you might consider using a different programming -language. Emacs Lisp is a good choice if you need sophisticated string -or pattern matching capabilities. The shell is also good at string and -pattern matching; in addition, it allows powerful use of the system -utilities. More conventional languages, such as C, C++, and Java, offer -better facilities for system programming and for managing the complexity -of large programs. Programs in these languages may require more lines -of source code than the equivalent `awk' programs, but they are easier -to maintain and usually run more efficiently. +language. The shell is good at string and pattern matching; in +addition, it allows powerful use of the system utilities. More +conventional languages, such as C, C++, and Java, offer better +facilities for system programming and for managing the complexity of +large programs. Python offers a nice balance between high-level ease +of programming and access to system facilities. Programs in these +languages may require more lines of source code than the equivalent +`awk' programs, but they are easier to maintain and usually run more +efficiently. File: gawk.info, Node: Invoking Gawk, Next: Regexp, Prev: Getting Started, Up: Top @@ -2350,11 +2368,11 @@ The following list describes options mandated by the POSIX standard: treated as single-byte characters. Normally, `gawk' follows the POSIX standard and attempts to process - its input data according to the current locale. This can often - involve converting multibyte characters into wide characters - (internally), and can lead to problems or confusion if the input - data does not contain valid multibyte characters. This option is - an easy way to tell `gawk': "hands off my data!". + its input data according to the current locale (*note Locales::). + This can often involve converting multibyte characters into wide + characters (internally), and can lead to problems or confusion if + the input data does not contain valid multibyte characters. This + option is an easy way to tell `gawk': "hands off my data!". `-c' `--traditional' @@ -2368,8 +2386,8 @@ The following list describes options mandated by the POSIX standard: Print the short version of the General Public License and then exit. -`-d[FILE]' -`--dump-variables[=FILE]' +`-d'[FILE] +`--dump-variables'[`='FILE] Print a sorted list of global variables, their types, and final values to FILE. If no FILE is provided, print this list to the file named `awkvars.out' in the current directory. No space is @@ -2383,8 +2401,8 @@ The following list describes options mandated by the POSIX standard: particularly easy mistake to make with simple variable names like `i', `j', etc.) -`-D[FILE]' -`--debug=[FILE]' +`-D'[FILE] +`--debug'[`='FILE] Enable debugging of `awk' programs (*note Debugging::). By default, the debugger reads commands interactively from the keyboard. The optional FILE argument allows you to specify a file @@ -2392,16 +2410,16 @@ The following list describes options mandated by the POSIX standard: non-interactively. No space is allowed between the `-D' and FILE, if FILE is supplied. -`-e PROGRAM-TEXT' -`--source PROGRAM-TEXT' +`-e' PROGRAM-TEXT +`--source' PROGRAM-TEXT Provide program source code in the PROGRAM-TEXT. This option allows you to mix source code in files with source code that you enter on the command line. This is particularly useful when you have library functions that you want to use from your command-line programs (*note AWKPATH Variable::). -`-E FILE' -`--exec FILE' +`-E' FILE +`--exec' FILE Similar to `-f', read `awk' program text from FILE. There are two differences from `-f': @@ -2434,37 +2452,41 @@ The following list describes options mandated by the POSIX standard: Print a "usage" message summarizing the short and long style options that `gawk' accepts and then exit. -`-i SOURCE-FILE' -`--include SOURCE-FILE' +`-i' SOURCE-FILE +`--include' SOURCE-FILE Read `awk' source library from SOURCE-FILE. This option is completely equivalent to using the `@include' directive inside your program. This option is very similar to the `-f' option, but there are two important differences. First, when `-i' is used, - the program source will not be loaded if it has been previously - loaded, whereas the `-f' will always load the file. Second, - because this option is intended to be used with code libraries, - `gawk' does not recognize such files as constituting main program - input. Thus, after processing an `-i' argument, `gawk' still - expects to find the main source code via the `-f' option or on the + the program source is not loaded if it has been previously loaded, + whereas with `-f', `gawk' always loads the file. Second, because + this option is intended to be used with code libraries, `gawk' + does not recognize such files as constituting main program input. + Thus, after processing an `-i' argument, `gawk' still expects to + find the main source code via the `-f' option or on the command-line. -`-l LIB' -`--load LIB' - Load a shared library LIB. This searches for the library using the - `AWKLIBPATH' environment variable. The correct library suffix for - your platform will be supplied by default, so it need not be - specified in the library name. The library initialization routine - should be named `dl_load()'. An alternative is to use the `@load' - keyword inside the program to load a shared library. - -`-L [value]' -`--lint[=value]' +`-l' EXT +`--load' EXT + Load a dynamic extension named EXT. Extensions are stored as + system shared libraries. This option searches for the library + using the `AWKLIBPATH' environment variable. The correct library + suffix for your platform will be supplied by default, so it need + not be specified in the extension name. The extension + initialization routine should be named `dl_load()'. An + alternative is to use the `@load' keyword inside the program to + load a shared library. This feature is described in detail in + *note Dynamic Extensions::. + +`-L'[VALUE] +`--lint'[`='VALUE] Warn about constructs that are dubious or nonportable to other - `awk' implementations. Some warnings are issued when `gawk' first - reads your program. Others are issued at runtime, as your program - executes. With an optional argument of `fatal', lint warnings - become fatal errors. This may be drastic, but its use will - certainly encourage the development of cleaner `awk' programs. + `awk' implementations. No space is allowed between the `-D' and + VALUE, if VALUE is supplied. Some warnings are issued when `gawk' + first reads your program. Others are issued at runtime, as your + program executes. With an optional argument of `fatal', lint + warnings become fatal errors. This may be drastic, but its use + will certainly encourage the development of cleaner `awk' programs. With an optional argument of `invalid', only warnings about things that are actually invalid are issued. (This is not fully implemented yet.) @@ -2495,23 +2517,26 @@ The following list describes options mandated by the POSIX standard: Force the use of the locale's decimal point character when parsing numeric input data (*note Locales::). -`-o[FILE]' -`--pretty-print[=FILE]' +`-o'[FILE] +`--pretty-print'[`='FILE] Enable pretty-printing of `awk' programs. By default, output - program is created in a file named `awkprof.out'. The optional - FILE argument allows you to specify a different file name for the - output. No space is allowed between the `-o' and FILE, if FILE is - supplied. + program is created in a file named `awkprof.out' (*note + Profiling::). The optional FILE argument allows you to specify a + different file name for the output. No space is allowed between + the `-o' and FILE, if FILE is supplied. + + NOTE: Due to the way `gawk' has evolved, with this option + your program is still executed. This will change in the next + major release such that `gawk' will only pretty-print the + program and not run it. `-O' `--optimize' Enable some optimizations on the internal representation of the - program. At the moment this includes just simple constant - folding. The `gawk' maintainer hopes to add more optimizations - over time. + program. At the moment this includes just simple constant folding. -`-p[FILE]' -`--profile[=FILE]' +`-p'[FILE] +`--profile'[`='FILE] Enable profiling of `awk' programs (*note Profiling::). By default, profiles are created in a file named `awkprof.out'. The optional FILE argument allows you to specify a different file name @@ -2544,15 +2569,15 @@ The following list describes options mandated by the POSIX standard: data (*note Locales::). If you supply both `--traditional' and `--posix' on the command - line, `--posix' takes precedence. `gawk' also issues a warning if - both options are supplied. + line, `--posix' takes precedence. `gawk' issues a warning if both + options are supplied. `-r' `--re-interval' Allow interval expressions (*note Regexp Operators::) in regexps. This is now `gawk''s default behavior. Nevertheless, this option remains both for backward compatibility, and for use in - combination with the `--traditional' option. + combination with `--traditional'. `-S' `--sandbox' @@ -2602,25 +2627,25 @@ input as a source of data.) source file and command-line `awk' programs, `gawk' provides the `--source' option. This does not require you to pre-empt the standard input for your source code; it allows you to easily mix command-line -and library source code (*note AWKPATH Variable::). The `--source' -option may also be used multiple times on the command line. +and library source code (*note AWKPATH Variable::). As with `-f', the +`--source' and `--include' options may also be used multiple times on +the command line. If no `-f' or `--source' option is specified, then `gawk' uses the first non-option command-line argument as the text of the program source code. If the environment variable `POSIXLY_CORRECT' exists, then `gawk' -behaves in strict POSIX mode, exactly as if you had supplied the -`--posix' command-line option. Many GNU programs look for this -environment variable to suppress extensions that conflict with POSIX, -but `gawk' behaves differently: it suppresses all extensions, even -those that do not conflict with POSIX, and behaves in strict POSIX -mode. If `--lint' is supplied on the command line and `gawk' turns on -POSIX mode because of `POSIXLY_CORRECT', then it issues a warning -message indicating that POSIX mode is in effect. You would typically -set this variable in your shell's startup file. For a -Bourne-compatible shell (such as Bash), you would add these lines to -the `.profile' file in your home directory: +behaves in strict POSIX mode, exactly as if you had supplied `--posix'. +Many GNU programs look for this environment variable to suppress +extensions that conflict with POSIX, but `gawk' behaves differently: it +suppresses all extensions, even those that do not conflict with POSIX, +and behaves in strict POSIX mode. If `--lint' is supplied on the +command line and `gawk' turns on POSIX mode because of +`POSIXLY_CORRECT', then it issues a warning message indicating that +POSIX mode is in effect. You would typically set this variable in your +shell's startup file. For a Bourne-compatible shell (such as Bash), +you would add these lines to the `.profile' file in your home directory: POSIXLY_CORRECT=true export POSIXLY_CORRECT @@ -2672,18 +2697,18 @@ begins scanning the argument list. The variable values given on the command line are processed for escape sequences (*note Escape Sequences::). (d.c.) - In some earlier implementations of `awk', when a variable assignment -occurred before any file names, the assignment would happen _before_ -the `BEGIN' rule was executed. `awk''s behavior was thus inconsistent; -some command-line assignments were available inside the `BEGIN' rule, -while others were not. Unfortunately, some applications came to depend -upon this "feature." When `awk' was changed to be more consistent, the -`-v' option was added to accommodate applications that depended upon -the old behavior. + In some very early implementations of `awk', when a variable +assignment occurred before any file names, the assignment would happen +_before_ the `BEGIN' rule was executed. `awk''s behavior was thus +inconsistent; some command-line assignments were available inside the +`BEGIN' rule, while others were not. Unfortunately, some applications +came to depend upon this "feature." When `awk' was changed to be more +consistent, the `-v' option was added to accommodate applications that +depended upon the old behavior. The variable assignment feature is most useful for assigning to variables such as `RS', `OFS', and `ORS', which control input and -output formats before scanning the data files. It is also useful for +output formats, before scanning the data files. It is also useful for controlling state if multiple passes are needed over a data file. For example: @@ -2718,7 +2743,7 @@ with `getline' (*note Getline/File::). In addition, `gawk' allows you to specify the special file name `/dev/stdin', both on the command line and with `getline'. Some other versions of `awk' also support this, but it is not standard. (Some -operating systems provide a `/dev/stdin' file in the file system, +operating systems provide a `/dev/stdin' file in the file system; however, `gawk' always processes this file name itself.) @@ -2757,9 +2782,9 @@ colons(1). `gawk' gets its search path from the `AWKPATH' environment variable. If that variable does not exist, `gawk' uses a default path, `.:/usr/local/share/awk'.(2) - The search path feature is particularly useful for building libraries -of useful `awk' functions. The library files can be placed in a -standard directory in the default path and then specified on the + The search path feature is particularly helpful for building +libraries of useful `awk' functions. The library files can be placed +in a standard directory in the default path and then specified on the command line with a short file name. Otherwise, the full file name would have to be typed for each file. @@ -2776,7 +2801,7 @@ filename. NOTE: To include the current directory in the path, either place `.' explicitly in the path or write a null entry in the path. (A null entry is indicated by starting or ending the path with a - colon or by placing two colons next to each other (`::').) This + colon or by placing two colons next to each other [`::'].) This path search mechanism is similar to the shell's. However, `gawk' always looks in the current directory _before_ @@ -2785,8 +2810,8 @@ filename. If `AWKPATH' is not defined in the environment, `gawk' places its default search path into `ENVIRON["AWKPATH"]'. This makes it easy to -determine the actual search path that `gawk' will use from within an -`awk' program. +determine the actual search path that `gawk' used from within an `awk' +program. While you can change `ENVIRON["AWKPATH"]' within your `awk' program, this has no effect on the running program's behavior. This makes @@ -2810,13 +2835,13 @@ File: gawk.info, Node: AWKLIBPATH Variable, Next: Other Environment Variables, ------------------------------------------- The `AWKLIBPATH' environment variable is similar to the `AWKPATH' -variable, but it is used to search for shared libraries specified with -the `-l' option rather than for source files. If the library is not -found, the path is searched again after adding the appropriate shared -library suffix for the platform. For example, on GNU/Linux systems, -the suffix `.so' is used. The search path specified is also used for -libraries loaded via the `@load' keyword (*note Loading Shared -Libraries::). +variable, but it is used to search for loadable extensions (stored as +system shared libraries) specified with the `-l' option rather than for +source files. If the extension is not found, the path is searched +again after adding the appropriate shared library suffix for the +platform. For example, on GNU/Linux systems, the suffix `.so' is used. +The search path specified is also used for extensions loaded via the +`@load' keyword (*note Loading Shared Libraries::). File: gawk.info, Node: Other Environment Variables, Prev: AWKLIBPATH Variable, Up: Environment Variables @@ -2833,7 +2858,7 @@ used by regular users. traditional and GNU extensions. *Note Options::. `GAWK_SOCK_RETRIES' - Controls the number of time `gawk' will attempt to retry a two-way + Controls the number of times `gawk' attempts to retry a two-way TCP/IP (socket) connection before giving up. *Note TCP/IP Networking::. @@ -2876,6 +2901,11 @@ change. The variables are: supposed to be differences, but occasionally theory and practice don't coordinate with each other.) +`GAWK_NO_PP_RUN' + If this variable exists, then when invoked with the + `--pretty-print' option, `gawk' skips running the program. This + variable will not survive into the next major release. + `GAWK_STACKSIZE' This specifies the amount by which `gawk' should grow its internal evaluation stack, when needed. @@ -2955,7 +2985,7 @@ enclosed in double quotes. NOTE: Keep in mind that this is a language construct and the file name cannot be a string variable, but rather just a literal string - in double quotes. + constant in double quotes. The files to be included may be nested; e.g., given a third script, namely `test3': @@ -3010,19 +3040,19 @@ and this also applies to files named with `@include'. File: gawk.info, Node: Loading Shared Libraries, Next: Obsolete, Prev: Include Files, Up: Invoking Gawk -2.8 Loading Shared Libraries Into Your Program -============================================== +2.8 Loading Dynamic Extensions Into Your Program +================================================ This minor node describes a feature that is specific to `gawk'. - The `@load' keyword can be used to read external `awk' shared -libraries. This allows you to link in compiled code that may offer -superior performance and/or give you access to extended capabilities -not supported by the `awk' language. The `AWKLIBPATH' variable is used -to search for the shared library. Using `@load' is completely -equivalent to using the `-l' command-line option. + The `@load' keyword can be used to read external `awk' extensions +(stored as system shared libraries). This allows you to link in +compiled code that may offer superior performance and/or give you +access to extended capabilities not supported by the `awk' language. +The `AWKLIBPATH' variable is used to search for the extension. Using +`@load' is completely equivalent to using the `-l' command-line option. - If the shared library is not initially found in `AWKLIBPATH', another + If the extension is not initially found in `AWKLIBPATH', another search is conducted after appending the platform's default shared library suffix to the filename. For example, on GNU/Linux systems, the suffix `.so' is used. @@ -3037,7 +3067,7 @@ This is equivalent to the following example: For command-line usage, the `-l' option is more convenient, but `@load' is useful for embedding inside an `awk' source file that requires -access to a shared library. +access to an extension. *note Dynamic Extensions::, describes how to write extensions (in C or C++) that can be loaded with either `@load' or the `-l' option. @@ -3108,8 +3138,8 @@ A regular expression can be used as a pattern by enclosing it in slashes. Then the regular expression is tested against the entire text of each record. (Normally, it only needs to match some part of the text in order to succeed.) For example, the following prints the -second field of each record that contains the string `li' anywhere in -it: +second field of each record where the string `li' appears anywhere in +the record: $ awk '/li/ { print $2 }' mail-list -| 555-5553 @@ -3194,8 +3224,8 @@ apply to both string constants and regexp constants: A literal backslash, `\'. `\a' - The "alert" character, `Ctrl-g', ASCII code 7 (BEL). (This - usually makes some sort of audible noise.) + The "alert" character, `Ctrl-g', ASCII code 7 (BEL). (This often + makes some sort of audible noise.) `\b' Backspace, `Ctrl-h', ASCII code 8 (BS). @@ -3330,20 +3360,21 @@ sequences and that are not listed in the table stand for themselves: at the beginning of the string. It is important to realize that `^' does not match the beginning of - a line embedded in a string. The condition is not true in the - following example: + a line (the point right after a `\n' newline character) embedded + in a string. The condition is not true in the following example: if ("line1\nLINE 2" ~ /^L/) ... `$' This is similar to `^', but it matches only at the end of a string. For example, `p$' matches a record that ends with a `p'. The `$' - is an anchor and does not match the end of a line embedded in a - string. The condition in the following example is not true: + is an anchor and does not match the end of a line (the point right + before a `\n' newline character) embedded in a string. The + condition in the following example is not true: if ("line1\nLINE 2" ~ /1$/) ... -`. (period)' +`.' (period) This matches any single character, _including_ the newline character. For example, `.P' matches any single character followed by a `P' in a string. Using concatenation, we can make a @@ -3355,7 +3386,7 @@ sequences and that are not listed in the table stand for themselves: Otherwise, NUL is just another character. Other versions of `awk' may not be able to match the NUL character. -`[...]' +`['...`]' This is called a "bracket expression".(1) It matches any _one_ of the characters that are enclosed in the square brackets. For example, `[MVX]' matches any one of the characters `M', `V', or @@ -3363,7 +3394,7 @@ sequences and that are not listed in the table stand for themselves: square brackets of a bracket expression is given in *note Bracket Expressions::. -`[^ ...]' +`[^'...`]' This is a "complemented bracket expression". The first character after the `[' _must_ be a `^'. It matches any characters _except_ those in the square brackets. For example, `[^awk]' matches any @@ -3379,7 +3410,7 @@ sequences and that are not listed in the table stand for themselves: The alternation applies to the largest possible regexps on either side. -`(...)' +`('...`)' Parentheses are used for grouping in regular expressions, as in arithmetic. They can be used to concatenate regular expressions containing the alternation operator, `|'. For example, @@ -3406,8 +3437,8 @@ sequences and that are not listed in the table stand for themselves: This symbol is similar to `*', except that the preceding expression must be matched at least once. This means that `wh+y' would match `why' and `whhy', but not `wy', whereas `wh*y' would - match all three of these strings. The following is a simpler way - of writing the last `*' example: + match all three. The following is a simpler way of writing the + last `*' example: awk '/\(c[ad]+r x\)/ { print }' sample @@ -3416,9 +3447,9 @@ sequences and that are not listed in the table stand for themselves: expression can be matched either once or not at all. For example, `fe?d' matches `fed' and `fd', but nothing else. -`{N}' -`{N,}' -`{N,M}' +`{'N`}' +`{'N`,}' +`{'N`,'M`}' One or two numbers inside braces denote an "interval expression". If there is one number in the braces, the preceding regexp is repeated N times. If there are two numbers separated by a comma, @@ -3698,7 +3729,9 @@ works in any POSIX-compliant `awk'. Another method, specific to `gawk', is to set the variable `IGNORECASE' to a nonzero value (*note Built-in Variables::). When `IGNORECASE' is not zero, _all_ regexp and string operations ignore -case. Changing the value of `IGNORECASE' dynamically controls the +case. + + Changing the value of `IGNORECASE' dynamically controls the case-sensitivity of the program as it runs. Case is significant by default because `IGNORECASE' (like most variables) is initialized to zero: @@ -3721,9 +3754,6 @@ dynamically turn case-sensitivity on or off for all the rules at once. `IGNORECASE' from the command line is a way to make a program case-insensitive without having to edit it. - Both regexp and string comparison operations are affected by -`IGNORECASE'. - In multibyte locales, the equivalences between upper- and lowercase characters are tested based on the wide-character values of the locale's character set. Otherwise, the characters are tested based on @@ -3785,7 +3815,8 @@ The righthand side of a `~' or `!~' operator need not be a regexp constant (i.e., a string of characters between slashes). It may be any expression. The expression is evaluated and converted to a string if necessary; the contents of the string are then used as the regexp. A -regexp computed in this way is called a "dynamic regexp": +regexp computed in this way is called a "dynamic regexp" or a "computed +regexp": BEGIN { digits_regexp = "[[:digit:]]+" } $0 ~ digits_regexp { print } @@ -3833,8 +3864,8 @@ constants," for several reasons: Using `\n' in Bracket Expressions of Dynamic Regexps - Some commercial versions of `awk' do not allow the newline character -to be used inside a bracket expression for a dynamic regexp: + Some versions of `awk' do not allow the newline character to be used +inside a bracket expression for a dynamic regexp: $ awk '$0 ~ "[ \t\n]"' error--> awk: newline in character class [ @@ -4517,7 +4548,7 @@ letter): > { print $2 }' -| a -In this case, the first field is "null" or empty. +In this case, the first field is null, or empty. The stripping of leading and trailing whitespace also comes into play whenever `$0' is recomputed. For instance, study this pipeline: @@ -5993,11 +6024,11 @@ width. Here is a list of the format-control letters: the first byte of a string or to numeric values within the range of a single byte (0-255). -`%d, %i' +`%d', `%i' Print a decimal integer. The two control letters are equivalent. (The `%i' specification is for compatibility with ISO C.) -`%e, %E' +`%e', `%E' Print a number in scientific (exponential) notation; for example: printf "%4.3e\n", 1950 @@ -6028,7 +6059,7 @@ width. Here is a list of the format-control letters: The `%F' format is a POSIX extension to ISO C; not all systems support it. On those that don't, `gawk' uses `%f' instead. -`%g, %G' +`%g', `%G' Print a number in either scientific notation or in floating-point notation, whichever uses fewer characters; if the result is printed in scientific notation, `%G' uses `E' instead of `e'. @@ -6044,7 +6075,7 @@ width. Here is a list of the format-control letters: use, because all numbers in `awk' are floating-point; it is provided primarily for compatibility with C.) -`%x, %X' +`%x', `%X' Print an unsigned hexadecimal integer; `%X' uses the letters `A' through `F' instead of `a' through `f' (*note Nondecimal-numbers::). @@ -8264,7 +8295,7 @@ to avoid the problem the expression can be rewritten as `$($0++)--'. This table presents `awk''s operators, in order of highest to lowest precedence: -`(...)' +`('...`)' Grouping. `$' @@ -8285,7 +8316,7 @@ precedence: `+ -' Addition, subtraction. -`String Concatenation' +String Concatenation There is no special symbol for concatenation. The operands are simply written side by side (*note Concatenation::). @@ -9684,7 +9715,7 @@ automatically on certain occasions in order to provide information to your program. The variables that are specific to `gawk' are marked with a pound sign (`#'). -`ARGC, ARGV' +`ARGC', `ARGV' The command-line arguments available to `awk' programs are stored in an array called `ARGV'. `ARGC' is the number of command-line arguments present. *Note Other Arguments::. Unlike most `awk' @@ -9713,7 +9744,7 @@ with a pound sign (`#'). are any of `awk''s command-line options. *Note ARGC and ARGV::, for information about how `awk' uses these variables. (d.c.) -`ARGIND #' +`ARGIND' # The index in `ARGV' of the current file being processed. Every time `gawk' opens a new data file for processing, it sets `ARGIND' to the index in `ARGV' of the file name. When `gawk' is @@ -9746,7 +9777,7 @@ with a pound sign (`#'). `ENVIRON["AWKPATH"]', *note AWKPATH Variable:: and `ENVIRON["AWKLIBPATH"]', *note AWKLIBPATH Variable::). -`ERRNO #' +`ERRNO' # If a system error occurs during a redirection for `getline', during a read for `getline', or during a `close()' operation, then `ERRNO' contains a string describing the error. @@ -9792,7 +9823,7 @@ with a pound sign (`#'). create or remove fields from the current record. *Note Changing Fields::. -`FUNCTAB #' +`FUNCTAB' # An array whose indices and corresponding values are the names of all the user-defined or extension functions in the program. @@ -9806,7 +9837,7 @@ with a pound sign (`#'). beginning of the program's execution (*note Records::). `NR' is incremented each time a new record is read. -`PROCINFO #' +`PROCINFO' # The elements of this array provide access to information about the running `awk' program. The following elements (listed alphabetically) are guaranteed to be available: @@ -9939,7 +9970,7 @@ with a pound sign (`#'). of the string where the matched substring starts, or zero if no match was found. -`RT #' +`RT' # This is set each time a record is read. It contains the input text that matched the text denoted by `RS', the record separator. @@ -9947,7 +9978,7 @@ with a pound sign (`#'). implementations, or if `gawk' is in compatibility mode (*note Options::), it is not special. -`SYMTAB #' +`SYMTAB' # An array whose indices are the names of all currently defined global variables and arrays in the program. The array may be used for indirect access to read or write the value of a variable: @@ -11195,7 +11226,7 @@ brackets ([ ]): Return the positive square root of X. `gawk' prints a warning message if X is negative. Thus, `sqrt(4)' is 2. -`srand([X])' +`srand('[X]`)' Set the starting point, or seed, for generating random numbers to the value X. @@ -11262,8 +11293,8 @@ pound sign (`#'): `&' with `sub()', `gsub()', and `gensub()'. -`asort(SOURCE [, DEST [, HOW ] ]) #' -`asorti(SOURCE [, DEST [, HOW ] ]) #' +`asort('SOURCE [`,' DEST [`,' HOW ] ]`)' # +`asorti('SOURCE [`,' DEST [`,' HOW ] ]`)' # These two functions are similar in behavior, so they are described together. @@ -11313,7 +11344,7 @@ pound sign (`#'): `asort()' and `asorti()' are `gawk' extensions; they are not available in compatibility mode (*note Options::). -`gensub(REGEXP, REPLACEMENT, HOW [, TARGET]) #' +`gensub(REGEXP, REPLACEMENT, HOW' [`, TARGET']`)' # Search the target string TARGET for matches of the regular expression REGEXP. If HOW is a string beginning with `g' or `G' (short for "global"), then replace all matches of REGEXP with @@ -11366,7 +11397,7 @@ pound sign (`#'): `gensub()' is a `gawk' extension; it is not available in compatibility mode (*note Options::). -`gsub(REGEXP, REPLACEMENT [, TARGET])' +`gsub(REGEXP, REPLACEMENT' [`, TARGET']`)' Search TARGET for _all_ of the longest, leftmost, _nonoverlapping_ matching substrings it can find and replace them with REPLACEMENT. The `g' in `gsub()' stands for "global," which means replace @@ -11395,7 +11426,7 @@ pound sign (`#'): It is a fatal error to use a regexp constant for FIND. -`length([STRING])' +`length('[STRING]`)' Return the number of characters in STRING. If STRING is a number, the length of the digit string representing that number is returned. For example, `length("abcde")' is five. By contrast, @@ -11435,7 +11466,7 @@ pound sign (`#'): array argument is not portable. If `--posix' is supplied, using an array argument is a fatal error (*note Arrays::). -`match(STRING, REGEXP [, ARRAY])' +`match(STRING, REGEXP' [`, ARRAY']`)' Search STRING for the longest, leftmost substring matched by the regular expression, REGEXP and return the character position, or "index", at which that substring begins (one, if it starts at the @@ -11522,7 +11553,7 @@ pound sign (`#'): compatibility mode (*note Options::), using a third argument is a fatal error. -`patsplit(STRING, ARRAY [, FIELDPAT [, SEPS ] ]) #' +`patsplit(STRING, ARRAY' [`, FIELDPAT' [`, SEPS' ] ]`)' # Divide STRING into pieces defined by FIELDPAT and store the pieces in ARRAY and the separator strings in the SEPS array. The first piece is stored in `ARRAY[1]', the second piece in `ARRAY[2]', and @@ -11544,7 +11575,7 @@ pound sign (`#'): The `patsplit()' function is a `gawk' extension. In compatibility mode (*note Options::), it is not available. -`split(STRING, ARRAY [, FIELDSEP [, SEPS ] ])' +`split(STRING, ARRAY' [`, FIELDSEP' [`, SEPS' ] ]`)' Divide STRING into pieces separated by FIELDSEP and store the pieces in ARRAY and the separator strings in the SEPS array. The first piece is stored in `ARRAY[1]', the second piece in @@ -11617,7 +11648,7 @@ pound sign (`#'): assigns the string `pi = 3.14 (approx.)' to the variable `pival'. -`strtonum(STR) #' +`strtonum(STR)' # Examine STR and return its numeric value. If STR begins with a leading `0', `strtonum()' assumes that STR is an octal number. If STR begins with a leading `0x' or `0X', `strtonum()' assumes that @@ -11637,7 +11668,7 @@ pound sign (`#'): `strtonum()' is a `gawk' extension; it is not available in compatibility mode (*note Options::). -`sub(REGEXP, REPLACEMENT [, TARGET])' +`sub(REGEXP, REPLACEMENT' [`, TARGET']`)' Search TARGET, which is treated as a string, for the leftmost, longest substring matched by the regular expression REGEXP. Modify the entire string by replacing the matched text with @@ -11710,7 +11741,7 @@ pound sign (`#'): into a string, and then the value of that string is treated as the regexp to match. -`substr(STRING, START [, LENGTH])' +`substr(STRING, START' [`, LENGTH' ]`)' Return a LENGTH-character-long substring of STRING, starting at character number START. The first character of a string is character number one.(3) For example, `substr("washington", 5, 3)' @@ -11967,7 +11998,7 @@ File: gawk.info, Node: I/O Functions, Next: Time Functions, Prev: String Func The following functions relate to input/output (I/O). Optional parameters are enclosed in square brackets ([ ]): -`close(FILENAME [, HOW])' +`close('FILENAME [`,' HOW]`)' Close the file FILENAME for input or output. Alternatively, the argument may be a shell command that was used for creating a coprocess, or for redirecting to or from a pipe; then the @@ -11982,7 +12013,7 @@ parameters are enclosed in square brackets ([ ]): not matter. *Note Two-way I/O::, which discusses this feature in more detail and gives an example. -`fflush([FILENAME])' +`fflush('[FILENAME]`)' Flush any buffered output associated with FILENAME, which is either a file opened for writing or a shell command for redirecting output to a pipe or coprocess. @@ -12189,7 +12220,7 @@ enclosed in square brackets ([ ]): If DATESPEC does not contain enough elements or if the resulting time is out of range, `mktime()' returns -1. -`strftime([FORMAT [, TIMESTAMP [, UTC-FLAG]]])' +``strftime(' [FORMAT [`,' TIMESTAMP [`,' UTC-FLAG ]]]`)'' Format the time specified by TIMESTAMP based on the contents of the FORMAT string and return the result. It is similar to the function of the same name in ISO C. If UTC-FLAG is present and is @@ -12507,23 +12538,23 @@ again with `10111001' and shift it left by three bits, you end up with `11001000'. `gawk' provides built-in functions that implement the bitwise operations just described. They are: -`and(V1, V2 [, ...])' +``and(V1, V2' [`,' ...]`)'' Return the bitwise AND of the arguments. There must be at least two. -`compl(VAL)' +``compl(VAL)'' Return the bitwise complement of VAL. -`lshift(VAL, COUNT)' +``lshift(VAL, COUNT)'' Return the value of VAL, shifted left by COUNT bits. -`or(V1, V2 [, ...])' +``or(V1, V2' [`,' ...]`)'' Return the bitwise OR of the arguments. There must be at least two. -`rshift(VAL, COUNT)' +``rshift(VAL, COUNT)'' Return the value of VAL, shifted right by COUNT bits. -`xor(V1, V2 [, ...])' +``xor(V1, V2' [`,' ...]`)'' Return the bitwise XOR of the arguments. There must be at least two. @@ -12639,7 +12670,7 @@ descriptions here are purposely brief. *Note Internationalization::, for the full story. Optional parameters are enclosed in square brackets ([ ]): -`bindtextdomain(DIRECTORY [, DOMAIN])' +``bindtextdomain(DIRECTORY' [`,' DOMAIN ]`)'' Set the directory in which `gawk' will look for message translation files, in case they will not or cannot be placed in the "standard" locations (e.g., during testing). It returns the @@ -12649,13 +12680,13 @@ brackets ([ ]): the null string (`""'), then `bindtextdomain()' returns the current binding for the given DOMAIN. -`dcgettext(STRING [, DOMAIN [, CATEGORY]])' +``dcgettext(STRING' [`,' DOMAIN [`,' CATEGORY ]]`)'' Return the translation of STRING in text domain DOMAIN for locale category CATEGORY. The default value for DOMAIN is the current value of `TEXTDOMAIN'. The default value for CATEGORY is `"LC_MESSAGES"'. -`dcngettext(STRING1, STRING2, NUMBER [, DOMAIN [, CATEGORY]])' +``dcngettext(STRING1, STRING2, NUMBER' [`,' DOMAIN [`,' CATEGORY ]]`)'' Return the plural form used for NUMBER of the translation of STRING1 and STRING2 in text domain DOMAIN for locale category CATEGORY. STRING1 is the English singular variant of a message, @@ -17577,10 +17608,10 @@ are several cases of interest: programming trick. Don't worry about it if you are not familiar with `sh'.) -`-v, -F' +`-v', `-F' These are saved and passed on to `gawk'. -`-f, --file, --file=, -Wfile=' +`-f', `--file', `--file=', `-Wfile=' The file name is appended to the shell variable `program' with an `@include' statement. The `expr' utility is used to remove the leading option part of the argument (e.g., `--file='). (Typical @@ -17589,10 +17620,10 @@ are several cases of interest: sequences in their arguments, possibly mangling the program text. Using `expr' avoids this problem.) -`--source, --source=, -Wsource=' +`--source', `--source=', `-Wsource=' The source text is appended to `program'. -`--version, -Wversion' +`--version', `-Wversion' `igawk' prints its version number, runs `gawk --version' to get the `gawk' version information, and then exits. @@ -19067,7 +19098,7 @@ internationalization: for translation at runtime. String constants without a leading underscore are not translated. -`dcgettext(STRING [, DOMAIN [, CATEGORY]])' +``dcgettext(STRING' [`,' DOMAIN [`,' CATEGORY ]]`)'' Return the translation of STRING in text domain DOMAIN for locale category CATEGORY. The default value for DOMAIN is the current value of `TEXTDOMAIN'. The default value for CATEGORY is @@ -19084,7 +19115,7 @@ internationalization: be simple and to allow for reasonable `awk'-style default arguments. -`dcngettext(STRING1, STRING2, NUMBER [, DOMAIN [, CATEGORY]])' +``dcngettext(STRING1, STRING2, NUMBER' [`,' DOMAIN [`,' CATEGORY ]]`)'' Return the plural form used for NUMBER of the translation of STRING1 and STRING2 in text domain DOMAIN for locale category CATEGORY. STRING1 is the English singular variant of a message, @@ -19095,7 +19126,7 @@ internationalization: The same remarks about argument order as for the `dcgettext()' function apply. -`bindtextdomain(DIRECTORY [, DOMAIN])' +``bindtextdomain(DIRECTORY' [`,' DOMAIN ]`)'' Change the directory in which `gettext' looks for `.gmo' files, in case they will not or cannot be placed in the standard locations (e.g., during testing). Return the directory in which DOMAIN is @@ -20276,16 +20307,16 @@ from a file. The commands are: `prompt' The debugger prompt. The default is `gawk> '. - `save_history [on | off]' + `save_history' [`on' | `off'] Save command history to file `./.gawk_history'. The default is `on'. - `save_options [on | off]' + `save_options' [`on' | `off'] Save current options to file `./.gawkrc' upon exit. The default is `on'. Options are read back in to the next session upon startup. - `trace [on | off]' + `trace' [`on' | `off'] Turn instruction tracing on or off. The default is `off'. `save' FILENAME @@ -20414,7 +20445,7 @@ categories, as follows: accidentally type `q' or `quit', to make sure you really want to quit. -`trace' `on' | `off' +`trace' [`on' | `off'] Turn on or off a continuous printing of instructions which are about to be executed, along with printing the `awk' line which they implement. The default is `off'. @@ -24281,8 +24312,8 @@ create a GNU/Linux shared library: } The `AWKLIBPATH' environment variable tells `gawk' where to find -shared libraries (*note Finding Extensions::). We set it to the -current directory and run the program: +extensions (*note Finding Extensions::). We set it to the current +directory and run the program: $ AWKLIBPATH=$PWD gawk -f testff.awk -| /tmp @@ -24355,7 +24386,7 @@ File: gawk.info, Node: Extension Sample File Functions, Next: Extension Sample The `filefuncs' extension provides three different functions, as follows: The usage is: -`@load "filefuncs"' +@load "filefuncs" This is how you load the extension. `result = chdir("/some/directory")' @@ -24364,7 +24395,7 @@ follows: The usage is: success or less than zero upon error. In the latter case it updates `ERRNO'. -`result = stat("/some/path", statdata [, follow])' +`result = stat("/some/path", statdata' [`, follow']`)' The `stat()' function provides a hook into the `stat()' system call. It returns zero upon success or less than zero upon error. In the latter case it updates `ERRNO'. @@ -30049,8 +30080,8 @@ Index * ! (exclamation point), !~ operator <6>: Case-sensitivity. (line 26) * ! (exclamation point), !~ operator: Regexp Usage. (line 19) * " (double quote) in shell commands: Read Terminal. (line 25) -* " (double quote), in regexp constants: Computed Regexps. (line 28) -* " (double quote), in shell commands: Quoting. (line 37) +* " (double quote), in regexp constants: Computed Regexps. (line 29) +* " (double quote), in shell commands: Quoting. (line 54) * # (number sign), #! (executable scripts): Executable Scripts. (line 6) * # (number sign), commenting: Comments. (line 6) @@ -30068,15 +30099,15 @@ Index (line 6) * ' (single quote): One-shot. (line 15) * ' (single quote) in gawk command lines: Long. (line 33) -* ' (single quote), in shell commands: Quoting. (line 31) +* ' (single quote), in shell commands: Quoting. (line 48) * ' (single quote), vs. apostrophe: Comments. (line 27) -* ' (single quote), with double quotes: Quoting. (line 53) +* ' (single quote), with double quotes: Quoting. (line 70) * () (parentheses), in a profile: Profiling. (line 146) -* () (parentheses), regexp operator: Regexp Operators. (line 79) +* () (parentheses), regexp operator: Regexp Operators. (line 80) * * (asterisk), * operator, as multiplication operator: Precedence. (line 55) * * (asterisk), * operator, as regexp operator: Regexp Operators. - (line 87) + (line 88) * * (asterisk), * operator, null strings, matching: Gory Details. (line 164) * * (asterisk), ** operator <1>: Precedence. (line 49) @@ -30090,7 +30121,7 @@ Index * + (plus sign), ++ operator: Increment Ops. (line 11) * + (plus sign), += operator <1>: Precedence. (line 95) * + (plus sign), += operator: Assignment Ops. (line 82) -* + (plus sign), regexp operator: Regexp Operators. (line 102) +* + (plus sign), regexp operator: Regexp Operators. (line 103) * , (comma), in range patterns: Ranges. (line 6) * - (hyphen), - operator: Precedence. (line 52) * - (hyphen), -- operator <1>: Precedence. (line 46) @@ -30100,7 +30131,7 @@ Index * - (hyphen), filenames beginning with: Options. (line 59) * - (hyphen), in bracket expressions: Bracket Expressions. (line 17) * --assign option: Options. (line 32) -* --bignum option: Options. (line 201) +* --bignum option: Options. (line 205) * --characters-as-bytes option: Options. (line 68) * --copyright option: Options. (line 88) * --debug option: Options. (line 108) @@ -30120,22 +30151,22 @@ Index * --gen-pot option: Options. (line 147) * --help option: Options. (line 154) * --include option: Options. (line 159) -* --lint option <1>: Options. (line 182) +* --lint option <1>: Options. (line 185) * --lint option: Command Line. (line 20) -* --lint-old option: Options. (line 288) +* --lint-old option: Options. (line 295) * --load option: Options. (line 173) * --non-decimal-data option <1>: Nondecimal Data. (line 6) -* --non-decimal-data option: Options. (line 207) +* --non-decimal-data option: Options. (line 211) * --non-decimal-data option, strtonum() function and: Nondecimal Data. (line 36) -* --optimize option: Options. (line 228) -* --posix option: Options. (line 247) -* --posix option, --traditional option and: Options. (line 266) -* --pretty-print option: Options. (line 220) +* --optimize option: Options. (line 237) +* --posix option: Options. (line 254) +* --posix option, --traditional option and: Options. (line 273) +* --pretty-print option: Options. (line 224) * --profile option <1>: Profiling. (line 12) -* --profile option: Options. (line 235) -* --re-interval option: Options. (line 272) -* --sandbox option: Options. (line 279) +* --profile option: Options. (line 242) +* --re-interval option: Options. (line 279) +* --sandbox option: Options. (line 286) * --sandbox option, disabling system() function: I/O Functions. (line 94) * --sandbox option, input redirection with getline: Getline. (line 19) @@ -30143,9 +30174,9 @@ Index (line 6) * --source option: Options. (line 117) * --traditional option: Options. (line 81) -* --traditional option, --posix option and: Options. (line 266) -* --use-lc-numeric option: Options. (line 215) -* --version option: Options. (line 293) +* --traditional option, --posix option and: Options. (line 273) +* --use-lc-numeric option: Options. (line 219) +* --version option: Options. (line 300) * --with-whiny-user-strftime configuration option: Additional Configuration Options. (line 35) * -b option: Options. (line 68) @@ -30158,29 +30189,29 @@ Index * -f option: Options. (line 25) * -F option: Options. (line 21) * -f option: Long. (line 12) -* -F option, -Ft sets FS to TAB: Options. (line 301) +* -F option, -Ft sets FS to TAB: Options. (line 308) * -F option, command line: Command Line Field Separator. (line 6) -* -f option, multiple uses: Options. (line 306) +* -f option, multiple uses: Options. (line 313) * -g option: Options. (line 147) * -h option: Options. (line 154) * -i option: Options. (line 159) -* -L option: Options. (line 288) +* -L option: Options. (line 295) * -l option: Options. (line 173) -* -M option: Options. (line 201) -* -N option: Options. (line 215) -* -n option: Options. (line 207) -* -O option: Options. (line 228) -* -o option: Options. (line 220) -* -P option: Options. (line 247) -* -p option: Options. (line 235) -* -r option: Options. (line 272) -* -S option: Options. (line 279) +* -M option: Options. (line 205) +* -N option: Options. (line 219) +* -n option: Options. (line 211) +* -O option: Options. (line 237) +* -o option: Options. (line 224) +* -P option: Options. (line 254) +* -p option: Options. (line 242) +* -r option: Options. (line 279) +* -S option: Options. (line 286) * -v option: Assignment Options. (line 12) -* -V option: Options. (line 293) +* -V option: Options. (line 300) * -v option: Options. (line 32) * -W option: Options. (line 46) -* . (period), regexp operator: Regexp Operators. (line 43) +* . (period), regexp operator: Regexp Operators. (line 44) * .gmo files: Explaining gettext. (line 41) * .gmo files, converting from .po: I18N Example. (line 62) * .gmo files, specifying directory of <1>: Programmer i18n. (line 47) @@ -30232,8 +30263,8 @@ Index * ? (question mark), ?: operator: Precedence. (line 92) * ? (question mark), regexp operator <1>: GNU Regexp Operators. (line 59) -* ? (question mark), regexp operator: Regexp Operators. (line 111) -* [] (square brackets), regexp operator: Regexp Operators. (line 55) +* ? (question mark), regexp operator: Regexp Operators. (line 112) +* [] (square brackets), regexp operator: Regexp Operators. (line 56) * \ (backslash): Comments. (line 50) * \ (backslash) in shell commands: Read Terminal. (line 25) * \ (backslash), \" escape sequence: Escape Sequences. (line 76) @@ -30281,8 +30312,8 @@ Index * \ (backslash), in escape sequences: Escape Sequences. (line 6) * \ (backslash), in escape sequences, POSIX and: Escape Sequences. (line 112) -* \ (backslash), in regexp constants: Computed Regexps. (line 28) -* \ (backslash), in shell commands: Quoting. (line 31) +* \ (backslash), in regexp constants: Computed Regexps. (line 29) +* \ (backslash), in shell commands: Quoting. (line 48) * \ (backslash), regexp operator: Regexp Operators. (line 18) * ^ (caret), ^ operator: Precedence. (line 49) * ^ (caret), ^= operator <1>: Precedence. (line 95) @@ -30436,7 +30467,7 @@ Index * asterisk (*), * operator, as multiplication operator: Precedence. (line 55) * asterisk (*), * operator, as regexp operator: Regexp Operators. - (line 87) + (line 88) * asterisk (*), * operator, null strings, matching: Gory Details. (line 164) * asterisk (*), ** operator <1>: Precedence. (line 49) @@ -30450,7 +30481,7 @@ Index * awf (amazingly workable formatter) program: Glossary. (line 25) * awk debugging, enabling: Options. (line 108) * awk language, POSIX version: Assignment Ops. (line 136) -* awk profiling, enabling: Options. (line 235) +* awk profiling, enabling: Options. (line 242) * awk programs <1>: Two Rules. (line 6) * awk programs <2>: Executable Scripts. (line 6) * awk programs: Getting Started. (line 12) @@ -30555,8 +30586,8 @@ Index * backslash (\), in escape sequences: Escape Sequences. (line 6) * backslash (\), in escape sequences, POSIX and: Escape Sequences. (line 112) -* backslash (\), in regexp constants: Computed Regexps. (line 28) -* backslash (\), in shell commands: Quoting. (line 31) +* backslash (\), in regexp constants: Computed Regexps. (line 29) +* backslash (\), in shell commands: Quoting. (line 48) * backslash (\), regexp operator: Regexp Operators. (line 18) * backtrace debugger command: Execution Stack. (line 13) * Beebe, Nelson H.F. <1>: Other Versions. (line 78) @@ -30617,14 +30648,14 @@ Index * braces ({}), actions and: Action Overview. (line 19) * braces ({}), statements, grouping: Statements. (line 10) * bracket expressions <1>: Bracket Expressions. (line 6) -* bracket expressions: Regexp Operators. (line 55) +* bracket expressions: Regexp Operators. (line 56) * bracket expressions, character classes: Bracket Expressions. (line 30) * bracket expressions, collating elements: Bracket Expressions. (line 69) * bracket expressions, collating symbols: Bracket Expressions. (line 76) -* bracket expressions, complemented: Regexp Operators. (line 63) +* bracket expressions, complemented: Regexp Operators. (line 64) * bracket expressions, equivalence classes: Bracket Expressions. (line 82) * bracket expressions, non-ASCII: Bracket Expressions. (line 69) @@ -30665,6 +30696,7 @@ Index * Brian Kernighan's awk, extensions: BTL. (line 6) * Brian Kernighan's awk, source code: Other Versions. (line 13) * Brini, Davide: Signature Program. (line 6) +* Brink, Jeroen: DOS Quoting. (line 10) * Broder, Alan J.: Contributors. (line 88) * Brown, Martin: Contributors. (line 82) * BSD-based operating systems: Glossary. (line 616) @@ -30710,14 +30742,14 @@ Index * CGI, awk scripts for: Options. (line 125) * changing precision of a number: Changing Precision. (line 6) * character classes, See bracket expressions: Regexp Operators. - (line 55) + (line 56) * character lists in regular expression: Bracket Expressions. (line 6) -* character lists, See bracket expressions: Regexp Operators. (line 55) +* character lists, See bracket expressions: Regexp Operators. (line 56) * character sets (machine character encodings) <1>: Glossary. (line 133) * character sets (machine character encodings): Ordinal Functions. (line 45) * character sets, See Also bracket expressions: Regexp Operators. - (line 55) + (line 56) * characters, counting: Wc Program. (line 6) * characters, transliterating: Translate Program. (line 6) * characters, values of as numbers: Ordinal Functions. (line 6) @@ -30861,7 +30893,7 @@ Index * cosine: Numeric Functions. (line 15) * counting: Wc Program. (line 6) * csh utility: Statements/Lines. (line 44) -* csh utility, POSIXLY_CORRECT environment variable: Options. (line 348) +* csh utility, POSIXLY_CORRECT environment variable: Options. (line 355) * csh utility, |& operator, comparison with: Two-way I/O. (line 44) * ctime() user-defined function: Function Example. (line 73) * currency symbols, localization: Explaining gettext. (line 103) @@ -31043,7 +31075,7 @@ Index * debugger, read commands from a file: Debugger Info. (line 96) * debugging awk programs: Debugger. (line 6) * debugging gawk, bug reports: Bugs. (line 9) -* decimal point character, locale specific: Options. (line 263) +* decimal point character, locale specific: Options. (line 270) * decrement operators: Increment Ops. (line 35) * default keyword: Switch Statement. (line 6) * Deifik, Scott <1>: Bugs. (line 70) @@ -31132,7 +31164,7 @@ Index * directories, command line: Command line directories. (line 6) * directories, searching: Igawk Program. (line 368) -* directories, searching for shared libraries: AWKLIBPATH Variable. +* directories, searching for loadable extensions: AWKLIBPATH Variable. (line 6) * directories, searching for source files: AWKPATH Variable. (line 6) * disable breakpoint: Breakpoint Control. (line 69) @@ -31153,8 +31185,8 @@ Index * dollar sign ($), regexp operator: Regexp Operators. (line 35) * double precision floating-point: General Arithmetic. (line 21) * double quote (") in shell commands: Read Terminal. (line 25) -* double quote ("), in regexp constants: Computed Regexps. (line 28) -* double quote ("), in shell commands: Quoting. (line 37) +* double quote ("), in regexp constants: Computed Regexps. (line 29) +* double quote ("), in shell commands: Quoting. (line 54) * down debugger command: Execution Stack. (line 21) * Drepper, Ulrich: Acknowledgments. (line 52) * dump all variables of a program: Options. (line 93) @@ -31470,7 +31502,7 @@ Index * FS variable, --field-separator option and: Options. (line 21) * FS variable, as null string: Single Character Fields. (line 20) -* FS variable, as TAB character: Options. (line 259) +* FS variable, as TAB character: Options. (line 266) * FS variable, changing value of: Field Separators. (line 35) * FS variable, running awk programs and: Cut Program. (line 68) * FS variable, setting from command line: Command Line Field Separator. @@ -31560,7 +31592,7 @@ Index (line 138) * gawk, ERRNO variable in: Getline. (line 19) * gawk, escape sequences: Escape Sequences. (line 124) -* gawk, extensions, disabling: Options. (line 247) +* gawk, extensions, disabling: Options. (line 254) * gawk, features, adding: Adding Code. (line 6) * gawk, features, advanced: Advanced Features. (line 6) * gawk, field separators and: User-modified. (line 77) @@ -31591,7 +31623,7 @@ Index (line 13) * gawk, interpreter, adding code to: Using Internal File Ops. (line 6) -* gawk, interval expressions and: Regexp Operators. (line 139) +* gawk, interval expressions and: Regexp Operators. (line 140) * gawk, line continuation in: Conditional Exp. (line 34) * gawk, LINT variable in: User-modified. (line 98) * gawk, list of contributors to: Contributors. (line 6) @@ -31609,7 +31641,7 @@ Index (line 26) * gawk, regular expressions, operators: GNU Regexp Operators. (line 6) -* gawk, regular expressions, precedence: Regexp Operators. (line 161) +* gawk, regular expressions, precedence: Regexp Operators. (line 162) * gawk, RT variable in <1>: Auto-set. (line 266) * gawk, RT variable in <2>: Multiple Line. (line 129) * gawk, RT variable in: Records. (line 132) @@ -31621,7 +31653,7 @@ Index * gawk, TEXTDOMAIN variable in: User-modified. (line 162) * gawk, timestamps: Time Functions. (line 6) * gawk, uses for: Preface. (line 36) -* gawk, versions of, information about, printing: Options. (line 293) +* gawk, versions of, information about, printing: Options. (line 300) * gawk, VMS version of: VMS Installation. (line 6) * gawk, word-boundary operator: GNU Regexp Operators. (line 63) @@ -31722,7 +31754,7 @@ Index * help debugger command: Miscellaneous Debugger Commands. (line 66) * hexadecimal numbers: Nondecimal-numbers. (line 6) -* hexadecimal values, enabling interpretation of: Options. (line 207) +* hexadecimal values, enabling interpretation of: Options. (line 211) * history expansion, in debugger: Readline Support. (line 6) * histsort.awk program: History Sorting. (line 25) * Hughes, Phil: Acknowledgments. (line 43) @@ -31832,7 +31864,7 @@ Index * internationalizing a program: Explaining gettext. (line 6) * interpreted programs <1>: Glossary. (line 357) * interpreted programs: Basic High Level. (line 15) -* interval expressions, regexp operator: Regexp Operators. (line 116) +* interval expressions, regexp operator: Regexp Operators. (line 117) * inventory-shipped file: Sample Data Files. (line 32) * invoke shell command: I/O Functions. (line 72) * isarray: Type Functions. (line 11) @@ -31935,9 +31967,9 @@ Index * lint checking, array subscripts: Uninitialized Subscripts. (line 43) * lint checking, empty programs: Command Line. (line 16) -* lint checking, issuing warnings: Options. (line 182) +* lint checking, issuing warnings: Options. (line 185) * lint checking, POSIXLY_CORRECT environment variable: Options. - (line 332) + (line 340) * lint checking, undefined functions: Pass By Value/Reference. (line 88) * LINT variable: User-modified. (line 98) @@ -31948,10 +31980,10 @@ Index * list debugger command: Miscellaneous Debugger Commands. (line 72) * list function definitions, in debugger: Debugger Info. (line 30) -* loading, library: Options. (line 173) +* loading, extensions: Options. (line 173) * local variables, in a function: Variable Scope. (line 6) * locale categories: Explaining gettext. (line 80) -* locale decimal point character: Options. (line 263) +* locale decimal point character: Options. (line 270) * locale, definition of: Locales. (line 6) * localization: I18N and L10N. (line 6) * localization, See internationalization, localization: I18N and L10N. @@ -32035,13 +32067,13 @@ Index * networks, programming: TCP/IP Networking. (line 6) * networks, support for: Special Network. (line 6) * newlines <1>: Boolean Ops. (line 67) -* newlines <2>: Options. (line 253) +* newlines <2>: Options. (line 260) * newlines: Statements/Lines. (line 6) * newlines, as field separators: Default Field Splitting. (line 6) * newlines, as record separators: Records. (line 20) -* newlines, in dynamic regexps: Computed Regexps. (line 58) -* newlines, in regexp constants: Computed Regexps. (line 68) +* newlines, in dynamic regexps: Computed Regexps. (line 59) +* newlines, in regexp constants: Computed Regexps. (line 69) * newlines, printing: Print Examples. (line 12) * newlines, separating statements in actions <1>: Statements. (line 10) * newlines, separating statements in actions: Action Overview. @@ -32081,7 +32113,7 @@ Index * null strings <3>: Regexp Field Splitting. (line 43) * null strings: Records. (line 122) -* null strings in gawk arguments, quoting and: Quoting. (line 62) +* null strings in gawk arguments, quoting and: Quoting. (line 79) * null strings, and deleting array elements: Delete. (line 27) * null strings, as array subscripts: Uninitialized Subscripts. (line 43) @@ -32112,7 +32144,7 @@ Index * oawk utility: Names. (line 10) * obsolete features: Obsolete. (line 6) * octal numbers: Nondecimal-numbers. (line 6) -* octal values, enabling interpretation of: Options. (line 207) +* octal values, enabling interpretation of: Options. (line 211) * OFMT variable <1>: User-modified. (line 115) * OFMT variable <2>: Conversion. (line 55) * OFMT variable: OFMT. (line 15) @@ -32192,7 +32224,7 @@ Index * P1003.1 POSIX standard: Glossary. (line 454) * parent process ID of gawk process: Auto-set. (line 186) * parentheses (), in a profile: Profiling. (line 146) -* parentheses (), regexp operator: Regexp Operators. (line 79) +* parentheses (), regexp operator: Regexp Operators. (line 80) * password file: Passwd Functions. (line 16) * patsplit: String Functions. (line 291) * patterns: Patterns and Actions. @@ -32213,7 +32245,7 @@ Index * percent sign (%), % operator: Precedence. (line 55) * percent sign (%), %= operator <1>: Precedence. (line 95) * percent sign (%), %= operator: Assignment Ops. (line 129) -* period (.), regexp operator: Regexp Operators. (line 43) +* period (.), regexp operator: Regexp Operators. (line 44) * Perl: Future Extensions. (line 6) * Peters, Arno: Contributors. (line 85) * Peterson, Hal: Contributors. (line 39) @@ -32230,7 +32262,7 @@ Index * plus sign (+), ++ operator: Increment Ops. (line 11) * plus sign (+), += operator <1>: Precedence. (line 95) * plus sign (+), += operator: Assignment Ops. (line 82) -* plus sign (+), regexp operator: Regexp Operators. (line 102) +* plus sign (+), regexp operator: Regexp Operators. (line 103) * pointers to functions: Indirect Calls. (line 6) * portability: Escape Sequences. (line 94) * portability, #! (executable scripts): Executable Scripts. (line 33) @@ -32256,7 +32288,7 @@ Index * portability, NF variable, decrementing: Changing Fields. (line 115) * portability, operators: Increment Ops. (line 60) * portability, operators, not in POSIX awk: Precedence. (line 98) -* portability, POSIXLY_CORRECT environment variable: Options. (line 353) +* portability, POSIXLY_CORRECT environment variable: Options. (line 360) * portability, substr() function: String Functions. (line 510) * portable object files <1>: Translator i18n. (line 6) * portable object files: Explaining gettext. (line 36) @@ -32296,26 +32328,26 @@ Index * POSIX awk, functions and, gsub()/sub(): Gory Details. (line 54) * POSIX awk, functions and, length(): String Functions. (line 173) * POSIX awk, GNU long options and: Options. (line 15) -* POSIX awk, interval expressions in: Regexp Operators. (line 135) +* POSIX awk, interval expressions in: Regexp Operators. (line 136) * POSIX awk, next/nextfile statements and: Next Statement. (line 45) * POSIX awk, numeric strings and: Variable Typing. (line 6) * POSIX awk, OFMT variable and <1>: Conversion. (line 55) * POSIX awk, OFMT variable and: OFMT. (line 27) -* POSIX awk, period (.), using: Regexp Operators. (line 50) +* POSIX awk, period (.), using: Regexp Operators. (line 51) * POSIX awk, printf format strings and: Format Modifiers. (line 159) -* POSIX awk, regular expressions and: Regexp Operators. (line 161) +* POSIX awk, regular expressions and: Regexp Operators. (line 162) * POSIX awk, timestamps and: Time Functions. (line 6) * POSIX awk, | I/O operator and: Getline/Pipe. (line 55) -* POSIX mode: Options. (line 247) +* POSIX mode: Options. (line 254) * POSIX, awk and: Preface. (line 23) * POSIX, gawk extensions not included in: POSIX/GNU. (line 6) * POSIX, programs, implementing in awk: Clones. (line 6) -* POSIXLY_CORRECT environment variable: Options. (line 332) +* POSIXLY_CORRECT environment variable: Options. (line 340) * PREC variable <1>: Setting Precision. (line 6) * PREC variable: User-modified. (line 134) * precedence <1>: Precedence. (line 6) * precedence: Increment Ops. (line 60) -* precedence, regexp operators: Regexp Operators. (line 156) +* precedence, regexp operators: Regexp Operators. (line 157) * print debugger command: Viewing And Changing Data. (line 36) * print statement: Printing. (line 16) @@ -32403,13 +32435,13 @@ Index * question mark (?), ?: operator: Precedence. (line 92) * question mark (?), regexp operator <1>: GNU Regexp Operators. (line 59) -* question mark (?), regexp operator: Regexp Operators. (line 111) +* question mark (?), regexp operator: Regexp Operators. (line 112) * QuikTrim Awk: Other Versions. (line 134) * quit debugger command: Miscellaneous Debugger Commands. (line 99) * QUIT signal (MS-Windows): Profiling. (line 214) * quoting in gawk command lines: Long. (line 26) -* quoting in gawk command lines, tricks for: Quoting. (line 71) +* quoting in gawk command lines, tricks for: Quoting. (line 88) * quoting, for small awk programs: Comments. (line 27) * r debugger command (alias for run): Debugger Execution Control. (line 62) @@ -32467,8 +32499,8 @@ Index * regexp constants, as patterns: Expression Patterns. (line 34) * regexp constants, in gawk: Using Constant Regexps. (line 28) -* regexp constants, slashes vs. quotes: Computed Regexps. (line 28) -* regexp constants, vs. string constants: Computed Regexps. (line 38) +* regexp constants, slashes vs. quotes: Computed Regexps. (line 29) +* regexp constants, vs. string constants: Computed Regexps. (line 39) * register extension: Registration Functions. (line 6) * regular expressions: Regexp. (line 6) @@ -32486,10 +32518,10 @@ Index (line 57) * regular expressions, dynamic: Computed Regexps. (line 6) * regular expressions, dynamic, with embedded newlines: Computed Regexps. - (line 58) + (line 59) * regular expressions, gawk, command-line options: GNU Regexp Operators. (line 70) -* regular expressions, interval expressions and: Options. (line 272) +* regular expressions, interval expressions and: Options. (line 279) * regular expressions, leftmost longest match: Leftmost Longest. (line 6) * regular expressions, operators <1>: Regexp Operators. (line 6) @@ -32501,7 +32533,7 @@ Index * regular expressions, operators, gawk: GNU Regexp Operators. (line 6) * regular expressions, operators, precedence of: Regexp Operators. - (line 156) + (line 157) * regular expressions, searching for: Egrep Program. (line 6) * relational operators, See comparison operators: Typing and Comparison. (line 9) @@ -32573,7 +32605,7 @@ Index (line 68) * sample debugging session: Sample Debugging Session. (line 6) -* sandbox mode: Options. (line 279) +* sandbox mode: Options. (line 286) * save debugger options: Debugger Info. (line 84) * scalar or array: Type Functions. (line 11) * scalar values: Basic Data Typing. (line 13) @@ -32588,7 +32620,7 @@ Index * search paths <1>: VMS Running. (line 58) * search paths <2>: PC Using. (line 10) * search paths: Igawk Program. (line 368) -* search paths, for shared libraries: AWKLIBPATH Variable. (line 6) +* search paths, for loadable extensions: AWKLIBPATH Variable. (line 6) * search paths, for source files <1>: VMS Running. (line 58) * search paths, for source files <2>: PC Using. (line 10) * search paths, for source files <3>: Igawk Program. (line 368) @@ -32687,7 +32719,7 @@ Index (line 145) * sidebar, Understanding $0: Changing Fields. (line 134) * sidebar, Using \n in Bracket Expressions of Dynamic Regexps: Computed Regexps. - (line 56) + (line 57) * sidebar, Using close()'s Return Value: Close Files And Pipes. (line 128) * SIGHUP signal, for dynamic profiling: Profiling. (line 211) @@ -32706,9 +32738,9 @@ Index * single precision floating-point: General Arithmetic. (line 21) * single quote ('): One-shot. (line 15) * single quote (') in gawk command lines: Long. (line 33) -* single quote ('), in shell commands: Quoting. (line 31) +* single quote ('), in shell commands: Quoting. (line 48) * single quote ('), vs. apostrophe: Comments. (line 27) -* single quote ('), with double quotes: Quoting. (line 53) +* single quote ('), with double quotes: Quoting. (line 70) * single-character fields: Single Character Fields. (line 6) * single-step execution, in the debugger: Debugger Execution Control. @@ -32754,7 +32786,7 @@ Index * sprintf() function, print/printf statements and: Round Function. (line 6) * sqrt: Numeric Functions. (line 78) -* square brackets ([]), regexp operator: Regexp Operators. (line 55) +* square brackets ([]), regexp operator: Regexp Operators. (line 56) * square root: Numeric Functions. (line 78) * srand: Numeric Functions. (line 82) * stack frame: Debugging Terms. (line 10) @@ -32783,7 +32815,7 @@ Index (line 46) * strftime: Time Functions. (line 48) * string constants: Scalar Constants. (line 15) -* string constants, vs. regexp constants: Computed Regexps. (line 38) +* string constants, vs. regexp constants: Computed Regexps. (line 39) * string extraction (internationalization): String Extraction. (line 6) * string length: String Functions. (line 164) @@ -32892,7 +32924,7 @@ Index * translate string: I18N Functions. (line 22) * translate.awk program: Translate Program. (line 55) * treating files, as single records: Records. (line 219) -* troubleshooting, --non-decimal-data option: Options. (line 207) +* troubleshooting, --non-decimal-data option: Options. (line 211) * troubleshooting, == operator: Comparison Operators. (line 37) * troubleshooting, awk uses FS not IFS: Field Separators. (line 30) @@ -32919,7 +32951,7 @@ Index * troubleshooting, quotes with file names: Special FD. (line 68) * troubleshooting, readable data files: File Checking. (line 6) * troubleshooting, regexp constants vs. string constants: Computed Regexps. - (line 38) + (line 39) * troubleshooting, string concatenation: Concatenation. (line 26) * troubleshooting, substr() function: String Functions. (line 497) * troubleshooting, system() function: I/O Functions. (line 94) @@ -33013,7 +33045,7 @@ Index * version of gawk extension API: Auto-set. (line 229) * version of GNU MP library: Auto-set. (line 215) * version of GNU MPFR library: Auto-set. (line 211) -* vertical bar (|): Regexp Operators. (line 69) +* vertical bar (|): Regexp Operators. (line 70) * vertical bar (|), | operator (I/O) <1>: Precedence. (line 65) * vertical bar (|), | operator (I/O): Getline/Pipe. (line 9) * vertical bar (|), |& operator (I/O) <1>: Two-way I/O. (line 44) @@ -33033,7 +33065,7 @@ Index * Wall, Larry <1>: Future Extensions. (line 6) * Wall, Larry: Array Intro. (line 6) * Wallin, Anders: Contributors. (line 103) -* warnings, issuing: Options. (line 182) +* warnings, issuing: Options. (line 185) * watch debugger command: Viewing And Changing Data. (line 67) * watchpoint: Debugging Terms. (line 42) @@ -33046,7 +33078,7 @@ Index * whitespace, as field separators: Default Field Splitting. (line 6) * whitespace, functions, calling: Calling Built-in. (line 10) -* whitespace, newlines as: Options. (line 253) +* whitespace, newlines as: Options. (line 260) * Williams, Kent: Contributors. (line 34) * Woehlke, Matthew: Contributors. (line 79) * Woods, John: Contributors. (line 27) @@ -33075,7 +33107,7 @@ Index * {} (braces): Profiling. (line 142) * {} (braces), actions and: Action Overview. (line 19) * {} (braces), statements, grouping: Statements. (line 10) -* | (vertical bar): Regexp Operators. (line 69) +* | (vertical bar): Regexp Operators. (line 70) * | (vertical bar), | operator (I/O) <1>: Precedence. (line 65) * | (vertical bar), | operator (I/O) <2>: Redirection. (line 57) * | (vertical bar), | operator (I/O): Getline/Pipe. (line 9) @@ -33127,502 +33159,502 @@ Ref: Executable Scripts-Footnote-179661 Ref: Executable Scripts-Footnote-279763 Node: Comments80310 Node: Quoting82777 -Node: DOS Quoting87400 -Node: Sample Data Files88075 -Node: Very Simple90590 -Node: Two Rules95241 -Node: More Complex97139 -Ref: More Complex-Footnote-1100069 -Node: Statements/Lines100154 -Ref: Statements/Lines-Footnote-1104617 -Node: Other Features104882 -Node: When105810 -Node: Invoking Gawk107957 -Node: Command Line109420 -Node: Options110203 -Ref: Options-Footnote-1125581 -Node: Other Arguments125606 -Node: Naming Standard Input128264 -Node: Environment Variables129358 -Node: AWKPATH Variable129916 -Ref: AWKPATH Variable-Footnote-1132697 -Ref: AWKPATH Variable-Footnote-2132742 -Node: AWKLIBPATH Variable133002 -Node: Other Environment Variables133720 -Node: Exit Status136683 -Node: Include Files137358 -Node: Loading Shared Libraries140927 -Node: Obsolete142291 -Node: Undocumented142988 -Node: Regexp143230 -Node: Regexp Usage144619 -Node: Escape Sequences146644 -Node: Regexp Operators152313 -Ref: Regexp Operators-Footnote-1159693 -Ref: Regexp Operators-Footnote-2159840 -Node: Bracket Expressions159938 -Ref: table-char-classes161828 -Node: GNU Regexp Operators164351 -Node: Case-sensitivity168074 -Ref: Case-sensitivity-Footnote-1171042 -Ref: Case-sensitivity-Footnote-2171277 -Node: Leftmost Longest171385 -Node: Computed Regexps172586 -Node: Reading Files175923 -Node: Records177925 -Ref: Records-Footnote-1187448 -Node: Fields187485 -Ref: Fields-Footnote-1190441 -Node: Nonconstant Fields190527 -Node: Changing Fields192733 -Node: Field Separators198692 -Node: Default Field Splitting201394 -Node: Regexp Field Splitting202511 -Node: Single Character Fields205853 -Node: Command Line Field Separator206912 -Node: Full Line Fields210254 -Ref: Full Line Fields-Footnote-1210762 -Node: Field Splitting Summary210808 -Ref: Field Splitting Summary-Footnote-1213907 -Node: Constant Size214008 -Node: Splitting By Content218615 -Ref: Splitting By Content-Footnote-1222364 -Node: Multiple Line222404 -Ref: Multiple Line-Footnote-1228251 -Node: Getline228430 -Node: Plain Getline230646 -Node: Getline/Variable232741 -Node: Getline/File233888 -Node: Getline/Variable/File235229 -Ref: Getline/Variable/File-Footnote-1236828 -Node: Getline/Pipe236915 -Node: Getline/Variable/Pipe239614 -Node: Getline/Coprocess240721 -Node: Getline/Variable/Coprocess241973 -Node: Getline Notes242710 -Node: Getline Summary245497 -Ref: table-getline-variants245905 -Node: Read Timeout246817 -Ref: Read Timeout-Footnote-1250556 -Node: Command line directories250614 -Node: Printing251244 -Node: Print252875 -Node: Print Examples254212 -Node: Output Separators256996 -Node: OFMT259012 -Node: Printf260370 -Node: Basic Printf261276 -Node: Control Letters262815 -Node: Format Modifiers266627 -Node: Printf Examples272636 -Node: Redirection275348 -Node: Special Files282322 -Node: Special FD282855 -Ref: Special FD-Footnote-1286480 -Node: Special Network286554 -Node: Special Caveats287404 -Node: Close Files And Pipes288200 -Ref: Close Files And Pipes-Footnote-1295183 -Ref: Close Files And Pipes-Footnote-2295331 -Node: Expressions295481 -Node: Values296613 -Node: Constants297289 -Node: Scalar Constants297969 -Ref: Scalar Constants-Footnote-1298828 -Node: Nondecimal-numbers299010 -Node: Regexp Constants302010 -Node: Using Constant Regexps302485 -Node: Variables305540 -Node: Using Variables306195 -Node: Assignment Options307919 -Node: Conversion309794 -Ref: table-locale-affects315294 -Ref: Conversion-Footnote-1315918 -Node: All Operators316027 -Node: Arithmetic Ops316657 -Node: Concatenation319162 -Ref: Concatenation-Footnote-1321950 -Node: Assignment Ops322070 -Ref: table-assign-ops327058 -Node: Increment Ops328389 -Node: Truth Values and Conditions331823 -Node: Truth Values332906 -Node: Typing and Comparison333955 -Node: Variable Typing334748 -Ref: Variable Typing-Footnote-1338645 -Node: Comparison Operators338767 -Ref: table-relational-ops339177 -Node: POSIX String Comparison342725 -Ref: POSIX String Comparison-Footnote-1343681 -Node: Boolean Ops343819 -Ref: Boolean Ops-Footnote-1347889 -Node: Conditional Exp347980 -Node: Function Calls349712 -Node: Precedence353306 -Node: Locales356975 -Node: Patterns and Actions358064 -Node: Pattern Overview359118 -Node: Regexp Patterns360787 -Node: Expression Patterns361330 -Node: Ranges365111 -Node: BEGIN/END368215 -Node: Using BEGIN/END368977 -Ref: Using BEGIN/END-Footnote-1371713 -Node: I/O And BEGIN/END371819 -Node: BEGINFILE/ENDFILE374101 -Node: Empty377015 -Node: Using Shell Variables377332 -Node: Action Overview379617 -Node: Statements381974 -Node: If Statement383828 -Node: While Statement385327 -Node: Do Statement387371 -Node: For Statement388527 -Node: Switch Statement391679 -Node: Break Statement393833 -Node: Continue Statement395823 -Node: Next Statement397616 -Node: Nextfile Statement400006 -Node: Exit Statement402661 -Node: Built-in Variables405077 -Node: User-modified406172 -Ref: User-modified-Footnote-1414530 -Node: Auto-set414592 -Ref: Auto-set-Footnote-1427657 -Ref: Auto-set-Footnote-2427862 -Node: ARGC and ARGV427918 -Node: Arrays431772 -Node: Array Basics433277 -Node: Array Intro434103 -Node: Reference to Elements438420 -Node: Assigning Elements440690 -Node: Array Example441181 -Node: Scanning an Array442913 -Node: Controlling Scanning445227 -Ref: Controlling Scanning-Footnote-1450314 -Node: Delete450630 -Ref: Delete-Footnote-1453395 -Node: Numeric Array Subscripts453452 -Node: Uninitialized Subscripts455635 -Node: Multidimensional457262 -Node: Multiscanning460355 -Node: Arrays of Arrays461944 -Node: Functions466584 -Node: Built-in467403 -Node: Calling Built-in468481 -Node: Numeric Functions470469 -Ref: Numeric Functions-Footnote-1474301 -Ref: Numeric Functions-Footnote-2474658 -Ref: Numeric Functions-Footnote-3474706 -Node: String Functions474975 -Ref: String Functions-Footnote-1497933 -Ref: String Functions-Footnote-2498062 -Ref: String Functions-Footnote-3498310 -Node: Gory Details498397 -Ref: table-sub-escapes500076 -Ref: table-sub-posix-92501430 -Ref: table-sub-proposed502781 -Ref: table-posix-sub504135 -Ref: table-gensub-escapes505680 -Ref: Gory Details-Footnote-1506856 -Ref: Gory Details-Footnote-2506907 -Node: I/O Functions507058 -Ref: I/O Functions-Footnote-1514048 -Node: Time Functions514195 -Ref: Time Functions-Footnote-1525178 -Ref: Time Functions-Footnote-2525246 -Ref: Time Functions-Footnote-3525404 -Ref: Time Functions-Footnote-4525515 -Ref: Time Functions-Footnote-5525627 -Ref: Time Functions-Footnote-6525854 -Node: Bitwise Functions526120 -Ref: table-bitwise-ops526682 -Ref: Bitwise Functions-Footnote-1530903 -Node: Type Functions531087 -Node: I18N Functions532238 -Node: User-defined533865 -Node: Definition Syntax534669 -Ref: Definition Syntax-Footnote-1539583 -Node: Function Example539652 -Ref: Function Example-Footnote-1542301 -Node: Function Caveats542323 -Node: Calling A Function542841 -Node: Variable Scope543796 -Node: Pass By Value/Reference546759 -Node: Return Statement550267 -Node: Dynamic Typing553248 -Node: Indirect Calls554179 -Node: Library Functions563866 -Ref: Library Functions-Footnote-1567379 -Ref: Library Functions-Footnote-2567522 -Node: Library Names567693 -Ref: Library Names-Footnote-1571166 -Ref: Library Names-Footnote-2571386 -Node: General Functions571472 -Node: Strtonum Function572500 -Node: Assert Function575430 -Node: Round Function578756 -Node: Cliff Random Function580297 -Node: Ordinal Functions581313 -Ref: Ordinal Functions-Footnote-1584390 -Ref: Ordinal Functions-Footnote-2584642 -Node: Join Function584853 -Ref: Join Function-Footnote-1586624 -Node: Getlocaltime Function586824 -Node: Readfile Function590565 -Node: Data File Management592404 -Node: Filetrans Function593036 -Node: Rewind Function597105 -Node: File Checking598492 -Node: Empty Files599586 -Node: Ignoring Assigns601816 -Node: Getopt Function603370 -Ref: Getopt Function-Footnote-1614673 -Node: Passwd Functions614876 -Ref: Passwd Functions-Footnote-1623854 -Node: Group Functions623942 -Node: Walking Arrays632026 -Node: Sample Programs634162 -Node: Running Examples634836 -Node: Clones635564 -Node: Cut Program636788 -Node: Egrep Program646639 -Ref: Egrep Program-Footnote-1654412 -Node: Id Program654522 -Node: Split Program658171 -Ref: Split Program-Footnote-1661690 -Node: Tee Program661818 -Node: Uniq Program664621 -Node: Wc Program672050 -Ref: Wc Program-Footnote-1676316 -Ref: Wc Program-Footnote-2676516 -Node: Miscellaneous Programs676608 -Node: Dupword Program677796 -Node: Alarm Program679827 -Node: Translate Program684634 -Ref: Translate Program-Footnote-1689021 -Ref: Translate Program-Footnote-2689269 -Node: Labels Program689403 -Ref: Labels Program-Footnote-1692774 -Node: Word Sorting692858 -Node: History Sorting696742 -Node: Extract Program698581 -Ref: Extract Program-Footnote-1706084 -Node: Simple Sed706212 -Node: Igawk Program709274 -Ref: Igawk Program-Footnote-1724431 -Ref: Igawk Program-Footnote-2724632 -Node: Anagram Program724770 -Node: Signature Program727838 -Node: Advanced Features728938 -Node: Nondecimal Data730824 -Node: Array Sorting732407 -Node: Controlling Array Traversal733104 -Node: Array Sorting Functions741388 -Ref: Array Sorting Functions-Footnote-1745257 -Node: Two-way I/O745451 -Ref: Two-way I/O-Footnote-1750883 -Node: TCP/IP Networking750965 -Node: Profiling753809 -Node: Internationalization761312 -Node: I18N and L10N762737 -Node: Explaining gettext763423 -Ref: Explaining gettext-Footnote-1768491 -Ref: Explaining gettext-Footnote-2768675 -Node: Programmer i18n768840 -Node: Translator i18n773042 -Node: String Extraction773836 -Ref: String Extraction-Footnote-1774797 -Node: Printf Ordering774883 -Ref: Printf Ordering-Footnote-1777665 -Node: I18N Portability777729 -Ref: I18N Portability-Footnote-1780178 -Node: I18N Example780241 -Ref: I18N Example-Footnote-1782879 -Node: Gawk I18N782951 -Node: Debugger783572 -Node: Debugging784543 -Node: Debugging Concepts784976 -Node: Debugging Terms786832 -Node: Awk Debugging789429 -Node: Sample Debugging Session790321 -Node: Debugger Invocation790841 -Node: Finding The Bug792174 -Node: List of Debugger Commands798661 -Node: Breakpoint Control799995 -Node: Debugger Execution Control803659 -Node: Viewing And Changing Data807019 -Node: Execution Stack810375 -Node: Debugger Info811842 -Node: Miscellaneous Debugger Commands815824 -Node: Readline Support821000 -Node: Limitations821831 -Node: Arbitrary Precision Arithmetic824083 -Ref: Arbitrary Precision Arithmetic-Footnote-1825732 -Node: General Arithmetic825880 -Node: Floating Point Issues827600 -Node: String Conversion Precision828481 -Ref: String Conversion Precision-Footnote-1830186 -Node: Unexpected Results830295 -Node: POSIX Floating Point Problems832448 -Ref: POSIX Floating Point Problems-Footnote-1836273 -Node: Integer Programming836311 -Node: Floating-point Programming838050 -Ref: Floating-point Programming-Footnote-1844381 -Ref: Floating-point Programming-Footnote-2844651 -Node: Floating-point Representation844915 -Node: Floating-point Context846080 -Ref: table-ieee-formats846919 -Node: Rounding Mode848303 -Ref: table-rounding-modes848782 -Ref: Rounding Mode-Footnote-1851797 -Node: Gawk and MPFR851976 -Node: Arbitrary Precision Floats853385 -Ref: Arbitrary Precision Floats-Footnote-1855828 -Node: Setting Precision856144 -Ref: table-predefined-precision-strings856830 -Node: Setting Rounding Mode858975 -Ref: table-gawk-rounding-modes859379 -Node: Floating-point Constants860566 -Node: Changing Precision861995 -Ref: Changing Precision-Footnote-1863392 -Node: Exact Arithmetic863566 -Node: Arbitrary Precision Integers866704 -Ref: Arbitrary Precision Integers-Footnote-1869719 -Node: Dynamic Extensions869866 -Node: Extension Intro871324 -Node: Plugin License872589 -Node: Extension Mechanism Outline873274 -Ref: load-extension873691 -Ref: load-new-function875169 -Ref: call-new-function876164 -Node: Extension API Description878179 -Node: Extension API Functions Introduction879466 -Node: General Data Types884393 -Ref: General Data Types-Footnote-1890088 -Node: Requesting Values890387 -Ref: table-value-types-returned891124 -Node: Memory Allocation Functions892078 -Ref: Memory Allocation Functions-Footnote-1894824 -Node: Constructor Functions894920 -Node: Registration Functions896678 -Node: Extension Functions897363 -Node: Exit Callback Functions899665 -Node: Extension Version String900914 -Node: Input Parsers901564 -Node: Output Wrappers911321 -Node: Two-way processors915831 -Node: Printing Messages918039 -Ref: Printing Messages-Footnote-1919116 -Node: Updating `ERRNO'919268 -Node: Accessing Parameters920007 -Node: Symbol Table Access921237 -Node: Symbol table by name921751 -Node: Symbol table by cookie923727 -Ref: Symbol table by cookie-Footnote-1927859 -Node: Cached values927922 -Ref: Cached values-Footnote-1931412 -Node: Array Manipulation931503 -Ref: Array Manipulation-Footnote-1932601 -Node: Array Data Types932640 -Ref: Array Data Types-Footnote-1935343 -Node: Array Functions935435 -Node: Flattening Arrays939271 -Node: Creating Arrays946123 -Node: Extension API Variables950848 -Node: Extension Versioning951484 -Node: Extension API Informational Variables953385 -Node: Extension API Boilerplate954471 -Node: Finding Extensions958275 -Node: Extension Example958835 -Node: Internal File Description959565 -Node: Internal File Ops963656 -Ref: Internal File Ops-Footnote-1975165 -Node: Using Internal File Ops975305 -Ref: Using Internal File Ops-Footnote-1977658 -Node: Extension Samples977924 -Node: Extension Sample File Functions979448 -Node: Extension Sample Fnmatch987933 -Node: Extension Sample Fork989702 -Node: Extension Sample Inplace990915 -Node: Extension Sample Ord992693 -Node: Extension Sample Readdir993529 -Node: Extension Sample Revout995061 -Node: Extension Sample Rev2way995654 -Node: Extension Sample Read write array996344 -Node: Extension Sample Readfile998227 -Node: Extension Sample API Tests999327 -Node: Extension Sample Time999852 -Node: gawkextlib1001216 -Node: Language History1003997 -Node: V7/SVR3.11005590 -Node: SVR41007910 -Node: POSIX1009352 -Node: BTL1010738 -Node: POSIX/GNU1011472 -Node: Feature History1017071 -Node: Common Extensions1030047 -Node: Ranges and Locales1031359 -Ref: Ranges and Locales-Footnote-11035976 -Ref: Ranges and Locales-Footnote-21036003 -Ref: Ranges and Locales-Footnote-31036237 -Node: Contributors1036458 -Node: Installation1041839 -Node: Gawk Distribution1042733 -Node: Getting1043217 -Node: Extracting1044043 -Node: Distribution contents1045735 -Node: Unix Installation1051456 -Node: Quick Installation1052073 -Node: Additional Configuration Options1054519 -Node: Configuration Philosophy1056255 -Node: Non-Unix Installation1058609 -Node: PC Installation1059067 -Node: PC Binary Installation1060366 -Node: PC Compiling1062214 -Node: PC Testing1065158 -Node: PC Using1066334 -Node: Cygwin1070502 -Node: MSYS1071311 -Node: VMS Installation1071825 -Node: VMS Compilation1072621 -Ref: VMS Compilation-Footnote-11073873 -Node: VMS Dynamic Extensions1073931 -Node: VMS Installation Details1075304 -Node: VMS Running1077555 -Node: VMS GNV1080389 -Node: VMS Old Gawk1081112 -Node: Bugs1081582 -Node: Other Versions1085500 -Node: Notes1091584 -Node: Compatibility Mode1092384 -Node: Additions1093167 -Node: Accessing The Source1094094 -Node: Adding Code1095534 -Node: New Ports1101579 -Node: Derived Files1105714 -Ref: Derived Files-Footnote-11111035 -Ref: Derived Files-Footnote-21111069 -Ref: Derived Files-Footnote-31111669 -Node: Future Extensions1111767 -Node: Implementation Limitations1112350 -Node: Extension Design1113602 -Node: Old Extension Problems1114756 -Ref: Old Extension Problems-Footnote-11116264 -Node: Extension New Mechanism Goals1116321 -Ref: Extension New Mechanism Goals-Footnote-11119686 -Node: Extension Other Design Decisions1119872 -Node: Extension Future Growth1121978 -Node: Old Extension Mechanism1122814 -Node: Basic Concepts1124554 -Node: Basic High Level1125235 -Ref: figure-general-flow1125507 -Ref: figure-process-flow1126106 -Ref: Basic High Level-Footnote-11129335 -Node: Basic Data Typing1129520 -Node: Glossary1132875 -Node: Copying1158106 -Node: GNU Free Documentation License1195662 -Node: Index1220798 +Node: DOS Quoting88093 +Node: Sample Data Files88768 +Node: Very Simple91283 +Node: Two Rules95933 +Node: More Complex97828 +Ref: More Complex-Footnote-1100760 +Node: Statements/Lines100845 +Ref: Statements/Lines-Footnote-1105300 +Node: Other Features105565 +Node: When106493 +Node: Invoking Gawk108641 +Node: Command Line110104 +Node: Options110887 +Ref: Options-Footnote-1126699 +Node: Other Arguments126724 +Node: Naming Standard Input129386 +Node: Environment Variables130480 +Node: AWKPATH Variable131038 +Ref: AWKPATH Variable-Footnote-1133816 +Ref: AWKPATH Variable-Footnote-2133861 +Node: AWKLIBPATH Variable134121 +Node: Other Environment Variables134880 +Node: Exit Status138045 +Node: Include Files138720 +Node: Loading Shared Libraries142298 +Node: Obsolete143681 +Node: Undocumented144378 +Node: Regexp144620 +Node: Regexp Usage146009 +Node: Escape Sequences148042 +Node: Regexp Operators153709 +Ref: Regexp Operators-Footnote-1161189 +Ref: Regexp Operators-Footnote-2161336 +Node: Bracket Expressions161434 +Ref: table-char-classes163324 +Node: GNU Regexp Operators165847 +Node: Case-sensitivity169570 +Ref: Case-sensitivity-Footnote-1172462 +Ref: Case-sensitivity-Footnote-2172697 +Node: Leftmost Longest172805 +Node: Computed Regexps174006 +Node: Reading Files177355 +Node: Records179357 +Ref: Records-Footnote-1188880 +Node: Fields188917 +Ref: Fields-Footnote-1191873 +Node: Nonconstant Fields191959 +Node: Changing Fields194165 +Node: Field Separators200124 +Node: Default Field Splitting202826 +Node: Regexp Field Splitting203943 +Node: Single Character Fields207284 +Node: Command Line Field Separator208343 +Node: Full Line Fields211685 +Ref: Full Line Fields-Footnote-1212193 +Node: Field Splitting Summary212239 +Ref: Field Splitting Summary-Footnote-1215338 +Node: Constant Size215439 +Node: Splitting By Content220046 +Ref: Splitting By Content-Footnote-1223795 +Node: Multiple Line223835 +Ref: Multiple Line-Footnote-1229682 +Node: Getline229861 +Node: Plain Getline232077 +Node: Getline/Variable234172 +Node: Getline/File235319 +Node: Getline/Variable/File236660 +Ref: Getline/Variable/File-Footnote-1238259 +Node: Getline/Pipe238346 +Node: Getline/Variable/Pipe241045 +Node: Getline/Coprocess242152 +Node: Getline/Variable/Coprocess243404 +Node: Getline Notes244141 +Node: Getline Summary246928 +Ref: table-getline-variants247336 +Node: Read Timeout248248 +Ref: Read Timeout-Footnote-1251987 +Node: Command line directories252045 +Node: Printing252675 +Node: Print254306 +Node: Print Examples255643 +Node: Output Separators258427 +Node: OFMT260443 +Node: Printf261801 +Node: Basic Printf262707 +Node: Control Letters264246 +Node: Format Modifiers268066 +Node: Printf Examples274075 +Node: Redirection276787 +Node: Special Files283761 +Node: Special FD284294 +Ref: Special FD-Footnote-1287919 +Node: Special Network287993 +Node: Special Caveats288843 +Node: Close Files And Pipes289639 +Ref: Close Files And Pipes-Footnote-1296622 +Ref: Close Files And Pipes-Footnote-2296770 +Node: Expressions296920 +Node: Values298052 +Node: Constants298728 +Node: Scalar Constants299408 +Ref: Scalar Constants-Footnote-1300267 +Node: Nondecimal-numbers300449 +Node: Regexp Constants303449 +Node: Using Constant Regexps303924 +Node: Variables306979 +Node: Using Variables307634 +Node: Assignment Options309358 +Node: Conversion311233 +Ref: table-locale-affects316733 +Ref: Conversion-Footnote-1317357 +Node: All Operators317466 +Node: Arithmetic Ops318096 +Node: Concatenation320601 +Ref: Concatenation-Footnote-1323389 +Node: Assignment Ops323509 +Ref: table-assign-ops328497 +Node: Increment Ops329828 +Node: Truth Values and Conditions333262 +Node: Truth Values334345 +Node: Typing and Comparison335394 +Node: Variable Typing336187 +Ref: Variable Typing-Footnote-1340084 +Node: Comparison Operators340206 +Ref: table-relational-ops340616 +Node: POSIX String Comparison344164 +Ref: POSIX String Comparison-Footnote-1345120 +Node: Boolean Ops345258 +Ref: Boolean Ops-Footnote-1349328 +Node: Conditional Exp349419 +Node: Function Calls351151 +Node: Precedence354745 +Node: Locales358414 +Node: Patterns and Actions359503 +Node: Pattern Overview360557 +Node: Regexp Patterns362226 +Node: Expression Patterns362769 +Node: Ranges366550 +Node: BEGIN/END369654 +Node: Using BEGIN/END370416 +Ref: Using BEGIN/END-Footnote-1373152 +Node: I/O And BEGIN/END373258 +Node: BEGINFILE/ENDFILE375540 +Node: Empty378454 +Node: Using Shell Variables378771 +Node: Action Overview381056 +Node: Statements383413 +Node: If Statement385267 +Node: While Statement386766 +Node: Do Statement388810 +Node: For Statement389966 +Node: Switch Statement393118 +Node: Break Statement395272 +Node: Continue Statement397262 +Node: Next Statement399055 +Node: Nextfile Statement401445 +Node: Exit Statement404100 +Node: Built-in Variables406516 +Node: User-modified407611 +Ref: User-modified-Footnote-1415969 +Node: Auto-set416031 +Ref: Auto-set-Footnote-1429098 +Ref: Auto-set-Footnote-2429303 +Node: ARGC and ARGV429359 +Node: Arrays433213 +Node: Array Basics434718 +Node: Array Intro435544 +Node: Reference to Elements439861 +Node: Assigning Elements442131 +Node: Array Example442622 +Node: Scanning an Array444354 +Node: Controlling Scanning446668 +Ref: Controlling Scanning-Footnote-1451755 +Node: Delete452071 +Ref: Delete-Footnote-1454836 +Node: Numeric Array Subscripts454893 +Node: Uninitialized Subscripts457076 +Node: Multidimensional458703 +Node: Multiscanning461796 +Node: Arrays of Arrays463385 +Node: Functions468025 +Node: Built-in468844 +Node: Calling Built-in469922 +Node: Numeric Functions471910 +Ref: Numeric Functions-Footnote-1475744 +Ref: Numeric Functions-Footnote-2476101 +Ref: Numeric Functions-Footnote-3476149 +Node: String Functions476418 +Ref: String Functions-Footnote-1499421 +Ref: String Functions-Footnote-2499550 +Ref: String Functions-Footnote-3499798 +Node: Gory Details499885 +Ref: table-sub-escapes501564 +Ref: table-sub-posix-92502918 +Ref: table-sub-proposed504269 +Ref: table-posix-sub505623 +Ref: table-gensub-escapes507168 +Ref: Gory Details-Footnote-1508344 +Ref: Gory Details-Footnote-2508395 +Node: I/O Functions508546 +Ref: I/O Functions-Footnote-1515542 +Node: Time Functions515689 +Ref: Time Functions-Footnote-1526682 +Ref: Time Functions-Footnote-2526750 +Ref: Time Functions-Footnote-3526908 +Ref: Time Functions-Footnote-4527019 +Ref: Time Functions-Footnote-5527131 +Ref: Time Functions-Footnote-6527358 +Node: Bitwise Functions527624 +Ref: table-bitwise-ops528186 +Ref: Bitwise Functions-Footnote-1532431 +Node: Type Functions532615 +Node: I18N Functions533766 +Node: User-defined535418 +Node: Definition Syntax536222 +Ref: Definition Syntax-Footnote-1541136 +Node: Function Example541205 +Ref: Function Example-Footnote-1543854 +Node: Function Caveats543876 +Node: Calling A Function544394 +Node: Variable Scope545349 +Node: Pass By Value/Reference548312 +Node: Return Statement551820 +Node: Dynamic Typing554801 +Node: Indirect Calls555732 +Node: Library Functions565419 +Ref: Library Functions-Footnote-1568932 +Ref: Library Functions-Footnote-2569075 +Node: Library Names569246 +Ref: Library Names-Footnote-1572719 +Ref: Library Names-Footnote-2572939 +Node: General Functions573025 +Node: Strtonum Function574053 +Node: Assert Function576983 +Node: Round Function580309 +Node: Cliff Random Function581850 +Node: Ordinal Functions582866 +Ref: Ordinal Functions-Footnote-1585943 +Ref: Ordinal Functions-Footnote-2586195 +Node: Join Function586406 +Ref: Join Function-Footnote-1588177 +Node: Getlocaltime Function588377 +Node: Readfile Function592118 +Node: Data File Management593957 +Node: Filetrans Function594589 +Node: Rewind Function598658 +Node: File Checking600045 +Node: Empty Files601139 +Node: Ignoring Assigns603369 +Node: Getopt Function604923 +Ref: Getopt Function-Footnote-1616226 +Node: Passwd Functions616429 +Ref: Passwd Functions-Footnote-1625407 +Node: Group Functions625495 +Node: Walking Arrays633579 +Node: Sample Programs635715 +Node: Running Examples636389 +Node: Clones637117 +Node: Cut Program638341 +Node: Egrep Program648192 +Ref: Egrep Program-Footnote-1655965 +Node: Id Program656075 +Node: Split Program659724 +Ref: Split Program-Footnote-1663243 +Node: Tee Program663371 +Node: Uniq Program666174 +Node: Wc Program673603 +Ref: Wc Program-Footnote-1677869 +Ref: Wc Program-Footnote-2678069 +Node: Miscellaneous Programs678161 +Node: Dupword Program679349 +Node: Alarm Program681380 +Node: Translate Program686187 +Ref: Translate Program-Footnote-1690574 +Ref: Translate Program-Footnote-2690822 +Node: Labels Program690956 +Ref: Labels Program-Footnote-1694327 +Node: Word Sorting694411 +Node: History Sorting698295 +Node: Extract Program700134 +Ref: Extract Program-Footnote-1707637 +Node: Simple Sed707765 +Node: Igawk Program710827 +Ref: Igawk Program-Footnote-1725998 +Ref: Igawk Program-Footnote-2726199 +Node: Anagram Program726337 +Node: Signature Program729405 +Node: Advanced Features730505 +Node: Nondecimal Data732391 +Node: Array Sorting733974 +Node: Controlling Array Traversal734671 +Node: Array Sorting Functions742955 +Ref: Array Sorting Functions-Footnote-1746824 +Node: Two-way I/O747018 +Ref: Two-way I/O-Footnote-1752450 +Node: TCP/IP Networking752532 +Node: Profiling755376 +Node: Internationalization762879 +Node: I18N and L10N764304 +Node: Explaining gettext764990 +Ref: Explaining gettext-Footnote-1770058 +Ref: Explaining gettext-Footnote-2770242 +Node: Programmer i18n770407 +Node: Translator i18n774634 +Node: String Extraction775428 +Ref: String Extraction-Footnote-1776389 +Node: Printf Ordering776475 +Ref: Printf Ordering-Footnote-1779257 +Node: I18N Portability779321 +Ref: I18N Portability-Footnote-1781770 +Node: I18N Example781833 +Ref: I18N Example-Footnote-1784471 +Node: Gawk I18N784543 +Node: Debugger785164 +Node: Debugging786135 +Node: Debugging Concepts786568 +Node: Debugging Terms788424 +Node: Awk Debugging791021 +Node: Sample Debugging Session791913 +Node: Debugger Invocation792433 +Node: Finding The Bug793766 +Node: List of Debugger Commands800253 +Node: Breakpoint Control801587 +Node: Debugger Execution Control805251 +Node: Viewing And Changing Data808611 +Node: Execution Stack811967 +Node: Debugger Info813434 +Node: Miscellaneous Debugger Commands817428 +Node: Readline Support822606 +Node: Limitations823437 +Node: Arbitrary Precision Arithmetic825689 +Ref: Arbitrary Precision Arithmetic-Footnote-1827338 +Node: General Arithmetic827486 +Node: Floating Point Issues829206 +Node: String Conversion Precision830087 +Ref: String Conversion Precision-Footnote-1831792 +Node: Unexpected Results831901 +Node: POSIX Floating Point Problems834054 +Ref: POSIX Floating Point Problems-Footnote-1837879 +Node: Integer Programming837917 +Node: Floating-point Programming839656 +Ref: Floating-point Programming-Footnote-1845987 +Ref: Floating-point Programming-Footnote-2846257 +Node: Floating-point Representation846521 +Node: Floating-point Context847686 +Ref: table-ieee-formats848525 +Node: Rounding Mode849909 +Ref: table-rounding-modes850388 +Ref: Rounding Mode-Footnote-1853403 +Node: Gawk and MPFR853582 +Node: Arbitrary Precision Floats854991 +Ref: Arbitrary Precision Floats-Footnote-1857434 +Node: Setting Precision857750 +Ref: table-predefined-precision-strings858436 +Node: Setting Rounding Mode860581 +Ref: table-gawk-rounding-modes860985 +Node: Floating-point Constants862172 +Node: Changing Precision863601 +Ref: Changing Precision-Footnote-1864998 +Node: Exact Arithmetic865172 +Node: Arbitrary Precision Integers868310 +Ref: Arbitrary Precision Integers-Footnote-1871325 +Node: Dynamic Extensions871472 +Node: Extension Intro872930 +Node: Plugin License874195 +Node: Extension Mechanism Outline874880 +Ref: load-extension875297 +Ref: load-new-function876775 +Ref: call-new-function877770 +Node: Extension API Description879785 +Node: Extension API Functions Introduction881072 +Node: General Data Types885999 +Ref: General Data Types-Footnote-1891694 +Node: Requesting Values891993 +Ref: table-value-types-returned892730 +Node: Memory Allocation Functions893684 +Ref: Memory Allocation Functions-Footnote-1896430 +Node: Constructor Functions896526 +Node: Registration Functions898284 +Node: Extension Functions898969 +Node: Exit Callback Functions901271 +Node: Extension Version String902520 +Node: Input Parsers903170 +Node: Output Wrappers912927 +Node: Two-way processors917437 +Node: Printing Messages919645 +Ref: Printing Messages-Footnote-1920722 +Node: Updating `ERRNO'920874 +Node: Accessing Parameters921613 +Node: Symbol Table Access922843 +Node: Symbol table by name923357 +Node: Symbol table by cookie925333 +Ref: Symbol table by cookie-Footnote-1929465 +Node: Cached values929528 +Ref: Cached values-Footnote-1933018 +Node: Array Manipulation933109 +Ref: Array Manipulation-Footnote-1934207 +Node: Array Data Types934246 +Ref: Array Data Types-Footnote-1936949 +Node: Array Functions937041 +Node: Flattening Arrays940877 +Node: Creating Arrays947729 +Node: Extension API Variables952454 +Node: Extension Versioning953090 +Node: Extension API Informational Variables954991 +Node: Extension API Boilerplate956077 +Node: Finding Extensions959881 +Node: Extension Example960441 +Node: Internal File Description961171 +Node: Internal File Ops965262 +Ref: Internal File Ops-Footnote-1976771 +Node: Using Internal File Ops976911 +Ref: Using Internal File Ops-Footnote-1979258 +Node: Extension Samples979524 +Node: Extension Sample File Functions981048 +Node: Extension Sample Fnmatch989535 +Node: Extension Sample Fork991304 +Node: Extension Sample Inplace992517 +Node: Extension Sample Ord994295 +Node: Extension Sample Readdir995131 +Node: Extension Sample Revout996663 +Node: Extension Sample Rev2way997256 +Node: Extension Sample Read write array997946 +Node: Extension Sample Readfile999829 +Node: Extension Sample API Tests1000929 +Node: Extension Sample Time1001454 +Node: gawkextlib1002818 +Node: Language History1005599 +Node: V7/SVR3.11007192 +Node: SVR41009512 +Node: POSIX1010954 +Node: BTL1012340 +Node: POSIX/GNU1013074 +Node: Feature History1018673 +Node: Common Extensions1031649 +Node: Ranges and Locales1032961 +Ref: Ranges and Locales-Footnote-11037578 +Ref: Ranges and Locales-Footnote-21037605 +Ref: Ranges and Locales-Footnote-31037839 +Node: Contributors1038060 +Node: Installation1043441 +Node: Gawk Distribution1044335 +Node: Getting1044819 +Node: Extracting1045645 +Node: Distribution contents1047337 +Node: Unix Installation1053058 +Node: Quick Installation1053675 +Node: Additional Configuration Options1056121 +Node: Configuration Philosophy1057857 +Node: Non-Unix Installation1060211 +Node: PC Installation1060669 +Node: PC Binary Installation1061968 +Node: PC Compiling1063816 +Node: PC Testing1066760 +Node: PC Using1067936 +Node: Cygwin1072104 +Node: MSYS1072913 +Node: VMS Installation1073427 +Node: VMS Compilation1074223 +Ref: VMS Compilation-Footnote-11075475 +Node: VMS Dynamic Extensions1075533 +Node: VMS Installation Details1076906 +Node: VMS Running1079157 +Node: VMS GNV1081991 +Node: VMS Old Gawk1082714 +Node: Bugs1083184 +Node: Other Versions1087102 +Node: Notes1093186 +Node: Compatibility Mode1093986 +Node: Additions1094769 +Node: Accessing The Source1095696 +Node: Adding Code1097136 +Node: New Ports1103181 +Node: Derived Files1107316 +Ref: Derived Files-Footnote-11112637 +Ref: Derived Files-Footnote-21112671 +Ref: Derived Files-Footnote-31113271 +Node: Future Extensions1113369 +Node: Implementation Limitations1113952 +Node: Extension Design1115204 +Node: Old Extension Problems1116358 +Ref: Old Extension Problems-Footnote-11117866 +Node: Extension New Mechanism Goals1117923 +Ref: Extension New Mechanism Goals-Footnote-11121288 +Node: Extension Other Design Decisions1121474 +Node: Extension Future Growth1123580 +Node: Old Extension Mechanism1124416 +Node: Basic Concepts1126156 +Node: Basic High Level1126837 +Ref: figure-general-flow1127109 +Ref: figure-process-flow1127708 +Ref: Basic High Level-Footnote-11130937 +Node: Basic Data Typing1131122 +Node: Glossary1134477 +Node: Copying1159708 +Node: GNU Free Documentation License1197264 +Node: Index1222400 End Tag Table diff --git a/doc/gawk.texi b/doc/gawk.texi index 353adb7c..872263d4 100644 --- a/doc/gawk.texi +++ b/doc/gawk.texi @@ -1987,8 +1987,9 @@ May, 2014 @part Part I:@* The @command{awk} Language @end iftex -@ignore @ifdocbook +@part The @command{awk} Language + Part I describes the @command{awk} language and @command{gawk} program in detail. It starts with the basics, and continues through all of the features of @command{awk}. Included also are many, but not all, @@ -2024,7 +2025,6 @@ following chapters: @ref{Functions}. @end itemize @end ifdocbook -@end ignore @node Getting Started @chapter Getting Started with @command{awk} @@ -2501,6 +2501,27 @@ knowledge of shell quoting rules. The following rules apply only to POSIX-compliant, Bourne-style shells (such as Bash, the GNU Bourne-Again Shell). If you use the C shell, you're on your own. +Before diving into the rules, we introduce a concept that appears +throughout this @value{DOCUMENT}, which is that of the @dfn{null}, +or empty, string. + +The null string is character data that has no value. +In other words, it is empty. It is written in @command{awk} programs +like this: @code{""}. In the shell, it can be written using single +or double quotes: @code{""} or @code{''}. While the null string has +no characters in it, it does exist. Consider this command: + +@example +$ @kbd{echo ""} +@end example + +@noindent +Here, the @command{echo} utility receives a single argument, even +though that argument has no characters in it. In the rest of this +@value{DOCUMENT}, we use the terms @dfn{null string} and @dfn{empty string} +interchangeably. Now, on to the quoting rules. + + @itemize @bullet @item Quoted items can be concatenated with nonquoted items as well as with other @@ -2676,6 +2697,7 @@ Although this @value{DOCUMENT} generally only worries about POSIX systems and th POSIX shell, the following issue arises often enough for many users that it is worth addressing. +@cindex Brink, Jeroen The ``shells'' on Microsoft Windows systems use the double-quote character for quoting, and make it difficult or impossible to include an escaped double-quote character in a command-line script. @@ -2689,7 +2711,6 @@ gawk "@{ print \"\042\" $0 \"\042\" @}" @var{file} @node Sample Data Files @section Data Files for the Examples -@c For gawk >= 4.0, update these data files. No-one has such slow modems! @cindex input files, examples @cindex @code{mail-list} file @@ -2846,7 +2867,7 @@ awk 'length($0) > 80' data @end example The sole rule has a relational expression as its pattern and it has no -action---so the default action, printing the record, is used. +action---so it uses the default action, printing the record. @cindex @command{expand} utility @item @@ -2929,9 +2950,9 @@ the program would print the odd-numbered lines. The @command{awk} utility reads the input files one line at a time. For each line, @command{awk} tries the patterns of each of the rules. -If several patterns match, then several actions are run in the order in +If several patterns match, then several actions execture in the order in which they appear in the @command{awk} program. If no patterns match, then -no actions are run. +no actions run. After processing all the rules that match the line (and perhaps there are none), @command{awk} reads the next line. (However, @@ -3023,8 +3044,8 @@ needed to produce this traditional-style output from @command{ls}.} The @samp{$6 == "Nov"} in our @command{awk} program is an expression that tests whether the sixth field of the output from @w{@samp{ls -l}} matches the string @samp{Nov}. Each time a line has the string -@samp{Nov} for its sixth field, the action @samp{sum += $5} is -performed. This adds the fifth field (the file's size) to the variable +@samp{Nov} for its sixth field, @command{awk} performs the action +@samp{sum += $5}. This adds the fifth field (the file's size) to the variable @code{sum}. As a result, when @command{awk} has finished reading all the input lines, @code{sum} is the total of the sizes of the files whose lines matched the pattern. (This works because @command{awk} variables @@ -3091,7 +3112,7 @@ We have generally not used backslash continuation in our sample programs. @command{gawk} places no limit on the length of a line, so backslash continuation is never strictly necessary; it just makes programs more readable. For this same reason, as well as -for clarity, we have kept most statements short in the sample programs +for clarity, we have kept most statements short in the programs presented throughout the @value{DOCUMENT}. Backslash continuation is most useful when your @command{awk} program is in a separate source file instead of entered from the command line. You should also note that @@ -3240,12 +3261,15 @@ that it has are much larger than they used to be. @cindex @command{awk} programs, complex If you find yourself writing @command{awk} scripts of more than, say, a few hundred lines, you might consider using a different programming -language. Emacs Lisp is a good choice if you need sophisticated string -or pattern matching capabilities. The shell is also good at string and +language. +The shell is good at string and pattern matching; in addition, it allows powerful use of the system utilities. More conventional languages, such as C, C++, and Java, offer better facilities for system programming and for managing the complexity -of large programs. Programs in these languages may require more lines +of large programs. +Python offers a nice balance between high-level ease of programming and +access to system facilities. +Programs in these languages may require more lines of source code than the equivalent @command{awk} programs, but they are easier to maintain and usually run more efficiently. @@ -3429,9 +3453,10 @@ program; see @ref{Getopt Function}. The following list describes @command{gawk}-specific options: -@table @code -@item -b -@itemx --characters-as-bytes +@c Have to use @asis here to get docbook to come out right. +@table @asis +@item @option{-b} +@itemx @option{--characters-as-bytes} @cindex @option{-b} option @cindex @option{--characters-as-bytes} option Cause @command{gawk} to treat all input data as single-byte characters. @@ -3439,14 +3464,14 @@ In addition, all output written with @code{print} or @code{printf} are treated as single-byte characters. Normally, @command{gawk} follows the POSIX standard and attempts to process -its input data according to the current locale. This can often involve +its input data according to the current locale (@pxref{Locales}). This can often involve converting multibyte characters into wide characters (internally), and can lead to problems or confusion if the input data does not contain valid multibyte characters. This option is an easy way to tell @command{gawk}: ``hands off my data!''. -@item -c -@itemx --traditional +@item @option{-c} +@itemx @option{--traditional} @cindex @option{-c} option @cindex @option{--traditional} option @cindex compatibility mode (@command{gawk}), specifying @@ -3457,15 +3482,15 @@ like Brian Kernighan's version @command{awk}. which summarizes the extensions. Also see @ref{Compatibility Mode}. -@item -C -@itemx --copyright +@item @option{-C} +@itemx @option{--copyright} @cindex @option{-C} option @cindex @option{--copyright} option @cindex GPL (General Public License), printing Print the short version of the General Public License and then exit. -@item -d@r{[}@var{file}@r{]} -@itemx --dump-variables@r{[}=@var{file}@r{]} +@item @option{-d}[@var{file}] +@itemx @option{--dump-variables}[@code{=}@var{file}] @cindex @option{-d} option @cindex @option{--dump-variables} option @cindex dump all variables of a program @@ -3487,8 +3512,8 @@ inadvertently use global variables that you meant to be local. (This is a particularly easy mistake to make with simple variable names like @code{i}, @code{j}, etc.) -@item -D@r{[}@var{file}@r{]} -@itemx --debug=@r{[}@var{file}@r{]} +@item @option{-D}[@var{file}] +@itemx @option{--debug}[@code{=}@var{file}] @cindex @option{-D} option @cindex @option{--debug} option @cindex @command{awk} debugging, enabling @@ -3500,8 +3525,8 @@ of commands for the debugger to execute non-interactively. No space is allowed between the @option{-D} and @var{file}, if @var{file} is supplied. -@item -e @var{program-text} -@itemx --source @var{program-text} +@item @option{-e} @var{program-text} +@itemx @option{--source} @var{program-text} @cindex @option{-e} option @cindex @option{--source} option @cindex source code, mixing @@ -3512,8 +3537,8 @@ This is particularly useful when you have library functions that you want to use from your command-line programs (@pxref{AWKPATH Variable}). -@item -E @var{file} -@itemx --exec @var{file} +@item @option{-E} @var{file} +@itemx @option{--exec} @var{file} @cindex @option{-E} option @cindex @option{--exec} option @cindex @command{awk} programs, location of @@ -3543,8 +3568,8 @@ with @samp{#!} scripts (@pxref{Executable Scripts}), like so: @var{awk program here @dots{}} @end example -@item -g -@itemx --gen-pot +@item @option{-g} +@itemx @option{--gen-pot} @cindex @option{-g} option @cindex @option{--gen-pot} option @cindex portable object files, generating @@ -3555,8 +3580,8 @@ output for all string constants that have been marked for translation. @xref{Internationalization}, for information about this option. -@item -h -@itemx --help +@item @option{-h} +@itemx @option{--help} @cindex @option{-h} option @cindex @option{--help} option @cindex GNU long options, printing list of @@ -3565,42 +3590,47 @@ for information about this option. Print a ``usage'' message summarizing the short and long style options that @command{gawk} accepts and then exit. -@item -i @var{source-file} -@itemx --include @var{source-file} +@item @option{-i} @var{source-file} +@itemx @option{--include} @var{source-file} @cindex @option{-i} option @cindex @option{--include} option @cindex @command{awk} programs, location of -Read @command{awk} source library from @var{source-file}. This option is -completely equivalent to using the @samp{@@include} directive inside -your program. This option is very -similar to the @option{-f} option, but there are two important differences. -First, when @option{-i} is used, the program source will not be loaded if it has -been previously loaded, whereas the @option{-f} will always load the file. +Read @command{awk} source library from @var{source-file}. This option +is completely equivalent to using the @code{@@include} directive inside +your program. This option is very similar to the @option{-f} option, +but there are two important differences. First, when @option{-i} is +used, the program source is not loaded if it has been previously +loaded, whereas with @option{-f}, @command{gawk} always loads the file. Second, because this option is intended to be used with code libraries, @command{gawk} does not recognize such files as constituting main program -input. Thus, after processing an @option{-i} argument, @command{gawk} still expects to -find the main source code via the @option{-f} option or on the command-line. +input. Thus, after processing an @option{-i} argument, @command{gawk} +still expects to find the main source code via the @option{-f} option +or on the command-line. -@item -l @var{lib} -@itemx --load @var{lib} +@item @option{-l} @var{ext} +@itemx @option{--load} @var{ext} @cindex @option{-l} option @cindex @option{--load} option -@cindex loading, library -Load a shared library @var{lib}. This searches for the library using the @env{AWKLIBPATH} +@cindex loading, extensions +Load a dynamic extension named @var{ext}. Extensions +are stored as system shared libraries. +This option searches for the library using the @env{AWKLIBPATH} environment variable. The correct library suffix for your platform will be -supplied by default, so it need not be specified in the library name. -The library initialization routine should be named @code{dl_load()}. -An alternative is to use the @samp{@@load} keyword inside the program to load -a shared library. +supplied by default, so it need not be specified in the extension name. +The extension initialization routine should be named @code{dl_load()}. +An alternative is to use the @code{@@load} keyword inside the program to load +a shared library. This feature is described in detail in @ref{Dynamic Extensions}. -@item -L @r{[}value@r{]} -@itemx --lint@r{[}=value@r{]} +@item @option{-L}[@var{value}] +@itemx @option{--lint}[@code{=}@var{value}] @cindex @option{-l} option @cindex @option{--lint} option @cindex lint checking, issuing warnings @cindex warnings, issuing Warn about constructs that are dubious or nonportable to other @command{awk} implementations. +No space is allowed between the @option{-D} and @var{value}, if +@var{value} is supplied. Some warnings are issued when @command{gawk} first reads your program. Others are issued at runtime, as your program executes. With an optional argument of @samp{fatal}, @@ -3616,16 +3646,16 @@ when eliminating problems pointed out by @option{--lint}, you should take care to search for all occurrences of each inappropriate construct. As @command{awk} programs are usually short, doing so is not burdensome. -@item -M -@itemx --bignum +@item @option{-M} +@itemx @option{--bignum} @cindex @option{-M} option @cindex @option{--bignum} option Force arbitrary precision arithmetic on numbers. This option has no effect if @command{gawk} is not compiled to use the GNU MPFR and MP libraries (@pxref{Gawk and MPFR}). -@item -n -@itemx --non-decimal-data +@item @option{-n} +@itemx @option{--non-decimal-data} @cindex @option{-n} option @cindex @option{--non-decimal-data} option @cindex hexadecimal values@comma{} enabling interpretation of @@ -3640,34 +3670,41 @@ This option can severely break old programs. Use with care. @end quotation -@item -N -@itemx --use-lc-numeric +@item @option{-N} +@itemx @option{--use-lc-numeric} @cindex @option{-N} option @cindex @option{--use-lc-numeric} option Force the use of the locale's decimal point character when parsing numeric input data (@pxref{Locales}). -@item -o@r{[}@var{file}@r{]} -@itemx --pretty-print@r{[}=@var{file}@r{]} +@item @option{-o}[@var{file}] +@itemx @option{--pretty-print}[@code{=}@var{file}] @cindex @option{-o} option @cindex @option{--pretty-print} option Enable pretty-printing of @command{awk} programs. -By default, output program is created in a file named @file{awkprof.out}. +By default, output program is created in a file named @file{awkprof.out} +(@pxref{Profiling}). The optional @var{file} argument allows you to specify a different file name for the output. No space is allowed between the @option{-o} and @var{file}, if @var{file} is supplied. -@item -O -@itemx --optimize +@quotation NOTE +Due to the way @command{gawk} has evolved, with this option +your program is still executed. This will change in the +next major release such that @command{gawk} will only +pretty-print the program and not run it. +@end quotation + +@item @option{-O} +@itemx @option{--optimize} @cindex @option{--optimize} option @cindex @option{-O} option Enable some optimizations on the internal representation of the program. -At the moment this includes just simple constant folding. The @command{gawk} -maintainer hopes to add more optimizations over time. +At the moment this includes just simple constant folding. -@item -p@r{[}@var{file}@r{]} -@itemx --profile@r{[}=@var{file}@r{]} +@item @option{-p}[@var{file}] +@itemx @option{--profile}[@code{=}@var{file}] @cindex @option{-p} option @cindex @option{--profile} option @cindex @command{awk} profiling, enabling @@ -3682,8 +3719,8 @@ No space is allowed between the @option{-p} and @var{file}, if The profile contains execution counts for each statement in the program in the left margin, and function call counts for each function. -@item -P -@itemx --posix +@item @option{-P} +@itemx @option{--posix} @cindex @option{-P} option @cindex @option{--posix} option @cindex POSIX mode @@ -3730,10 +3767,10 @@ data (@pxref{Locales}). @cindex @option{--posix} option, @code{--traditional} option and If you supply both @option{--traditional} and @option{--posix} on the command line, @option{--posix} takes precedence. @command{gawk} -also issues a warning if both options are supplied. +issues a warning if both options are supplied. -@item -r -@itemx --re-interval +@item @option{-r} +@itemx @option{--re-interval} @cindex @option{-r} option @cindex @option{--re-interval} option @cindex regular expressions, interval expressions and @@ -3742,10 +3779,10 @@ Allow interval expressions in regexps. This is now @command{gawk}'s default behavior. Nevertheless, this option remains both for backward compatibility, -and for use in combination with the @option{--traditional} option. +and for use in combination with @option{--traditional}. -@item -S -@itemx --sandbox +@item @option{-S} +@itemx @option{--sandbox} @cindex @option{-S} option @cindex @option{--sandbox} option @cindex sandbox mode @@ -3757,16 +3794,16 @@ This is particularly useful when you want to run @command{awk} scripts from questionable sources and need to make sure the scripts can't access your system (other than the specified input data file). -@item -t -@itemx --lint-old +@item @option{-t} +@itemx @option{--lint-old} @cindex @option{-L} option @cindex @option{--lint-old} option Warn about constructs that are not available in the original version of @command{awk} from Version 7 Unix (@pxref{V7/SVR3.1}). -@item -V -@itemx --version +@item @option{-V} +@itemx @option{--version} @cindex @option{-V} option @cindex @option{--version} option @cindex @command{gawk}, versions of, information about@comma{} printing @@ -3808,13 +3845,13 @@ type @kbd{Ctrl-d} (the end-of-file character) to terminate it. input but then you will not be able to also use the standard input as a source of data.) -Because it is clumsy using the standard @command{awk} mechanisms to mix source -file and command-line @command{awk} programs, @command{gawk} provides the -@option{--source} option. This does not require you to pre-empt the standard -input for your source code; it allows you to easily mix command-line -and library source code -(@pxref{AWKPATH Variable}). -The @option{--source} option may also be used multiple times on the command line. +Because it is clumsy using the standard @command{awk} mechanisms to mix +source file and command-line @command{awk} programs, @command{gawk} +provides the @option{--source} option. This does not require you to +pre-empt the standard input for your source code; it allows you to easily +mix command-line and library source code (@pxref{AWKPATH Variable}). +As with @option{-f}, the @option{--source} and @option{--include} +options may also be used multiple times on the command line. @cindex @option{--source} option If no @option{-f} or @option{--source} option is specified, then @command{gawk} @@ -3826,7 +3863,7 @@ program source code. @cindex POSIX mode If the environment variable @env{POSIXLY_CORRECT} exists, then @command{gawk} behaves in strict POSIX mode, exactly as if -you had supplied the @option{--posix} command-line option. +you had supplied @option{--posix}. Many GNU programs look for this environment variable to suppress extensions that conflict with POSIX, but @command{gawk} behaves differently: it suppresses all extensions, even those that do not @@ -3905,7 +3942,7 @@ The variable values given on the command line are processed for escape sequences (@pxref{Escape Sequences}). @value{DARKCORNER} -In some earlier implementations of @command{awk}, when a variable assignment +In some very early implementations of @command{awk}, when a variable assignment occurred before any file names, the assignment would happen @emph{before} the @code{BEGIN} rule was executed. @command{awk}'s behavior was thus inconsistent; some command-line assignments were available inside the @@ -3917,7 +3954,7 @@ upon the old behavior. The variable assignment feature is most useful for assigning to variables such as @code{RS}, @code{OFS}, and @code{ORS}, which control input and -output formats before scanning the data files. It is also useful for +output formats, before scanning the data files. It is also useful for controlling state if multiple passes are needed over a data file. For example: @@ -3959,7 +3996,7 @@ with @code{getline}. Some other versions of @command{awk} also support this, but it is not standard. (Some operating systems provide a @file{/dev/stdin} file -in the file system, however, @command{gawk} always processes +in the file system; however, @command{gawk} always processes this file name itself.) @node Environment Variables @@ -4007,7 +4044,7 @@ directory is the value of @samp{$(datadir)} generated when @command{gawk} was configured. You probably don't need to worry about this, though.} -The search path feature is particularly useful for building libraries +The search path feature is particularly helpful for building libraries of useful @command{awk} functions. The library files can be placed in a standard directory in the default path and then specified on the command line with a short file name. Otherwise, the full file name @@ -4024,11 +4061,13 @@ If the source code is not found after the initial search, the path is searched again after adding the default @samp{.awk} suffix to the filename. @quotation NOTE +@c 4/2014: +@c using @samp{.} to get quotes, since @file{} no longer supplies them. To include the current directory in the path, either place -@file{.} explicitly in the path or write a null entry in the +@samp{.} explicitly in the path or write a null entry in the path. (A null entry is indicated by starting or ending the path with a -colon or by placing two colons next to each other (@samp{::}).) +colon or by placing two colons next to each other [@samp{::}].) This path search mechanism is similar to the shell's. @c someday, @cite{The Bourne Again Shell}.... @@ -4043,7 +4082,7 @@ the current directory in the search path. If @env{AWKPATH} is not defined in the environment, @command{gawk} places its default search path into @code{ENVIRON["AWKPATH"]}. This makes it easy to determine -the actual search path that @command{gawk} will use +the actual search path that @command{gawk} used from within an @command{awk} program. While you can change @code{ENVIRON["AWKPATH"]} within your @command{awk} @@ -4055,18 +4094,18 @@ found, and @command{gawk} no longer needs to use @env{AWKPATH}. @node AWKLIBPATH Variable @subsection The @env{AWKLIBPATH} Environment Variable @cindex @env{AWKLIBPATH} environment variable -@cindex directories, searching for shared libraries -@cindex search paths, for shared libraries +@cindex directories, searching for loadable extensions +@cindex search paths, for loadable extensions @cindex differences in @command{awk} and @command{gawk}, @code{AWKLIBPATH} environment variable The @env{AWKLIBPATH} environment variable is similar to the @env{AWKPATH} -variable, but it is used to search for shared libraries specified -with the @option{-l} option rather than for source files. If the library -is not found, the path is searched again after adding the appropriate -shared library suffix for the platform. For example, on GNU/Linux systems, -the suffix @samp{.so} is used. -The search path specified is also used for libraries loaded via the -@samp{@@load} keyword (@pxref{Loading Shared Libraries}). +variable, but it is used to search for loadable extensions (stored as +system shared libraries) specified with the @option{-l} option rather +than for source files. If the extension is not found, the path is +searched again after adding the appropriate shared library suffix for +the platform. For example, on GNU/Linux systems, the suffix @samp{.so} +is used. The search path specified is also used for extensions loaded +via the @code{@@load} keyword (@pxref{Loading Shared Libraries}). @node Other Environment Variables @subsection Other Environment Variables @@ -4082,7 +4121,7 @@ mode, disabling all traditional and GNU extensions. @xref{Options}. @item GAWK_SOCK_RETRIES -Controls the number of time @command{gawk} will attempt to +Controls the number of times @command{gawk} attempts to retry a two-way TCP/IP (socket) connection before giving up. @xref{TCP/IP Networking}. @@ -4130,6 +4169,11 @@ two regexp matchers that @command{gawk} uses internally. (There aren't supposed to be differences, but occasionally theory and practice don't coordinate with each other.) +@item GAWK_NO_PP_RUN +If this variable exists, then when invoked with the @option{--pretty-print} +option, @command{gawk} skips running the program. This variable will +not survive into the next major release. + @item GAWK_STACKSIZE This specifies the amount by which @command{gawk} should grow its internal evaluation stack, when needed. @@ -4174,13 +4218,13 @@ to @code{EXIT_FAILURE}. This @value{SECTION} describes a feature that is specific to @command{gawk}. -The @samp{@@include} keyword can be used to read external @command{awk} source +The @code{@@include} keyword can be used to read external @command{awk} source files. This gives you the ability to split large @command{awk} source files into smaller, more manageable pieces, and also lets you reuse common @command{awk} code from various @command{awk} scripts. In other words, you can group together @command{awk} functions, used to carry out specific tasks, into external files. These files can be used just like function libraries, -using the @samp{@@include} keyword in conjunction with the @env{AWKPATH} +using the @code{@@include} keyword in conjunction with the @env{AWKPATH} environment variable. Note that source files may also be included using the @option{-i} option. @@ -4214,14 +4258,14 @@ $ @kbd{gawk -f test2} @end example @code{gawk} runs the @file{test2} script which includes @file{test1} -using the @samp{@@include} +using the @code{@@include} keyword. So, to include external @command{awk} source files you just -use @samp{@@include} followed by the name of the file to be included, +use @code{@@include} followed by the name of the file to be included, enclosed in double quotes. @quotation NOTE Keep in mind that this is a language construct and the file name cannot -be a string variable, but rather just a literal string in double quotes. +be a string variable, but rather just a literal string constant in double quotes. @end quotation The files to be included may be nested; e.g., given a third @@ -4260,47 +4304,48 @@ or: @noindent are valid. The @code{AWKPATH} environment variable can be of great -value when using @samp{@@include}. The same rules for the use +value when using @code{@@include}. The same rules for the use of the @code{AWKPATH} variable in command-line file searches (@pxref{AWKPATH Variable}) apply to -@samp{@@include} also. +@code{@@include} also. This is very helpful in constructing @command{gawk} function libraries. If you have a large script with useful, general purpose @command{awk} functions, you can break it down into library files and put those files in a special directory. You can then include those ``libraries,'' using either the full pathnames of the files, or by setting the @code{AWKPATH} -environment variable accordingly and then using @samp{@@include} with +environment variable accordingly and then using @code{@@include} with just the file part of the full pathname. Of course you can have more than one directory to keep library files; the more complex the working environment is, the more directories you may need to organize the files to be included. Given the ability to specify multiple @option{-f} options, the -@samp{@@include} mechanism is not strictly necessary. -However, the @samp{@@include} keyword +@code{@@include} mechanism is not strictly necessary. +However, the @code{@@include} keyword can help you in constructing self-contained @command{gawk} programs, thus reducing the need for writing complex and tedious command lines. -In particular, @samp{@@include} is very useful for writing CGI scripts +In particular, @code{@@include} is very useful for writing CGI scripts to be run from web pages. As mentioned in @ref{AWKPATH Variable}, the current directory is always searched first for source files, before searching in @env{AWKPATH}, -and this also applies to files named with @samp{@@include}. +and this also applies to files named with @code{@@include}. @node Loading Shared Libraries -@section Loading Shared Libraries Into Your Program +@section Loading Dynamic Extensions Into Your Program This @value{SECTION} describes a feature that is specific to @command{gawk}. -The @samp{@@load} keyword can be used to read external @command{awk} shared -libraries. This allows you to link in compiled code that may offer superior +The @code{@@load} keyword can be used to read external @command{awk} extensions +(stored as system shared libraries). +This allows you to link in compiled code that may offer superior performance and/or give you access to extended capabilities not supported by the @command{awk} language. The @env{AWKLIBPATH} variable is used to -search for the shared library. Using @samp{@@load} is completely equivalent +search for the extension. Using @code{@@load} is completely equivalent to using the @option{-l} command-line option. -If the shared library is not initially found in @env{AWKLIBPATH}, another +If the extension is not initially found in @env{AWKLIBPATH}, another search is conducted after appending the platform's default shared library suffix to the filename. For example, on GNU/Linux systems, the suffix @samp{.so} is used. @@ -4320,11 +4365,11 @@ $ @kbd{gawk -lordchr 'BEGIN @{print chr(65)@}'} @noindent For command-line usage, the @option{-l} option is more convenient, -but @samp{@@load} is useful for embedding inside an @command{awk} source file -that requires access to a shared library. +but @code{@@load} is useful for embedding inside an @command{awk} source file +that requires access to an extension. @ref{Dynamic Extensions}, describes how to write extensions (in C or C++) -that can be loaded with either @samp{@@load} or the @option{-l} option. +that can be loaded with either @code{@@load} or the @option{-l} option. @node Obsolete @section Obsolete Options and/or Features @@ -4470,8 +4515,8 @@ A regular expression can be used as a pattern by enclosing it in slashes. Then the regular expression is tested against the entire text of each record. (Normally, it only needs to match some part of the text in order to succeed.) For example, the -following prints the second field of each record that contains the string -@samp{li} anywhere in it: +following prints the second field of each record where the string +@samp{li} appears anywhere in the record: @example $ @kbd{awk '/li/ @{ print $2 @}' mail-list} @@ -4601,7 +4646,7 @@ A literal backslash, @samp{\}. @cindex backslash (@code{\}), @code{\a} escape sequence @item \a The ``alert'' character, @kbd{Ctrl-g}, ASCII code 7 (BEL). -(This usually makes some sort of audible noise.) +(This often makes some sort of audible noise.) @cindex @code{\} (backslash), @code{\b} escape sequence @cindex backslash (@code{\}), @code{\b} escape sequence @@ -4872,10 +4917,11 @@ the very first step in processing regexps. Here is a list of metacharacters. All characters that are not escape sequences and that are not listed in the table stand for themselves: -@table @code +@c Use @asis so the docbook comes out ok. Sigh. +@table @asis @cindex backslash (@code{\}), regexp operator @cindex @code{\} (backslash), regexp operator -@item \ +@item @code{\} This is used to suppress the special meaning of a character when matching. For example, @samp{\$} matches the character @samp{$}. @@ -4884,7 +4930,7 @@ matches the character @samp{$}. @cindex Texinfo, chapter beginnings in files @cindex @code{^} (caret), regexp operator @cindex caret (@code{^}), regexp operator -@item ^ +@item @code{^} This matches the beginning of a string. For example, @samp{^@@chapter} matches @samp{@@chapter} at the beginning of a string and can be used to identify chapter beginnings in Texinfo source files. @@ -4892,7 +4938,7 @@ The @samp{^} is known as an @dfn{anchor}, because it anchors the pattern to match only at the beginning of the string. It is important to realize that @samp{^} does not match the beginning of -a line embedded in a string. +a line (the point right after a @samp{\n} newline character) embedded in a string. The condition is not true in the following example: @example @@ -4901,11 +4947,13 @@ if ("line1\nLINE 2" ~ /^L/) @dots{} @cindex @code{$} (dollar sign), regexp operator @cindex dollar sign (@code{$}), regexp operator -@item $ +@item @code{$} This is similar to @samp{^}, but it matches only at the end of a string. For example, @samp{p$} matches a record that ends with a @samp{p}. The @samp{$} is an anchor -and does not match the end of a line embedded in a string. +and does not match the end of a line +(the point right before a @samp{\n} newline character) +embedded in a string. The condition in the following example is not true: @example @@ -4914,7 +4962,7 @@ if ("line1\nLINE 2" ~ /1$/) @dots{} @cindex @code{.} (period), regexp operator @cindex period (@code{.}), regexp operator -@item . @r{(period)} +@item @code{.} (period) This matches any single character, @emph{including} the newline character. For example, @samp{.P} matches any single character followed by a @samp{P} in a string. Using @@ -4935,7 +4983,7 @@ may not be able to match the @sc{nul} character. @cindex character sets, See Also bracket expressions @cindex character lists, See bracket expressions @cindex character classes, See bracket expressions -@item [@dots{}] +@item @code{[}@dots{}@code{]} This is called a @dfn{bracket expression}.@footnote{In other literature, you may see a bracket expression referred to as either a @dfn{character set}, a @dfn{character class}, or a @dfn{character list}.} @@ -4947,7 +4995,7 @@ is given in @ref{Bracket Expressions}. @cindex bracket expressions, complemented -@item [^ @dots{}] +@item @code{[^}@dots{}@code{]} This is a @dfn{complemented bracket expression}. The first character after the @samp{[} @emph{must} be a @samp{^}. It matches any characters @emph{except} those in the square brackets. For example, @samp{[^awk]} @@ -4956,7 +5004,7 @@ or @samp{k}. @cindex @code{|} (vertical bar) @cindex vertical bar (@code{|}) -@item | +@item @code{|} This is the @dfn{alternation operator} and it is used to specify alternatives. The @samp{|} has the lowest precedence of all the regular @@ -4969,7 +5017,7 @@ The alternation applies to the largest possible regexps on either side. @cindex @code{()} (parentheses), regexp operator @cindex parentheses @code{()}, regexp operator -@item (@dots{}) +@item @code{(}@dots{}@code{)} Parentheses are used for grouping in regular expressions, as in arithmetic. They can be used to concatenate regular expressions containing the alternation operator, @samp{|}. For example, @@ -4980,7 +5028,7 @@ explained further on in this list.) @cindex @code{*} (asterisk), @code{*} operator, as regexp operator @cindex asterisk (@code{*}), @code{*} operator, as regexp operator -@item * +@item @code{*} This symbol means that the preceding regular expression should be repeated as many times as necessary to find a match. For example, @samp{ph*} applies the @samp{*} symbol to the preceding @samp{h} and looks for matches @@ -4998,11 +5046,11 @@ with backslashes. @cindex @code{+} (plus sign), regexp operator @cindex plus sign (@code{+}), regexp operator -@item + +@item @code{+} This symbol is similar to @samp{*}, except that the preceding expression must be matched at least once. This means that @samp{wh+y} would match @samp{why} and @samp{whhy}, but not @samp{wy}, whereas -@samp{wh*y} would match all three of these strings. +@samp{wh*y} would match all three. The following is a simpler way of writing the last @samp{*} example: @@ -5012,15 +5060,15 @@ awk '/\(c[ad]+r x\)/ @{ print @}' sample @cindex @code{?} (question mark), regexp operator @cindex question mark (@code{?}), regexp operator -@item ? +@item @code{?} This symbol is similar to @samp{*}, except that the preceding expression can be matched either once or not at all. For example, @samp{fe?d} matches @samp{fed} and @samp{fd}, but nothing else. @cindex interval expressions, regexp operator -@item @{@var{n}@} -@itemx @{@var{n},@} -@itemx @{@var{n},@var{m}@} +@item @code{@{}@var{n}@code{@}} +@itemx @code{@{}@var{n}@code{,@}} +@itemx @code{@{}@var{n}@code{,}@var{m}@code{@}} One or two numbers inside braces denote an @dfn{interval expression}. If there is one number in the braces, the preceding regexp is repeated @var{n} times. @@ -5443,10 +5491,12 @@ This works in any POSIX-compliant @command{awk}. Another method, specific to @command{gawk}, is to set the variable @code{IGNORECASE} to a nonzero value (@pxref{Built-in Variables}). When @code{IGNORECASE} is not zero, @emph{all} regexp and string -operations ignore case. Changing the value of -@code{IGNORECASE} dynamically controls the case-sensitivity of the -program as it runs. Case is significant by default because -@code{IGNORECASE} (like most variables) is initialized to zero: +operations ignore case. + +Changing the value of @code{IGNORECASE} dynamically controls the +case-sensitivity of the program as it runs. Case is significant by +default because @code{IGNORECASE} (like most variables) is initialized +to zero: @example x = "aB" @@ -5476,9 +5526,6 @@ case-sensitivity on or off for all the rules at once. Setting @code{IGNORECASE} from the command line is a way to make a program case-insensitive without having to edit it. -Both regexp and string comparison -operations are affected by @code{IGNORECASE}. - @c @cindex ISO 8859-1 @c @cindex ISO Latin-1 In multibyte locales, @@ -5556,7 +5603,7 @@ regexp constant (i.e., a string of characters between slashes). It may be any expression. The expression is evaluated and converted to a string if necessary; the contents of the string are then used as the regexp. A regexp computed in this way is called a @dfn{dynamic -regexp}: +regexp} or a @dfn{computed regexp}: @example BEGIN @{ digits_regexp = "[[:digit:]]+" @} @@ -5630,7 +5677,7 @@ intend a regexp match. @cindex regular expressions, dynamic, with embedded newlines @cindex newlines, in dynamic regexps -Some commercial versions of @command{awk} do not allow the newline +Some versions of @command{awk} do not allow the newline character to be used inside a bracket expression for a dynamic regexp: @example @@ -5668,7 +5715,7 @@ occur often in practice, but it's worth noting for future reference. @cindex regular expressions, dynamic, with embedded newlines @cindex newlines, in dynamic regexps -Some commercial versions of @command{awk} do not allow the newline +Some versions of @command{awk} do not allow the newline character to be used inside a bracket expression for a dynamic regexp: @example @@ -6652,7 +6699,7 @@ $ @kbd{echo ' a b c d ' | awk 'BEGIN @{ FS = "[ \t\n]+" @}} @cindex null strings @cindex strings, null @cindex empty strings, See null strings -In this case, the first field is @dfn{null} or empty. +In this case, the first field is null, or empty. The stripping of leading and trailing whitespace also comes into play whenever @code{$0} is recomputed. For instance, study this pipeline: @@ -7738,19 +7785,19 @@ Such a record is replaced by the contents of the file Note here how the name of the extra input file is not built into the program; it is taken directly from the data, specifically from the second field on -the @samp{@@include} line. +the @code{@@include} line. The @code{close()} function is called to ensure that if two identical -@samp{@@include} lines appear in the input, the entire specified file is +@code{@@include} lines appear in the input, the entire specified file is included twice. @xref{Close Files And Pipes}. One deficiency of this program is that it does not process nested -@samp{@@include} statements -(i.e., @samp{@@include} statements in included files) +@code{@@include} statements +(i.e., @code{@@include} statements in included files) the way a true macro preprocessor would. @xref{Igawk Program}, for a program -that does handle nested @samp{@@include} statements. +that does handle nested @code{@@include} statements. @node Getline/Pipe @subsection Using @code{getline} from a Pipe @@ -8566,8 +8613,9 @@ of value to print. The rest of the format specifier is made up of optional @dfn{modifiers} that control @emph{how} to print the value, such as the field width. Here is a list of the format-control letters: -@table @code -@item %c +@c @asis for docbook to come out right +@table @asis +@item @code{%c} Print a number as an ASCII character; thus, @samp{printf "%c", 65} outputs the letter @samp{A}. The output for a string value is the first character of the string. @@ -8599,12 +8647,12 @@ a single byte (0--255). @end quotation -@item %d@r{,} %i +@item @code{%d}, @code{%i} Print a decimal integer. The two control letters are equivalent. (The @samp{%i} specification is for compatibility with ISO C.) -@item %e@r{,} %E +@item @code{%e}, @code{%E} Print a number in scientific (exponential) notation; for example: @@ -8619,7 +8667,7 @@ which follow the decimal point. discussed in the next @value{SUBSECTION}.) @samp{%E} uses @samp{E} instead of @samp{e} in the output. -@item %f +@item @code{%f} Print a number in floating-point notation. For example: @@ -8641,37 +8689,37 @@ and positive infinity as @samp{inf} and @samp{infinity}. The special ``not a number'' value formats as @samp{-nan} or @samp{nan}. -@item %F +@item @code{%F} Like @samp{%f} but the infinity and ``not a number'' values are spelled using uppercase letters. The @samp{%F} format is a POSIX extension to ISO C; not all systems support it. On those that don't, @command{gawk} uses @samp{%f} instead. -@item %g@r{,} %G +@item @code{%g}, @code{%G} Print a number in either scientific notation or in floating-point notation, whichever uses fewer characters; if the result is printed in scientific notation, @samp{%G} uses @samp{E} instead of @samp{e}. -@item %o +@item @code{%o} Print an unsigned octal integer (@pxref{Nondecimal-numbers}). -@item %s +@item @code{%s} Print a string. -@item %u +@item @code{%u} Print an unsigned decimal integer. (This format is of marginal use, because all numbers in @command{awk} are floating-point; it is provided primarily for compatibility with C.) -@item %x@r{,} %X +@item @code{%x}, @code{%X} Print an unsigned hexadecimal integer; @samp{%X} uses the letters @samp{A} through @samp{F} instead of @samp{a} through @samp{f} (@pxref{Nondecimal-numbers}). -@item %% +@item @code{%%} Print a single @samp{%}. This does not consume an argument and it ignores any modifiers. @@ -11908,28 +11956,28 @@ expression because the first @samp{$} has higher precedence than the This table presents @command{awk}'s operators, in order of highest to lowest precedence: -@c use @code in the items, looks better in TeX w/o all the quotes -@table @code -@item (@dots{}) +@c @asis for docbook to come out right +@table @asis +@item @code{(}@dots{}@code{)} Grouping. @cindex @code{$} (dollar sign), @code{$} field operator @cindex dollar sign (@code{$}), @code{$} field operator -@item $ +@item @code{$} Field reference. @cindex @code{+} (plus sign), @code{++} operator @cindex plus sign (@code{+}), @code{++} operator @cindex @code{-} (hyphen), @code{--} operator @cindex hyphen (@code{-}), @code{--} operator -@item ++ -- +@item @code{++ --} Increment, decrement. @cindex @code{^} (caret), @code{^} operator @cindex caret (@code{^}), @code{^} operator @cindex @code{*} (asterisk), @code{**} operator @cindex asterisk (@code{*}), @code{**} operator -@item ^ ** +@item @code{^ **} Exponentiation. These operators group right-to-left. @cindex @code{+} (plus sign), @code{+} operator @@ -11938,7 +11986,7 @@ Exponentiation. These operators group right-to-left. @cindex hyphen (@code{-}), @code{-} operator @cindex @code{!} (exclamation point), @code{!} operator @cindex exclamation point (@code{!}), @code{!} operator -@item + - ! +@item @code{+ - !} Unary plus, minus, logical ``not.'' @cindex @code{*} (asterisk), @code{*} operator, as multiplication operator @@ -11947,17 +11995,17 @@ Unary plus, minus, logical ``not.'' @cindex forward slash (@code{/}), @code{/} operator @cindex @code{%} (percent sign), @code{%} operator @cindex percent sign (@code{%}), @code{%} operator -@item * / % +@item @code{* / %} Multiplication, division, remainder. @cindex @code{+} (plus sign), @code{+} operator @cindex plus sign (@code{+}), @code{+} operator @cindex @code{-} (hyphen), @code{-} operator @cindex hyphen (@code{-}), @code{-} operator -@item + - +@item @code{+ -} Addition, subtraction. -@item @r{String Concatenation} +@item String Concatenation There is no special symbol for concatenation. The operands are simply written side by side (@pxref{Concatenation}). @@ -11983,7 +12031,7 @@ The operands are simply written side by side @cindex @code{|} (vertical bar), @code{|&} operator (I/O) @cindex vertical bar (@code{|}), @code{|&} operator (I/O) @cindex operators, input/output -@item < <= == != > >= >> | |& +@item @code{< <= == != > >= >> | |&} Relational and redirection. The relational operators and the redirections have the same precedence level. Characters such as @samp{>} serve both as relationals and as @@ -12004,26 +12052,26 @@ The correct way to write this statement is @samp{print foo > (a ? b : c)}. @cindex tilde (@code{~}), @code{~} operator @cindex @code{!} (exclamation point), @code{!~} operator @cindex exclamation point (@code{!}), @code{!~} operator -@item ~ !~ +@item @code{~ !~} Matching, nonmatching. @cindex @code{in} operator -@item in +@item @code{in} Array membership. @cindex @code{&} (ampersand), @code{&&} operator @cindex ampersand (@code{&}), @code{&&} operator -@item && +@item @code{&&} Logical ``and''. @cindex @code{|} (vertical bar), @code{||} operator @cindex vertical bar (@code{|}), @code{||} operator -@item || +@item @code{||} Logical ``or''. @cindex @code{?} (question mark), @code{?:} operator @cindex question mark (@code{?}), @code{?:} operator -@item ?: +@item @code{?:} Conditional. This operator groups right-to-left. @cindex @code{+} (plus sign), @code{+=} operator @@ -12040,7 +12088,7 @@ Conditional. This operator groups right-to-left. @cindex percent sign (@code{%}), @code{%=} operator @cindex @code{^} (caret), @code{^=} operator @cindex caret (@code{^}), @code{^=} operator -@item = += -= *= /= %= ^= **= +@item @code{= += -= *= /= %= ^= **=} Assignment. These operators group right-to-left. @end table @@ -13804,11 +13852,12 @@ sets automatically on certain occasions in order to provide information to your program. The variables that are specific to @command{gawk} are marked with a pound sign@w{ (@samp{#}).} -@table @code +@c @asis for docbook +@table @asis @cindex @code{ARGC}/@code{ARGV} variables @cindex arguments, command-line @cindex command line, arguments -@item ARGC@r{,} ARGV +@item @code{ARGC}, @code{ARGV} The command-line arguments available to @command{awk} programs are stored in an array called @code{ARGV}. @code{ARGC} is the number of command-line arguments present. @xref{Other Arguments}. @@ -13848,7 +13897,7 @@ about how @command{awk} uses these variables. @cindex @code{ARGIND} variable @cindex differences in @command{awk} and @command{gawk}, @code{ARGIND} variable -@item ARGIND # +@item @code{ARGIND} # The index in @code{ARGV} of the current file being processed. Every time @command{gawk} opens a new data file for processing, it sets @code{ARGIND} to the index in @code{ARGV} of the file name. @@ -13873,7 +13922,7 @@ it is not special. @cindex @code{ENVIRON} array @cindex environment variables, in @code{ENVIRON} array -@item ENVIRON +@item @code{ENVIRON} An associative array containing the values of the environment. The array indices are the environment variable names; the elements are the values of the particular environment variables. For example, @@ -13893,7 +13942,7 @@ On such systems, the @code{ENVIRON} array is empty (except for @cindex @code{ERRNO} variable @cindex differences in @command{awk} and @command{gawk}, @code{ERRNO} variable @cindex error handling, @code{ERRNO} variable and -@item ERRNO # +@item @code{ERRNO} # If a system error occurs during a redirection for @code{getline}, during a read for @code{getline}, or during a @code{close()} operation, then @code{ERRNO} contains a string describing the error. @@ -13920,7 +13969,7 @@ it is not special. @cindex @code{FILENAME} variable @cindex dark corner, @code{FILENAME} variable -@item FILENAME +@item @code{FILENAME} The name of the file that @command{awk} is currently reading. When no data files are listed on the command line, @command{awk} reads from the standard input and @code{FILENAME} is set to @code{"-"}. @@ -13939,14 +13988,14 @@ inside a @code{BEGIN} rule can give @code{FILENAME} a value. @cindex @code{FNR} variable -@item FNR +@item @code{FNR} The current record number in the current file. @code{FNR} is incremented each time a new record is read (@pxref{Records}). It is reinitialized to zero each time a new input file is started. @cindex @code{NF} variable -@item NF +@item @code{NF} The number of fields in the current input record. @code{NF} is set each time a new record is read, when a new field is created or when @code{$0} changes (@pxref{Fields}). @@ -13960,7 +14009,7 @@ current record. @xref{Changing Fields}. @cindex @code{FUNCTAB} array @cindex @command{gawk}, @code{FUNCTAB} array in @cindex differences in @command{awk} and @command{gawk}, @code{FUNCTAB} variable -@item FUNCTAB # +@item @code{FUNCTAB} # An array whose indices and corresponding values are the names of all the user-defined or extension functions in the program. @@ -13971,7 +14020,7 @@ the @code{FUNCTAB} array will also cause a fatal error. @end quotation @cindex @code{NR} variable -@item NR +@item @code{NR} The number of input records @command{awk} has processed since the beginning of the program's execution (@pxref{Records}). @@ -13980,7 +14029,7 @@ the beginning of the program's execution @cindex @command{gawk}, @code{PROCINFO} array in @cindex @code{PROCINFO} array @cindex differences in @command{awk} and @command{gawk}, @code{PROCINFO} array -@item PROCINFO # +@item @code{PROCINFO} # The elements of this array provide access to information about the running @command{awk} program. The following elements (listed alphabetically) @@ -14137,7 +14186,7 @@ or if @command{gawk} is in compatibility mode it is not special. @cindex @code{RLENGTH} variable -@item RLENGTH +@item @code{RLENGTH} The length of the substring matched by the @code{match()} function (@pxref{String Functions}). @@ -14145,7 +14194,7 @@ The length of the substring matched by the is the length of the matched string, or @minus{}1 if no match is found. @cindex @code{RSTART} variable -@item RSTART +@item @code{RSTART} The start-index in characters of the substring that is matched by the @code{match()} function (@pxref{String Functions}). @@ -14156,7 +14205,7 @@ if no match was found. @cindex @command{gawk}, @code{RT} variable in @cindex @code{RT} variable @cindex differences in @command{awk} and @command{gawk}, @code{RT} variable -@item RT # +@item @code{RT} # This is set each time a record is read. It contains the input text that matched the text denoted by @code{RS}, the record separator. @@ -14169,7 +14218,7 @@ it is not special. @cindex @command{gawk}, @code{SYMTAB} array in @cindex @code{SYMTAB} array @cindex differences in @command{awk} and @command{gawk}, @code{SYMTAB} variable -@item SYMTAB # +@item @code{SYMTAB} # An array whose indices are the names of all currently defined global variables and arrays in the program. The array may be used for indirect access to read or write the value of a variable: @@ -15692,26 +15741,27 @@ The following list describes all of the built-in functions that work with numbers. Optional parameters are enclosed in square brackets@w{ ([ ]):} -@table @code -@item atan2(@var{y}, @var{x}) +@c @asis for docbook +@table @asis +@item @code{atan2(@var{y}, @var{x})} @cindexawkfunc{atan2} @cindex arctangent Return the arctangent of @code{@var{y} / @var{x}} in radians. You can use @samp{pi = atan2(0, -1)} to retrieve the value of @value{PI}. -@item cos(@var{x}) +@item @code{cos(@var{x})} @cindexawkfunc{cos} @cindex cosine Return the cosine of @var{x}, with @var{x} in radians. -@item exp(@var{x}) +@item @code{exp(@var{x})} @cindexawkfunc{exp} @cindex exponent Return the exponential of @var{x} (@code{e ^ @var{x}}) or report an error if @var{x} is out of range. The range of values @var{x} can have depends on your machine's floating-point representation. -@item int(@var{x}) +@item @code{int(@var{x})} @cindexawkfunc{int} @cindex round to nearest integer Return the nearest integer to @var{x}, located between @var{x} and zero and @@ -15720,13 +15770,13 @@ truncated toward zero. For example, @code{int(3)} is 3, @code{int(3.9)} is 3, @code{int(-3.9)} is @minus{}3, and @code{int(-3)} is @minus{}3 as well. -@item log(@var{x}) +@item @code{log(@var{x})} @cindexawkfunc{log} @cindex logarithm Return the natural logarithm of @var{x}, if @var{x} is positive; otherwise, report an error. -@item rand() +@item @code{rand()} @cindexawkfunc{rand} @cindex random numbers, @code{rand()}/@code{srand()} functions Return a random number. The values of @code{rand()} are @@ -15784,19 +15834,19 @@ the seed to a value that is different in each run. To do this, use @code{srand()}. @end quotation -@item sin(@var{x}) +@item @code{sin(@var{x})} @cindexawkfunc{sin} @cindex sine Return the sine of @var{x}, with @var{x} in radians. -@item sqrt(@var{x}) +@item @code{sqrt(@var{x})} @cindexawkfunc{sqrt} @cindex square root Return the positive square root of @var{x}. @command{gawk} prints a warning message if @var{x} is negative. Thus, @code{sqrt(4)} is 2. -@item srand(@r{[}@var{x}@r{]}) +@item @code{srand(}[@var{x}]@code{)} @cindexawkfunc{srand} Set the starting point, or seed, for generating random numbers to the value @var{x}. @@ -15853,9 +15903,10 @@ pound sign@w{ (@samp{#}):} @code{gensub()}. @end menu -@table @code -@item asort(@var{source} @r{[}, @var{dest} @r{[}, @var{how} @r{]} @r{]}) # -@itemx asorti(@var{source} @r{[}, @var{dest} @r{[}, @var{how} @r{]} @r{]}) # +@c @asis for docbook +@table @asis +@item @code{asort(}@var{source} [@code{,} @var{dest} [@code{,} @var{how} ] ]@code{)} # +@itemx @code{asorti(}@var{source} [@code{,} @var{dest} [@code{,} @var{how} ] ]@code{)} # @cindexgawkfunc{asorti} @cindex sort array @cindex arrays, elements, retrieving number of @@ -15922,7 +15973,7 @@ a[3] = "middle" @code{asort()} and @code{asorti()} are @command{gawk} extensions; they are not available in compatibility mode (@pxref{Options}). -@item gensub(@var{regexp}, @var{replacement}, @var{how} @r{[}, @var{target}@r{]}) # +@item @code{gensub(@var{regexp}, @var{replacement}, @var{how}} [@code{, @var{target}}]@code{)} # @cindexgawkfunc{gensub} @cindex search and replace in strings @cindex substitute in string @@ -15987,7 +16038,7 @@ is the original unchanged value of @var{target}. @code{gensub()} is a @command{gawk} extension; it is not available in compatibility mode (@pxref{Options}). -@item gsub(@var{regexp}, @var{replacement} @r{[}, @var{target}@r{]}) +@item @code{gsub(@var{regexp}, @var{replacement}} [@code{, @var{target}}]@code{)} @cindexawkfunc{gsub} Search @var{target} for @emph{all} of the longest, leftmost, @emph{nonoverlapping} matching @@ -16009,7 +16060,7 @@ omitted, then the entire input record (@code{$0}) is used. As in @code{sub()}, the characters @samp{&} and @samp{\} are special, and the third argument must be assignable. -@item index(@var{in}, @var{find}) +@item @code{index(@var{in}, @var{find})} @cindexawkfunc{index} @cindex search in string @cindex find substring in string @@ -16028,7 +16079,7 @@ If @var{find} is not found, @code{index()} returns zero. It is a fatal error to use a regexp constant for @var{find}. -@item length(@r{[}@var{string}@r{]}) +@item @code{length(}[@var{string}]@code{)} @cindexawkfunc{length} @cindex string length @cindex length of string @@ -16093,7 +16144,7 @@ If @option{--lint} is provided on the command line If @option{--posix} is supplied, using an array argument is a fatal error (@pxref{Arrays}). -@item match(@var{string}, @var{regexp} @r{[}, @var{array}@r{]}) +@item @code{match(@var{string}, @var{regexp}} [@code{, @var{array}}]@code{)} @cindexawkfunc{match} @cindex string, regular expression match @cindex match regexp in string @@ -16210,7 +16261,7 @@ The @var{array} argument to @code{match()} is a (@pxref{Options}), using a third argument is a fatal error. -@item patsplit(@var{string}, @var{array} @r{[}, @var{fieldpat} @r{[}, @var{seps} @r{]} @r{]}) # +@item @code{patsplit(@var{string}, @var{array}} [@code{, @var{fieldpat}} [@code{, @var{seps}} ] ]@code{)} # @cindexgawkfunc{patsplit} @cindex split string into array Divide @@ -16242,7 +16293,7 @@ The @code{patsplit()} function is a (@pxref{Options}), it is not available. -@item split(@var{string}, @var{array} @r{[}, @var{fieldsep} @r{[}, @var{seps} @r{]} @r{]}) +@item @code{split(@var{string}, @var{array}} [@code{, @var{fieldsep}} [@code{, @var{seps}} ] ]@code{)} @cindexawkfunc{split} Divide @var{string} into pieces separated by @var{fieldsep} and store the pieces in @var{array} and the separator strings in the @@ -16327,7 +16378,7 @@ If @var{string} does not match @var{fieldsep} at all (but is not null), @var{array} has one element only. The value of that element is the original @var{string}. -@item sprintf(@var{format}, @var{expression1}, @dots{}) +@item @code{sprintf(@var{format}, @var{expression1}, @dots{})} @cindexawkfunc{sprintf} @cindex formatting strings Return (without printing) the string that @code{printf} would @@ -16344,7 +16395,7 @@ assigns the string @w{@samp{pi = 3.14 (approx.)}} to the variable @code{pival}. @cindexgawkfunc{strtonum} @cindex convert string to number -@item strtonum(@var{str}) # +@item @code{strtonum(@var{str})} # Examine @var{str} and return its numeric value. If @var{str} begins with a leading @samp{0}, @code{strtonum()} assumes that @var{str} is an octal number. If @var{str} begins with a leading @samp{0x} or @@ -16369,7 +16420,7 @@ for recognizing numbers (@pxref{Locales}). @code{strtonum()} is a @command{gawk} extension; it is not available in compatibility mode (@pxref{Options}). -@item sub(@var{regexp}, @var{replacement} @r{[}, @var{target}@r{]}) +@item @code{sub(@var{regexp}, @var{replacement}} [@code{, @var{target}}]@code{)} @cindexawkfunc{sub} @cindex replace in string Search @var{target}, which is treated as a string, for the @@ -16470,7 +16521,7 @@ will not run. Finally, if the @var{regexp} is not a regexp constant, it is converted into a string, and then the value of that string is treated as the regexp to match. -@item substr(@var{string}, @var{start} @r{[}, @var{length}@r{]}) +@item @code{substr(@var{string}, @var{start}} [@code{, @var{length}} ]@code{)} @cindexawkfunc{substr} @cindex substring Return a @var{length}-character-long substring of @var{string}, @@ -16530,7 +16581,7 @@ string = substr(string, 1, 2) "CDE" substr(string, 6) @cindex case sensitivity, converting case @cindex strings, converting letter case -@item tolower(@var{string}) +@item @code{tolower(@var{string})} @cindexawkfunc{tolower} @cindex convert string to lower case Return a copy of @var{string}, with each uppercase character @@ -16538,7 +16589,7 @@ in the string replaced with its corresponding lowercase character. Nonalphabetic characters are left unchanged. For example, @code{tolower("MiXeD cAsE 123")} returns @code{"mixed case 123"}. -@item toupper(@var{string}) +@item @code{toupper(@var{string})} @cindexawkfunc{toupper} @cindex convert string to upper case Return a copy of @var{string}, with each lowercase character @@ -16971,8 +17022,8 @@ Although this makes a certain amount of sense, it can be surprising. The following functions relate to input/output (I/O). Optional parameters are enclosed in square brackets ([ ]): -@table @code -@item close(@var{filename} @r{[}, @var{how}@r{]}) +@table @asis +@item @code{close(}@var{filename} [@code{,} @var{how}]@code{)} @cindexawkfunc{close} @cindex files, closing @cindex close file or coprocess @@ -16991,7 +17042,7 @@ not matter. @xref{Two-way I/O}, which discusses this feature in more detail and gives an example. -@item fflush(@r{[}@var{filename}@r{]}) +@item @code{fflush(}[@var{filename}]@code{)} @cindexawkfunc{fflush} @cindex flush buffered output Flush any buffered output associated with @var{filename}, which is either a @@ -17051,7 +17102,7 @@ a file or pipe that was opened for reading (such as with @code{getline}), or if @var{filename} is not an open file, pipe, or coprocess. In such a case, @code{fflush()} returns @minus{}1, as well. -@item system(@var{command}) +@item @code{system(@var{command})} @cindexawkfunc{system} @cindex invoke shell command @cindex interacting with other programs @@ -17407,7 +17458,7 @@ is out of range, @code{mktime()} returns @minus{}1. @cindex @command{gawk}, @code{PROCINFO} array in @cindex @code{PROCINFO} array -@item strftime(@r{[}@var{format} @r{[}, @var{timestamp} @r{[}, @var{utc-flag}@r{]]]}) +@item @code{strftime(} [@var{format} [@code{,} @var{timestamp} [@code{,} @var{utc-flag} ]]]@code{)} @c STARTOFRANGE strf @cindexgawkfunc{strftime} @cindex format time string @@ -17891,32 +17942,32 @@ bitwise operations just described. They are: @table @code @cindexgawkfunc{and} @cindex bitwise AND -@item and(@var{v1}, @var{v2} @r{[}, @r{@dots{}]}) +@item @code{and(@var{v1}, @var{v2}} [@code{,} @dots{}]@code{)} Return the bitwise AND of the arguments. There must be at least two. @cindexgawkfunc{compl} @cindex bitwise complement -@item compl(@var{val}) +@item @code{compl(@var{val})} Return the bitwise complement of @var{val}. @cindexgawkfunc{lshift} @cindex left shift -@item lshift(@var{val}, @var{count}) +@item @code{lshift(@var{val}, @var{count})} Return the value of @var{val}, shifted left by @var{count} bits. @cindexgawkfunc{or} @cindex bitwise OR -@item or(@var{v1}, @var{v2} @r{[}, @r{@dots{}]}) +@item @code{or(@var{v1}, @var{v2}} [@code{,} @dots{}]@code{)} Return the bitwise OR of the arguments. There must be at least two. @cindexgawkfunc{rshift} @cindex right shift -@item rshift(@var{val}, @var{count}) +@item @code{rshift(@var{val}, @var{count})} Return the value of @var{val}, shifted right by @var{count} bits. @cindexgawkfunc{xor} @cindex bitwise XOR -@item xor(@var{v1}, @var{v2} @r{[}, @r{@dots{}]}) +@item @code{xor(@var{v1}, @var{v2}} [@code{,} @dots{}]@code{)} Return the bitwise XOR of the arguments. There must be at least two. @end table @@ -18080,7 +18131,7 @@ Optional parameters are enclosed in square brackets ([ ]): @table @code @cindexgawkfunc{bindtextdomain} @cindex set directory of message catalogs -@item bindtextdomain(@var{directory} @r{[}, @var{domain}@r{]}) +@item @code{bindtextdomain(@var{directory}} [@code{,} @var{domain} ]@code{)} Set the directory in which @command{gawk} will look for message translation files, in case they will not or cannot be placed in the ``standard'' locations @@ -18094,14 +18145,14 @@ given @var{domain}. @cindexgawkfunc{dcgettext} @cindex translate string -@item dcgettext(@var{string} @r{[}, @var{domain} @r{[}, @var{category}@r{]]}) +@item @code{dcgettext(@var{string}} [@code{,} @var{domain} [@code{,} @var{category} ]]@code{)} Return the translation of @var{string} in text domain @var{domain} for locale category @var{category}. The default value for @var{domain} is the current value of @code{TEXTDOMAIN}. The default value for @var{category} is @code{"LC_MESSAGES"}. @cindexgawkfunc{dcngettext} -@item dcngettext(@var{string1}, @var{string2}, @var{number} @r{[}, @var{domain} @r{[}, @var{category}@r{]]}) +@item @code{dcngettext(@var{string1}, @var{string2}, @var{number}} [@code{,} @var{domain} [@code{,} @var{category} ]]@code{)} Return the plural form used for @var{number} of the translation of @var{string1} and @var{string2} in text domain @var{domain} for locale category @var{category}. @var{string1} is the @@ -19158,9 +19209,8 @@ for (i = 1; i <= n; i++) @part Part II:@* Problem Solving With @command{awk} @end iftex -@ignore @ifdocbook -@part Part II:@* Problem Solving With @command{awk} +@part Problem Solving With @command{awk} Part II shows how to use @command{awk} and @command{gawk} for problem solving. There is lots of code here for you to read and learn from. @@ -19174,7 +19224,6 @@ It contains the following chapters: @ref{Sample Programs}. @end itemize @end ifdocbook -@end ignore @node Library Functions @chapter A Library of @command{awk} Functions @@ -24534,7 +24583,7 @@ BEGIN @{ The following program, @file{igawk.sh}, provides this service. It simulates @command{gawk}'s searching of the @env{AWKPATH} variable and also allows @dfn{nested} includes; i.e., a file that is included -with @samp{@@include} can contain further @samp{@@include} statements. +with @code{@@include} can contain further @code{@@include} statements. @command{igawk} makes an effort to only include files once, so that nested includes don't accidentally include a library function twice. @@ -24572,7 +24621,7 @@ of the file included into the program at the correct point. @item Run an @command{awk} program (naturally) over the shell variable's contents to expand -@samp{@@include} statements. The expanded program is placed in a second +@code{@@include} statements. The expanded program is placed in a second shell variable. @item @@ -24592,24 +24641,25 @@ argument is @samp{debug}. The next part loops through all the command-line arguments. There are several cases of interest: -@table @code -@item -- +@c @asis for docbook +@table @asis +@item @option{--} This ends the arguments to @command{igawk}. Anything else should be passed on to the user's @command{awk} program without being evaluated. -@item -W +@item @option{-W} This indicates that the next option is specific to @command{gawk}. To make argument processing easier, the @option{-W} is appended to the front of the remaining arguments and the loop continues. (This is an @command{sh} programming trick. Don't worry about it if you are not familiar with @command{sh}.) -@item -v@r{,} -F +@item @option{-v}, @option{-F} These are saved and passed on to @command{gawk}. -@item -f@r{,} --file@r{,} --file=@r{,} -Wfile= +@item @option{-f}, @option{--file}, @option{--file=}, @option{-Wfile=} The file name is appended to the shell variable @code{program} with an -@samp{@@include} statement. +@code{@@include} statement. The @command{expr} utility is used to remove the leading option part of the argument (e.g., @samp{--file=}). (Typical @command{sh} usage would be to use the @command{echo} and @command{sed} @@ -24617,10 +24667,10 @@ utilities to do this work. Unfortunately, some versions of @command{echo} evalu escape sequences in their arguments, possibly mangling the program text. Using @command{expr} avoids this problem.) -@item --source@r{,} --source=@r{,} -Wsource= +@item @option{--source}, @option{--source=}, @option{-Wsource=} The source text is appended to @code{program}. -@item --version@r{,} -Wversion +@item @option{--version}, @option{-Wversion} @command{igawk} prints its version number, runs @samp{gawk --version} to get the @command{gawk} version information, and then exits. @end table @@ -24728,14 +24778,14 @@ fi @c endfile @end example -The @command{awk} program to process @samp{@@include} directives +The @command{awk} program to process @code{@@include} directives is stored in the shell variable @code{expand_prog}. Doing this keeps the shell script readable. The @command{awk} program reads through the user's program, one line at a time, using @code{getline} (@pxref{Getline}). The input -file names and @samp{@@include} statements are managed using a stack. -As each @samp{@@include} is encountered, the current file name is -``pushed'' onto the stack and the file named in the @samp{@@include} +file names and @code{@@include} statements are managed using a stack. +As each @code{@@include} is encountered, the current file name is +``pushed'' onto the stack and the file named in the @code{@@include} directive becomes the current file name. As each file is finished, the stack is ``popped,'' and the previous input file becomes the current input file again. The process is started by making the original file @@ -24811,8 +24861,8 @@ BEGIN @{ The stack is initialized with @code{ARGV[1]}, which will be @samp{/dev/stdin}. The main loop comes next. Input lines are read in succession. Lines that -do not start with @samp{@@include} are printed verbatim. -If the line does start with @samp{@@include}, the file name is in @code{$2}. +do not start with @code{@@include} are printed verbatim. +If the line does start with @code{@@include}, the file name is in @code{$2}. @code{pathto()} is called to generate the full path. If it cannot, then the program prints an error message and continues. @@ -24880,7 +24930,7 @@ It's done in these steps: @enumerate @item -Run @command{gawk} with the @samp{@@include}-processing program (the +Run @command{gawk} with the @code{@@include}-processing program (the value of the @code{expand_prog} shell variable) on standard input. @item @@ -24924,9 +24974,9 @@ There are four key simplifications that make the program work better: @itemize @bullet @item -Using @samp{@@include} even for the files named with @option{-f} makes building +Using @code{@@include} even for the files named with @option{-f} makes building the initial collected @command{awk} program much simpler; all the -@samp{@@include} processing can be done once. +@code{@@include} processing can be done once. @item Not trying to save the line read with @code{getline} @@ -24939,7 +24989,7 @@ considerably. @item Using a @code{getline} loop in the @code{BEGIN} rule does it all in one place. It is not necessary to call out to a separate loop for processing -nested @samp{@@include} statements. +nested @code{@@include} statements. @item Instead of saving the expanded program in a temporary file, putting it in a shell variable @@ -24959,7 +25009,7 @@ Finally, @command{igawk} shows that it is not always necessary to add new features to a program; they can often be layered on top. @ignore With @command{igawk}, -there is no real reason to build @samp{@@include} processing into +there is no real reason to build @code{@@include} processing into @command{gawk} itself. @end ignore @@ -24988,8 +25038,8 @@ One user @c Karl Berry, karl@ileaf.com, 10/95 suggested that @command{gawk} be modified to automatically read these files upon startup. Instead, it would be very simple to modify @command{igawk} -to do this. Since @command{igawk} can process nested @samp{@@include} -directives, @file{default.awk} could simply contain @samp{@@include} +to do this. Since @command{igawk} can process nested @code{@@include} +directives, @file{default.awk} could simply contain @code{@@include} statements for the desired library functions. @c Exercise: make this change @@ -25244,10 +25294,8 @@ BEGIN { @part Part III:@* Moving Beyond Standard @command{awk} With @command{gawk} @end iftex -@ignore @ifdocbook - -@part Part III:@* Moving Beyond Standard @command{awk} With @command{gawk} +@part Moving Beyond Standard @command{awk} With @command{gawk} Part III focuses on features specific to @command{gawk}. It contains the following chapters: @@ -25269,7 +25317,6 @@ It contains the following chapters: @ref{Dynamic Extensions}. @end itemize @end ifdocbook -@end ignore @node Advanced Features @chapter Advanced Features of @command{gawk} @@ -26665,7 +26712,7 @@ are candidates for translation at runtime. String constants without a leading underscore are not translated. @cindexgawkfunc{dcgettext} -@item dcgettext(@var{string} @r{[}, @var{domain} @r{[}, @var{category}@r{]]}) +@item @code{dcgettext(@var{string}} [@code{,} @var{domain} [@code{,} @var{category} ]]@code{)} Return the translation of @var{string} in text domain @var{domain} for locale category @var{category}. The default value for @var{domain} is the current value of @code{TEXTDOMAIN}. @@ -26691,7 +26738,7 @@ default arguments. @end quotation @cindexgawkfunc{dcngettext} -@item dcngettext(@var{string1}, @var{string2}, @var{number} @r{[}, @var{domain} @r{[}, @var{category}@r{]]}) +@item @code{dcngettext(@var{string1}, @var{string2}, @var{number}} [@code{,} @var{domain} [@code{,} @var{category} ]]@code{)} Return the plural form used for @var{number} of the translation of @var{string1} and @var{string2} in text domain @var{domain} for locale category @var{category}. @var{string1} is the @@ -26707,7 +26754,7 @@ The same remarks about argument order as for the @code{dcgettext()} function app @cindex message object files, specifying directory of @cindex files, message object, specifying directory of @cindexgawkfunc{bindtextdomain} -@item bindtextdomain(@var{directory} @r{[}, @var{domain}@r{]}) +@item @code{bindtextdomain(@var{directory}} [@code{,} @var{domain} ]@code{)} Change the directory in which @code{gettext} looks for @file{.gmo} files, in case they will not or cannot be placed in the standard locations @@ -28211,38 +28258,39 @@ a new value to the named option. The available options are: @c nested table -@table @code -@item history_size +@c asis for docbook +@table @asis +@item @code{history_size} @cindex debugger history size The maximum number of lines to keep in the history file @file{./.gawk_history}. The default is 100. -@item listsize +@item @code{listsize} @cindex debugger default list amount The number of lines that @code{list} prints. The default is 15. -@item outfile +@item @code{outfile} @cindex redirect @command{gawk} output, in debugger Send @command{gawk} output to a file; debugger output still goes to standard output. An empty string (@code{""}) resets output to standard output. -@item prompt +@item @code{prompt} @cindex debugger prompt The debugger prompt. The default is @samp{@w{gawk> }}. -@item save_history @r{[}on @r{|} off@r{]} +@item @code{save_history} [@code{on} | @code{off}] @cindex debugger history file Save command history to file @file{./.gawk_history}. The default is @code{on}. -@item save_options @r{[}on @r{|} off@r{]} +@item @code{save_options} [@code{on} | @code{off}] @cindex save debugger options Save current options to file @file{./.gawkrc} upon exit. The default is @code{on}. Options are read back in to the next session upon startup. -@item trace @r{[}on @r{|} off@r{]} +@item @code{trace} [@code{on} | @code{off}] @cindex instruction tracing, in debugger Turn instruction tracing on or off. The default is @code{off}. @end table @@ -28396,7 +28444,7 @@ running a program, the debugger warns you if you accidentally type @cindex debugger commands, @code{trace} @cindex @code{trace} debugger command -@item @code{trace} @code{on} @r{|} @code{off} +@item @code{trace} [@code{on} | @code{off}] Turn on or off a continuous printing of instructions which are about to be executed, along with printing the @command{awk} line which they implement. The default is @code{off}. @@ -32198,8 +32246,9 @@ as described earlier (@pxref{Extension Functions}). It can then be looped over for multiple calls to @code{add_ext_func()}. +@c Use @var{OR} for docbook @item static awk_bool_t (*init_func)(void) = NULL; -@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @r{OR} +@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @var{OR} @itemx static awk_bool_t init_my_module(void) @{ @dots{} @} @itemx static awk_bool_t (*init_func)(void) = init_my_module; If you need to do some initialization work, you should define a @@ -32869,7 +32918,7 @@ BEGIN @{ @end example The @env{AWKLIBPATH} environment variable tells -@command{gawk} where to find shared libraries (@pxref{Finding Extensions}). +@command{gawk} where to find extensions (@pxref{Finding Extensions}). We set it to the current directory and run the program: @example @@ -32932,19 +32981,19 @@ Others mainly provide example code that shows how to use the extension API. The @code{filefuncs} extension provides three different functions, as follows: The usage is: -@table @code +@table @asis @item @@load "filefuncs" This is how you load the extension. @cindex @code{chdir()} extension function -@item result = chdir("/some/directory") +@item @code{result = chdir("/some/directory")} The @code{chdir()} function is a direct hook to the @code{chdir()} system call to change the current directory. It returns zero upon success or less than zero upon error. In the latter case it updates @code{ERRNO}. @cindex @code{stat()} extension function -@item result = stat("/some/path", statdata @r{[}, follow@r{]}) +@item @code{result = stat("/some/path", statdata} [@code{, follow}]@code{)} The @code{stat()} function provides a hook into the @code{stat()} system call. It returns zero upon success or less than zero upon error. @@ -33034,8 +33083,8 @@ Not all systems support all file types. @end multitable @cindex @code{fts()} extension function -@item flags = or(FTS_PHYSICAL, ...) -@itemx result = fts(pathlist, flags, filedata) +@item @code{flags = or(FTS_PHYSICAL, ...)} +@itemx @code{result = fts(pathlist, flags, filedata)} Walk the file trees provided in @code{pathlist} and fill in the @code{filedata} array as described below. @code{flags} is the bitwise OR of several predefined constant values, also described below. @@ -33659,14 +33708,19 @@ See the project's web site for more information. @part Part IV:@* Appendices @end iftex -@ignore @ifdocbook -@part Part IV:@* Appendices +@part Appendices -Part IV provides the appendices, the Glossary, and two licenses that cover -the @command{gawk} source code and this @value{DOCUMENT}, respectively. -It contains the following appendices: +@ifclear FOR_PRINT +Part IV contains the appendixes (including the two licenses that cover +the @command{gawk} source code and this @value{DOCUMENT}, respectively) +and the Glossary: +@end ifclear + +@ifset FOR_PRINT +Part IV contains two appendixes: +@end ifset @itemize @bullet @item @@ -33675,6 +33729,7 @@ It contains the following appendices: @item @ref{Installation}. +@ifclear FOR_PRINT @item @ref{Notes}. @@ -33689,24 +33744,23 @@ It contains the following appendices: @item @ref{GNU Free Documentation License}. +@end ifclear @end itemize @end ifdocbook -@end ignore @node Language History @appendix The Evolution of the @command{awk} Language -This @value{DOCUMENT} describes the GNU implementation of @command{awk}, which follows -the POSIX specification. -Many long-time @command{awk} users learned @command{awk} programming -with the original @command{awk} implementation in Version 7 Unix. -(This implementation was the basis for @command{awk} in Berkeley Unix, -through 4.3-Reno. Subsequent versions of Berkeley Unix, and some systems -derived from 4.4BSD-Lite, use various versions of @command{gawk} -for their @command{awk}.) -This @value{CHAPTER} briefly describes the -evolution of the @command{awk} language, with cross-references to other parts -of the @value{DOCUMENT} where you can find more information. +This @value{DOCUMENT} describes the GNU implementation of @command{awk}, +which follows the POSIX specification. Many long-time @command{awk} +users learned @command{awk} programming with the original @command{awk} +implementation in Version 7 Unix. (This implementation was the basis for +@command{awk} in Berkeley Unix, through 4.3-Reno. Subsequent versions +of Berkeley Unix, and some systems derived from 4.4BSD-Lite, use various +versions of @command{gawk} for their @command{awk}.) This @value{CHAPTER} +briefly describes the evolution of the @command{awk} language, with +cross-references to other parts of the @value{DOCUMENT} where you can +find more information. @menu * V7/SVR3.1:: The major changes between V7 and System V @@ -40144,4 +40198,4 @@ which sorta sucks. TODO: ----- -1. Empty string vs. null string. 30 occurrences vs. 77, respectively. +2. Add back in docbook fixes for @r{}. diff --git a/doc/gawktexi.in b/doc/gawktexi.in index 8e966ccd..440d641b 100644 --- a/doc/gawktexi.in +++ b/doc/gawktexi.in @@ -1954,8 +1954,9 @@ May, 2014 @part Part I:@* The @command{awk} Language @end iftex -@ignore @ifdocbook +@part The @command{awk} Language + Part I describes the @command{awk} language and @command{gawk} program in detail. It starts with the basics, and continues through all of the features of @command{awk}. Included also are many, but not all, @@ -1991,7 +1992,6 @@ following chapters: @ref{Functions}. @end itemize @end ifdocbook -@end ignore @node Getting Started @chapter Getting Started with @command{awk} @@ -2429,6 +2429,27 @@ knowledge of shell quoting rules. The following rules apply only to POSIX-compliant, Bourne-style shells (such as Bash, the GNU Bourne-Again Shell). If you use the C shell, you're on your own. +Before diving into the rules, we introduce a concept that appears +throughout this @value{DOCUMENT}, which is that of the @dfn{null}, +or empty, string. + +The null string is character data that has no value. +In other words, it is empty. It is written in @command{awk} programs +like this: @code{""}. In the shell, it can be written using single +or double quotes: @code{""} or @code{''}. While the null string has +no characters in it, it does exist. Consider this command: + +@example +$ @kbd{echo ""} +@end example + +@noindent +Here, the @command{echo} utility receives a single argument, even +though that argument has no characters in it. In the rest of this +@value{DOCUMENT}, we use the terms @dfn{null string} and @dfn{empty string} +interchangeably. Now, on to the quoting rules. + + @itemize @bullet @item Quoted items can be concatenated with nonquoted items as well as with other @@ -2604,6 +2625,7 @@ Although this @value{DOCUMENT} generally only worries about POSIX systems and th POSIX shell, the following issue arises often enough for many users that it is worth addressing. +@cindex Brink, Jeroen The ``shells'' on Microsoft Windows systems use the double-quote character for quoting, and make it difficult or impossible to include an escaped double-quote character in a command-line script. @@ -2617,7 +2639,6 @@ gawk "@{ print \"\042\" $0 \"\042\" @}" @var{file} @node Sample Data Files @section Data Files for the Examples -@c For gawk >= 4.0, update these data files. No-one has such slow modems! @cindex input files, examples @cindex @code{mail-list} file @@ -2774,7 +2795,7 @@ awk 'length($0) > 80' data @end example The sole rule has a relational expression as its pattern and it has no -action---so the default action, printing the record, is used. +action---so it uses the default action, printing the record. @cindex @command{expand} utility @item @@ -2857,9 +2878,9 @@ the program would print the odd-numbered lines. The @command{awk} utility reads the input files one line at a time. For each line, @command{awk} tries the patterns of each of the rules. -If several patterns match, then several actions are run in the order in +If several patterns match, then several actions execture in the order in which they appear in the @command{awk} program. If no patterns match, then -no actions are run. +no actions run. After processing all the rules that match the line (and perhaps there are none), @command{awk} reads the next line. (However, @@ -2951,8 +2972,8 @@ needed to produce this traditional-style output from @command{ls}.} The @samp{$6 == "Nov"} in our @command{awk} program is an expression that tests whether the sixth field of the output from @w{@samp{ls -l}} matches the string @samp{Nov}. Each time a line has the string -@samp{Nov} for its sixth field, the action @samp{sum += $5} is -performed. This adds the fifth field (the file's size) to the variable +@samp{Nov} for its sixth field, @command{awk} performs the action +@samp{sum += $5}. This adds the fifth field (the file's size) to the variable @code{sum}. As a result, when @command{awk} has finished reading all the input lines, @code{sum} is the total of the sizes of the files whose lines matched the pattern. (This works because @command{awk} variables @@ -3019,7 +3040,7 @@ We have generally not used backslash continuation in our sample programs. @command{gawk} places no limit on the length of a line, so backslash continuation is never strictly necessary; it just makes programs more readable. For this same reason, as well as -for clarity, we have kept most statements short in the sample programs +for clarity, we have kept most statements short in the programs presented throughout the @value{DOCUMENT}. Backslash continuation is most useful when your @command{awk} program is in a separate source file instead of entered from the command line. You should also note that @@ -3168,12 +3189,15 @@ that it has are much larger than they used to be. @cindex @command{awk} programs, complex If you find yourself writing @command{awk} scripts of more than, say, a few hundred lines, you might consider using a different programming -language. Emacs Lisp is a good choice if you need sophisticated string -or pattern matching capabilities. The shell is also good at string and +language. +The shell is good at string and pattern matching; in addition, it allows powerful use of the system utilities. More conventional languages, such as C, C++, and Java, offer better facilities for system programming and for managing the complexity -of large programs. Programs in these languages may require more lines +of large programs. +Python offers a nice balance between high-level ease of programming and +access to system facilities. +Programs in these languages may require more lines of source code than the equivalent @command{awk} programs, but they are easier to maintain and usually run more efficiently. @@ -3357,9 +3381,10 @@ program; see @ref{Getopt Function}. The following list describes @command{gawk}-specific options: -@table @code -@item -b -@itemx --characters-as-bytes +@c Have to use @asis here to get docbook to come out right. +@table @asis +@item @option{-b} +@itemx @option{--characters-as-bytes} @cindex @option{-b} option @cindex @option{--characters-as-bytes} option Cause @command{gawk} to treat all input data as single-byte characters. @@ -3367,14 +3392,14 @@ In addition, all output written with @code{print} or @code{printf} are treated as single-byte characters. Normally, @command{gawk} follows the POSIX standard and attempts to process -its input data according to the current locale. This can often involve +its input data according to the current locale (@pxref{Locales}). This can often involve converting multibyte characters into wide characters (internally), and can lead to problems or confusion if the input data does not contain valid multibyte characters. This option is an easy way to tell @command{gawk}: ``hands off my data!''. -@item -c -@itemx --traditional +@item @option{-c} +@itemx @option{--traditional} @cindex @option{-c} option @cindex @option{--traditional} option @cindex compatibility mode (@command{gawk}), specifying @@ -3385,15 +3410,15 @@ like Brian Kernighan's version @command{awk}. which summarizes the extensions. Also see @ref{Compatibility Mode}. -@item -C -@itemx --copyright +@item @option{-C} +@itemx @option{--copyright} @cindex @option{-C} option @cindex @option{--copyright} option @cindex GPL (General Public License), printing Print the short version of the General Public License and then exit. -@item -d@r{[}@var{file}@r{]} -@itemx --dump-variables@r{[}=@var{file}@r{]} +@item @option{-d}[@var{file}] +@itemx @option{--dump-variables}[@code{=}@var{file}] @cindex @option{-d} option @cindex @option{--dump-variables} option @cindex dump all variables of a program @@ -3415,8 +3440,8 @@ inadvertently use global variables that you meant to be local. (This is a particularly easy mistake to make with simple variable names like @code{i}, @code{j}, etc.) -@item -D@r{[}@var{file}@r{]} -@itemx --debug=@r{[}@var{file}@r{]} +@item @option{-D}[@var{file}] +@itemx @option{--debug}[@code{=}@var{file}] @cindex @option{-D} option @cindex @option{--debug} option @cindex @command{awk} debugging, enabling @@ -3428,8 +3453,8 @@ of commands for the debugger to execute non-interactively. No space is allowed between the @option{-D} and @var{file}, if @var{file} is supplied. -@item -e @var{program-text} -@itemx --source @var{program-text} +@item @option{-e} @var{program-text} +@itemx @option{--source} @var{program-text} @cindex @option{-e} option @cindex @option{--source} option @cindex source code, mixing @@ -3440,8 +3465,8 @@ This is particularly useful when you have library functions that you want to use from your command-line programs (@pxref{AWKPATH Variable}). -@item -E @var{file} -@itemx --exec @var{file} +@item @option{-E} @var{file} +@itemx @option{--exec} @var{file} @cindex @option{-E} option @cindex @option{--exec} option @cindex @command{awk} programs, location of @@ -3471,8 +3496,8 @@ with @samp{#!} scripts (@pxref{Executable Scripts}), like so: @var{awk program here @dots{}} @end example -@item -g -@itemx --gen-pot +@item @option{-g} +@itemx @option{--gen-pot} @cindex @option{-g} option @cindex @option{--gen-pot} option @cindex portable object files, generating @@ -3483,8 +3508,8 @@ output for all string constants that have been marked for translation. @xref{Internationalization}, for information about this option. -@item -h -@itemx --help +@item @option{-h} +@itemx @option{--help} @cindex @option{-h} option @cindex @option{--help} option @cindex GNU long options, printing list of @@ -3493,42 +3518,47 @@ for information about this option. Print a ``usage'' message summarizing the short and long style options that @command{gawk} accepts and then exit. -@item -i @var{source-file} -@itemx --include @var{source-file} +@item @option{-i} @var{source-file} +@itemx @option{--include} @var{source-file} @cindex @option{-i} option @cindex @option{--include} option @cindex @command{awk} programs, location of -Read @command{awk} source library from @var{source-file}. This option is -completely equivalent to using the @samp{@@include} directive inside -your program. This option is very -similar to the @option{-f} option, but there are two important differences. -First, when @option{-i} is used, the program source will not be loaded if it has -been previously loaded, whereas the @option{-f} will always load the file. +Read @command{awk} source library from @var{source-file}. This option +is completely equivalent to using the @code{@@include} directive inside +your program. This option is very similar to the @option{-f} option, +but there are two important differences. First, when @option{-i} is +used, the program source is not loaded if it has been previously +loaded, whereas with @option{-f}, @command{gawk} always loads the file. Second, because this option is intended to be used with code libraries, @command{gawk} does not recognize such files as constituting main program -input. Thus, after processing an @option{-i} argument, @command{gawk} still expects to -find the main source code via the @option{-f} option or on the command-line. +input. Thus, after processing an @option{-i} argument, @command{gawk} +still expects to find the main source code via the @option{-f} option +or on the command-line. -@item -l @var{lib} -@itemx --load @var{lib} +@item @option{-l} @var{ext} +@itemx @option{--load} @var{ext} @cindex @option{-l} option @cindex @option{--load} option -@cindex loading, library -Load a shared library @var{lib}. This searches for the library using the @env{AWKLIBPATH} +@cindex loading, extensions +Load a dynamic extension named @var{ext}. Extensions +are stored as system shared libraries. +This option searches for the library using the @env{AWKLIBPATH} environment variable. The correct library suffix for your platform will be -supplied by default, so it need not be specified in the library name. -The library initialization routine should be named @code{dl_load()}. -An alternative is to use the @samp{@@load} keyword inside the program to load -a shared library. +supplied by default, so it need not be specified in the extension name. +The extension initialization routine should be named @code{dl_load()}. +An alternative is to use the @code{@@load} keyword inside the program to load +a shared library. This feature is described in detail in @ref{Dynamic Extensions}. -@item -L @r{[}value@r{]} -@itemx --lint@r{[}=value@r{]} +@item @option{-L}[@var{value}] +@itemx @option{--lint}[@code{=}@var{value}] @cindex @option{-l} option @cindex @option{--lint} option @cindex lint checking, issuing warnings @cindex warnings, issuing Warn about constructs that are dubious or nonportable to other @command{awk} implementations. +No space is allowed between the @option{-D} and @var{value}, if +@var{value} is supplied. Some warnings are issued when @command{gawk} first reads your program. Others are issued at runtime, as your program executes. With an optional argument of @samp{fatal}, @@ -3544,16 +3574,16 @@ when eliminating problems pointed out by @option{--lint}, you should take care to search for all occurrences of each inappropriate construct. As @command{awk} programs are usually short, doing so is not burdensome. -@item -M -@itemx --bignum +@item @option{-M} +@itemx @option{--bignum} @cindex @option{-M} option @cindex @option{--bignum} option Force arbitrary precision arithmetic on numbers. This option has no effect if @command{gawk} is not compiled to use the GNU MPFR and MP libraries (@pxref{Gawk and MPFR}). -@item -n -@itemx --non-decimal-data +@item @option{-n} +@itemx @option{--non-decimal-data} @cindex @option{-n} option @cindex @option{--non-decimal-data} option @cindex hexadecimal values@comma{} enabling interpretation of @@ -3568,34 +3598,41 @@ This option can severely break old programs. Use with care. @end quotation -@item -N -@itemx --use-lc-numeric +@item @option{-N} +@itemx @option{--use-lc-numeric} @cindex @option{-N} option @cindex @option{--use-lc-numeric} option Force the use of the locale's decimal point character when parsing numeric input data (@pxref{Locales}). -@item -o@r{[}@var{file}@r{]} -@itemx --pretty-print@r{[}=@var{file}@r{]} +@item @option{-o}[@var{file}] +@itemx @option{--pretty-print}[@code{=}@var{file}] @cindex @option{-o} option @cindex @option{--pretty-print} option Enable pretty-printing of @command{awk} programs. -By default, output program is created in a file named @file{awkprof.out}. +By default, output program is created in a file named @file{awkprof.out} +(@pxref{Profiling}). The optional @var{file} argument allows you to specify a different file name for the output. No space is allowed between the @option{-o} and @var{file}, if @var{file} is supplied. -@item -O -@itemx --optimize +@quotation NOTE +Due to the way @command{gawk} has evolved, with this option +your program is still executed. This will change in the +next major release such that @command{gawk} will only +pretty-print the program and not run it. +@end quotation + +@item @option{-O} +@itemx @option{--optimize} @cindex @option{--optimize} option @cindex @option{-O} option Enable some optimizations on the internal representation of the program. -At the moment this includes just simple constant folding. The @command{gawk} -maintainer hopes to add more optimizations over time. +At the moment this includes just simple constant folding. -@item -p@r{[}@var{file}@r{]} -@itemx --profile@r{[}=@var{file}@r{]} +@item @option{-p}[@var{file}] +@itemx @option{--profile}[@code{=}@var{file}] @cindex @option{-p} option @cindex @option{--profile} option @cindex @command{awk} profiling, enabling @@ -3610,8 +3647,8 @@ No space is allowed between the @option{-p} and @var{file}, if The profile contains execution counts for each statement in the program in the left margin, and function call counts for each function. -@item -P -@itemx --posix +@item @option{-P} +@itemx @option{--posix} @cindex @option{-P} option @cindex @option{--posix} option @cindex POSIX mode @@ -3658,10 +3695,10 @@ data (@pxref{Locales}). @cindex @option{--posix} option, @code{--traditional} option and If you supply both @option{--traditional} and @option{--posix} on the command line, @option{--posix} takes precedence. @command{gawk} -also issues a warning if both options are supplied. +issues a warning if both options are supplied. -@item -r -@itemx --re-interval +@item @option{-r} +@itemx @option{--re-interval} @cindex @option{-r} option @cindex @option{--re-interval} option @cindex regular expressions, interval expressions and @@ -3670,10 +3707,10 @@ Allow interval expressions in regexps. This is now @command{gawk}'s default behavior. Nevertheless, this option remains both for backward compatibility, -and for use in combination with the @option{--traditional} option. +and for use in combination with @option{--traditional}. -@item -S -@itemx --sandbox +@item @option{-S} +@itemx @option{--sandbox} @cindex @option{-S} option @cindex @option{--sandbox} option @cindex sandbox mode @@ -3685,16 +3722,16 @@ This is particularly useful when you want to run @command{awk} scripts from questionable sources and need to make sure the scripts can't access your system (other than the specified input data file). -@item -t -@itemx --lint-old +@item @option{-t} +@itemx @option{--lint-old} @cindex @option{-L} option @cindex @option{--lint-old} option Warn about constructs that are not available in the original version of @command{awk} from Version 7 Unix (@pxref{V7/SVR3.1}). -@item -V -@itemx --version +@item @option{-V} +@itemx @option{--version} @cindex @option{-V} option @cindex @option{--version} option @cindex @command{gawk}, versions of, information about@comma{} printing @@ -3736,13 +3773,13 @@ type @kbd{Ctrl-d} (the end-of-file character) to terminate it. input but then you will not be able to also use the standard input as a source of data.) -Because it is clumsy using the standard @command{awk} mechanisms to mix source -file and command-line @command{awk} programs, @command{gawk} provides the -@option{--source} option. This does not require you to pre-empt the standard -input for your source code; it allows you to easily mix command-line -and library source code -(@pxref{AWKPATH Variable}). -The @option{--source} option may also be used multiple times on the command line. +Because it is clumsy using the standard @command{awk} mechanisms to mix +source file and command-line @command{awk} programs, @command{gawk} +provides the @option{--source} option. This does not require you to +pre-empt the standard input for your source code; it allows you to easily +mix command-line and library source code (@pxref{AWKPATH Variable}). +As with @option{-f}, the @option{--source} and @option{--include} +options may also be used multiple times on the command line. @cindex @option{--source} option If no @option{-f} or @option{--source} option is specified, then @command{gawk} @@ -3754,7 +3791,7 @@ program source code. @cindex POSIX mode If the environment variable @env{POSIXLY_CORRECT} exists, then @command{gawk} behaves in strict POSIX mode, exactly as if -you had supplied the @option{--posix} command-line option. +you had supplied @option{--posix}. Many GNU programs look for this environment variable to suppress extensions that conflict with POSIX, but @command{gawk} behaves differently: it suppresses all extensions, even those that do not @@ -3833,7 +3870,7 @@ The variable values given on the command line are processed for escape sequences (@pxref{Escape Sequences}). @value{DARKCORNER} -In some earlier implementations of @command{awk}, when a variable assignment +In some very early implementations of @command{awk}, when a variable assignment occurred before any file names, the assignment would happen @emph{before} the @code{BEGIN} rule was executed. @command{awk}'s behavior was thus inconsistent; some command-line assignments were available inside the @@ -3845,7 +3882,7 @@ upon the old behavior. The variable assignment feature is most useful for assigning to variables such as @code{RS}, @code{OFS}, and @code{ORS}, which control input and -output formats before scanning the data files. It is also useful for +output formats, before scanning the data files. It is also useful for controlling state if multiple passes are needed over a data file. For example: @@ -3887,7 +3924,7 @@ with @code{getline}. Some other versions of @command{awk} also support this, but it is not standard. (Some operating systems provide a @file{/dev/stdin} file -in the file system, however, @command{gawk} always processes +in the file system; however, @command{gawk} always processes this file name itself.) @node Environment Variables @@ -3935,7 +3972,7 @@ directory is the value of @samp{$(datadir)} generated when @command{gawk} was configured. You probably don't need to worry about this, though.} -The search path feature is particularly useful for building libraries +The search path feature is particularly helpful for building libraries of useful @command{awk} functions. The library files can be placed in a standard directory in the default path and then specified on the command line with a short file name. Otherwise, the full file name @@ -3952,11 +3989,13 @@ If the source code is not found after the initial search, the path is searched again after adding the default @samp{.awk} suffix to the filename. @quotation NOTE +@c 4/2014: +@c using @samp{.} to get quotes, since @file{} no longer supplies them. To include the current directory in the path, either place -@file{.} explicitly in the path or write a null entry in the +@samp{.} explicitly in the path or write a null entry in the path. (A null entry is indicated by starting or ending the path with a -colon or by placing two colons next to each other (@samp{::}).) +colon or by placing two colons next to each other [@samp{::}].) This path search mechanism is similar to the shell's. @c someday, @cite{The Bourne Again Shell}.... @@ -3971,7 +4010,7 @@ the current directory in the search path. If @env{AWKPATH} is not defined in the environment, @command{gawk} places its default search path into @code{ENVIRON["AWKPATH"]}. This makes it easy to determine -the actual search path that @command{gawk} will use +the actual search path that @command{gawk} used from within an @command{awk} program. While you can change @code{ENVIRON["AWKPATH"]} within your @command{awk} @@ -3983,18 +4022,18 @@ found, and @command{gawk} no longer needs to use @env{AWKPATH}. @node AWKLIBPATH Variable @subsection The @env{AWKLIBPATH} Environment Variable @cindex @env{AWKLIBPATH} environment variable -@cindex directories, searching for shared libraries -@cindex search paths, for shared libraries +@cindex directories, searching for loadable extensions +@cindex search paths, for loadable extensions @cindex differences in @command{awk} and @command{gawk}, @code{AWKLIBPATH} environment variable The @env{AWKLIBPATH} environment variable is similar to the @env{AWKPATH} -variable, but it is used to search for shared libraries specified -with the @option{-l} option rather than for source files. If the library -is not found, the path is searched again after adding the appropriate -shared library suffix for the platform. For example, on GNU/Linux systems, -the suffix @samp{.so} is used. -The search path specified is also used for libraries loaded via the -@samp{@@load} keyword (@pxref{Loading Shared Libraries}). +variable, but it is used to search for loadable extensions (stored as +system shared libraries) specified with the @option{-l} option rather +than for source files. If the extension is not found, the path is +searched again after adding the appropriate shared library suffix for +the platform. For example, on GNU/Linux systems, the suffix @samp{.so} +is used. The search path specified is also used for extensions loaded +via the @code{@@load} keyword (@pxref{Loading Shared Libraries}). @node Other Environment Variables @subsection Other Environment Variables @@ -4010,7 +4049,7 @@ mode, disabling all traditional and GNU extensions. @xref{Options}. @item GAWK_SOCK_RETRIES -Controls the number of time @command{gawk} will attempt to +Controls the number of times @command{gawk} attempts to retry a two-way TCP/IP (socket) connection before giving up. @xref{TCP/IP Networking}. @@ -4058,6 +4097,11 @@ two regexp matchers that @command{gawk} uses internally. (There aren't supposed to be differences, but occasionally theory and practice don't coordinate with each other.) +@item GAWK_NO_PP_RUN +If this variable exists, then when invoked with the @option{--pretty-print} +option, @command{gawk} skips running the program. This variable will +not survive into the next major release. + @item GAWK_STACKSIZE This specifies the amount by which @command{gawk} should grow its internal evaluation stack, when needed. @@ -4102,13 +4146,13 @@ to @code{EXIT_FAILURE}. This @value{SECTION} describes a feature that is specific to @command{gawk}. -The @samp{@@include} keyword can be used to read external @command{awk} source +The @code{@@include} keyword can be used to read external @command{awk} source files. This gives you the ability to split large @command{awk} source files into smaller, more manageable pieces, and also lets you reuse common @command{awk} code from various @command{awk} scripts. In other words, you can group together @command{awk} functions, used to carry out specific tasks, into external files. These files can be used just like function libraries, -using the @samp{@@include} keyword in conjunction with the @env{AWKPATH} +using the @code{@@include} keyword in conjunction with the @env{AWKPATH} environment variable. Note that source files may also be included using the @option{-i} option. @@ -4142,14 +4186,14 @@ $ @kbd{gawk -f test2} @end example @code{gawk} runs the @file{test2} script which includes @file{test1} -using the @samp{@@include} +using the @code{@@include} keyword. So, to include external @command{awk} source files you just -use @samp{@@include} followed by the name of the file to be included, +use @code{@@include} followed by the name of the file to be included, enclosed in double quotes. @quotation NOTE Keep in mind that this is a language construct and the file name cannot -be a string variable, but rather just a literal string in double quotes. +be a string variable, but rather just a literal string constant in double quotes. @end quotation The files to be included may be nested; e.g., given a third @@ -4188,47 +4232,48 @@ or: @noindent are valid. The @code{AWKPATH} environment variable can be of great -value when using @samp{@@include}. The same rules for the use +value when using @code{@@include}. The same rules for the use of the @code{AWKPATH} variable in command-line file searches (@pxref{AWKPATH Variable}) apply to -@samp{@@include} also. +@code{@@include} also. This is very helpful in constructing @command{gawk} function libraries. If you have a large script with useful, general purpose @command{awk} functions, you can break it down into library files and put those files in a special directory. You can then include those ``libraries,'' using either the full pathnames of the files, or by setting the @code{AWKPATH} -environment variable accordingly and then using @samp{@@include} with +environment variable accordingly and then using @code{@@include} with just the file part of the full pathname. Of course you can have more than one directory to keep library files; the more complex the working environment is, the more directories you may need to organize the files to be included. Given the ability to specify multiple @option{-f} options, the -@samp{@@include} mechanism is not strictly necessary. -However, the @samp{@@include} keyword +@code{@@include} mechanism is not strictly necessary. +However, the @code{@@include} keyword can help you in constructing self-contained @command{gawk} programs, thus reducing the need for writing complex and tedious command lines. -In particular, @samp{@@include} is very useful for writing CGI scripts +In particular, @code{@@include} is very useful for writing CGI scripts to be run from web pages. As mentioned in @ref{AWKPATH Variable}, the current directory is always searched first for source files, before searching in @env{AWKPATH}, -and this also applies to files named with @samp{@@include}. +and this also applies to files named with @code{@@include}. @node Loading Shared Libraries -@section Loading Shared Libraries Into Your Program +@section Loading Dynamic Extensions Into Your Program This @value{SECTION} describes a feature that is specific to @command{gawk}. -The @samp{@@load} keyword can be used to read external @command{awk} shared -libraries. This allows you to link in compiled code that may offer superior +The @code{@@load} keyword can be used to read external @command{awk} extensions +(stored as system shared libraries). +This allows you to link in compiled code that may offer superior performance and/or give you access to extended capabilities not supported by the @command{awk} language. The @env{AWKLIBPATH} variable is used to -search for the shared library. Using @samp{@@load} is completely equivalent +search for the extension. Using @code{@@load} is completely equivalent to using the @option{-l} command-line option. -If the shared library is not initially found in @env{AWKLIBPATH}, another +If the extension is not initially found in @env{AWKLIBPATH}, another search is conducted after appending the platform's default shared library suffix to the filename. For example, on GNU/Linux systems, the suffix @samp{.so} is used. @@ -4248,11 +4293,11 @@ $ @kbd{gawk -lordchr 'BEGIN @{print chr(65)@}'} @noindent For command-line usage, the @option{-l} option is more convenient, -but @samp{@@load} is useful for embedding inside an @command{awk} source file -that requires access to a shared library. +but @code{@@load} is useful for embedding inside an @command{awk} source file +that requires access to an extension. @ref{Dynamic Extensions}, describes how to write extensions (in C or C++) -that can be loaded with either @samp{@@load} or the @option{-l} option. +that can be loaded with either @code{@@load} or the @option{-l} option. @node Obsolete @section Obsolete Options and/or Features @@ -4398,8 +4443,8 @@ A regular expression can be used as a pattern by enclosing it in slashes. Then the regular expression is tested against the entire text of each record. (Normally, it only needs to match some part of the text in order to succeed.) For example, the -following prints the second field of each record that contains the string -@samp{li} anywhere in it: +following prints the second field of each record where the string +@samp{li} appears anywhere in the record: @example $ @kbd{awk '/li/ @{ print $2 @}' mail-list} @@ -4529,7 +4574,7 @@ A literal backslash, @samp{\}. @cindex backslash (@code{\}), @code{\a} escape sequence @item \a The ``alert'' character, @kbd{Ctrl-g}, ASCII code 7 (BEL). -(This usually makes some sort of audible noise.) +(This often makes some sort of audible noise.) @cindex @code{\} (backslash), @code{\b} escape sequence @cindex backslash (@code{\}), @code{\b} escape sequence @@ -4717,10 +4762,11 @@ the very first step in processing regexps. Here is a list of metacharacters. All characters that are not escape sequences and that are not listed in the table stand for themselves: -@table @code +@c Use @asis so the docbook comes out ok. Sigh. +@table @asis @cindex backslash (@code{\}), regexp operator @cindex @code{\} (backslash), regexp operator -@item \ +@item @code{\} This is used to suppress the special meaning of a character when matching. For example, @samp{\$} matches the character @samp{$}. @@ -4729,7 +4775,7 @@ matches the character @samp{$}. @cindex Texinfo, chapter beginnings in files @cindex @code{^} (caret), regexp operator @cindex caret (@code{^}), regexp operator -@item ^ +@item @code{^} This matches the beginning of a string. For example, @samp{^@@chapter} matches @samp{@@chapter} at the beginning of a string and can be used to identify chapter beginnings in Texinfo source files. @@ -4737,7 +4783,7 @@ The @samp{^} is known as an @dfn{anchor}, because it anchors the pattern to match only at the beginning of the string. It is important to realize that @samp{^} does not match the beginning of -a line embedded in a string. +a line (the point right after a @samp{\n} newline character) embedded in a string. The condition is not true in the following example: @example @@ -4746,11 +4792,13 @@ if ("line1\nLINE 2" ~ /^L/) @dots{} @cindex @code{$} (dollar sign), regexp operator @cindex dollar sign (@code{$}), regexp operator -@item $ +@item @code{$} This is similar to @samp{^}, but it matches only at the end of a string. For example, @samp{p$} matches a record that ends with a @samp{p}. The @samp{$} is an anchor -and does not match the end of a line embedded in a string. +and does not match the end of a line +(the point right before a @samp{\n} newline character) +embedded in a string. The condition in the following example is not true: @example @@ -4759,7 +4807,7 @@ if ("line1\nLINE 2" ~ /1$/) @dots{} @cindex @code{.} (period), regexp operator @cindex period (@code{.}), regexp operator -@item . @r{(period)} +@item @code{.} (period) This matches any single character, @emph{including} the newline character. For example, @samp{.P} matches any single character followed by a @samp{P} in a string. Using @@ -4780,7 +4828,7 @@ may not be able to match the @sc{nul} character. @cindex character sets, See Also bracket expressions @cindex character lists, See bracket expressions @cindex character classes, See bracket expressions -@item [@dots{}] +@item @code{[}@dots{}@code{]} This is called a @dfn{bracket expression}.@footnote{In other literature, you may see a bracket expression referred to as either a @dfn{character set}, a @dfn{character class}, or a @dfn{character list}.} @@ -4792,7 +4840,7 @@ is given in @ref{Bracket Expressions}. @cindex bracket expressions, complemented -@item [^ @dots{}] +@item @code{[^}@dots{}@code{]} This is a @dfn{complemented bracket expression}. The first character after the @samp{[} @emph{must} be a @samp{^}. It matches any characters @emph{except} those in the square brackets. For example, @samp{[^awk]} @@ -4801,7 +4849,7 @@ or @samp{k}. @cindex @code{|} (vertical bar) @cindex vertical bar (@code{|}) -@item | +@item @code{|} This is the @dfn{alternation operator} and it is used to specify alternatives. The @samp{|} has the lowest precedence of all the regular @@ -4814,7 +4862,7 @@ The alternation applies to the largest possible regexps on either side. @cindex @code{()} (parentheses), regexp operator @cindex parentheses @code{()}, regexp operator -@item (@dots{}) +@item @code{(}@dots{}@code{)} Parentheses are used for grouping in regular expressions, as in arithmetic. They can be used to concatenate regular expressions containing the alternation operator, @samp{|}. For example, @@ -4825,7 +4873,7 @@ explained further on in this list.) @cindex @code{*} (asterisk), @code{*} operator, as regexp operator @cindex asterisk (@code{*}), @code{*} operator, as regexp operator -@item * +@item @code{*} This symbol means that the preceding regular expression should be repeated as many times as necessary to find a match. For example, @samp{ph*} applies the @samp{*} symbol to the preceding @samp{h} and looks for matches @@ -4843,11 +4891,11 @@ with backslashes. @cindex @code{+} (plus sign), regexp operator @cindex plus sign (@code{+}), regexp operator -@item + +@item @code{+} This symbol is similar to @samp{*}, except that the preceding expression must be matched at least once. This means that @samp{wh+y} would match @samp{why} and @samp{whhy}, but not @samp{wy}, whereas -@samp{wh*y} would match all three of these strings. +@samp{wh*y} would match all three. The following is a simpler way of writing the last @samp{*} example: @@ -4857,15 +4905,15 @@ awk '/\(c[ad]+r x\)/ @{ print @}' sample @cindex @code{?} (question mark), regexp operator @cindex question mark (@code{?}), regexp operator -@item ? +@item @code{?} This symbol is similar to @samp{*}, except that the preceding expression can be matched either once or not at all. For example, @samp{fe?d} matches @samp{fed} and @samp{fd}, but nothing else. @cindex interval expressions, regexp operator -@item @{@var{n}@} -@itemx @{@var{n},@} -@itemx @{@var{n},@var{m}@} +@item @code{@{}@var{n}@code{@}} +@itemx @code{@{}@var{n}@code{,@}} +@itemx @code{@{}@var{n}@code{,}@var{m}@code{@}} One or two numbers inside braces denote an @dfn{interval expression}. If there is one number in the braces, the preceding regexp is repeated @var{n} times. @@ -5288,10 +5336,12 @@ This works in any POSIX-compliant @command{awk}. Another method, specific to @command{gawk}, is to set the variable @code{IGNORECASE} to a nonzero value (@pxref{Built-in Variables}). When @code{IGNORECASE} is not zero, @emph{all} regexp and string -operations ignore case. Changing the value of -@code{IGNORECASE} dynamically controls the case-sensitivity of the -program as it runs. Case is significant by default because -@code{IGNORECASE} (like most variables) is initialized to zero: +operations ignore case. + +Changing the value of @code{IGNORECASE} dynamically controls the +case-sensitivity of the program as it runs. Case is significant by +default because @code{IGNORECASE} (like most variables) is initialized +to zero: @example x = "aB" @@ -5321,9 +5371,6 @@ case-sensitivity on or off for all the rules at once. Setting @code{IGNORECASE} from the command line is a way to make a program case-insensitive without having to edit it. -Both regexp and string comparison -operations are affected by @code{IGNORECASE}. - @c @cindex ISO 8859-1 @c @cindex ISO Latin-1 In multibyte locales, @@ -5401,7 +5448,7 @@ regexp constant (i.e., a string of characters between slashes). It may be any expression. The expression is evaluated and converted to a string if necessary; the contents of the string are then used as the regexp. A regexp computed in this way is called a @dfn{dynamic -regexp}: +regexp} or a @dfn{computed regexp}: @example BEGIN @{ digits_regexp = "[[:digit:]]+" @} @@ -5470,7 +5517,7 @@ intend a regexp match. @cindex regular expressions, dynamic, with embedded newlines @cindex newlines, in dynamic regexps -Some commercial versions of @command{awk} do not allow the newline +Some versions of @command{awk} do not allow the newline character to be used inside a bracket expression for a dynamic regexp: @example @@ -6365,7 +6412,7 @@ $ @kbd{echo ' a b c d ' | awk 'BEGIN @{ FS = "[ \t\n]+" @}} @cindex null strings @cindex strings, null @cindex empty strings, See null strings -In this case, the first field is @dfn{null} or empty. +In this case, the first field is null, or empty. The stripping of leading and trailing whitespace also comes into play whenever @code{$0} is recomputed. For instance, study this pipeline: @@ -7356,19 +7403,19 @@ Such a record is replaced by the contents of the file Note here how the name of the extra input file is not built into the program; it is taken directly from the data, specifically from the second field on -the @samp{@@include} line. +the @code{@@include} line. The @code{close()} function is called to ensure that if two identical -@samp{@@include} lines appear in the input, the entire specified file is +@code{@@include} lines appear in the input, the entire specified file is included twice. @xref{Close Files And Pipes}. One deficiency of this program is that it does not process nested -@samp{@@include} statements -(i.e., @samp{@@include} statements in included files) +@code{@@include} statements +(i.e., @code{@@include} statements in included files) the way a true macro preprocessor would. @xref{Igawk Program}, for a program -that does handle nested @samp{@@include} statements. +that does handle nested @code{@@include} statements. @node Getline/Pipe @subsection Using @code{getline} from a Pipe @@ -8184,8 +8231,9 @@ of value to print. The rest of the format specifier is made up of optional @dfn{modifiers} that control @emph{how} to print the value, such as the field width. Here is a list of the format-control letters: -@table @code -@item %c +@c @asis for docbook to come out right +@table @asis +@item @code{%c} Print a number as an ASCII character; thus, @samp{printf "%c", 65} outputs the letter @samp{A}. The output for a string value is the first character of the string. @@ -8217,12 +8265,12 @@ a single byte (0--255). @end quotation -@item %d@r{,} %i +@item @code{%d}, @code{%i} Print a decimal integer. The two control letters are equivalent. (The @samp{%i} specification is for compatibility with ISO C.) -@item %e@r{,} %E +@item @code{%e}, @code{%E} Print a number in scientific (exponential) notation; for example: @@ -8237,7 +8285,7 @@ which follow the decimal point. discussed in the next @value{SUBSECTION}.) @samp{%E} uses @samp{E} instead of @samp{e} in the output. -@item %f +@item @code{%f} Print a number in floating-point notation. For example: @@ -8259,37 +8307,37 @@ and positive infinity as @samp{inf} and @samp{infinity}. The special ``not a number'' value formats as @samp{-nan} or @samp{nan}. -@item %F +@item @code{%F} Like @samp{%f} but the infinity and ``not a number'' values are spelled using uppercase letters. The @samp{%F} format is a POSIX extension to ISO C; not all systems support it. On those that don't, @command{gawk} uses @samp{%f} instead. -@item %g@r{,} %G +@item @code{%g}, @code{%G} Print a number in either scientific notation or in floating-point notation, whichever uses fewer characters; if the result is printed in scientific notation, @samp{%G} uses @samp{E} instead of @samp{e}. -@item %o +@item @code{%o} Print an unsigned octal integer (@pxref{Nondecimal-numbers}). -@item %s +@item @code{%s} Print a string. -@item %u +@item @code{%u} Print an unsigned decimal integer. (This format is of marginal use, because all numbers in @command{awk} are floating-point; it is provided primarily for compatibility with C.) -@item %x@r{,} %X +@item @code{%x}, @code{%X} Print an unsigned hexadecimal integer; @samp{%X} uses the letters @samp{A} through @samp{F} instead of @samp{a} through @samp{f} (@pxref{Nondecimal-numbers}). -@item %% +@item @code{%%} Print a single @samp{%}. This does not consume an argument and it ignores any modifiers. @@ -11285,28 +11333,28 @@ expression because the first @samp{$} has higher precedence than the This table presents @command{awk}'s operators, in order of highest to lowest precedence: -@c use @code in the items, looks better in TeX w/o all the quotes -@table @code -@item (@dots{}) +@c @asis for docbook to come out right +@table @asis +@item @code{(}@dots{}@code{)} Grouping. @cindex @code{$} (dollar sign), @code{$} field operator @cindex dollar sign (@code{$}), @code{$} field operator -@item $ +@item @code{$} Field reference. @cindex @code{+} (plus sign), @code{++} operator @cindex plus sign (@code{+}), @code{++} operator @cindex @code{-} (hyphen), @code{--} operator @cindex hyphen (@code{-}), @code{--} operator -@item ++ -- +@item @code{++ --} Increment, decrement. @cindex @code{^} (caret), @code{^} operator @cindex caret (@code{^}), @code{^} operator @cindex @code{*} (asterisk), @code{**} operator @cindex asterisk (@code{*}), @code{**} operator -@item ^ ** +@item @code{^ **} Exponentiation. These operators group right-to-left. @cindex @code{+} (plus sign), @code{+} operator @@ -11315,7 +11363,7 @@ Exponentiation. These operators group right-to-left. @cindex hyphen (@code{-}), @code{-} operator @cindex @code{!} (exclamation point), @code{!} operator @cindex exclamation point (@code{!}), @code{!} operator -@item + - ! +@item @code{+ - !} Unary plus, minus, logical ``not.'' @cindex @code{*} (asterisk), @code{*} operator, as multiplication operator @@ -11324,17 +11372,17 @@ Unary plus, minus, logical ``not.'' @cindex forward slash (@code{/}), @code{/} operator @cindex @code{%} (percent sign), @code{%} operator @cindex percent sign (@code{%}), @code{%} operator -@item * / % +@item @code{* / %} Multiplication, division, remainder. @cindex @code{+} (plus sign), @code{+} operator @cindex plus sign (@code{+}), @code{+} operator @cindex @code{-} (hyphen), @code{-} operator @cindex hyphen (@code{-}), @code{-} operator -@item + - +@item @code{+ -} Addition, subtraction. -@item @r{String Concatenation} +@item String Concatenation There is no special symbol for concatenation. The operands are simply written side by side (@pxref{Concatenation}). @@ -11360,7 +11408,7 @@ The operands are simply written side by side @cindex @code{|} (vertical bar), @code{|&} operator (I/O) @cindex vertical bar (@code{|}), @code{|&} operator (I/O) @cindex operators, input/output -@item < <= == != > >= >> | |& +@item @code{< <= == != > >= >> | |&} Relational and redirection. The relational operators and the redirections have the same precedence level. Characters such as @samp{>} serve both as relationals and as @@ -11381,26 +11429,26 @@ The correct way to write this statement is @samp{print foo > (a ? b : c)}. @cindex tilde (@code{~}), @code{~} operator @cindex @code{!} (exclamation point), @code{!~} operator @cindex exclamation point (@code{!}), @code{!~} operator -@item ~ !~ +@item @code{~ !~} Matching, nonmatching. @cindex @code{in} operator -@item in +@item @code{in} Array membership. @cindex @code{&} (ampersand), @code{&&} operator @cindex ampersand (@code{&}), @code{&&} operator -@item && +@item @code{&&} Logical ``and''. @cindex @code{|} (vertical bar), @code{||} operator @cindex vertical bar (@code{|}), @code{||} operator -@item || +@item @code{||} Logical ``or''. @cindex @code{?} (question mark), @code{?:} operator @cindex question mark (@code{?}), @code{?:} operator -@item ?: +@item @code{?:} Conditional. This operator groups right-to-left. @cindex @code{+} (plus sign), @code{+=} operator @@ -11417,7 +11465,7 @@ Conditional. This operator groups right-to-left. @cindex percent sign (@code{%}), @code{%=} operator @cindex @code{^} (caret), @code{^=} operator @cindex caret (@code{^}), @code{^=} operator -@item = += -= *= /= %= ^= **= +@item @code{= += -= *= /= %= ^= **=} Assignment. These operators group right-to-left. @end table @@ -13181,11 +13229,12 @@ sets automatically on certain occasions in order to provide information to your program. The variables that are specific to @command{gawk} are marked with a pound sign@w{ (@samp{#}).} -@table @code +@c @asis for docbook +@table @asis @cindex @code{ARGC}/@code{ARGV} variables @cindex arguments, command-line @cindex command line, arguments -@item ARGC@r{,} ARGV +@item @code{ARGC}, @code{ARGV} The command-line arguments available to @command{awk} programs are stored in an array called @code{ARGV}. @code{ARGC} is the number of command-line arguments present. @xref{Other Arguments}. @@ -13225,7 +13274,7 @@ about how @command{awk} uses these variables. @cindex @code{ARGIND} variable @cindex differences in @command{awk} and @command{gawk}, @code{ARGIND} variable -@item ARGIND # +@item @code{ARGIND} # The index in @code{ARGV} of the current file being processed. Every time @command{gawk} opens a new data file for processing, it sets @code{ARGIND} to the index in @code{ARGV} of the file name. @@ -13250,7 +13299,7 @@ it is not special. @cindex @code{ENVIRON} array @cindex environment variables, in @code{ENVIRON} array -@item ENVIRON +@item @code{ENVIRON} An associative array containing the values of the environment. The array indices are the environment variable names; the elements are the values of the particular environment variables. For example, @@ -13270,7 +13319,7 @@ On such systems, the @code{ENVIRON} array is empty (except for @cindex @code{ERRNO} variable @cindex differences in @command{awk} and @command{gawk}, @code{ERRNO} variable @cindex error handling, @code{ERRNO} variable and -@item ERRNO # +@item @code{ERRNO} # If a system error occurs during a redirection for @code{getline}, during a read for @code{getline}, or during a @code{close()} operation, then @code{ERRNO} contains a string describing the error. @@ -13297,7 +13346,7 @@ it is not special. @cindex @code{FILENAME} variable @cindex dark corner, @code{FILENAME} variable -@item FILENAME +@item @code{FILENAME} The name of the file that @command{awk} is currently reading. When no data files are listed on the command line, @command{awk} reads from the standard input and @code{FILENAME} is set to @code{"-"}. @@ -13316,14 +13365,14 @@ inside a @code{BEGIN} rule can give @code{FILENAME} a value. @cindex @code{FNR} variable -@item FNR +@item @code{FNR} The current record number in the current file. @code{FNR} is incremented each time a new record is read (@pxref{Records}). It is reinitialized to zero each time a new input file is started. @cindex @code{NF} variable -@item NF +@item @code{NF} The number of fields in the current input record. @code{NF} is set each time a new record is read, when a new field is created or when @code{$0} changes (@pxref{Fields}). @@ -13337,7 +13386,7 @@ current record. @xref{Changing Fields}. @cindex @code{FUNCTAB} array @cindex @command{gawk}, @code{FUNCTAB} array in @cindex differences in @command{awk} and @command{gawk}, @code{FUNCTAB} variable -@item FUNCTAB # +@item @code{FUNCTAB} # An array whose indices and corresponding values are the names of all the user-defined or extension functions in the program. @@ -13348,7 +13397,7 @@ the @code{FUNCTAB} array will also cause a fatal error. @end quotation @cindex @code{NR} variable -@item NR +@item @code{NR} The number of input records @command{awk} has processed since the beginning of the program's execution (@pxref{Records}). @@ -13357,7 +13406,7 @@ the beginning of the program's execution @cindex @command{gawk}, @code{PROCINFO} array in @cindex @code{PROCINFO} array @cindex differences in @command{awk} and @command{gawk}, @code{PROCINFO} array -@item PROCINFO # +@item @code{PROCINFO} # The elements of this array provide access to information about the running @command{awk} program. The following elements (listed alphabetically) @@ -13514,7 +13563,7 @@ or if @command{gawk} is in compatibility mode it is not special. @cindex @code{RLENGTH} variable -@item RLENGTH +@item @code{RLENGTH} The length of the substring matched by the @code{match()} function (@pxref{String Functions}). @@ -13522,7 +13571,7 @@ The length of the substring matched by the is the length of the matched string, or @minus{}1 if no match is found. @cindex @code{RSTART} variable -@item RSTART +@item @code{RSTART} The start-index in characters of the substring that is matched by the @code{match()} function (@pxref{String Functions}). @@ -13533,7 +13582,7 @@ if no match was found. @cindex @command{gawk}, @code{RT} variable in @cindex @code{RT} variable @cindex differences in @command{awk} and @command{gawk}, @code{RT} variable -@item RT # +@item @code{RT} # This is set each time a record is read. It contains the input text that matched the text denoted by @code{RS}, the record separator. @@ -13546,7 +13595,7 @@ it is not special. @cindex @command{gawk}, @code{SYMTAB} array in @cindex @code{SYMTAB} array @cindex differences in @command{awk} and @command{gawk}, @code{SYMTAB} variable -@item SYMTAB # +@item @code{SYMTAB} # An array whose indices are the names of all currently defined global variables and arrays in the program. The array may be used for indirect access to read or write the value of a variable: @@ -15023,26 +15072,27 @@ The following list describes all of the built-in functions that work with numbers. Optional parameters are enclosed in square brackets@w{ ([ ]):} -@table @code -@item atan2(@var{y}, @var{x}) +@c @asis for docbook +@table @asis +@item @code{atan2(@var{y}, @var{x})} @cindexawkfunc{atan2} @cindex arctangent Return the arctangent of @code{@var{y} / @var{x}} in radians. You can use @samp{pi = atan2(0, -1)} to retrieve the value of @value{PI}. -@item cos(@var{x}) +@item @code{cos(@var{x})} @cindexawkfunc{cos} @cindex cosine Return the cosine of @var{x}, with @var{x} in radians. -@item exp(@var{x}) +@item @code{exp(@var{x})} @cindexawkfunc{exp} @cindex exponent Return the exponential of @var{x} (@code{e ^ @var{x}}) or report an error if @var{x} is out of range. The range of values @var{x} can have depends on your machine's floating-point representation. -@item int(@var{x}) +@item @code{int(@var{x})} @cindexawkfunc{int} @cindex round to nearest integer Return the nearest integer to @var{x}, located between @var{x} and zero and @@ -15051,13 +15101,13 @@ truncated toward zero. For example, @code{int(3)} is 3, @code{int(3.9)} is 3, @code{int(-3.9)} is @minus{}3, and @code{int(-3)} is @minus{}3 as well. -@item log(@var{x}) +@item @code{log(@var{x})} @cindexawkfunc{log} @cindex logarithm Return the natural logarithm of @var{x}, if @var{x} is positive; otherwise, report an error. -@item rand() +@item @code{rand()} @cindexawkfunc{rand} @cindex random numbers, @code{rand()}/@code{srand()} functions Return a random number. The values of @code{rand()} are @@ -15115,19 +15165,19 @@ the seed to a value that is different in each run. To do this, use @code{srand()}. @end quotation -@item sin(@var{x}) +@item @code{sin(@var{x})} @cindexawkfunc{sin} @cindex sine Return the sine of @var{x}, with @var{x} in radians. -@item sqrt(@var{x}) +@item @code{sqrt(@var{x})} @cindexawkfunc{sqrt} @cindex square root Return the positive square root of @var{x}. @command{gawk} prints a warning message if @var{x} is negative. Thus, @code{sqrt(4)} is 2. -@item srand(@r{[}@var{x}@r{]}) +@item @code{srand(}[@var{x}]@code{)} @cindexawkfunc{srand} Set the starting point, or seed, for generating random numbers to the value @var{x}. @@ -15184,9 +15234,10 @@ pound sign@w{ (@samp{#}):} @code{gensub()}. @end menu -@table @code -@item asort(@var{source} @r{[}, @var{dest} @r{[}, @var{how} @r{]} @r{]}) # -@itemx asorti(@var{source} @r{[}, @var{dest} @r{[}, @var{how} @r{]} @r{]}) # +@c @asis for docbook +@table @asis +@item @code{asort(}@var{source} [@code{,} @var{dest} [@code{,} @var{how} ] ]@code{)} # +@itemx @code{asorti(}@var{source} [@code{,} @var{dest} [@code{,} @var{how} ] ]@code{)} # @cindexgawkfunc{asorti} @cindex sort array @cindex arrays, elements, retrieving number of @@ -15253,7 +15304,7 @@ a[3] = "middle" @code{asort()} and @code{asorti()} are @command{gawk} extensions; they are not available in compatibility mode (@pxref{Options}). -@item gensub(@var{regexp}, @var{replacement}, @var{how} @r{[}, @var{target}@r{]}) # +@item @code{gensub(@var{regexp}, @var{replacement}, @var{how}} [@code{, @var{target}}]@code{)} # @cindexgawkfunc{gensub} @cindex search and replace in strings @cindex substitute in string @@ -15318,7 +15369,7 @@ is the original unchanged value of @var{target}. @code{gensub()} is a @command{gawk} extension; it is not available in compatibility mode (@pxref{Options}). -@item gsub(@var{regexp}, @var{replacement} @r{[}, @var{target}@r{]}) +@item @code{gsub(@var{regexp}, @var{replacement}} [@code{, @var{target}}]@code{)} @cindexawkfunc{gsub} Search @var{target} for @emph{all} of the longest, leftmost, @emph{nonoverlapping} matching @@ -15340,7 +15391,7 @@ omitted, then the entire input record (@code{$0}) is used. As in @code{sub()}, the characters @samp{&} and @samp{\} are special, and the third argument must be assignable. -@item index(@var{in}, @var{find}) +@item @code{index(@var{in}, @var{find})} @cindexawkfunc{index} @cindex search in string @cindex find substring in string @@ -15359,7 +15410,7 @@ If @var{find} is not found, @code{index()} returns zero. It is a fatal error to use a regexp constant for @var{find}. -@item length(@r{[}@var{string}@r{]}) +@item @code{length(}[@var{string}]@code{)} @cindexawkfunc{length} @cindex string length @cindex length of string @@ -15424,7 +15475,7 @@ If @option{--lint} is provided on the command line If @option{--posix} is supplied, using an array argument is a fatal error (@pxref{Arrays}). -@item match(@var{string}, @var{regexp} @r{[}, @var{array}@r{]}) +@item @code{match(@var{string}, @var{regexp}} [@code{, @var{array}}]@code{)} @cindexawkfunc{match} @cindex string, regular expression match @cindex match regexp in string @@ -15541,7 +15592,7 @@ The @var{array} argument to @code{match()} is a (@pxref{Options}), using a third argument is a fatal error. -@item patsplit(@var{string}, @var{array} @r{[}, @var{fieldpat} @r{[}, @var{seps} @r{]} @r{]}) # +@item @code{patsplit(@var{string}, @var{array}} [@code{, @var{fieldpat}} [@code{, @var{seps}} ] ]@code{)} # @cindexgawkfunc{patsplit} @cindex split string into array Divide @@ -15573,7 +15624,7 @@ The @code{patsplit()} function is a (@pxref{Options}), it is not available. -@item split(@var{string}, @var{array} @r{[}, @var{fieldsep} @r{[}, @var{seps} @r{]} @r{]}) +@item @code{split(@var{string}, @var{array}} [@code{, @var{fieldsep}} [@code{, @var{seps}} ] ]@code{)} @cindexawkfunc{split} Divide @var{string} into pieces separated by @var{fieldsep} and store the pieces in @var{array} and the separator strings in the @@ -15658,7 +15709,7 @@ If @var{string} does not match @var{fieldsep} at all (but is not null), @var{array} has one element only. The value of that element is the original @var{string}. -@item sprintf(@var{format}, @var{expression1}, @dots{}) +@item @code{sprintf(@var{format}, @var{expression1}, @dots{})} @cindexawkfunc{sprintf} @cindex formatting strings Return (without printing) the string that @code{printf} would @@ -15675,7 +15726,7 @@ assigns the string @w{@samp{pi = 3.14 (approx.)}} to the variable @code{pival}. @cindexgawkfunc{strtonum} @cindex convert string to number -@item strtonum(@var{str}) # +@item @code{strtonum(@var{str})} # Examine @var{str} and return its numeric value. If @var{str} begins with a leading @samp{0}, @code{strtonum()} assumes that @var{str} is an octal number. If @var{str} begins with a leading @samp{0x} or @@ -15700,7 +15751,7 @@ for recognizing numbers (@pxref{Locales}). @code{strtonum()} is a @command{gawk} extension; it is not available in compatibility mode (@pxref{Options}). -@item sub(@var{regexp}, @var{replacement} @r{[}, @var{target}@r{]}) +@item @code{sub(@var{regexp}, @var{replacement}} [@code{, @var{target}}]@code{)} @cindexawkfunc{sub} @cindex replace in string Search @var{target}, which is treated as a string, for the @@ -15801,7 +15852,7 @@ will not run. Finally, if the @var{regexp} is not a regexp constant, it is converted into a string, and then the value of that string is treated as the regexp to match. -@item substr(@var{string}, @var{start} @r{[}, @var{length}@r{]}) +@item @code{substr(@var{string}, @var{start}} [@code{, @var{length}} ]@code{)} @cindexawkfunc{substr} @cindex substring Return a @var{length}-character-long substring of @var{string}, @@ -15861,7 +15912,7 @@ string = substr(string, 1, 2) "CDE" substr(string, 6) @cindex case sensitivity, converting case @cindex strings, converting letter case -@item tolower(@var{string}) +@item @code{tolower(@var{string})} @cindexawkfunc{tolower} @cindex convert string to lower case Return a copy of @var{string}, with each uppercase character @@ -15869,7 +15920,7 @@ in the string replaced with its corresponding lowercase character. Nonalphabetic characters are left unchanged. For example, @code{tolower("MiXeD cAsE 123")} returns @code{"mixed case 123"}. -@item toupper(@var{string}) +@item @code{toupper(@var{string})} @cindexawkfunc{toupper} @cindex convert string to upper case Return a copy of @var{string}, with each lowercase character @@ -16269,8 +16320,8 @@ Although this makes a certain amount of sense, it can be surprising. The following functions relate to input/output (I/O). Optional parameters are enclosed in square brackets ([ ]): -@table @code -@item close(@var{filename} @r{[}, @var{how}@r{]}) +@table @asis +@item @code{close(}@var{filename} [@code{,} @var{how}]@code{)} @cindexawkfunc{close} @cindex files, closing @cindex close file or coprocess @@ -16289,7 +16340,7 @@ not matter. @xref{Two-way I/O}, which discusses this feature in more detail and gives an example. -@item fflush(@r{[}@var{filename}@r{]}) +@item @code{fflush(}[@var{filename}]@code{)} @cindexawkfunc{fflush} @cindex flush buffered output Flush any buffered output associated with @var{filename}, which is either a @@ -16349,7 +16400,7 @@ a file or pipe that was opened for reading (such as with @code{getline}), or if @var{filename} is not an open file, pipe, or coprocess. In such a case, @code{fflush()} returns @minus{}1, as well. -@item system(@var{command}) +@item @code{system(@var{command})} @cindexawkfunc{system} @cindex invoke shell command @cindex interacting with other programs @@ -16577,7 +16628,7 @@ is out of range, @code{mktime()} returns @minus{}1. @cindex @command{gawk}, @code{PROCINFO} array in @cindex @code{PROCINFO} array -@item strftime(@r{[}@var{format} @r{[}, @var{timestamp} @r{[}, @var{utc-flag}@r{]]]}) +@item @code{strftime(} [@var{format} [@code{,} @var{timestamp} [@code{,} @var{utc-flag} ]]]@code{)} @c STARTOFRANGE strf @cindexgawkfunc{strftime} @cindex format time string @@ -17061,32 +17112,32 @@ bitwise operations just described. They are: @table @code @cindexgawkfunc{and} @cindex bitwise AND -@item and(@var{v1}, @var{v2} @r{[}, @r{@dots{}]}) +@item @code{and(@var{v1}, @var{v2}} [@code{,} @dots{}]@code{)} Return the bitwise AND of the arguments. There must be at least two. @cindexgawkfunc{compl} @cindex bitwise complement -@item compl(@var{val}) +@item @code{compl(@var{val})} Return the bitwise complement of @var{val}. @cindexgawkfunc{lshift} @cindex left shift -@item lshift(@var{val}, @var{count}) +@item @code{lshift(@var{val}, @var{count})} Return the value of @var{val}, shifted left by @var{count} bits. @cindexgawkfunc{or} @cindex bitwise OR -@item or(@var{v1}, @var{v2} @r{[}, @r{@dots{}]}) +@item @code{or(@var{v1}, @var{v2}} [@code{,} @dots{}]@code{)} Return the bitwise OR of the arguments. There must be at least two. @cindexgawkfunc{rshift} @cindex right shift -@item rshift(@var{val}, @var{count}) +@item @code{rshift(@var{val}, @var{count})} Return the value of @var{val}, shifted right by @var{count} bits. @cindexgawkfunc{xor} @cindex bitwise XOR -@item xor(@var{v1}, @var{v2} @r{[}, @r{@dots{}]}) +@item @code{xor(@var{v1}, @var{v2}} [@code{,} @dots{}]@code{)} Return the bitwise XOR of the arguments. There must be at least two. @end table @@ -17250,7 +17301,7 @@ Optional parameters are enclosed in square brackets ([ ]): @table @code @cindexgawkfunc{bindtextdomain} @cindex set directory of message catalogs -@item bindtextdomain(@var{directory} @r{[}, @var{domain}@r{]}) +@item @code{bindtextdomain(@var{directory}} [@code{,} @var{domain} ]@code{)} Set the directory in which @command{gawk} will look for message translation files, in case they will not or cannot be placed in the ``standard'' locations @@ -17264,14 +17315,14 @@ given @var{domain}. @cindexgawkfunc{dcgettext} @cindex translate string -@item dcgettext(@var{string} @r{[}, @var{domain} @r{[}, @var{category}@r{]]}) +@item @code{dcgettext(@var{string}} [@code{,} @var{domain} [@code{,} @var{category} ]]@code{)} Return the translation of @var{string} in text domain @var{domain} for locale category @var{category}. The default value for @var{domain} is the current value of @code{TEXTDOMAIN}. The default value for @var{category} is @code{"LC_MESSAGES"}. @cindexgawkfunc{dcngettext} -@item dcngettext(@var{string1}, @var{string2}, @var{number} @r{[}, @var{domain} @r{[}, @var{category}@r{]]}) +@item @code{dcngettext(@var{string1}, @var{string2}, @var{number}} [@code{,} @var{domain} [@code{,} @var{category} ]]@code{)} Return the plural form used for @var{number} of the translation of @var{string1} and @var{string2} in text domain @var{domain} for locale category @var{category}. @var{string1} is the @@ -18328,9 +18379,8 @@ for (i = 1; i <= n; i++) @part Part II:@* Problem Solving With @command{awk} @end iftex -@ignore @ifdocbook -@part Part II:@* Problem Solving With @command{awk} +@part Problem Solving With @command{awk} Part II shows how to use @command{awk} and @command{gawk} for problem solving. There is lots of code here for you to read and learn from. @@ -18344,7 +18394,6 @@ It contains the following chapters: @ref{Sample Programs}. @end itemize @end ifdocbook -@end ignore @node Library Functions @chapter A Library of @command{awk} Functions @@ -23675,7 +23724,7 @@ BEGIN @{ The following program, @file{igawk.sh}, provides this service. It simulates @command{gawk}'s searching of the @env{AWKPATH} variable and also allows @dfn{nested} includes; i.e., a file that is included -with @samp{@@include} can contain further @samp{@@include} statements. +with @code{@@include} can contain further @code{@@include} statements. @command{igawk} makes an effort to only include files once, so that nested includes don't accidentally include a library function twice. @@ -23713,7 +23762,7 @@ of the file included into the program at the correct point. @item Run an @command{awk} program (naturally) over the shell variable's contents to expand -@samp{@@include} statements. The expanded program is placed in a second +@code{@@include} statements. The expanded program is placed in a second shell variable. @item @@ -23733,24 +23782,25 @@ argument is @samp{debug}. The next part loops through all the command-line arguments. There are several cases of interest: -@table @code -@item -- +@c @asis for docbook +@table @asis +@item @option{--} This ends the arguments to @command{igawk}. Anything else should be passed on to the user's @command{awk} program without being evaluated. -@item -W +@item @option{-W} This indicates that the next option is specific to @command{gawk}. To make argument processing easier, the @option{-W} is appended to the front of the remaining arguments and the loop continues. (This is an @command{sh} programming trick. Don't worry about it if you are not familiar with @command{sh}.) -@item -v@r{,} -F +@item @option{-v}, @option{-F} These are saved and passed on to @command{gawk}. -@item -f@r{,} --file@r{,} --file=@r{,} -Wfile= +@item @option{-f}, @option{--file}, @option{--file=}, @option{-Wfile=} The file name is appended to the shell variable @code{program} with an -@samp{@@include} statement. +@code{@@include} statement. The @command{expr} utility is used to remove the leading option part of the argument (e.g., @samp{--file=}). (Typical @command{sh} usage would be to use the @command{echo} and @command{sed} @@ -23758,10 +23808,10 @@ utilities to do this work. Unfortunately, some versions of @command{echo} evalu escape sequences in their arguments, possibly mangling the program text. Using @command{expr} avoids this problem.) -@item --source@r{,} --source=@r{,} -Wsource= +@item @option{--source}, @option{--source=}, @option{-Wsource=} The source text is appended to @code{program}. -@item --version@r{,} -Wversion +@item @option{--version}, @option{-Wversion} @command{igawk} prints its version number, runs @samp{gawk --version} to get the @command{gawk} version information, and then exits. @end table @@ -23869,14 +23919,14 @@ fi @c endfile @end example -The @command{awk} program to process @samp{@@include} directives +The @command{awk} program to process @code{@@include} directives is stored in the shell variable @code{expand_prog}. Doing this keeps the shell script readable. The @command{awk} program reads through the user's program, one line at a time, using @code{getline} (@pxref{Getline}). The input -file names and @samp{@@include} statements are managed using a stack. -As each @samp{@@include} is encountered, the current file name is -``pushed'' onto the stack and the file named in the @samp{@@include} +file names and @code{@@include} statements are managed using a stack. +As each @code{@@include} is encountered, the current file name is +``pushed'' onto the stack and the file named in the @code{@@include} directive becomes the current file name. As each file is finished, the stack is ``popped,'' and the previous input file becomes the current input file again. The process is started by making the original file @@ -23952,8 +24002,8 @@ BEGIN @{ The stack is initialized with @code{ARGV[1]}, which will be @samp{/dev/stdin}. The main loop comes next. Input lines are read in succession. Lines that -do not start with @samp{@@include} are printed verbatim. -If the line does start with @samp{@@include}, the file name is in @code{$2}. +do not start with @code{@@include} are printed verbatim. +If the line does start with @code{@@include}, the file name is in @code{$2}. @code{pathto()} is called to generate the full path. If it cannot, then the program prints an error message and continues. @@ -24021,7 +24071,7 @@ It's done in these steps: @enumerate @item -Run @command{gawk} with the @samp{@@include}-processing program (the +Run @command{gawk} with the @code{@@include}-processing program (the value of the @code{expand_prog} shell variable) on standard input. @item @@ -24065,9 +24115,9 @@ There are four key simplifications that make the program work better: @itemize @bullet @item -Using @samp{@@include} even for the files named with @option{-f} makes building +Using @code{@@include} even for the files named with @option{-f} makes building the initial collected @command{awk} program much simpler; all the -@samp{@@include} processing can be done once. +@code{@@include} processing can be done once. @item Not trying to save the line read with @code{getline} @@ -24080,7 +24130,7 @@ considerably. @item Using a @code{getline} loop in the @code{BEGIN} rule does it all in one place. It is not necessary to call out to a separate loop for processing -nested @samp{@@include} statements. +nested @code{@@include} statements. @item Instead of saving the expanded program in a temporary file, putting it in a shell variable @@ -24100,7 +24150,7 @@ Finally, @command{igawk} shows that it is not always necessary to add new features to a program; they can often be layered on top. @ignore With @command{igawk}, -there is no real reason to build @samp{@@include} processing into +there is no real reason to build @code{@@include} processing into @command{gawk} itself. @end ignore @@ -24129,8 +24179,8 @@ One user @c Karl Berry, karl@ileaf.com, 10/95 suggested that @command{gawk} be modified to automatically read these files upon startup. Instead, it would be very simple to modify @command{igawk} -to do this. Since @command{igawk} can process nested @samp{@@include} -directives, @file{default.awk} could simply contain @samp{@@include} +to do this. Since @command{igawk} can process nested @code{@@include} +directives, @file{default.awk} could simply contain @code{@@include} statements for the desired library functions. @c Exercise: make this change @@ -24385,10 +24435,8 @@ BEGIN { @part Part III:@* Moving Beyond Standard @command{awk} With @command{gawk} @end iftex -@ignore @ifdocbook - -@part Part III:@* Moving Beyond Standard @command{awk} With @command{gawk} +@part Moving Beyond Standard @command{awk} With @command{gawk} Part III focuses on features specific to @command{gawk}. It contains the following chapters: @@ -24410,7 +24458,6 @@ It contains the following chapters: @ref{Dynamic Extensions}. @end itemize @end ifdocbook -@end ignore @node Advanced Features @chapter Advanced Features of @command{gawk} @@ -25806,7 +25853,7 @@ are candidates for translation at runtime. String constants without a leading underscore are not translated. @cindexgawkfunc{dcgettext} -@item dcgettext(@var{string} @r{[}, @var{domain} @r{[}, @var{category}@r{]]}) +@item @code{dcgettext(@var{string}} [@code{,} @var{domain} [@code{,} @var{category} ]]@code{)} Return the translation of @var{string} in text domain @var{domain} for locale category @var{category}. The default value for @var{domain} is the current value of @code{TEXTDOMAIN}. @@ -25832,7 +25879,7 @@ default arguments. @end quotation @cindexgawkfunc{dcngettext} -@item dcngettext(@var{string1}, @var{string2}, @var{number} @r{[}, @var{domain} @r{[}, @var{category}@r{]]}) +@item @code{dcngettext(@var{string1}, @var{string2}, @var{number}} [@code{,} @var{domain} [@code{,} @var{category} ]]@code{)} Return the plural form used for @var{number} of the translation of @var{string1} and @var{string2} in text domain @var{domain} for locale category @var{category}. @var{string1} is the @@ -25848,7 +25895,7 @@ The same remarks about argument order as for the @code{dcgettext()} function app @cindex message object files, specifying directory of @cindex files, message object, specifying directory of @cindexgawkfunc{bindtextdomain} -@item bindtextdomain(@var{directory} @r{[}, @var{domain}@r{]}) +@item @code{bindtextdomain(@var{directory}} [@code{,} @var{domain} ]@code{)} Change the directory in which @code{gettext} looks for @file{.gmo} files, in case they will not or cannot be placed in the standard locations @@ -27352,38 +27399,39 @@ a new value to the named option. The available options are: @c nested table -@table @code -@item history_size +@c asis for docbook +@table @asis +@item @code{history_size} @cindex debugger history size The maximum number of lines to keep in the history file @file{./.gawk_history}. The default is 100. -@item listsize +@item @code{listsize} @cindex debugger default list amount The number of lines that @code{list} prints. The default is 15. -@item outfile +@item @code{outfile} @cindex redirect @command{gawk} output, in debugger Send @command{gawk} output to a file; debugger output still goes to standard output. An empty string (@code{""}) resets output to standard output. -@item prompt +@item @code{prompt} @cindex debugger prompt The debugger prompt. The default is @samp{@w{gawk> }}. -@item save_history @r{[}on @r{|} off@r{]} +@item @code{save_history} [@code{on} | @code{off}] @cindex debugger history file Save command history to file @file{./.gawk_history}. The default is @code{on}. -@item save_options @r{[}on @r{|} off@r{]} +@item @code{save_options} [@code{on} | @code{off}] @cindex save debugger options Save current options to file @file{./.gawkrc} upon exit. The default is @code{on}. Options are read back in to the next session upon startup. -@item trace @r{[}on @r{|} off@r{]} +@item @code{trace} [@code{on} | @code{off}] @cindex instruction tracing, in debugger Turn instruction tracing on or off. The default is @code{off}. @end table @@ -27537,7 +27585,7 @@ running a program, the debugger warns you if you accidentally type @cindex debugger commands, @code{trace} @cindex @code{trace} debugger command -@item @code{trace} @code{on} @r{|} @code{off} +@item @code{trace} [@code{on} | @code{off}] Turn on or off a continuous printing of instructions which are about to be executed, along with printing the @command{awk} line which they implement. The default is @code{off}. @@ -31339,8 +31387,9 @@ as described earlier (@pxref{Extension Functions}). It can then be looped over for multiple calls to @code{add_ext_func()}. +@c Use @var{OR} for docbook @item static awk_bool_t (*init_func)(void) = NULL; -@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @r{OR} +@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @var{OR} @itemx static awk_bool_t init_my_module(void) @{ @dots{} @} @itemx static awk_bool_t (*init_func)(void) = init_my_module; If you need to do some initialization work, you should define a @@ -32010,7 +32059,7 @@ BEGIN @{ @end example The @env{AWKLIBPATH} environment variable tells -@command{gawk} where to find shared libraries (@pxref{Finding Extensions}). +@command{gawk} where to find extensions (@pxref{Finding Extensions}). We set it to the current directory and run the program: @example @@ -32073,19 +32122,19 @@ Others mainly provide example code that shows how to use the extension API. The @code{filefuncs} extension provides three different functions, as follows: The usage is: -@table @code +@table @asis @item @@load "filefuncs" This is how you load the extension. @cindex @code{chdir()} extension function -@item result = chdir("/some/directory") +@item @code{result = chdir("/some/directory")} The @code{chdir()} function is a direct hook to the @code{chdir()} system call to change the current directory. It returns zero upon success or less than zero upon error. In the latter case it updates @code{ERRNO}. @cindex @code{stat()} extension function -@item result = stat("/some/path", statdata @r{[}, follow@r{]}) +@item @code{result = stat("/some/path", statdata} [@code{, follow}]@code{)} The @code{stat()} function provides a hook into the @code{stat()} system call. It returns zero upon success or less than zero upon error. @@ -32175,8 +32224,8 @@ Not all systems support all file types. @end multitable @cindex @code{fts()} extension function -@item flags = or(FTS_PHYSICAL, ...) -@itemx result = fts(pathlist, flags, filedata) +@item @code{flags = or(FTS_PHYSICAL, ...)} +@itemx @code{result = fts(pathlist, flags, filedata)} Walk the file trees provided in @code{pathlist} and fill in the @code{filedata} array as described below. @code{flags} is the bitwise OR of several predefined constant values, also described below. @@ -32800,14 +32849,19 @@ See the project's web site for more information. @part Part IV:@* Appendices @end iftex -@ignore @ifdocbook -@part Part IV:@* Appendices +@part Appendices -Part IV provides the appendices, the Glossary, and two licenses that cover -the @command{gawk} source code and this @value{DOCUMENT}, respectively. -It contains the following appendices: +@ifclear FOR_PRINT +Part IV contains the appendixes (including the two licenses that cover +the @command{gawk} source code and this @value{DOCUMENT}, respectively) +and the Glossary: +@end ifclear + +@ifset FOR_PRINT +Part IV contains two appendixes: +@end ifset @itemize @bullet @item @@ -32816,6 +32870,7 @@ It contains the following appendices: @item @ref{Installation}. +@ifclear FOR_PRINT @item @ref{Notes}. @@ -32830,24 +32885,23 @@ It contains the following appendices: @item @ref{GNU Free Documentation License}. +@end ifclear @end itemize @end ifdocbook -@end ignore @node Language History @appendix The Evolution of the @command{awk} Language -This @value{DOCUMENT} describes the GNU implementation of @command{awk}, which follows -the POSIX specification. -Many long-time @command{awk} users learned @command{awk} programming -with the original @command{awk} implementation in Version 7 Unix. -(This implementation was the basis for @command{awk} in Berkeley Unix, -through 4.3-Reno. Subsequent versions of Berkeley Unix, and some systems -derived from 4.4BSD-Lite, use various versions of @command{gawk} -for their @command{awk}.) -This @value{CHAPTER} briefly describes the -evolution of the @command{awk} language, with cross-references to other parts -of the @value{DOCUMENT} where you can find more information. +This @value{DOCUMENT} describes the GNU implementation of @command{awk}, +which follows the POSIX specification. Many long-time @command{awk} +users learned @command{awk} programming with the original @command{awk} +implementation in Version 7 Unix. (This implementation was the basis for +@command{awk} in Berkeley Unix, through 4.3-Reno. Subsequent versions +of Berkeley Unix, and some systems derived from 4.4BSD-Lite, use various +versions of @command{gawk} for their @command{awk}.) This @value{CHAPTER} +briefly describes the evolution of the @command{awk} language, with +cross-references to other parts of the @value{DOCUMENT} where you can +find more information. @menu * V7/SVR3.1:: The major changes between V7 and System V @@ -39285,4 +39339,4 @@ which sorta sucks. TODO: ----- -1. Empty string vs. null string. 30 occurrences vs. 77, respectively. +2. Add back in docbook fixes for @r{}. |