diff options
Diffstat (limited to 'doc/gawk.info')
-rw-r--r-- | doc/gawk.info | 3271 |
1 files changed, 1724 insertions, 1547 deletions
diff --git a/doc/gawk.info b/doc/gawk.info index 08ec4452..d83370e8 100644 --- a/doc/gawk.info +++ b/doc/gawk.info @@ -9,7 +9,7 @@ START-INFO-DIR-ENTRY * awk: (gawk)Invoking gawk. Text scanning and processing. END-INFO-DIR-ENTRY - Copyright (C) 1989, 1991, 1992, 1993, 1996-2005, 2007, 2009-2014 + Copyright (C) 1989, 1991, 1992, 1993, 1996-2005, 2007, 2009-2015 Free Software Foundation, Inc. @@ -37,7 +37,7 @@ General Introduction This file documents `awk', a program that you can use to select particular records in a file and perform operations upon them. - Copyright (C) 1989, 1991, 1992, 1993, 1996-2005, 2007, 2009-2014 + Copyright (C) 1989, 1991, 1992, 1993, 1996-2005, 2007, 2009-2015 Free Software Foundation, Inc. @@ -749,10 +749,10 @@ and associative arrays. Those looking for something new can try out The programs in this book make clear that an AWK program is typically much smaller and faster to develop than a counterpart written -in C. Consequently, there is often a payoff to prototype an algorithm -or design in AWK to get it running quickly and expose problems early. -Often, the interpreted performance is adequate and the AWK prototype -becomes the product. +in C. Consequently, there is often a payoff to prototyping an +algorithm or design in AWK to get it running quickly and expose +problems early. Often, the interpreted performance is adequate and the +AWK prototype becomes the product. The new `pgawk' (profiling `gawk'), produces program execution counts. I recently experimented with an algorithm that for n lines of @@ -776,16 +776,16 @@ Foreword to the Fourth Edition ****************************** Some things don't change. Thirteen years ago I wrote: "If you use AWK -or want to learn how, then read this book." True then and still true +or want to learn how, then read this book." True then, and still true today. - Learning to use a programming language is more than mastering the -syntax. One needs to acquire an understanding of how to use the + Learning to use a programming language is about more than mastering +the syntax. One needs to acquire an understanding of how to use the features of the language to solve practical programming problems. A focus of this book is many examples that show how to use AWK. Some things do change. Our computers are much faster and have more -memory. Consequently, speed and storage inefficiencies of a high level +memory. Consequently, speed and storage inefficiencies of a high-level language matter less. Prototyping in AWK and then rewriting in C for performance reasons happens less, because more often the prototype is fast enough. @@ -794,9 +794,9 @@ fast enough. C++. With `gawk' 4.1 and later, you do not have to choose between writing your program in AWK or in C/C++. You can write most of your program in AWK and the aspects that require C/C++ capabilities can be -written in C/C++ and then the pieces glued together when the `gawk' +written in C/C++, and then the pieces glued together when the `gawk' module loads the C/C++ module as a dynamic plug-in. *note Dynamic -Extensions::, has all the details, and as expected, many examples to +Extensions::, has all the details, and, as expected, many examples to help you learn the ins and outs. I enjoy programming in AWK and had fun (re)reading this book. I @@ -835,7 +835,7 @@ So most of the time, we don't distinguish between `gawk' and other * Validate data - * Produce indexes and perform other document preparation tasks + * Produce indexes and perform other document-preparation tasks * Experiment with algorithms that you can adapt later to other computer languages @@ -927,7 +927,7 @@ advice from Richard Stallman. John Woods contributed parts of the code as well. In 1988 and 1989, David Trueman, with help from me, thoroughly reworked `gawk' for compatibility with the newer `awk'. Circa 1994, I became the primary maintainer. Current development -focuses on bug fixes, performance improvements, standards compliance +focuses on bug fixes, performance improvements, standards compliance, and, occasionally, new features. In May 1997, Ju"rgen Kahrs felt the need for network access from @@ -939,10 +939,10 @@ the `gawk' distribution). His code finally became part of the main John Haque rewrote the `gawk' internals, in the process providing an `awk'-level debugger. This version became available as `gawk' version -4.0, in 2011. +4.0 in 2011. - *Note Contributors::, for a full list of those who made important -contributions to `gawk'. + *Note Contributors::, for a full list of those who have made +important contributions to `gawk'. File: gawk.info, Node: Names, Next: This Manual, Prev: History, Up: Preface @@ -955,7 +955,7 @@ provided in *note Language History::. The language described in this Info file is often referred to as "new `awk'." By analogy, the original version of `awk' is referred to as "old `awk'." - Today, on most systems, when you run the `awk' utility, you get some + Today, on most systems, when you run the `awk' utility you get some version of new `awk'.(1) If your system's standard `awk' is the old one, you will see something like this if you try the test program: @@ -1015,9 +1015,9 @@ in *note Sample Programs::, should be of interest. This Info file is split into several parts, as follows: - * Part I describes the `awk' language and `gawk' program in detail. - It starts with the basics, and continues through all of the - features of `awk'. It contains the following chapters: + * Part I describes the `awk' language and the `gawk' program in + detail. It starts with the basics, and continues through all of + the features of `awk'. It contains the following chapters: - *note Getting Started::, provides the essentials you need to know to begin using `awk'. @@ -1048,9 +1048,10 @@ in *note Sample Programs::, should be of interest. `gawk' use. - *note Arrays::, covers `awk''s one-and-only data structure: - associative arrays. Deleting array elements and whole arrays - is also described, as well as sorting arrays in `gawk'. It - also describes how `gawk' provides arrays of arrays. + the associative array. Deleting array elements and whole + arrays is described, as well as sorting arrays in `gawk'. + The major node also describes how `gawk' provides arrays of + arrays. - *note Functions::, describes the built-in functions `awk' and `gawk' provide, as well as how to define your own functions. @@ -1058,14 +1059,13 @@ in *note Sample Programs::, should be of interest. indirectly. * Part II shows how to use `awk' and `gawk' for problem solving. - There is lots of code here for you to read and learn from. It - contains the following chapters: + There is lots of code here for you to read and learn from. This + part contains the following chapters: - - *note Library Functions::, which provides a number of - functions meant to be used from main `awk' programs. + - *note Library Functions::, provides a number of functions + meant to be used from main `awk' programs. - - *note Sample Programs::, which provides many sample `awk' - programs. + - *note Sample Programs::, provides many sample `awk' programs. Reading these two chapters allows you to see `awk' solving real problems. @@ -1097,7 +1097,7 @@ in *note Sample Programs::, should be of interest. It contains the following appendices: - *note Language History::, describes how the `awk' language - has evolved since its first release to present. It also + has evolved since its first release to the present. It also describes how `gawk' has acquired features over time. - *note Installation::, describes how to get `gawk', how to @@ -1114,7 +1114,7 @@ in *note Sample Programs::, should be of interest. material for those who are completely unfamiliar with computer programming. - The *note Glossary::, defines most, if not all of, the + The *note Glossary::, defines most, if not all, of the significant terms used throughout the Info file. If you find terms that you aren't familiar with, try looking them up here. @@ -1143,8 +1143,8 @@ node briefly documents the typographical conventions used in Texinfo. common shell primary and secondary prompts, `$' and `>'. Input that you type is shown `like this'. Output from the command is preceded by the glyph "-|". This typically represents the command's standard -output. Error messages, and other output on the command's standard -error, are preceded by the glyph "error-->". For example: +output. Error messages and other output on the command's standard +error are preceded by the glyph "error-->". For example: $ echo hi on stdout -| hi on stdout @@ -1156,7 +1156,7 @@ particular, there are special characters called "control characters." These are characters that you type by holding down both the `CONTROL' key and another key, at the same time. For example, a `Ctrl-d' is typed by first pressing and holding the `CONTROL' key, next pressing the `d' -key and finally releasing both keys. +key, and finally releasing both keys. For the sake of brevity, throughout this Info file, we refer to Brian Kernighan's version of `awk' as "BWK `awk'." (*Note Other @@ -1172,7 +1172,7 @@ Dark Corners Until the POSIX standard (and `GAWK: Effective AWK Programming'), many features of `awk' were either poorly documented or not documented at all. Descriptions of such features (often called "dark corners") -are noted in this Info file with "(d.c.)". They also appear in the +are noted in this Info file with "(d.c.)." They also appear in the index under the heading "dark corner." But, as noted by the opening quote, any coverage of dark corners is @@ -1195,8 +1195,8 @@ editor. GNU Emacs is the most widely used version of Emacs today. The GNU(1) Project is an ongoing effort on the part of the Free Software Foundation to create a complete, freely distributable, -POSIX-compliant computing environment. The FSF uses the "GNU General -Public License" (GPL) to ensure that their software's source code is +POSIX-compliant computing environment. The FSF uses the GNU General +Public License (GPL) to ensure that its software's source code is always available to the end user. A copy of the GPL is included for your reference (*note Copying::). The GPL applies to the C language source code for `gawk'. To find out more about the FSF and the GNU @@ -1224,26 +1224,26 @@ original, "old" version of `awk'. I started working with that version in the fall of 1988. As work on it progressed, the FSF published several preliminary versions (numbered -0.X). In 1996, Edition 1.0 was released with `gawk' 3.0.0. The FSF +0.X). In 1996, edition 1.0 was released with `gawk' 3.0.0. The FSF published the first two editions under the title `The GNU Awk User's Guide'. This edition maintains the basic structure of the previous editions. For FSF edition 4.0, the content was thoroughly reviewed and updated. All references to `gawk' versions prior to 4.0 were removed. Of -significant note for that edition was *note Debugger::. +significant note for that edition was the addition of *note Debugger::. For FSF edition 4.1, the content has been reorganized into parts, and the major new additions are *note Arbitrary Precision Arithmetic::, and *note Dynamic Extensions::. This Info file will undoubtedly continue to evolve. If you find an -error in this Info file, please report it! *Note Bugs::, for +error in the Info file, please report it! *Note Bugs::, for information on submitting problem reports electronically. ---------- Footnotes ---------- - (1) GNU stands for "GNU's not Unix." + (1) GNU stands for "GNU's Not Unix." (2) The terminology "GNU/Linux" is explained in the *note Glossary::. @@ -1287,7 +1287,7 @@ acknowledgments: this manual. Jay Fenlason contributed many ideas and sample programs. Richard Mlynarik and Robert Chassell gave helpful comments on drafts of this manual. The paper `A Supplemental - Document for `awk'' by John W. Pierce of the Chemistry Department + Document for AWK' by John W. Pierce of the Chemistry Department at UC San Diego, pinpointed several issues relevant both to `awk' implementation and to this manual, that would otherwise have escaped us. @@ -1300,7 +1300,7 @@ GNU Project. acknowledgements: The following people (in alphabetical order) provided helpful - comments on various versions of this book, Rick Adams, Dr. Nelson + comments on various versions of this book: Rick Adams, Dr. Nelson H.F. Beebe, Karl Berry, Dr. Michael Brennan, Rich Burridge, Claire Cloutier, Diane Close, Scott Deifik, Christopher ("Topher") Eliot, Jeffrey Friedl, Dr. Darrel Hankerson, Michal Jaegermann, Dr. @@ -1309,7 +1309,7 @@ acknowledgements: Robert J. Chassell provided much valuable advice on the use of Texinfo. He also deserves special thanks for convincing me _not_ - to title this Info file `How To Gawk Politely'. Karl Berry helped + to title this Info file `How to Gawk Politely'. Karl Berry helped significantly with the TeX part of Texinfo. I would like to thank Marshall and Elaine Hartholz of Seattle and @@ -1349,18 +1349,18 @@ of people. *Note Contributors::, for the full list. Thanks to Michael Brennan for the Forewords. Thanks to Patrice Dumas for the new `makeinfo' program. Thanks to -Karl Berry who continues to work to keep the Texinfo markup language +Karl Berry, who continues to work to keep the Texinfo markup language sane. Robert P.J. Day, Michael Brennan, and Brian Kernighan kindly acted as reviewers for the 2015 edition of this Info file. Their feedback helped improve the final work. - I would like to thank Brian Kernighan for invaluable assistance -during the testing and debugging of `gawk', and for ongoing help and -advice in clarifying numerous points about the language. We could not -have done nearly as good a job on either `gawk' or its documentation -without his help. + I would also like to thank Brian Kernighan for his invaluable +assistance during the testing and debugging of `gawk', and for his +ongoing help and advice in clarifying numerous points about the +language. We could not have done nearly as good a job on either `gawk' +or its documentation without his help. Brian is in a class by himself as a programmer and technical author. I have to thank him (yet again) for his ongoing friendship and for @@ -1403,10 +1403,10 @@ contain "function definitions", an advanced feature that we will ignore for now; *note User-defined::). Each rule specifies one pattern to search for and one action to perform upon finding the pattern. - Syntactically, a rule consists of a pattern followed by an action. -The action is enclosed in braces to separate it from the pattern. -Newlines usually separate rules. Therefore, an `awk' program looks -like this: + Syntactically, a rule consists of a "pattern" followed by an +"action". The action is enclosed in braces to separate it from the +pattern. Newlines usually separate rules. Therefore, an `awk' program +looks like this: PATTERN { ACTION } PATTERN { ACTION } @@ -1474,7 +1474,7 @@ program as the first argument of the `awk' command, like this: awk 'PROGRAM' INPUT-FILE1 INPUT-FILE2 ... -where PROGRAM consists of a series of PATTERNS and ACTIONS, as +where PROGRAM consists of a series of patterns and actions, as described earlier. This command format instructs the "shell", or command interpreter, @@ -1489,8 +1489,8 @@ programs from shell scripts, because it avoids the need for a separate file for the `awk' program. A self-contained shell script is more reliable because there are no other files to misplace. - Later in this chapter, *note Very Simple::, presents several short, -self-contained programs. + Later in this chapter, in *note Very Simple::, we'll see examples of +several short, self-contained programs. File: gawk.info, Node: Read Terminal, Next: Long, Prev: One-shot, Up: Running gawk @@ -1505,7 +1505,7 @@ following command line: `awk' applies the PROGRAM to the "standard input", which usually means whatever you type on the keyboard. This continues until you indicate -end-of-file by typing `Ctrl-d'. (On other operating systems, the +end-of-file by typing `Ctrl-d'. (On non-POSIX operating systems, the end-of-file character may be different. For example, on OS/2, it is `Ctrl-z'.) @@ -1580,10 +1580,9 @@ that are provided on the `awk' command line. (Also, placing the program in a file allows us to use a literal single quote in the program text, instead of the magic `\47'.) - If you want to clearly identify your `awk' program files as such, -you can add the extension `.awk' to the file name. This doesn't affect -the execution of the `awk' program but it does make "housekeeping" -easier. + If you want to clearly identify an `awk' program file as such, you +can add the extension `.awk' to the file name. This doesn't affect the +execution of the `awk' program but it does make "housekeeping" easier. File: gawk.info, Node: Executable Scripts, Next: Comments, Prev: Long, Up: Running gawk @@ -1712,7 +1711,7 @@ at a later time. File: gawk.info, Node: Quoting, Prev: Comments, Up: Running gawk -1.1.6 Shell-Quoting Issues +1.1.6 Shell Quoting Issues -------------------------- * Menu: @@ -1807,7 +1806,7 @@ shell quoting tricks, like this: -| Here is a single quote <'> This program consists of three concatenated quoted strings. The first -and the third are single quoted, the second is double quoted. +and the third are single-quoted, and the second is double-quoted. This can be "simplified" to: @@ -1834,8 +1833,7 @@ like so: $ awk 'BEGIN { print "Here is a double quote <\42>" }' -| Here is a double quote <"> -This works nicely, except that you should comment clearly what the -escapes mean. +This works nicely, but you should comment clearly what the escapes mean. A fourth option is to use command-line variable assignment, like this: @@ -1844,11 +1842,11 @@ this: -| Here is a single quote <'> (Here, the two string constants and the value of `sq' are -concatenated into a single string which is printed by `print'.) +concatenated into a single string that is printed by `print'.) If you really need both single and double quotes in your `awk' program, it is probably best to move it into a separate file, where the -shell won't be part of the picture, and you can say what you mean. +shell won't be part of the picture and you can say what you mean. File: gawk.info, Node: DOS Quoting, Up: Quoting @@ -1906,7 +1904,7 @@ of green crates shipped, the number of red boxes shipped, the number of orange bags shipped, and the number of blue packages shipped, respectively. There are 16 entries, covering the 12 months of last year and the first four months of the current year. An empty line separates -the data for the two years. +the data for the two years: Jan 13 25 15 115 Feb 15 32 24 226 @@ -1938,7 +1936,7 @@ File: gawk.info, Node: Very Simple, Next: Two Rules, Prev: Sample Data Files, The following command runs a simple `awk' program that searches the input file `mail-list' for the character string `li' (a grouping of characters is usually called a "string"; the term "string" is based on -similar usage in English, such as "a string of pearls," or "a string of +similar usage in English, such as "a string of pearls" or "a string of cars in a train"): awk '/li/ { print $0 }' mail-list @@ -1974,24 +1972,25 @@ prints all lines matching the pattern `li'. By comparison, omitting the `print' statement but retaining the braces makes an empty action that does nothing (i.e., no lines are printed). - Many practical `awk' programs are just a line or two. Following is a -collection of useful, short programs to get you started. Some of these -programs contain constructs that haven't been covered yet. (The -description of the program will give you a good idea of what is going -on, but you'll need to read the rest of the Info file to become an -`awk' expert!) Most of the examples use a data file named `data'. -This is just a placeholder; if you use these programs yourself, -substitute your own file names for `data'. For future reference, note -that there is often more than one way to do things in `awk'. At some -point, you may want to look back at these examples and see if you can -come up with different ways to do the same things shown here: + Many practical `awk' programs are just a line or two long. +Following is a collection of useful, short programs to get you started. +Some of these programs contain constructs that haven't been covered +yet. (The description of the program will give you a good idea of what +is going on, but you'll need to read the rest of the Info file to +become an `awk' expert!) Most of the examples use a data file named +`data'. This is just a placeholder; if you use these programs +yourself, substitute your own file names for `data'. For future +reference, note that there is often more than one way to do things in +`awk'. At some point, you may want to look back at these examples and +see if you can come up with different ways to do the same things shown +here: * Print every line that is longer than 80 characters: awk 'length($0) > 80' data - The sole rule has a relational expression as its pattern and it - has no action--so it uses the default action, printing the record. + The sole rule has a relational expression as its pattern and has no + action--so it uses the default action, printing the record. * Print the length of the longest input line: @@ -2046,8 +2045,8 @@ come up with different ways to do the same things shown here: awk 'NR % 2 == 0' data - If you use the expression `NR % 2 == 1' instead, the program would - print the odd-numbered lines. + If you used the expression `NR % 2 == 1' instead, the program + would print the odd-numbered lines. File: gawk.info, Node: Two Rules, Next: More Complex, Prev: Very Simple, Up: Getting Started @@ -2280,9 +2279,10 @@ built-in functions for working with timestamps, performing bit manipulation, for runtime string translation (internationalization), determining the type of a variable, and array sorting. - As we develop our presentation of the `awk' language, we introduce -most of the variables and many of the functions. They are described -systematically in *note Built-in Variables::, and in *note Built-in::. + As we develop our presentation of the `awk' language, we will +introduce most of the variables and many of the functions. They are +described systematically in *note Built-in Variables::, and in *note +Built-in::. File: gawk.info, Node: When, Next: Intro Summary, Prev: Other Features, Up: Getting Started @@ -2347,7 +2347,7 @@ File: gawk.info, Node: Intro Summary, Prev: When, Up: Getting Started * You may use backslash continuation to continue a source line. Lines are automatically continued after a comma, open brace, - question mark, colon, `||', `&&', `do' and `else'. + question mark, colon, `||', `&&', `do', and `else'. File: gawk.info, Node: Invoking Gawk, Next: Regexp, Prev: Getting Started, Up: Top @@ -2414,8 +2414,8 @@ File: gawk.info, Node: Options, Next: Other Arguments, Prev: Command Line, U Options begin with a dash and consist of a single character. GNU-style long options consist of two dashes and a keyword. The keyword can be abbreviated, as long as the abbreviation allows the option to be -uniquely identified. If the option takes an argument, then the keyword -is either immediately followed by an equals sign (`=') and the +uniquely identified. If the option takes an argument, either the +keyword is immediately followed by an equals sign (`=') and the argument's value, or the keyword and the argument's value are separated by whitespace. If a particular option with a value is given more than once, it is the last value that counts. @@ -2430,10 +2430,10 @@ The following list describes options mandated by the POSIX standard: `-f SOURCE-FILE' `--file SOURCE-FILE' - Read `awk' program source from SOURCE-FILE instead of in the first - nonoption argument. This option may be given multiple times; the - `awk' program consists of the concatenation of the contents of - each specified SOURCE-FILE. + Read the `awk' program source from SOURCE-FILE instead of in the + first nonoption argument. This option may be given multiple + times; the `awk' program consists of the concatenation of the + contents of each specified SOURCE-FILE. `-v VAR=VAL' `--assign VAR=VAL' @@ -2474,7 +2474,7 @@ The following list describes options mandated by the POSIX standard: `-b' `--characters-as-bytes' Cause `gawk' to treat all input data as single-byte characters. - In addition, all output written with `print' or `printf' are + In addition, all output written with `print' or `printf' is treated as single-byte characters. Normally, `gawk' follows the POSIX standard and attempts to process @@ -2482,7 +2482,7 @@ The following list describes options mandated by the POSIX standard: This can often involve converting multibyte characters into wide characters (internally), and can lead to problems or confusion if the input data does not contain valid multibyte characters. This - option is an easy way to tell `gawk': "hands off my data!". + option is an easy way to tell `gawk', "Hands off my data!" `-c' `--traditional' @@ -2517,7 +2517,7 @@ The following list describes options mandated by the POSIX standard: default, the debugger reads commands interactively from the keyboard (standard input). The optional FILE argument allows you to specify a file with a list of commands for the debugger to - execute non-interactively. No space is allowed between the `-D' + execute noninteractively. No space is allowed between the `-D' and FILE, if FILE is supplied. `-e' PROGRAM-TEXT @@ -2552,23 +2552,23 @@ The following list describes options mandated by the POSIX standard: `-g' `--gen-pot' - Analyze the source program and generate a GNU `gettext' Portable - Object Template file on standard output for all string constants + Analyze the source program and generate a GNU `gettext' portable + object template file on standard output for all string constants that have been marked for translation. *Note Internationalization::, for information about this option. `-h' `--help' - Print a "usage" message summarizing the short and long style + Print a "usage" message summarizing the short- and long-style options that `gawk' accepts and then exit. `-i' SOURCE-FILE `--include' SOURCE-FILE Read an `awk' source library from SOURCE-FILE. This option is completely equivalent to using the `@include' directive inside - your program. This option is very similar to the `-f' option, but - there are two important differences. First, when `-i' is used, - the program source is not loaded if it has been previously loaded, + your program. It is very similar to the `-f' option, but there + are two important differences. First, when `-i' is used, the + program source is not loaded if it has been previously loaded, whereas with `-f', `gawk' always loads the file. Second, because this option is intended to be used with code libraries, `gawk' does not recognize such files as constituting main program input. @@ -2630,7 +2630,7 @@ The following list describes options mandated by the POSIX standard: `-o'[FILE] `--pretty-print'[`='FILE] - Enable pretty-printing of `awk' programs. By default, output + Enable pretty-printing of `awk' programs. By default, the output program is created in a file named `awkprof.out' (*note Profiling::). The optional FILE argument allows you to specify a different file name for the output. No space is allowed between @@ -2736,7 +2736,7 @@ input as a source of data.) Because it is clumsy using the standard `awk' mechanisms to mix source file and command-line `awk' programs, `gawk' provides the `-e' -option. This does not require you to pre-empt the standard input for +option. This does not require you to preempt the standard input for your source code; it allows you to easily mix command-line and library source code (*note AWKPATH Variable::). As with `-f', the `-e' and `-i' options may also be used multiple times on the command line. @@ -2895,7 +2895,7 @@ implementations, you must supply a precise pathname for each program file, unless the file is in the current directory. But with `gawk', if the file name supplied to the `-f' or `-i' options does not contain a directory separator `/', then `gawk' searches a list of directories -(called the "search path"), one by one, looking for a file with the +(called the "search path") one by one, looking for a file with the specified name. The search path is a string consisting of directory names separated by @@ -2928,9 +2928,9 @@ or by placing two colons next to each other [`::'].) Different past versions of `gawk' would also look explicitly in the current directory, either before or after the path search. As - of version 4.1.2, this no longer happens, and if you wish to look - in the current directory, you must include `.' either as a separate - entry, or as a null entry in the search path. + of version 4.1.2, this no longer happens; if you wish to look in + the current directory, you must include `.' either as a separate + entry or as a null entry in the search path. The default value for `AWKPATH' is `.:/usr/local/share/awk'.(2) Since `.' is included at the beginning, `gawk' searches first in the @@ -3042,7 +3042,7 @@ change. The variables are: If this variable exists, `gawk' includes the file name and line number within the `gawk' source code from which warning and/or fatal messages are generated. Its purpose is to help isolate the - source of a message, as there are multiple places which produce the + source of a message, as there are multiple places that produce the same warning or error message. `GAWK_NO_DFA' @@ -3058,16 +3058,16 @@ change. The variables are: evaluation stack, when needed. `INT_CHAIN_MAX' - The intended maximum number of items `gawk' will maintain on a - hash chain for managing arrays indexed by integers. + This specifies intended maximum number of items `gawk' will + maintain on a hash chain for managing arrays indexed by integers. `STR_CHAIN_MAX' - The intended maximum number of items `gawk' will maintain on a - hash chain for managing arrays indexed by strings. + This specifies intended maximum number of items `gawk' will + maintain on a hash chain for managing arrays indexed by strings. `TIDYMEM' If this variable exists, `gawk' uses the `mtrace()' library calls - from GNU LIBC to help track down possible memory leaks. + from the GNU C library to help track down possible memory leaks. File: gawk.info, Node: Exit Status, Next: Include Files, Prev: Environment Variables, Up: Invoking Gawk @@ -3099,11 +3099,11 @@ This minor node describes a feature that is specific to `gawk'. files. This gives you the ability to split large `awk' source files into smaller, more manageable pieces, and also lets you reuse common `awk' code from various `awk' scripts. In other words, you can group -together `awk' functions, used to carry out specific tasks, into -external files. These files can be used just like function libraries, -using the `@include' keyword in conjunction with the `AWKPATH' -environment variable. Note that source files may also be included -using the `-i' option. +together `awk' functions used to carry out specific tasks into external +files. These files can be used just like function libraries, using the +`@include' keyword in conjunction with the `AWKPATH' environment +variable. Note that source files may also be included using the `-i' +option. Let's see an example. We'll start with two (trivial) `awk' scripts, namely `test1' and `test2'. Here is the `test1' script: @@ -3165,11 +3165,11 @@ Variable::) apply to `@include' also. This is very helpful in constructing `gawk' function libraries. If you have a large script with useful, general-purpose `awk' functions, you can break it down into library files and put those files in a -special directory. You can then include those "libraries," using -either the full pathnames of the files, or by setting the `AWKPATH' +special directory. You can then include those "libraries," either by +using the full pathnames of the files, or by setting the `AWKPATH' environment variable accordingly and then using `@include' with just -the file part of the full pathname. Of course, you can have more than -one directory to keep library files; the more complex the working +the file part of the full pathname. Of course, you can keep library +files in more than one directory; the more complex the working environment is, the more directories you may need to organize the files to be included. @@ -3181,8 +3181,8 @@ particular, `@include' is very useful for writing CGI scripts to be run from web pages. As mentioned in *note AWKPATH Variable::, the current directory is -always searched first for source files, before searching in `AWKPATH', -and this also applies to files named with `@include'. +always searched first for source files, before searching in `AWKPATH'; +this also applies to files named with `@include'. File: gawk.info, Node: Loading Shared Libraries, Next: Obsolete, Prev: Include Files, Up: Invoking Gawk @@ -3227,8 +3227,8 @@ File: gawk.info, Node: Obsolete, Next: Undocumented, Prev: Loading Shared Lib ==================================== This minor node describes features and/or command-line options from -previous releases of `gawk' that are either not available in the -current version or that are still supported but deprecated (meaning that +previous releases of `gawk' that either are not available in the +current version or are still supported but deprecated (meaning that they will _not_ be in the next release). The process-related special files `/dev/pid', `/dev/ppid', @@ -3256,7 +3256,7 @@ File: gawk.info, Node: Invoking Summary, Prev: Undocumented, Up: Invoking Gaw run `awk'. * The three standard options for all versions of `awk' are `-f', - `-F' and `-v'. `gawk' supplies these and many others, as well as + `-F', and `-v'. `gawk' supplies these and many others, as well as corresponding GNU-style long options. * Nonoption command-line arguments are usually treated as file names, @@ -3286,7 +3286,7 @@ File: gawk.info, Node: Invoking Summary, Prev: Undocumented, Up: Invoking Gaw * `gawk' allows you to load additional functions written in C or C++ using the `@load' statement and/or the `-l' option. (This - advanced feature is described later on in *note Dynamic + advanced feature is described later, in *note Dynamic Extensions::.) @@ -3435,7 +3435,7 @@ sequences apply to both string constants and regexp constants: Horizontal TAB, `Ctrl-i', ASCII code 9 (HT). `\v' - Vertical tab, `Ctrl-k', ASCII code 11 (VT). + Vertical TAB, `Ctrl-k', ASCII code 11 (VT). `\NNN' The octal value NNN, where NNN stands for 1 to 3 digits between @@ -3485,7 +3485,7 @@ normally be a regexp operator. For example, `/a\+b/' matches the three characters `a+b'. For complete portability, do not use a backslash before any -character not shown in the previous list and that is not an operator. +character not shown in the previous list or that is not an operator. Backslash Before Regular Characters @@ -3547,7 +3547,7 @@ and converted into corresponding real characters as the very first step in processing regexps. Here is a list of metacharacters. All characters that are not escape -sequences and that are not listed in the following stand for themselves: +sequences and that are not listed here stand for themselves: `\' This suppresses the special meaning of a character when matching. @@ -3630,7 +3630,7 @@ sequences and that are not listed in the following stand for themselves: There are two subtle points to understand about how `*' works. First, the `*' applies only to the single preceding regular expression component (e.g., in `ph*', it applies just to the `h'). - To cause `*' to apply to a larger sub-expression, use parentheses: + To cause `*' to apply to a larger subexpression, use parentheses: `(ph)*' matches `ph', `phph', `phphph', and so on. Second, `*' finds as many repetitions as possible. If the text to @@ -3661,10 +3661,10 @@ sequences and that are not listed in the following stand for themselves: Matches `whhhy', but not `why' or `whhhhy'. `wh{3,5}y' - Matches `whhhy', `whhhhy', or `whhhhhy', only. + Matches `whhhy', `whhhhy', or `whhhhhy' only. `wh{2,}y' - Matches `whhy' or `whhhy', and so on. + Matches `whhy', `whhhy', and so on. Interval expressions were not traditionally available in `awk'. They were added as part of the POSIX standard to make `awk' and @@ -3766,7 +3766,7 @@ Class Meaning `[:print:]' Printable characters (characters that are not control characters) `[:punct:]' Punctuation characters (characters that are not letters, - digits control characters, or space characters) + digits, control characters, or space characters) `[:space:]' Space characters (such as space, TAB, and formfeed, to name a few) `[:upper:]' Uppercase alphabetic characters @@ -3804,8 +3804,9 @@ Collating symbols Equivalence classes Locale-specific names for a list of characters that are equal. The name is enclosed between `[=' and `=]'. For example, the name `e' - might be used to represent all of "e," "e`," and "e'." In this - case, `[[=e=]]' is a regexp that matches any of `e', `e'', or `e`'. + might be used to represent all of "e," "e^," "e`," and "e'." In + this case, `[[=e=]]' is a regexp that matches any of `e', `e^', + `e'', or `e`'. These features are very valuable in non-English-speaking locales. @@ -3827,7 +3828,7 @@ Consider the following: This example uses the `sub()' function to make a change to the input record. (`sub()' replaces the first instance of any text matched by the first argument with the string provided as the second argument; -*note String Functions::). Here, the regexp `/a+/' indicates "one or +*note String Functions::.) Here, the regexp `/a+/' indicates "one or more `a' characters," and the replacement text is `<A>'. The input contains four `a' characters. `awk' (and POSIX) regular @@ -3864,15 +3865,16 @@ regexp": This sets `digits_regexp' to a regexp that describes one or more digits, and tests whether the input record matches this regexp. - NOTE: When using the `~' and `!~' operators, there is a difference - between a regexp constant enclosed in slashes and a string - constant enclosed in double quotes. If you are going to use a - string constant, you have to understand that the string is, in - essence, scanned _twice_: the first time when `awk' reads your + NOTE: When using the `~' and `!~' operators, be aware that there + is a difference between a regexp constant enclosed in slashes and + a string constant enclosed in double quotes. If you are going to + use a string constant, you have to understand that the string is, + in essence, scanned _twice_: the first time when `awk' reads your program, and the second time when it goes to match the string on the lefthand side of the operator with the pattern on the right. This is true of any string-valued expression (such as - `digits_regexp', shown previously), not just string constants. + `digits_regexp', shown in the previous example), not just string + constants. What difference does it make if the string is scanned twice? The answer has to do with escape sequences, and particularly with @@ -3969,7 +3971,7 @@ letters, digits, or underscores (`_'): `\B' Matches the empty string that occurs between two word-constituent - characters. For example, `/\Brat\B/' matches `crate' but it does + characters. For example, `/\Brat\B/' matches `crate', but it does not match `dirty rat'. `\B' is essentially the opposite of `\y'. There are two other operators that work on buffers. In Emacs, a @@ -3978,10 +3980,10 @@ letters, digits, or underscores (`_'): operators are: `\`' - Matches the empty string at the beginning of a buffer (string). + Matches the empty string at the beginning of a buffer (string) `\'' - Matches the empty string at the end of a buffer (string). + Matches the empty string at the end of a buffer (string) Because `^' and `$' always work in terms of the beginning and end of strings, these operators don't add any new capabilities for `awk'. @@ -4152,7 +4154,7 @@ one line. Each record is automatically split into chunks called parts of a record. On rare occasions, you may need to use the `getline' command. The -`getline' command is valuable, both because it can do explicit input +`getline' command is valuable both because it can do explicit input from any number of files, and because the files used with it do not have to be named on the `awk' command line (*note Getline::). @@ -4201,8 +4203,8 @@ File: gawk.info, Node: awk split records, Next: gawk split records, Up: Recor Records are separated by a character called the "record separator". By default, the record separator is the newline character. This is why -records are, by default, single lines. A different character can be -used for the record separator by assigning the character to the +records are, by default, single lines. To use a different character +for the record separator, simply assign that character to the predefined variable `RS'. Like any other variable, the value of `RS' can be changed in the @@ -4217,14 +4219,14 @@ BEGIN/END::). For example: awk 'BEGIN { RS = "u" } { print $0 }' mail-list -changes the value of `RS' to `u', before reading any input. This is a -string whose first character is the letter "u"; as a result, records -are separated by the letter "u." Then the input file is read, and the -second rule in the `awk' program (the action with no pattern) prints -each record. Because each `print' statement adds a newline at the end -of its output, this `awk' program copies the input with each `u' -changed to a newline. Here are the results of running the program on -`mail-list': +changes the value of `RS' to `u', before reading any input. The new +value is a string whose first character is the letter "u"; as a result, +records are separated by the letter "u". Then the input file is read, +and the second rule in the `awk' program (the action with no pattern) +prints each record. Because each `print' statement adds a newline at +the end of its output, this `awk' program copies the input with each +`u' changed to a newline. Here are the results of running the program +on `mail-list': $ awk 'BEGIN { RS = "u" } > { print $0 }' mail-list @@ -4272,11 +4274,11 @@ data file (*note Sample Data Files::), the line looks like this: Bill 555-1675 bill.drowning@hotmail.com A -It contains no `u' so there is no reason to split the record, unlike -the others which have one or more occurrences of the `u'. In fact, -this record is treated as part of the previous record; the newline -separating them in the output is the original newline in the data file, -not the one added by `awk' when it printed the record! +It contains no `u', so there is no reason to split the record, unlike +the others, which each have one or more occurrences of the `u'. In +fact, this record is treated as part of the previous record; the +newline separating them in the output is the original newline in the +data file, not the one added by `awk' when it printed the record! Another way to change the record separator is on the command line, using the variable-assignment feature (*note Other Arguments::): @@ -4342,8 +4344,8 @@ part of either record. character. However, when `RS' is a regular expression, `RT' contains the actual input text that matched the regular expression. - If the input file ended without any text that matches `RS', `gawk' -sets `RT' to the null string. + If the input file ends without any text matching `RS', `gawk' sets +`RT' to the null string. The following example illustrates both of these features. It sets `RS' equal to a regular expression that matches either a newline or a @@ -4441,12 +4443,12 @@ to these pieces of the record. You don't have to use them--you can operate on the whole record if you want--but fields are what make simple `awk' programs so powerful. - You use a dollar-sign (`$') to refer to a field in an `awk' program, + You use a dollar sign (`$') to refer to a field in an `awk' program, followed by the number of the field you want. Thus, `$1' refers to the -first field, `$2' to the second, and so on. (Unlike the Unix shells, -the field numbers are not limited to single digits. `$127' is the -127th field in the record.) For example, suppose the following is a -line of input: +first field, `$2' to the second, and so on. (Unlike in the Unix +shells, the field numbers are not limited to single digits. `$127' is +the 127th field in the record.) For example, suppose the following is +a line of input: This seems like a pretty nice example. @@ -4463,10 +4465,9 @@ as `$7', which is `example.'. If you try to reference a field beyond the last one (such as `$8' when the record has only seven fields), you get the empty string. (If used in a numeric operation, you get zero.) - The use of `$0', which looks like a reference to the "zero-th" -field, is a special case: it represents the whole input record. Use it -when you are not interested in specific fields. Here are some more -examples: + The use of `$0', which looks like a reference to the "zeroth" field, +is a special case: it represents the whole input record. Use it when +you are not interested in specific fields. Here are some more examples: $ awk '$1 ~ /li/ { print $0 }' mail-list -| Amelia 555-5553 amelia.zodiacusque@gmail.com F @@ -4514,8 +4515,8 @@ is another example of using expressions as field numbers: awk '{ print $(2*2) }' mail-list `awk' evaluates the expression `(2*2)' and uses its value as the -number of the field to print. The `*' sign represents multiplication, -so the expression `2*2' evaluates to four. The parentheses are used so +number of the field to print. The `*' represents multiplication, so +the expression `2*2' evaluates to four. The parentheses are used so that the multiplication is done before the `$' operation; they are necessary whenever there is a binary operator(1) in the field-number expression. This example, then, prints the type of relationship (the @@ -4539,7 +4540,7 @@ field number. ---------- Footnotes ---------- (1) A "binary operator", such as `*' for multiplication, is one that -takes two operands. The distinction is required, because `awk' also has +takes two operands. The distinction is required because `awk' also has unary (one-operand) and ternary (three-operand) operators. @@ -4661,7 +4662,7 @@ value of `NF' and recomputes `$0'. (d.c.) Here is an example: decremented. Finally, there are times when it is convenient to force `awk' to -rebuild the entire record, using the current value of the fields and +rebuild the entire record, using the current values of the fields and `OFS'. To do this, use the seemingly innocuous assignment: $1 = $1 # force record to be reconstituted @@ -4681,7 +4682,7 @@ built-in function that updates `$0', such as `sub()' and `gsub()' It is important to remember that `$0' is the _full_ record, exactly as it was read from the input. This includes any leading or trailing whitespace, and the exact whitespace (or other characters) that -separate the fields. +separates the fields. It is a common error to try to change the field separators in a record simply by setting `FS' and `OFS', and then expecting a plain @@ -4749,7 +4750,7 @@ attached, such as: John Q. Smith, LXIX, 29 Oak St., Walamazoo, MI 42139 -The same program would extract `*LXIX', instead of `*29*Oak*St.'. If +The same program would extract `*LXIX' instead of `*29*Oak*St.'. If you were expecting the program to print the address, you would be surprised. The moral is to choose your data layout and separator characters carefully to prevent such problems. (If the data is not in @@ -4948,11 +4949,11 @@ your field and record separators. Perhaps the most common use of a single character as the field separator occurs when processing the Unix system password file. On many Unix systems, each user has a separate entry in the system -password file, one line per user. The information in these lines is -separated by colons. The first field is the user's login name and the -second is the user's encrypted or shadow password. (A shadow password -is indicated by the presence of a single `x' in the second field.) A -password file entry might look like this: +password file, with one line per user. The information in these lines +is separated by colons. The first field is the user's login name and +the second is the user's encrypted or shadow password. (A shadow +password is indicated by the presence of a single `x' in the second +field.) A password file entry might look like this: arnold:x:2076:10:Arnold Robbins:/home/arnold:/bin/bash @@ -4980,15 +4981,14 @@ When you do this, `$1' is the same as `$0'. According to the POSIX standard, `awk' is supposed to behave as if each record is split into fields at the time it is read. In particular, this means that if you change the value of `FS' after a -record is read, the value of the fields (i.e., how they were split) +record is read, the values of the fields (i.e., how they were split) should reflect the old value of `FS', not the new one. However, many older implementations of `awk' do not work this way. Instead, they defer splitting the fields until a field is actually referenced. The fields are split using the _current_ value of `FS'! (d.c.) This behavior can be difficult to diagnose. The following -example illustrates the difference between the two methods. (The -`sed'(2) command prints just the first line of `/etc/passwd'.) +example illustrates the difference between the two methods: sed 1q /etc/passwd | awk '{ FS = ":" ; print $1 }' @@ -5001,6 +5001,8 @@ first line of the file, something like: root:x:0:0:Root:/: + (The `sed'(2) command prints just the first line of `/etc/passwd'.) + ---------- Footnotes ---------- (1) Thanks to Andrew Schorr for this tip. @@ -5154,7 +5156,7 @@ run on a system with card readers is another story!) splitting again. Use `FS = FS' to make this happen, without having to know the current value of `FS'. In order to tell which kind of field splitting is in effect, use `PROCINFO["FS"]' (*note Auto-set::). The -value is `"FS"' if regular field splitting is being used, or it is +value is `"FS"' if regular field splitting is being used, or `"FIELDWIDTHS"' if fixed-width field splitting is being used: if (PROCINFO["FS"] == "FS") @@ -5187,10 +5189,10 @@ what they are, and not by what they are not. The most notorious such case is so-called "comma-separated values" (CSV) data. Many spreadsheet programs, for example, can export their data into text files, where each record is terminated with a newline, -and fields are separated by commas. If only commas separated the data, +and fields are separated by commas. If commas only separated the data, there wouldn't be an issue. The problem comes when one of the fields contains an _embedded_ comma. In such cases, most programs embed the -field in double quotes.(1) So we might have data like this: +field in double quotes.(1) So, we might have data like this: Robbins,Arnold,"1234 A Pretty Street, NE",MyTown,MyState,12345-6789,USA @@ -5257,9 +5259,9 @@ being used. provides an elegant solution for the majority of cases, and the `gawk' developers are satisfied with that. - As written, the regexp used for `FPAT' requires that each field have -a least one character. A straightforward modification (changing -changed the first `+' to `*') allows fields to be empty: + As written, the regexp used for `FPAT' requires that each field +contain at least one character. A straightforward modification +(changing the first `+' to `*') allows fields to be empty: FPAT = "([^,]*)|(\"[^\"]+\")" @@ -5267,9 +5269,8 @@ changed the first `+' to `*') allows fields to be empty: available for splitting regular strings (*note String Functions::). To recap, `gawk' provides three independent methods to split input -records into fields. `gawk' uses whichever mechanism was last chosen -based on which of the three variables--`FS', `FIELDWIDTHS', and -`FPAT'--was last assigned to. +records into fields. The mechanism used is based on which of the three +variables--`FS', `FIELDWIDTHS', or `FPAT'--was last assigned to. ---------- Footnotes ---------- @@ -5307,7 +5308,7 @@ empty; lines that contain only whitespace do not count.) `"\n\n+"' to `RS'. This regexp matches the newline at the end of the record and one or more blank lines after the record. In addition, a regular expression always matches the longest possible sequence when -there is a choice (*note Leftmost Longest::). So the next record +there is a choice (*note Leftmost Longest::). So, the next record doesn't start until the first nonblank line that follows--no matter how many blank lines appear in a row, they are considered one record separator. @@ -5319,12 +5320,12 @@ last record, the final newline is removed from the record. In the second case, this special processing is not done. (d.c.) Now that the input is separated into records, the second step is to -separate the fields in the record. One way to do this is to divide each -of the lines into fields in the normal manner. This happens by default -as the result of a special feature. When `RS' is set to the empty -string, _and_ `FS' is set to a single character, the newline character -_always_ acts as a field separator. This is in addition to whatever -field separations result from `FS'.(1) +separate the fields in the records. One way to do this is to divide +each of the lines into fields in the normal manner. This happens by +default as the result of a special feature. When `RS' is set to the +empty string _and_ `FS' is set to a single character, the newline +character _always_ acts as a field separator. This is in addition to +whatever field separations result from `FS'.(1) The original motivation for this special exception was probably to provide useful behavior in the default case (i.e., `FS' is equal to @@ -5332,17 +5333,17 @@ provide useful behavior in the default case (i.e., `FS' is equal to newline character to separate fields, because there is no way to prevent it. However, you can work around this by using the `split()' function to break up the record manually (*note String Functions::). -If you have a single character field separator, you can work around the +If you have a single-character field separator, you can work around the special feature in a different way, by making `FS' into a regexp for that single character. For example, if the field separator is a percent character, instead of `FS = "%"', use `FS = "[%]"'. Another way to separate fields is to put each field on a separate line: to do this, just set the variable `FS' to the string `"\n"'. -(This single character separator matches a single newline.) A +(This single-character separator matches a single newline.) A practical example of a data file organized this way might be a mailing -list, where each entry is separated by blank lines. Consider a mailing -list in a file named `addresses', which looks like this: +list, where blank lines separate the entries. Consider a mailing list +in a file named `addresses', which looks like this: Jane Doe 123 Main Street @@ -5425,7 +5426,7 @@ File: gawk.info, Node: Getline, Next: Read Timeout, Prev: Multiple Line, Up: So far we have been getting our input data from `awk''s main input stream--either the standard input (usually your keyboard, sometimes the -output from another program) or from the files specified on the command +output from another program) or the files specified on the command line. The `awk' language has a special built-in command called `getline' that can be used to read input under your explicit control. @@ -5563,7 +5564,7 @@ and produces these results: free The `getline' command used in this way sets only the variables `NR', -`FNR', and `RT' (and of course, VAR). The record is not split into +`FNR', and `RT' (and, of course, VAR). The record is not split into fields, so the values of the fields (including `$0') and the value of `NF' do not change. @@ -5573,8 +5574,8 @@ File: gawk.info, Node: Getline/File, Next: Getline/Variable/File, Prev: Getli 4.9.3 Using `getline' from a File --------------------------------- -Use `getline < FILE' to read the next record from FILE. Here FILE is a -string-valued expression that specifies the file name. `< FILE' is +Use `getline < FILE' to read the next record from FILE. Here, FILE is +a string-valued expression that specifies the file name. `< FILE' is called a "redirection" because it directs input to come from a different place. For example, the following program reads its input record from the file `secondary.input' when it encounters a first field @@ -5710,8 +5711,8 @@ all `awk' implementations. treatment of a construct like `"echo " "date" | getline'. Most versions, including the current version, treat it at as `("echo " "date") | getline'. (This is also how BWK `awk' behaves.) Some - versions changed and treated it as `"echo " ("date" | getline)'. - (This is how `mawk' behaves.) In short, _always_ use explicit + versions instead treat it as `"echo " ("date" | getline)'. (This + is how `mawk' behaves.) In short, _always_ use explicit parentheses, and then you won't have to worry. @@ -5747,15 +5748,16 @@ File: gawk.info, Node: Getline/Coprocess, Next: Getline/Variable/Coprocess, P 4.9.7 Using `getline' from a Coprocess -------------------------------------- -Input into `getline' from a pipe is a one-way operation. The command -that is started with `COMMAND | getline' only sends data _to_ your -`awk' program. +Reading input into `getline' from a pipe is a one-way operation. The +command that is started with `COMMAND | getline' only sends data _to_ +your `awk' program. On occasion, you might want to send data to another program for processing and then read the results back. `gawk' allows you to start a "coprocess", with which two-way communications are possible. This is done with the `|&' operator. Typically, you write data to the -coprocess first and then read results back, as shown in the following: +coprocess first and then read the results back, as shown in the +following: print "SOME QUERY" |& "db_server" "db_server" |& getline @@ -5817,7 +5819,7 @@ in mind: files. (d.c.) (See *note BEGIN/END::; also *note Auto-set::.) * Using `FILENAME' with `getline' (`getline < FILENAME') is likely - to be a source for confusion. `awk' opens a separate input stream + to be a source of confusion. `awk' opens a separate input stream from the current input file. However, by not using a variable, `$0' and `NF' are still updated. If you're doing this, it's probably by accident, and you should reconsider what it is you're @@ -5825,15 +5827,15 @@ in mind: * *note Getline Summary::, presents a table summarizing the `getline' variants and which variables they can affect. It is - worth noting that those variants which do not use redirection can + worth noting that those variants that do not use redirection can cause `FILENAME' to be updated if they cause `awk' to start reading a new input file. * If the variable being assigned is an expression with side effects, different versions of `awk' behave differently upon encountering end-of-file. Some versions don't evaluate the expression; many - versions (including `gawk') do. Here is an example, due to Duncan - Moore: + versions (including `gawk') do. Here is an example, courtesy of + Duncan Moore: BEGIN { system("echo 1 > f") @@ -5841,8 +5843,8 @@ in mind: print c } - Here, the side effect is the `++c'. Is `c' incremented if end of - file is encountered, before the element in `a' is assigned? + Here, the side effect is the `++c'. Is `c' incremented if + end-of-file is encountered before the element in `a' is assigned? `gawk' treats `getline' like a function call, and evaluates the expression `a[++c]' before attempting to read from `f'. However, @@ -5886,8 +5888,8 @@ This minor node describes a feature that is specific to `gawk'. You may specify a timeout in milliseconds for reading input from the keyboard, a pipe, or two-way communication, including TCP/IP sockets. -This can be done on a per input, command, or connection basis, by -setting a special element in the `PROCINFO' array (*note Auto-set::): +This can be done on a per-input, per-command, or per-connection basis, +by setting a special element in the `PROCINFO' array (*note Auto-set::): PROCINFO["input_name", "READ_TIMEOUT"] = TIMEOUT IN MILLISECONDS @@ -5911,7 +5913,7 @@ for more than five seconds: print $0 `gawk' terminates the read operation if input does not arrive after -waiting for the timeout period, returns failure and sets `ERRNO' to an +waiting for the timeout period, returns failure, and sets `ERRNO' to an appropriate string value. A negative or zero value for the timeout is the same as specifying no timeout at all. @@ -5919,7 +5921,7 @@ the same as specifying no timeout at all. implicit loop that reads input records and matches them against patterns, like so: - $ gawk 'BEGIN { PROCINFO["-", "READ_TIMEOUT"] = 5000 } + $ gawk 'BEGIN { PROCINFO["-", "READ_TIMEOUT"] = 5000 } > { print "You entered: " $0 }' gawk -| You entered: gawk @@ -5951,7 +5953,7 @@ input to arrive: environment variable exists, `gawk' uses its value to initialize the timeout value. The exclusive use of the environment variable to specify timeout has the disadvantage of not being able to control it on -a per command or connection basis. +a per-command or per-connection basis. `gawk' considers a timeout event to be an error even though the attempt to read from the underlying device may succeed in a later @@ -6019,7 +6021,7 @@ File: gawk.info, Node: Input Summary, Next: Input Exercises, Prev: Command-li * `gawk' sets `RT' to the text matched by `RS'. * After splitting the input into records, `awk' further splits the - record into individual fields, named `$1', `$2', and so on. `$0' + records into individual fields, named `$1', `$2', and so on. `$0' is the whole record, and `NF' indicates how many fields there are. The default way to split fields is between whitespace characters. @@ -6033,19 +6035,21 @@ File: gawk.info, Node: Input Summary, Next: Input Exercises, Prev: Command-li * Field splitting is more complicated than record splitting: - Field separator value Fields are split ... `awk' / - `gawk' + Field separator value Fields are split ... `awk' / + `gawk' ---------------------------------------------------------------------- - `FS == " "' On runs of whitespace `awk' - `FS == ANY SINGLE On that character `awk' - CHARACTER' - `FS == REGEXP' On text matching the regexp `awk' - `FS == ""' Each individual character is `gawk' - a separate field - `FIELDWIDTHS == LIST OF Based on character position `gawk' - COLUMNS' - `FPAT == REGEXP' On the text surrounding text `gawk' - matching the regexp + `FS == " "' On runs of whitespace `awk' + `FS == ANY SINGLE On that character `awk' + CHARACTER' + `FS == REGEXP' On text matching the `awk' + regexp + `FS == ""' Such that each individual `gawk' + character is a separate + field + `FIELDWIDTHS == LIST OF Based on character `gawk' + COLUMNS' position + `FPAT == REGEXP' On the text surrounding `gawk' + text matching the regexp * Using `FS = "\n"' causes the entire record to be a single field (assuming that newlines separate records). @@ -6055,12 +6059,11 @@ File: gawk.info, Node: Input Summary, Next: Input Exercises, Prev: Command-li * Use `PROCINFO["FS"]' to see how fields are being split. - * Use `getline' in its various forms to read additional records, - from the default input stream, from a file, or from a pipe or - coprocess. + * Use `getline' in its various forms to read additional records from + the default input stream, from a file, or from a pipe or coprocess. - * Use `PROCINFO[FILE, "READ_TIMEOUT"]' to cause reads to timeout for - FILE. + * Use `PROCINFO[FILE, "READ_TIMEOUT"]' to cause reads to time out + for FILE. * Directories on the command line are fatal for standard `awk'; `gawk' ignores them if not in POSIX mode. @@ -6155,7 +6158,7 @@ you will probably get an error. Keep in mind that a space is printed between any two items. Note that the `print' statement is a statement and not an -expression--you can't use it in the pattern part of a PATTERN-ACTION +expression--you can't use it in the pattern part of a pattern-action statement, for example. @@ -6303,7 +6306,7 @@ File: gawk.info, Node: OFMT, Next: Printf, Prev: Output Separators, Up: Prin =========================================== When printing numeric values with the `print' statement, `awk' -internally converts the number to a string of characters and prints +internally converts each number to a string of characters and prints that string. `awk' uses the `sprintf()' function to do this conversion (*note String Functions::). For now, it suffices to say that the `sprintf()' function accepts a "format specification" that tells it how @@ -6358,7 +6361,7 @@ A simple `printf' statement looks like this: As for `print', the entire list of arguments may optionally be enclosed in parentheses. Here too, the parentheses are necessary if any of the -item expressions use the `>' relational operator; otherwise, it can be +item expressions uses the `>' relational operator; otherwise, it can be confused with an output redirection (*note Redirection::). The difference between `printf' and `print' is the FORMAT argument. @@ -6385,7 +6388,7 @@ statements. For example: > }' -| Don't Panic! -Here, neither the `+' nor the `OUCH!' appear in the output message. +Here, neither the `+' nor the `OUCH!' appears in the output message. File: gawk.info, Node: Control Letters, Next: Format Modifiers, Prev: Basic Printf, Up: Printf @@ -6424,7 +6427,7 @@ width. Here is a list of the format-control letters: (The `%i' specification is for compatibility with ISO C.) `%e', `%E' - Print a number in scientific (exponential) notation; for example: + Print a number in scientific (exponential) notation. For example: printf "%4.3e\n", 1950 @@ -6449,7 +6452,7 @@ width. Here is a list of the format-control letters: Math Definitions::). `%F' - Like `%f' but the infinity and "not a number" values are spelled + Like `%f', but the infinity and "not a number" values are spelled using uppercase letters. The `%F' format is a POSIX extension to ISO C; not all systems @@ -6643,7 +6646,7 @@ string, like so: s = "abcdefg" printf "%" w "." p "s\n", s -This is not particularly easy to read but it does work. +This is not particularly easy to read, but it does work. C programmers may be used to supplying additional modifiers (`h', `j', `l', `L', `t', and `z') in `printf' format strings. These are not @@ -6682,7 +6685,7 @@ an aligned two-column table of names and phone numbers, as shown here: -| Jean-Paul 555-2127 In this case, the phone numbers had to be printed as strings because -the numbers are separated by a dash. Printing the phone numbers as +the numbers are separated by dashes. Printing the phone numbers as numbers would have produced just the first three digits: `555'. This would have been pretty confusing. @@ -6730,7 +6733,7 @@ output, usually the screen. Both `print' and `printf' can also send their output to other places. This is called "redirection". NOTE: When `--sandbox' is specified (*note Options::), redirecting - output to files, pipes and coprocesses is disabled. + output to files, pipes, and coprocesses is disabled. A redirection appears after the `print' or `printf' statement. Redirections in `awk' are written just like redirections in shell @@ -6770,7 +6773,7 @@ work identically for `printf': Each output file contains one name or number per line. `print ITEMS >> OUTPUT-FILE' - This redirection prints the items into the pre-existing output file + This redirection prints the items into the preexisting output file named OUTPUT-FILE. The difference between this and the single-`>' redirection is that the old contents (if any) of OUTPUT-FILE are not erased. Instead, the `awk' output is appended to the file. @@ -6818,8 +6821,8 @@ work identically for `printf': `print ITEMS |& COMMAND' This redirection prints the items to the input of COMMAND. The difference between this and the single-`|' redirection is that the - output from COMMAND can be read with `getline'. Thus COMMAND is a - "coprocess", which works together with, but subsidiary to, the + output from COMMAND can be read with `getline'. Thus, COMMAND is + a "coprocess", which works together with but is subsidiary to the `awk' program. This feature is a `gawk' extension, and is not available in POSIX @@ -6843,7 +6846,7 @@ a file, and then to use `>>' for subsequent output: This is indeed how redirections must be used from the shell. But in `awk', it isn't necessary. In this kind of case, a program should use `>' for all the `print' statements, because the output file is only -opened once. (It happens that if you mix `>' and `>>' that output is +opened once. (It happens that if you mix `>' and `>>' output is produced in the expected order. However, mixing the operators for the same file is definitely poor style, and is confusing to readers of your program.) @@ -6876,14 +6879,14 @@ command lines to be fed to the shell. File: gawk.info, Node: Special FD, Next: Special Files, Prev: Redirection, Up: Printing -5.7 Special Files for Standard Pre-Opened Data Streams -====================================================== +5.7 Special Files for Standard Preopened Data Streams +===================================================== Running programs conventionally have three input and output streams already available to them for reading and writing. These are known as the "standard input", "standard output", and "standard error output". -These open streams (and any other open file or pipe) are often referred -to by the technical term "file descriptors". +These open streams (and any other open files or pipes) are often +referred to by the technical term "file descriptors". These streams are, by default, connected to your keyboard and screen, but they are often redirected with the shell, via the `<', `<<', @@ -6908,7 +6911,7 @@ error messages to the screen, like this: (`/dev/tty' is a special file supplied by the operating system that is connected to your keyboard and screen. It represents the "terminal,"(1) which on modern systems is a keyboard and screen, not a serial console.) -This generally has the same effect but not always: although the +This generally has the same effect, but not always: although the standard error stream is usually the screen, it can be redirected; when that happens, writing to the screen is not correct. In fact, if `awk' is run from a background job, it may not have a terminal at all. Then @@ -6935,7 +6938,7 @@ becomes: print "Serious error detected!" > "/dev/stderr" - Note the use of quotes around the file name. Like any other + Note the use of quotes around the file name. Like with any other redirection, the value must be a string. It is a common error to omit the quotes, which leads to confusing results. @@ -6968,7 +6971,7 @@ there are special file names reserved for TCP/IP networking. File: gawk.info, Node: Other Inherited Files, Next: Special Network, Up: Special Files -5.8.1 Accessing Other Open Files With `gawk' +5.8.1 Accessing Other Open Files with `gawk' -------------------------------------------- Besides the `/dev/stdin', `/dev/stdout', and `/dev/stderr' special file @@ -7018,7 +7021,7 @@ File: gawk.info, Node: Special Caveats, Prev: Special Network, Up: Special Fi Here are some things to bear in mind when using the special file names that `gawk' provides: - * Recognition of the file names for the three standard pre-opened + * Recognition of the file names for the three standard preopened files is disabled only in POSIX mode. * Recognition of the other special file names is disabled if `gawk' @@ -7027,7 +7030,7 @@ that `gawk' provides: * `gawk' _always_ interprets these special file names. For example, using `/dev/fd/4' for output actually writes on file descriptor 4, - and not on a new file descriptor that is `dup()''ed from file + and not on a new file descriptor that is `dup()'ed from file descriptor 4. Most of the time this does not matter; however, it is important to _not_ close any of the files related to file descriptors 0, 1, and 2. Doing so results in unpredictable @@ -7187,8 +7190,8 @@ closing input or output files, respectively. This value is zero if the close succeeds, or -1 if it fails. The POSIX standard is very vague; it says that `close()' returns -zero on success and nonzero otherwise. In general, different -implementations vary in what they report when closing pipes; thus the +zero on success and a nonzero value otherwise. In general, different +implementations vary in what they report when closing pipes; thus, the return value cannot be used portably. (d.c.) In POSIX mode (*note Options::), `gawk' just returns zero when closing a pipe. @@ -7266,8 +7269,8 @@ File: gawk.info, Node: Output Summary, Next: Output Exercises, Prev: Nonfatal numeric values for the `print' statement. * The `printf' statement provides finer-grained control over output, - with format control letters for different data types and various - flags that modify the behavior of the format control letters. + with format-control letters for different data types and various + flags that modify the behavior of the format-control letters. * Output from both `print' and `printf' may be redirected to files, pipes, and coprocesses. @@ -7323,9 +7326,9 @@ value to a variable or a field by using an assignment operator. An expression can serve as a pattern or action statement on its own. Most other kinds of statements contain one or more expressions that specify the data on which to operate. As in other languages, -expressions in `awk' include variables, array references, constants, -and function calls, as well as combinations of these with various -operators. +expressions in `awk' can include variables, array references, +constants, and function calls, as well as combinations of these with +various operators. * Menu: @@ -7344,8 +7347,8 @@ File: gawk.info, Node: Values, Next: All Operators, Up: Expressions ========================================= Expressions are built up from values and the operations performed upon -them. This minor node describes the elementary objects which provide -the values used in expressions. +them. This minor node describes the elementary objects that provide the +values used in expressions. * Menu: @@ -7390,14 +7393,14 @@ the same value: 1.05e+2 1050e-1 - A string constant consists of a sequence of characters enclosed in + A "string constant" consists of a sequence of characters enclosed in double quotation marks. For example: "parrot" represents the string whose contents are `parrot'. Strings in `gawk' can be of any length, and they can contain any of the possible -eight-bit ASCII characters including ASCII NUL (character code zero). +eight-bit ASCII characters, including ASCII NUL (character code zero). Other `awk' implementations may have difficulty with some character codes. @@ -7417,14 +7420,14 @@ File: gawk.info, Node: Nondecimal-numbers, Next: Regexp Constants, Prev: Scal In `awk', all numbers are in decimal (i.e., base 10). Many other programming languages allow you to specify numbers in other bases, often octal (base 8) and hexadecimal (base 16). In octal, the numbers go 0, -1, 2, 3, 4, 5, 6, 7, 10, 11, 12, and so on. Just as `11', in decimal, -is 1 times 10 plus 1, so `11', in octal, is 1 times 8, plus 1. This -equals 9 in decimal. In hexadecimal, there are 16 digits. Because the -everyday decimal number system only has ten digits (`0'-`9'), the -letters `a' through `f' are used to represent the rest. (Case in the -letters is usually irrelevant; hexadecimal `a' and `A' have the same -value.) Thus, `11', in hexadecimal, is 1 times 16 plus 1, which equals -17 in decimal. +1, 2, 3, 4, 5, 6, 7, 10, 11, 12, and so on. Just as `11' in decimal is +1 times 10 plus 1, so `11' in octal is 1 times 8 plus 1. This equals 9 +in decimal. In hexadecimal, there are 16 digits. Because the everyday +decimal number system only has ten digits (`0'-`9'), the letters `a' +through `f' are used to represent the rest. (Case in the letters is +usually irrelevant; hexadecimal `a' and `A' have the same value.) +Thus, `11' in hexadecimal is 1 times 16 plus 1, which equals 17 in +decimal. Just by looking at plain `11', you can't tell what base it's in. So, in C, C++, and other languages derived from C, there is a special @@ -7432,13 +7435,13 @@ notation to signify the base. Octal numbers start with a leading `0', and hexadecimal numbers start with a leading `0x' or `0X': `11' - Decimal value 11. + Decimal value 11 `011' - Octal 11, decimal value 9. + Octal 11, decimal value 9 `0x11' - Hexadecimal 11, decimal value 17. + Hexadecimal 11, decimal value 17 This example shows the difference: @@ -7457,11 +7460,11 @@ really need to do this, use the `--non-decimal-data' command-line option; *note Nondecimal Data::.) If you have octal or hexadecimal data, you can use the `strtonum()' function (*note String Functions::) to convert the data into a number. Most of the time, you will want to -use octal or hexadecimal constants when working with the built-in bit -manipulation functions; see *note Bitwise Functions::, for more +use octal or hexadecimal constants when working with the built-in +bit-manipulation functions; see *note Bitwise Functions::, for more information. - Unlike some early C implementations, `8' and `9' are not valid in + Unlike in some early C implementations, `8' and `9' are not valid in octal constants. For example, `gawk' treats `018' as decimal 18: $ gawk 'BEGIN { print "021 is", 021 ; print 018 }' @@ -7488,12 +7491,12 @@ File: gawk.info, Node: Regexp Constants, Prev: Nondecimal-numbers, Up: Consta 6.1.1.3 Regular Expression Constants .................................... -A regexp constant is a regular expression description enclosed in +A "regexp constant" is a regular expression description enclosed in slashes, such as `/^beginning and end$/'. Most regexps used in `awk' programs are constant, but the `~' and `!~' matching operators can also match computed or dynamic regexps (which are typically just ordinary -strings or variables that contain a regexp, but could be a more complex -expression). +strings or variables that contain a regexp, but could be more complex +expressions). File: gawk.info, Node: Using Constant Regexps, Next: Variables, Prev: Constants, Up: Values @@ -7545,7 +7548,7 @@ and `patsplit()' functions (*note String Functions::). Modern implementations of `awk', including `gawk', allow the third argument of `split()' to be a regexp constant, but some older implementations do not. (d.c.) Because some built-in functions accept regexp constants -as arguments, it can be confusing when attempting to use regexp +as arguments, confusion can arise when attempting to use regexp constants as arguments to user-defined functions (*note User-defined::). For example: @@ -7568,10 +7571,11 @@ User-defined::). For example: In this example, the programmer wants to pass a regexp constant to the user-defined function `mysub()', which in turn passes it on to either `sub()' or `gsub()'. However, what really happens is that the -`pat' parameter is either one or zero, depending upon whether or not -`$0' matches `/hi/'. `gawk' issues a warning when it sees a regexp -constant used as a parameter to a user-defined function, because -passing a truth value in this way is probably not what was intended. +`pat' parameter is assigned a value of either one or zero, depending +upon whether or not `$0' matches `/hi/'. `gawk' issues a warning when +it sees a regexp constant used as a parameter to a user-defined +function, because passing a truth value in this way is probably not +what was intended. File: gawk.info, Node: Variables, Next: Conversion, Prev: Using Constant Regexps, Up: Values @@ -7579,7 +7583,7 @@ File: gawk.info, Node: Variables, Next: Conversion, Prev: Using Constant Rege 6.1.3 Variables --------------- -Variables are ways of storing values at one point in your program for +"Variables" are ways of storing values at one point in your program for use later in another part of your program. They can be manipulated entirely within the program text, and they can also be assigned values on the `awk' command line. @@ -7608,14 +7612,14 @@ variables. A variable name is a valid expression by itself; it represents the variable's current value. Variables are given new values with -"assignment operators", "increment operators", and "decrement -operators". *Note Assignment Ops::. In addition, the `sub()' and -`gsub()' functions can change a variable's value, and the `match()', -`split()', and `patsplit()' functions can change the contents of their -array parameters. *Note String Functions::. +"assignment operators", "increment operators", and "decrement operators" +(*note Assignment Ops::). In addition, the `sub()' and `gsub()' +functions can change a variable's value, and the `match()', `split()', +and `patsplit()' functions can change the contents of their array +parameters (*note String Functions::). A few variables have special built-in meanings, such as `FS' (the -field separator), and `NF' (the number of fields in the current input +field separator) and `NF' (the number of fields in the current input record). *Note Built-in Variables::, for a list of the predefined variables. These predefined variables can be used and assigned just like all other variables, but their values are also used or changed @@ -7812,7 +7816,7 @@ point, so the default behavior was restored to use a period as the decimal point character. You can use the `--use-lc-numeric' option (*note Options::) to force `gawk' to use the locale's decimal point character. (`gawk' also uses the locale's decimal point character when -in POSIX mode, either via `--posix', or the `POSIXLY_CORRECT' +in POSIX mode, either via `--posix' or the `POSIXLY_CORRECT' environment variable, as shown previously.) *note table-locale-affects:: describes the cases in which the @@ -7828,10 +7832,10 @@ Input Use period Use locale Table 6.1: Locale decimal point versus a period - Finally, modern day formal standards and IEEE standard floating-point -representation can have an unusual but important effect on the way -`gawk' converts some special string values to numbers. The details are -presented in *note POSIX Floating Point Problems::. + Finally, modern-day formal standards and the IEEE standard +floating-point representation can have an unusual but important effect +on the way `gawk' converts some special string values to numbers. The +details are presented in *note POSIX Floating Point Problems::. File: gawk.info, Node: All Operators, Next: Truth Values and Conditions, Prev: Values, Up: Expressions @@ -7839,7 +7843,7 @@ File: gawk.info, Node: All Operators, Next: Truth Values and Conditions, Prev 6.2 Operators: Doing Something with Values ========================================== -This minor node introduces the "operators" which make use of the values +This minor node introduces the "operators" that make use of the values provided by constants and variables. * Menu: @@ -8020,7 +8024,7 @@ you'll get. ---------- Footnotes ---------- - (1) It happens that BWK `awk', `gawk' and `mawk' all "get it right," + (1) It happens that BWK `awk', `gawk', and `mawk' all "get it right," but you should not rely on this. @@ -8137,7 +8141,7 @@ righthand expression. For example: The indices of `bar' are practically guaranteed to be different, because `rand()' returns different values each time it is called. (Arrays and the `rand()' function haven't been covered yet. *Note Arrays::, and -*note Numeric Functions::, for more information). This example +*note Numeric Functions::, for more information.) This example illustrates an important fact about assignment operators: the lefthand expression is only evaluated _once_. @@ -8155,14 +8159,14 @@ converted to a number. Operator Effect -------------------------------------------------------------------------- -LVALUE `+=' INCREMENT Add INCREMENT to the value of LVALUE -LVALUE `-=' DECREMENT Subtract DECREMENT from the value of LVALUE -LVALUE `*=' Multiply the value of LVALUE by COEFFICIENT +LVALUE `+=' INCREMENT Add INCREMENT to the value of LVALUE. +LVALUE `-=' DECREMENT Subtract DECREMENT from the value of LVALUE. +LVALUE `*=' Multiply the value of LVALUE by COEFFICIENT. COEFFICIENT -LVALUE `/=' DIVISOR Divide the value of LVALUE by DIVISOR -LVALUE `%=' MODULUS Set LVALUE to its remainder by MODULUS -LVALUE `^=' POWER -LVALUE `**=' POWER Raise LVALUE to the power POWER (c.e.) +LVALUE `/=' DIVISOR Divide the value of LVALUE by DIVISOR. +LVALUE `%=' MODULUS Set LVALUE to its remainder by MODULUS. +LVALUE `^=' POWER Raise LVALUE to the power POWER. +LVALUE `**=' POWER Raise LVALUE to the power POWER. (c.e.) Table 6.2: Arithmetic assignment operators @@ -8247,8 +8251,8 @@ is a summary of increment and decrement expressions: Operator Evaluation Order - Doctor, doctor! It hurts when I do this! - So don't do that! -- Groucho Marx + Doctor, it hurts when I do this! + Then don't do that! -- Groucho Marx What happens for something like the following? @@ -8263,7 +8267,7 @@ Or something even stranger? In other words, when do the various side effects prescribed by the postfix operators (`b++') take effect? When side effects happen is -"implementation defined". In other words, it is up to the particular +"implementation-defined". In other words, it is up to the particular version of `awk'. The result for the first example may be 12 or 13, and for the second, it may be 22 or 23. @@ -8278,7 +8282,7 @@ File: gawk.info, Node: Truth Values and Conditions, Next: Function Calls, Pre =============================== In certain contexts, expression values also serve as "truth values"; -(i.e., they determine what should happen next as the program runs). This +i.e., they determine what should happen next as the program runs. This minor node describes how `awk' defines "true" and "false" and how values are compared. @@ -8332,10 +8336,10 @@ File: gawk.info, Node: Typing and Comparison, Next: Boolean Ops, Prev: Truth The Guide is definitive. Reality is frequently inaccurate. -- Douglas Adams, `The Hitchhiker's Guide to the Galaxy' - Unlike other programming languages, `awk' variables do not have a -fixed type. Instead, they can be either a number or a string, depending -upon the value that is assigned to them. We look now at how variables -are typed, and how `awk' compares variables. + Unlike in other programming languages, in `awk' variables do not +have a fixed type. Instead, they can be either a number or a string, +depending upon the value that is assigned to them. We look now at how +variables are typed, and how `awk' compares variables. * Menu: @@ -8356,16 +8360,16 @@ of the variable is important because the types of two variables determine how they are compared. Variable typing follows these rules: * A numeric constant or the result of a numeric operation has the - NUMERIC attribute. + "numeric" attribute. * A string constant or the result of a string operation has the - STRING attribute. + "string" attribute. * Fields, `getline' input, `FILENAME', `ARGV' elements, `ENVIRON' elements, and the elements of an array created by `match()', `split()', and `patsplit()' that are numeric strings have the - STRNUM attribute. Otherwise, they have the STRING attribute. - Uninitialized variables also have the STRNUM attribute. + "strnum" attribute. Otherwise, they have the "string" attribute. + Uninitialized variables also have the "strnum" attribute. * Attributes propagate across assignments but are not changed by any use. @@ -8407,12 +8411,13 @@ constant, then a string comparison is performed. Otherwise, a numeric comparison is performed. This point bears additional emphasis: All user input is made of -characters, and so is first and foremost of STRING type; input strings -that look numeric are additionally given the STRNUM attribute. Thus, -the six-character input string ` +3.14' receives the STRNUM attribute. +characters, and so is first and foremost of string type; input strings +that look numeric are additionally given the strnum attribute. Thus, +the six-character input string ` +3.14' receives the strnum attribute. In contrast, the eight characters `" +3.14"' appearing in program text comprise a string constant. The following examples print `1' when the -comparison between the two different constants is true, `0' otherwise: +comparison between the two different constants is true, and `0' +otherwise: $ echo ' +3.14' | awk '{ print($0 == " +3.14") }' True -| 1 @@ -8511,7 +8516,7 @@ comparison is: -| false the result is `false' because both `$1' and `$2' are user input. They -are numeric strings--therefore both have the STRNUM attribute, +are numeric strings--therefore both have the strnum attribute, dictating a numeric comparison. The purpose of the comparison rules and the use of numeric strings is to attempt to produce the behavior that is "least surprising," while still "doing the right thing." @@ -8570,7 +8575,7 @@ is an example to illustrate the difference, in an `en_US.UTF-8' locale: ---------- Footnotes ---------- (1) Technically, string comparison is supposed to behave the same -way as if the strings are compared with the C `strcoll()' function. +way as if the strings were compared with the C `strcoll()' function. File: gawk.info, Node: Boolean Ops, Next: Conditional Exp, Prev: Typing and Comparison, Up: Truth Values and Conditions @@ -8633,7 +8638,7 @@ Boolean operators are: The `&&' and `||' operators are called "short-circuit" operators because of the way they work. Evaluation of the full expression is -"short-circuited" if the result can be determined part way through its +"short-circuited" if the result can be determined partway through its evaluation. Statements that end with `&&' or `||' can be continued simply by @@ -8686,15 +8691,15 @@ File: gawk.info, Node: Conditional Exp, Prev: Boolean Ops, Up: Truth Values a A "conditional expression" is a special kind of expression that has three operands. It allows you to use one expression's value to select -one of two other expressions. The conditional expression is the same -as in the C language, as shown here: +one of two other expressions. The conditional expression in `awk' is +the same as in the C language, as shown here: SELECTOR ? IF-TRUE-EXP : IF-FALSE-EXP There are three subexpressions. The first, SELECTOR, is always computed first. If it is "true" (not zero or not null), then -IF-TRUE-EXP is computed next and its value becomes the value of the -whole expression. Otherwise, IF-FALSE-EXP is computed next and its +IF-TRUE-EXP is computed next, and its value becomes the value of the +whole expression. Otherwise, IF-FALSE-EXP is computed next, and its value becomes the value of the whole expression. For example, the following expression produces the absolute value of `x': @@ -8728,7 +8733,7 @@ A "function" is a name for a particular calculation. This enables you to ask for it by name at any point in the program. For example, the function `sqrt()' computes the square root of a number. - A fixed set of functions are "built-in", which means they are + A fixed set of functions are "built in", which means they are available in every `awk' program. The `sqrt()' function is one of these. *Note Built-in::, for a list of built-in functions and their descriptions. In addition, you can define functions for use in your @@ -8863,7 +8868,7 @@ precedence: Increment, decrement. `^ **' - Exponentiation. These operators group right-to-left. + Exponentiation. These operators group right to left. `+ - !' Unary plus, minus, logical "not." @@ -8890,7 +8895,7 @@ String concatenation operand of another operator. As a result, it does not make sense to use a redirection operator near another operator of lower precedence without parentheses. Such combinations (e.g., `print - foo > a ? b : c'), result in syntax errors. The correct way to + foo > a ? b : c') result in syntax errors. The correct way to write this statement is `print foo > (a ? b : c)'. `~ !~' @@ -8900,16 +8905,16 @@ String concatenation Array membership. `&&' - Logical "and". + Logical "and." `||' - Logical "or". + Logical "or." `?:' - Conditional. This operator groups right-to-left. + Conditional. This operator groups right to left. `= += -= *= /= %= ^= **=' - Assignment. These operators group right-to-left. + Assignment. These operators group right to left. NOTE: The `|&', `**', and `**=' operators are not specified by POSIX. For maximum portability, do not use them. @@ -8977,24 +8982,24 @@ File: gawk.info, Node: Expressions Summary, Prev: Locales, Up: Expressions * `awk' provides the usual arithmetic operators (addition, subtraction, multiplication, division, modulus), and unary plus - and minus. It also provides comparison operators, boolean - operators, array membership testing, and regexp matching - operators. String concatenation is accomplished by placing two - expressions next to each other; there is no explicit operator. - The three-operand `?:' operator provides an "if-else" test within - expressions. + and minus. It also provides comparison operators, Boolean + operators, an array membership testing operator, and regexp + matching operators. String concatenation is accomplished by + placing two expressions next to each other; there is no explicit + operator. The three-operand `?:' operator provides an "if-else" + test within expressions. * Assignment operators provide convenient shorthands for common arithmetic operations. - * In `awk', a value is considered to be true if it is non-zero _or_ + * In `awk', a value is considered to be true if it is nonzero _or_ non-null. Otherwise, the value is false. * A variable's type is set upon each assignment and may change over its lifetime. The type determines how it behaves in comparisons (string or numeric). - * Function calls return a value which may be used as part of a larger + * Function calls return a value that may be used as part of a larger expression. Expressions used to pass parameter values are fully evaluated before the function is called. `awk' provides built-in and user-defined functions; this is described in *note Functions::. @@ -9168,7 +9173,7 @@ inside Boolean patterns. Likewise, the special patterns `BEGIN', `END', `BEGINFILE', and `ENDFILE', which never match any input record, are not expressions and cannot appear inside Boolean patterns. - The precedence of the different operators which can appear in + The precedence of the different operators that can appear in patterns is described in *note Precedence::. @@ -9188,8 +9193,8 @@ following: prints every record in `myfile' between `on'/`off' pairs, inclusive. A range pattern starts out by matching BEGPAT against every input -record. When a record matches BEGPAT, the range pattern is "turned on" -and the range pattern matches this record as well. As long as the +record. When a record matches BEGPAT, the range pattern is "turned +on", and the range pattern matches this record as well. As long as the range pattern stays turned on, it automatically matches every input record read. The range pattern also matches ENDPAT against every input record; when this succeeds, the range pattern is "turned off" again for @@ -9307,7 +9312,7 @@ for more information on using library functions. *Note Library Functions::, for a number of useful library functions. If an `awk' program has only `BEGIN' rules and no other rules, then -the program exits after the `BEGIN' rule is run.(1) However, if an +the program exits after the `BEGIN' rules are run.(1) However, if an `END' rule exists, then the input is read, even if there are no other rules in the program. This is necessary in case the `END' rule checks the `FNR' and `NR' variables. @@ -9333,7 +9338,7 @@ give `$0' a real value is to execute a `getline' command without a variable (*note Getline::). Another way is simply to assign a value to `$0'. - The second point is similar to the first but from the other + The second point is similar to the first, but from the other direction. Traditionally, due largely to implementation issues, `$0' and `NF' were _undefined_ inside an `END' rule. The POSIX standard specifies that `NF' is available in an `END' rule. It contains the @@ -9394,7 +9399,7 @@ tasks that would otherwise be difficult or impossible to perform: entirely. Otherwise, `gawk' exits with the usual fatal error. * If you have written extensions that modify the record handling (by - inserting an "input parser," *note Input Parsers::), you can invoke + inserting an "input parser"; *note Input Parsers::), you can invoke them at this point, before `gawk' has started processing the file. (This is a _very_ advanced feature, currently used only by the `gawkextlib' project (http://gawkextlib.sourceforge.net).) @@ -9404,16 +9409,15 @@ last record in an input file. For the last input file, it will be called before any `END' rules. The `ENDFILE' rule is executed even for empty input files. - Normally, when an error occurs when reading input in the normal input -processing loop, the error is fatal. However, if an `ENDFILE' rule is -present, the error becomes non-fatal, and instead `ERRNO' is set. This -makes it possible to catch and process I/O errors at the level of the -`awk' program. + Normally, when an error occurs when reading input in the normal +input-processing loop, the error is fatal. However, if an `ENDFILE' +rule is present, the error becomes non-fatal, and instead `ERRNO' is +set. This makes it possible to catch and process I/O errors at the +level of the `awk' program. The `next' statement (*note Next Statement::) is not allowed inside either a `BEGINFILE' or an `ENDFILE' rule. The `nextfile' statement is -allowed only inside a `BEGINFILE' rule, but not inside an `ENDFILE' -rule. +allowed only inside a `BEGINFILE' rule, not inside an `ENDFILE' rule. The `getline' statement (*note Getline::) is restricted inside both `BEGINFILE' and `ENDFILE': only redirected forms of `getline' are @@ -9458,11 +9462,11 @@ following program: END { print nmatches, "found" }' /path/to/data The `awk' program consists of two pieces of quoted text that are -concatenated together to form the program. The first part is double -quoted, which allows substitution of the `pattern' shell variable -inside the quotes. The second part is single quoted. +concatenated together to form the program. The first part is +double-quoted, which allows substitution of the `pattern' shell +variable inside the quotes. The second part is single-quoted. - Variable substitution via quoting works, but can be potentially + Variable substitution via quoting works, but can potentially be messy. It requires a good understanding of the shell's quoting rules (*note Quoting::), and it's often difficult to correctly match up the quotes when reading the program. @@ -9659,15 +9663,15 @@ The body of this loop is a compound statement enclosed in braces, containing two statements. The loop works in the following manner: first, the value of `i' is set to one. Then, the `while' statement tests whether `i' is less than or equal to three. This is true when -`i' equals one, so the `i'-th field is printed. Then the `i++' +`i' equals one, so the `i'th field is printed. Then the `i++' increments the value of `i' and the loop repeats. The loop terminates when `i' reaches four. A newline is not required between the condition and the body; however, using one makes the program clearer unless the body is a -compound statement or else is very simple. The newline after the -open-brace that begins the compound statement is not required either, -but the program is harder to read without it. +compound statement or else is very simple. The newline after the open +brace that begins the compound statement is not required either, but the +program is harder to read without it. File: gawk.info, Node: Do Statement, Next: For Statement, Prev: While Statement, Up: Statements @@ -9690,7 +9694,7 @@ Contrast this with the corresponding `while' statement: while (CONDITION) BODY -This statement does not execute BODY even once if the CONDITION is +This statement does not execute the BODY even once if the CONDITION is false to begin with. The following is an example of a `do' statement: { @@ -9746,7 +9750,7 @@ loop.) The same is true of the INCREMENT part. Incrementing additional variables requires separate statements at the end of the loop. The C compound expression, using C's comma operator, is useful in this -context but it is not supported in `awk'. +context, but it is not supported in `awk'. Most often, INCREMENT is an increment expression, as in the previous example. But this is not required; it can be any expression @@ -9822,7 +9826,7 @@ statement looks like this: Control flow in the `switch' statement works as it does in C. Once a match to a given case is made, the case statement bodies execute until -a `break', `continue', `next', `nextfile' or `exit' is encountered, or +a `break', `continue', `next', `nextfile', or `exit' is encountered, or the end of the `switch' statement itself. For example: while ((c = getopt(ARGC, ARGV, "aksx")) != -1) { @@ -10065,12 +10069,11 @@ listed in `ARGV'. standard. See the Austin Group website (http://austingroupbugs.net/view.php?id=607). - The current version of BWK `awk', and `mawk' also support -`nextfile'. However, they don't allow the `nextfile' statement inside -function bodies (*note User-defined::). `gawk' does; a `nextfile' -inside a function body reads the next record and starts processing it -with the first rule in the program, just as any other `nextfile' -statement. + The current version of BWK `awk' and `mawk' also support `nextfile'. +However, they don't allow the `nextfile' statement inside function +bodies (*note User-defined::). `gawk' does; a `nextfile' inside a +function body reads the next record and starts processing it with the +first rule in the program, just as any other `nextfile' statement. File: gawk.info, Node: Exit Statement, Prev: Nextfile Statement, Up: Statements @@ -10098,9 +10101,9 @@ record, skips reading any remaining input records, and executes the they do not execute. In such a case, if you don't want the `END' rule to do its job, set -a variable to nonzero before the `exit' statement and check that -variable in the `END' rule. *Note Assert Function::, for an example -that does this. +a variable to a nonzero value before the `exit' statement and check +that variable in the `END' rule. *Note Assert Function::, for an +example that does this. If an argument is supplied to `exit', its value is used as the exit status code for the `awk' process. If no argument is supplied, `exit' @@ -10158,7 +10161,7 @@ of activity. File: gawk.info, Node: User-modified, Next: Auto-set, Up: Built-in Variables -7.5.1 Built-In Variables That Control `awk' +7.5.1 Built-in Variables That Control `awk' ------------------------------------------- The following is an alphabetical list of variables that you can change @@ -10182,11 +10185,11 @@ description of each variable.) use binary I/O. Any other string value is treated the same as `"rw"', but causes `gawk' to generate a warning message. `BINMODE' is described in more detail in *note PC Using::. `mawk' - (*note Other Versions::), also supports this variable, but only + (*note Other Versions::) also supports this variable, but only using numeric values. ``CONVFMT'' - This string controls conversion of numbers to strings (*note + A string that controls the conversion of numbers to strings (*note Conversion::). It works by being passed, in effect, as the first argument to the `sprintf()' function (*note String Functions::). Its default value is `"%.6g"'. `CONVFMT' was introduced by the @@ -10233,7 +10236,7 @@ description of each variable.) `IGNORECASE #' If `IGNORECASE' is nonzero or non-null, then all string comparisons - and all regular expression matching are case independent. Thus, + and all regular expression matching are case-independent. Thus, regexp matching with `~' and `!~', as well as the `gensub()', `gsub()', `index()', `match()', `patsplit()', `split()', and `sub()' functions, record termination with `RS', and field @@ -10253,7 +10256,7 @@ description of each variable.) Assigning a false value to `LINT' turns off the lint warnings. This variable is a `gawk' extension. It is not special in other - `awk' implementations. Unlike the other special variables, + `awk' implementations. Unlike with the other special variables, changing `LINT' does affect the production of lint warnings, even if `gawk' is in compatibility mode. Much as the `--lint' and `--traditional' options independently control different aspects of @@ -10261,17 +10264,18 @@ description of each variable.) execution is independent of the flavor of `awk' being executed. `OFMT' - Controls conversion of numbers to strings (*note Conversion::) for - printing with the `print' statement. It works by being passed as - the first argument to the `sprintf()' function (*note String - Functions::). Its default value is `"%.6g"'. Earlier versions of - `awk' used `OFMT' to specify the format for converting numbers to - strings in general expressions; this is now done by `CONVFMT'. + A string that controls conversion of numbers to strings (*note + Conversion::) for printing with the `print' statement. It works + by being passed as the first argument to the `sprintf()' function + (*note String Functions::). Its default value is `"%.6g"'. + Earlier versions of `awk' used `OFMT' to specify the format for + converting numbers to strings in general expressions; this is now + done by `CONVFMT'. `OFS' - This is the output field separator (*note Output Separators::). - It is output between the fields printed by a `print' statement. - Its default value is `" "', a string consisting of a single space. + The output field separator (*note Output Separators::). It is + output between the fields printed by a `print' statement. Its + default value is `" "', a string consisting of a single space. `ORS' The output record separator. It is output at the end of every @@ -10321,7 +10325,7 @@ description of each variable.) File: gawk.info, Node: Auto-set, Next: ARGC and ARGV, Prev: User-modified, Up: Built-in Variables -7.5.2 Built-In Variables That Convey Information +7.5.2 Built-in Variables That Convey Information ------------------------------------------------ The following is an alphabetical list of variables that `awk' sets @@ -10439,14 +10443,14 @@ Options::), they are not special: `NF' The number of fields in the current input record. `NF' is set - each time a new record is read, when a new field is created or + each time a new record is read, when a new field is created, or when `$0' changes (*note Fields::). Unlike most of the variables described in this node, assigning a value to `NF' has the potential to affect `awk''s internal workings. In particular, assignments to `NF' can be used to - create or remove fields from the current record. *Note Changing - Fields::. + create fields in or remove fields from the current record. *Note + Changing Fields::. `FUNCTAB #' An array whose indices and corresponding values are the names of @@ -10481,7 +10485,7 @@ Options::), they are not special: `PROCINFO["identifiers"]' A subarray, indexed by the names of all identifiers used in - the text of the AWK program. An "identifier" is simply the + the text of the `awk' program. An "identifier" is simply the name of a variable (be it scalar or array), built-in function, user-defined function, or extension function. For each identifier, the value of the element is one of the @@ -10502,7 +10506,7 @@ Options::), they are not special: `"untyped"' The identifier is untyped (could be used as a scalar or - array, `gawk' doesn't know yet). + an array; `gawk' doesn't know yet). `"user"' The identifier is a user-defined function. @@ -10591,7 +10595,7 @@ Options::), they are not special: string, or -1 if no match is found. `RSTART' - The start-index in characters of the substring that is matched by + The start index in characters of the substring that is matched by the `match()' function (*note String Functions::). `RSTART' is set by invoking the `match()' function. Its value is the position of the string where the matched substring starts, or zero if no @@ -10641,7 +10645,7 @@ Options::), they are not special: } NOTE: In order to avoid severe time-travel paradoxes,(2) - neither `FUNCTAB' nor `SYMTAB' are available as elements + neither `FUNCTAB' nor `SYMTAB' is available as an element within the `SYMTAB' array. Changing `NR' and `FNR' @@ -10780,7 +10784,7 @@ are passed on to the `awk' program. (*Note Getopt Function::, for an When designing your program, you should choose options that don't conflict with `gawk''s, because it will process any options that it accepts before passing the rest of the command line on to your program. -Using `#!' with the `-E' option may help (*Note Executable Scripts::, +Using `#!' with the `-E' option may help (*note Executable Scripts::, and *note Options::,). @@ -10791,14 +10795,14 @@ File: gawk.info, Node: Pattern Action Summary, Prev: Built-in Variables, Up: * Pattern-action pairs make up the basic elements of an `awk' program. Patterns are either normal expressions, range - expressions, regexp constants, one of the special keywords - `BEGIN', `END', `BEGINFILE', `ENDFILE', or empty. The action + expressions, or regexp constants; one of the special keywords + `BEGIN', `END', `BEGINFILE', or `ENDFILE'; or empty. The action executes if the current record matches the pattern. Empty (missing) patterns match all records. - * I/O from `BEGIN' and `END' rules have certain constraints. This - is also true, only more so, for `BEGINFILE' and `ENDFILE' rules. - The latter two give you "hooks" into `gawk''s file processing, + * I/O from `BEGIN' and `END' rules has certain constraints. This is + also true, only more so, for `BEGINFILE' and `ENDFILE' rules. The + latter two give you "hooks" into `gawk''s file processing, allowing you to recover from a file that otherwise would cause a fatal error (such as a file that cannot be opened). @@ -10819,11 +10823,11 @@ File: gawk.info, Node: Pattern Action Summary, Prev: Built-in Variables, Up: iteration of a loop (or get out of a `switch'). * `next' and `nextfile' let you read the next record and start over - at the top of your program, or skip to the next input file and + at the top of your program or skip to the next input file and start over, respectively. * The `exit' statement terminates your program. When executed from - an action (or function body) it transfers control to the `END' + an action (or function body), it transfers control to the `END' statements. From an `END' statement body, it exits immediately. You may pass an optional numeric value to be used as `awk''s exit status. @@ -10925,9 +10929,9 @@ languages allow arbitrary starting and ending indices--e.g., `15 .. 27'--but the size of the array is still fixed when the array is declared.) - A contiguous array of four elements might look like the following -example, conceptually, if the element values are 8, `"foo"', `""', and -30 as shown in *note figure-array-elements::: + A contiguous array of four elements might look like *note +figure-array-elements::, conceptually, if the element values are eight, +`"foo"', `""', and 30. +---------+---------+--------+---------+ | 8 | "foo" | "" | 30 | @r{Value} @@ -10936,17 +10940,19 @@ example, conceptually, if the element values are 8, `"foo"', `""', and Figure 8.1: A contiguous array Only the values are stored; the indices are implicit from the order of -the values. Here, 8 is the value at index zero, because 8 appears in the -position with zero elements before it. +the values. Here, eight is the value at index zero, because eight +appears in the position with zero elements before it. Arrays in `awk' are different--they are "associative". This means that each array is a collection of pairs--an index and its corresponding array element value: - Index 3 Value 30 - Index 1 Value "foo" - Index 0 Value 8 - Index 2 Value "" + Index Value +------------------------ + `3' `30' + `1' `"foo"' + `0' `8' + `2' `""' The pairs are shown in jumbled order because their order is irrelevant.(1) @@ -10955,11 +10961,13 @@ irrelevant.(1) at any time. For example, suppose a tenth element is added to the array whose value is `"number ten"'. The result is: - Index 10 Value "number ten" - Index 3 Value 30 - Index 1 Value "foo" - Index 0 Value 8 - Index 2 Value "" + Index Value +------------------------------- + `10' `"number ten"' + `3' `30' + `1' `"foo"' + `0' `8' + `2' `""' Now the array is "sparse", which just means some indices are missing. It has elements 0-3 and 10, but doesn't have elements 4, 5, 6, 7, 8, or @@ -10970,17 +10978,19 @@ have to be positive integers. Any number, or even a string, can be an index. For example, the following is an array that translates words from English to French: - Index "dog" Value "chien" - Index "cat" Value "chat" - Index "one" Value "un" - Index 1 Value "un" + Index Value +------------------------ + `"dog"' `"chien"' + `"cat"' `"chat"' + `"one"' `"un"' + `1' `"un"' Here we decided to translate the number one in both spelled-out and numeric form--thus illustrating that a single array can have both numbers and strings as indices. (In fact, array subscripts are always strings. There are some subtleties to how numbers work when used as array subscripts; this is discussed in more detail in *note Numeric -Array Subscripts::.) Here, the number `1' isn't double quoted, because +Array Subscripts::.) Here, the number `1' isn't double-quoted, because `awk' automatically converts it to a string. The value of `IGNORECASE' has no effect upon array subscripting. @@ -11004,7 +11014,7 @@ File: gawk.info, Node: Reference to Elements, Next: Assigning Elements, Prev: ----------------------------------- The principal way to use an array is to refer to one of its elements. -An array reference is an expression as follows: +An "array reference" is an expression as follows: ARRAY[INDEX-EXPRESSION] @@ -11012,8 +11022,8 @@ Here, ARRAY is the name of an array. The expression INDEX-EXPRESSION is the index of the desired element of the array. The value of the array reference is the current value of that array -element. For example, `foo[4.3]' is an expression for the element of -array `foo' at index `4.3'. +element. For example, `foo[4.3]' is an expression referencing the +element of array `foo' at index `4.3'. A reference to an array element that has no recorded value yields a value of `""', the null string. This includes elements that have not @@ -11080,7 +11090,7 @@ File: gawk.info, Node: Array Example, Next: Scanning an Array, Prev: Assignin The following program takes a list of lines, each beginning with a line number, and prints them out in order of line number. The line numbers -are not in order when they are first read--instead they are scrambled. +are not in order when they are first read--instead, they are scrambled. This program sorts the lines by making an array using the line numbers as subscripts. The program then prints out the lines in sorted order of their numbers. It is a very simple program and gets confused upon @@ -11151,7 +11161,7 @@ has previously used, with the variable VAR set to that index. The following program uses this form of the `for' statement. The first rule scans the input records and notes which words appear (at least once) in the input, by storing a one into the array `used' with -the word as index. The second rule scans the elements of `used' to +the word as the index. The second rule scans the elements of `used' to find all the distinct words that appear in the input. It prints each word that is more than 10 characters long and also prints the number of such words. *Note String Functions::, for more information on the @@ -11234,7 +11244,7 @@ internal implementation of arrays and will vary from one version of Often, though, you may wish to do something simple, such as "traverse the array by comparing the indices in ascending order," or "traverse the array by comparing the values in descending order." -`gawk' provides two mechanisms which give you this control. +`gawk' provides two mechanisms that give you this control: * Set `PROCINFO["sorted_in"]' to one of a set of predefined values. We describe this now. @@ -11282,22 +11292,26 @@ available: which `gawk' uses internally to perform the sorting. `"@ind_str_desc"' - String indices ordered from high to low. + Like `"@ind_str_asc"', but the string indices are ordered from + high to low. `"@ind_num_desc"' - Numeric indices ordered from high to low. + Like `"@ind_num_asc"', but the numeric indices are ordered from + high to low. `"@val_type_desc"' - Element values, based on type, ordered from high to low. - Subarrays, if present, come out first. + Like `"@val_type_asc"', but the element values, based on type, are + ordered from high to low. Subarrays, if present, come out first. `"@val_str_desc"' - Element values, treated as strings, ordered from high to low. - Subarrays, if present, come out first. + Like `"@val_str_asc"', but the element values, treated as strings, + are ordered from high to low. Subarrays, if present, come out + first. `"@val_num_desc"' - Element values, treated as numbers, ordered from high to low. - Subarrays, if present, come out first. + Like `"@val_num_asc"', but the element values, treated as numbers, + are ordered from high to low. Subarrays, if present, come out + first. The array traversal order is determined before the `for' loop starts to run. Changing `PROCINFO["sorted_in"]' in the loop body does not @@ -11483,8 +11497,8 @@ deleting elements in an array: This example removes all the elements from the array `frequencies'. Once an element is deleted, a subsequent `for' statement to scan the -array does not report that element and the `in' operator to check for -the presence of that element returns zero (i.e., false): +array does not report that element and using the `in' operator to check +for the presence of that element returns zero (i.e., false): delete foo[4] if (4 in foo) @@ -11687,7 +11701,7 @@ two-element subarray at index `1' of the main array `a': This simulates a true two-dimensional array. Each subarray element can contain another subarray as a value, which in turn can hold other arrays as well. In this way, you can create arrays of three or more -dimensions. The indices can be any `awk' expression, including scalars +dimensions. The indices can be any `awk' expressions, including scalars separated by commas (i.e., a regular `awk' simulated multidimensional subscript). So the following is valid in `gawk': @@ -11696,7 +11710,7 @@ subscript). So the following is valid in `gawk': Each subarray and the main array can be of different length. In fact, the elements of an array or its subarray do not all have to have the same type. This means that the main array and any of its subarrays -can be non-rectangular, or jagged in structure. You can assign a scalar +can be nonrectangular, or jagged in structure. You can assign a scalar value to the index `4' of the main array `a', even though `a[1]' is itself an array and not a scalar: @@ -11714,8 +11728,8 @@ the element at that index: a[4][5][6][7] = "An element in a four-dimensional array" This removes the scalar value from index `4' and then inserts a -subarray of subarray of subarray containing a scalar. You can also -delete an entire subarray or subarray of subarrays: +three-level nested subarray containing a scalar. You can also delete an +entire subarray or subarray of subarrays: delete a[4][5] a[4][5] = "An element in subarray a[4]" @@ -11723,7 +11737,7 @@ delete an entire subarray or subarray of subarrays: But recall that you can not delete the main array `a' and then use it as a scalar. - The built-in functions which take array arguments can also be used + The built-in functions that take array arguments can also be used with subarrays. For example, the following code fragment uses `length()' (*note String Functions::) to determine the number of elements in the main array `a' and its subarrays: @@ -11744,7 +11758,7 @@ be nested to scan all the elements of an array of arrays if it is rectangular in structure. In order to print the contents (scalar values) of a two-dimensional array of arrays (i.e., in which each first-level element is itself an array, not necessarily of the same -length) you could use the following code: +length), you could use the following code: for (i in array) for (j in array[i]) @@ -11826,9 +11840,9 @@ File: gawk.info, Node: Arrays Summary, Prev: Arrays of Arrays, Up: Arrays of `awk'. * Standard `awk' simulates multidimensional arrays by separating - subscript values with a comma. The values are concatenated into a + subscript values with commas. The values are concatenated into a single string, separated by the value of `SUBSEP'. The fact that - such a subscript was created in this way is not retained; thus + such a subscript was created in this way is not retained; thus, changing `SUBSEP' may have unexpected consequences. You can use `(SUB1, SUB2, ...) in ARRAY' to see if such a multidimensional subscript exists in ARRAY. @@ -11836,7 +11850,7 @@ File: gawk.info, Node: Arrays Summary, Prev: Arrays of Arrays, Up: Arrays * `gawk' provides true arrays of arrays. You use a separate set of square brackets for each dimension in such an array: `data[row][col]', for example. Array elements may thus be either - scalar values (number or string) or another array. + scalar values (number or string) or other arrays. * Use the `isarray()' built-in function to determine if an array element is itself a subarray. @@ -11856,7 +11870,9 @@ internationalize and localize programs. Besides the built-in functions, `awk' has provisions for writing new functions that the rest of a program can use. The second half of this -major node describes these "user-defined" functions. +major node describes these "user-defined" functions. Finally, we +explore indirect function calls, a `gawk'-specific extension that lets +you determine at runtime what function is to be called. * Menu: @@ -11868,7 +11884,7 @@ major node describes these "user-defined" functions. File: gawk.info, Node: Built-in, Next: User-defined, Up: Functions -9.1 Built-In Functions +9.1 Built-in Functions ====================== "Built-in" functions are always available for your `awk' program to @@ -11893,7 +11909,7 @@ for your convenience. File: gawk.info, Node: Calling Built-in, Next: Numeric Functions, Up: Built-in -9.1.1 Calling Built-In Functions +9.1.1 Calling Built-in Functions -------------------------------- To call one of `awk''s built-in functions, write the name of the @@ -11930,9 +11946,10 @@ are evaluated from left to right or from right to left. For example: j = atan2(++i, i *= 2) If the order of evaluation is left to right, then `i' first becomes -6, and then 12, and `atan2()' is called with the two arguments 6 and -12. But if the order of evaluation is right to left, `i' first becomes -10, then 11, and `atan2()' is called with the two arguments 11 and 10. +six, and then 12, and `atan2()' is called with the two arguments six +and 12. But if the order of evaluation is right to left, `i' first +becomes 10, then 11, and `atan2()' is called with the two arguments 11 +and 10. File: gawk.info, Node: Numeric Functions, Next: String Functions, Prev: Calling Built-in, Up: Built-in @@ -11988,7 +12005,7 @@ brackets ([ ]): Often random integers are needed instead. Following is a user-defined function that can be used to obtain a random - non-negative integer less than N: + nonnegative integer less than N: function randint(n) { @@ -12078,7 +12095,7 @@ File: gawk.info, Node: String Functions, Next: I/O Functions, Prev: Numeric F The functions in this minor node look at or change the text of one or more strings. - `gawk' understands locales (*note Locales::), and does all string + `gawk' understands locales (*note Locales::) and does all string processing in terms of _characters_, not _bytes_. This distinction is particularly important to understand for locales where one character may be represented by multiple bytes. Thus, for example, `length()' @@ -12149,7 +12166,7 @@ Options::): a[2] = "de" a[3] = "sac" - The `asorti()' function works similarly to `asort()', however, the + The `asorti()' function works similarly to `asort()'; however, the _indices_ are sorted, instead of the values. Thus, in the previous example, starting with the same initial set of indices and values in `a', calling `asorti(a)' would yield: @@ -12237,7 +12254,7 @@ Options::): With BWK `awk' and `gawk', it is a fatal error to use a regexp constant for FIND. Other implementations allow it, simply treating the regexp constant as an expression meaning `$0 ~ - /regexp/'. (d.c.). + /regexp/'. (d.c.) `length('[STRING]`)' Return the number of characters in STRING. If STRING is a number, @@ -12281,9 +12298,9 @@ Options::): `match(STRING, REGEXP' [`, ARRAY']`)' Search STRING for the longest, leftmost substring matched by the - regular expression, REGEXP and return the character position - (index) at which that substring begins (one, if it starts at the - beginning of STRING). If no match is found, return zero. + regular expression REGEXP and return the character position (index) + at which that substring begins (one, if it starts at the beginning + of STRING). If no match is found, return zero. The REGEXP argument may be either a regexp constant (`/'...`/') or a string constant (`"'...`"'). In the latter case, the string is @@ -12291,7 +12308,7 @@ Options::): discussion of the difference between the two forms, and the implications for writing your program correctly. - The order of the first two arguments is backwards from most other + The order of the first two arguments is the opposite of most other string functions that work with regular expressions, such as `sub()' and `gsub()'. It might help to remember that for `match()', the order is the same as for the `~' operator: `STRING @@ -12358,8 +12375,8 @@ Options::): There may not be subscripts for the start and index for every parenthesized subexpression, because they may not all have matched - text; thus they should be tested for with the `in' operator (*note - Reference to Elements::). + text; thus, they should be tested for with the `in' operator + (*note Reference to Elements::). The ARRAY argument to `match()' is a `gawk' extension. In compatibility mode (*note Options::), using a third argument is a @@ -12392,19 +12409,19 @@ Options::): FIELDSEP, is a regexp describing where to split STRING (much as `FS' can be a regexp describing where to split input records). If FIELDSEP is omitted, the value of `FS' is used. `split()' returns - the number of elements created. SEPS is a `gawk' extension with + the number of elements created. SEPS is a `gawk' extension, with `SEPS[I]' being the separator string between `ARRAY[I]' and - `ARRAY[I+1]'. If FIELDSEP is a single space then any leading + `ARRAY[I+1]'. If FIELDSEP is a single space, then any leading whitespace goes into `SEPS[0]' and any trailing whitespace goes - into `SEPS[N]' where N is the return value of `split()' (i.e., the - number of elements in ARRAY). + into `SEPS[N]', where N is the return value of `split()' (i.e., + the number of elements in ARRAY). The `split()' function splits strings into pieces in a manner similar to the way input lines are split into fields. For example: split("cul-de-sac", a, "-", seps) - splits the string `cul-de-sac' into three fields using `-' as the + splits the string `"cul-de-sac"' into three fields using `-' as the separator. It sets the contents of the array `a' as follows: a[1] = "cul" @@ -12421,17 +12438,18 @@ Options::): As with input field-splitting, when the value of FIELDSEP is `" "', leading and trailing whitespace is ignored in values assigned to the elements of ARRAY but not in SEPS, and the elements - are separated by runs of whitespace. Also, as with input - field-splitting, if FIELDSEP is the null string, each individual + are separated by runs of whitespace. Also, as with input field + splitting, if FIELDSEP is the null string, each individual character in the string is split into its own array element. (c.e.) Note, however, that `RS' has no effect on the way `split()' works. - Even though `RS = ""' causes newline to also be an input field - separator, this does not affect how `split()' splits strings. + Even though `RS = ""' causes the newline character to also be an + input field separator, this does not affect how `split()' splits + strings. Modern implementations of `awk', including `gawk', allow the third - argument to be a regexp constant (`/abc/') as well as a string. + argument to be a regexp constant (`/'...`/') as well as a string. (d.c.) The POSIX standard allows this as well. *Note Computed Regexps::, for a discussion of the difference between using a string constant or a regexp constant, and the implications for @@ -12532,7 +12550,7 @@ Options::): { sub(/\|/, "\\&"); print } As mentioned, the third argument to `sub()' must be a variable, - field or array element. Some versions of `awk' allow the third + field, or array element. Some versions of `awk' allow the third argument to be an expression that is not an lvalue. In such a case, `sub()' still searches for the pattern and returns zero or one, but the result of the substitution (if any) is thrown away @@ -12657,11 +12675,11 @@ example, `"a\qb"' is treated as `"aqb"'. At the runtime level, the various functions handle sequences of `\' and `&' differently. The situation is (sadly) somewhat complex. -Historically, the `sub()' and `gsub()' functions treated the two -character sequence `\&' specially; this sequence was replaced in the -generated text with a single `&'. Any other `\' within the REPLACEMENT -string that did not precede an `&' was passed through unchanged. This -is illustrated in *note table-sub-escapes::. +Historically, the `sub()' and `gsub()' functions treated the +two-character sequence `\&' specially; this sequence was replaced in +the generated text with a single `&'. Any other `\' within the +REPLACEMENT string that did not precede an `&' was passed through +unchanged. This is illustrated in *note table-sub-escapes::. You type `sub()' sees `sub()' generates ------- --------- -------------- @@ -12676,10 +12694,10 @@ is illustrated in *note table-sub-escapes::. Table 9.1: Historical escape sequence processing for `sub()' and `gsub()' -This table shows both the lexical-level processing, where an odd number -of backslashes becomes an even number at the runtime level, as well as -the runtime processing done by `sub()'. (For the sake of simplicity, -the rest of the following tables only show the case of even numbers of +This table shows the lexical-level processing, where an odd number of +backslashes becomes an even number at the runtime level, as well as the +runtime processing done by `sub()'. (For the sake of simplicity, the +rest of the following tables only show the case of even numbers of backslashes entered at the lexical level.) The problem with the historical approach is that there is no way to @@ -12703,10 +12721,10 @@ This is shown in *note table-sub-proposed::. `\\q' `\q' A literal `\q' `\\\\' `\\' `\\' -Table 9.2: GNU `awk' rules for `sub()' and backslash +Table 9.2: `gawk' rules for `sub()' and backslash In a nutshell, at the runtime level, there are now three special -sequences of characters (`\\\&', `\\&' and `\&') whereas historically +sequences of characters (`\\\&', `\\&', and `\&') whereas historically there was only one. However, as in the historical case, any `\' that is not part of one of these three sequences is not special and appears in the output literally. @@ -12736,7 +12754,7 @@ Table 9.3: POSIX rules for `sub()' and `gsub()' `\\\\' is seen as `\\' and produces `\' instead of `\\'. Starting with version 3.1.4, `gawk' followed the POSIX rules when -`--posix' is specified (*note Options::). Otherwise, it continued to +`--posix' was specified (*note Options::). Otherwise, it continued to follow the proposed rules, as that had been its behavior for many years. When version 4.0.0 was released, the `gawk' maintainer made the @@ -12763,9 +12781,9 @@ the `\' does not, as shown in *note table-gensub-escapes::. Table 9.4: Escape sequence processing for `gensub()' - Because of the complexity of the lexical and runtime level processing -and the special cases for `sub()' and `gsub()', we recommend the use of -`gawk' and `gensub()' when you have to do substitutions. + Because of the complexity of the lexical- and runtime-level +processing and the special cases for `sub()' and `gsub()', we recommend +the use of `gawk' and `gensub()' when you have to do substitutions. ---------- Footnotes ---------- @@ -12792,10 +12810,10 @@ parameters are enclosed in square brackets ([ ]): When closing a coprocess, it is occasionally useful to first close one end of the two-way pipe and then to close the other. This is done by providing a second argument to `close()'. This second - argument should be one of the two string values `"to"' or `"from"', - indicating which end of the pipe to close. Case in the string does - not matter. *Note Two-way I/O::, which discusses this feature in - more detail and gives an example. + argument (HOW) should be one of the two string values `"to"' or + `"from"', indicating which end of the pipe to close. Case in the + string does not matter. *Note Two-way I/O::, which discusses this + feature in more detail and gives an example. Note that the second argument to `close()' is a `gawk' extension; it is not available in compatibility mode (*note Options::). @@ -12813,7 +12831,7 @@ parameters are enclosed in square brackets ([ ]): sometimes it is necessary to force a program to "flush" its buffers (i.e., write the information to its destination, even if a buffer is not full). This is the purpose of the `fflush()' - function--`gawk' also buffers its output and the `fflush()' + function--`gawk' also buffers its output, and the `fflush()' function forces `gawk' to flush its buffers. Brian Kernighan added `fflush()' to his `awk' in April 1992. For @@ -12830,16 +12848,17 @@ parameters are enclosed in square brackets ([ ]): output files and pipes if the argument was the null string. This was changed in order to be compatible with Brian Kernighan's `awk', in the hope that standardizing this - feature in POSIX would then be easier (which indeed helped). + feature in POSIX would then be easier (which indeed proved to + be the case). With `gawk', you can use `fflush("/dev/stdout")' if you wish to flush only the standard output. `fflush()' returns zero if the buffer is successfully flushed; - otherwise, it returns non-zero. (`gawk' returns -1.) In the case - where all buffers are flushed, the return value is zero only if - all buffers were flushed successfully. Otherwise, it is -1, and - `gawk' warns about the problem FILENAME. + otherwise, it returns a nonzero value. (`gawk' returns -1.) In + the case where all buffers are flushed, the return value is zero + only if all buffers were flushed successfully. Otherwise, it is + -1, and `gawk' warns about the problem FILENAME. `gawk' also issues a warning message if you attempt to flush a file or pipe that was opened for reading (such as with `getline'), @@ -12848,9 +12867,9 @@ parameters are enclosed in square brackets ([ ]): Interactive Versus Noninteractive Buffering - As a side point, buffering issues can be even more confusing, - depending upon whether your program is "interactive" (i.e., - communicating with a user sitting at a keyboard).(1) + As a side point, buffering issues can be even more confusing if + your program is "interactive" (i.e., communicating with a user + sitting at a keyboard).(1) Interactive programs generally "line buffer" their output (i.e., they write out every line). Noninteractive programs wait until @@ -12879,7 +12898,7 @@ parameters are enclosed in square brackets ([ ]): shot. `system(COMMAND)' - Execute the operating-system command COMMAND and then return to + Execute the operating system command COMMAND and then return to the `awk' program. Return COMMAND's exit status. For example, if the following fragment of code is put in your `awk' @@ -12968,14 +12987,14 @@ File: gawk.info, Node: Time Functions, Next: Bitwise Functions, Prev: I/O Fun `awk' programs are commonly used to process log files containing timestamp information, indicating when a particular log record was -written. Many programs log their timestamp in the form returned by the -`time()' system call, which is the number of seconds since a particular -epoch. On POSIX-compliant systems, it is the number of seconds since -1970-01-01 00:00:00 UTC, not counting leap seconds.(1) All known -POSIX-compliant systems support timestamps from 0 through 2^31 - 1, -which is sufficient to represent times through 2038-01-19 03:14:07 UTC. -Many systems support a wider range of timestamps, including negative -timestamps that represent times before the epoch. +written. Many programs log their timestamps in the form returned by +the `time()' system call, which is the number of seconds since a +particular epoch. On POSIX-compliant systems, it is the number of +seconds since 1970-01-01 00:00:00 UTC, not counting leap seconds.(1) +All known POSIX-compliant systems support timestamps from 0 through +2^31 - 1, which is sufficient to represent times through 2038-01-19 +03:14:07 UTC. Many systems support a wider range of timestamps, +including negative timestamps that represent times before the epoch. In order to make it easier to process such log files and to produce useful reports, `gawk' provides the following functions for working @@ -12998,9 +13017,9 @@ enclosed in square brackets ([ ]): specified; for example, an hour of -1 means 1 hour before midnight. The origin-zero Gregorian calendar is assumed, with year 0 preceding year 1 and year -1 preceding year 0. The time is - assumed to be in the local timezone. If the daylight-savings flag - is positive, the time is assumed to be daylight savings time; if - zero, the time is assumed to be standard time; and if negative + assumed to be in the local time zone. If the daylight-savings + flag is positive, the time is assumed to be daylight savings time; + if zero, the time is assumed to be standard time; and if negative (the default), `mktime()' attempts to determine whether daylight savings time is in effect for the specified time. @@ -13141,23 +13160,23 @@ the following date format specifications: The weekday as a decimal number (1-7). Monday is day one. `%U' - The week number of the year (the first Sunday as the first day of - week one) as a decimal number (00-53). + The week number of the year (with the first Sunday as the first + day of week one) as a decimal number (00-53). `%V' - The week number of the year (the first Monday as the first day of - week one) as a decimal number (01-53). The method for determining - the week number is as specified by ISO 8601. (To wit: if the week - containing January 1 has four or more days in the new year, then - it is week one; otherwise it is week 53 of the previous year and - the next week is week one.) + The week number of the year (with the first Monday as the first + day of week one) as a decimal number (01-53). The method for + determining the week number is as specified by ISO 8601. (To wit: + if the week containing January 1 has four or more days in the new + year, then it is week one; otherwise it is week 53 of the previous + year and the next week is week one.) `%w' The weekday as a decimal number (0-6). Sunday is day zero. `%W' - The week number of the year (the first Monday as the first day of - week one) as a decimal number (00-53). + The week number of the year (with the first Monday as the first + day of week one) as a decimal number (00-53). `%x' The locale's "appropriate" date representation. (This is `%A %B @@ -13174,8 +13193,8 @@ the following date format specifications: The full year as a decimal number (e.g., 2015). `%z' - The timezone offset in a +HHMM format (e.g., the format necessary - to produce RFC 822/RFC 1036 date headers). + The time zone offset in a `+HHMM' format (e.g., the format + necessary to produce RFC 822/RFC 1036 date headers). `%Z' The time zone name or abbreviation; no characters if no time zone @@ -13292,7 +13311,7 @@ each successive pair of bits in the operands. Three common operations are bitwise AND, OR, and XOR. The operations are described in *note table-bitwise-ops::. - Bit Operator + Bit operator | AND | OR | XOR |--+--+--+--+--+-- Operands | 0 | 1 | 0 | 1 | 0 | 1 @@ -13348,7 +13367,7 @@ paragraph, don't worry about it.) Here is a user-defined function (*note User-defined::) that illustrates the use of these functions: - # bits2str --- turn a byte into readable 1's and 0's + # bits2str --- turn a byte into readable ones and zeros function bits2str(bits, data, mask) { @@ -13387,13 +13406,14 @@ This program produces the following output when run: -| lshift(0x99, 2) = 0x264 = 0000001001100100 -| rshift(0x99, 2) = 0x26 = 00100110 - The `bits2str()' function turns a binary number into a string. The -number `1' represents a binary value where the rightmost bit is set to -1. Using this mask, the function repeatedly checks the rightmost bit. -ANDing the mask with the value indicates whether the rightmost bit is 1 -or not. If so, a `"1"' is concatenated onto the front of the string. -Otherwise, a `"0"' is added. The value is then shifted right by one -bit and the loop continues until there are no more 1 bits. + The `bits2str()' function turns a binary number into a string. +Initializing `mask' to one creates a binary value where the rightmost +bit is set to one. Using this mask, the function repeatedly checks the +rightmost bit. ANDing the mask with the value indicates whether the +rightmost bit is one or not. If so, a `"1"' is concatenated onto the +front of the string. Otherwise, a `"0"' is added. The value is then +shifted right by one bit and the loop continues until there are no more +one bits. If the initial value is zero, it returns a simple `"0"'. Otherwise, at the end, it pads the value with zeros to represent multiples of @@ -13406,9 +13426,9 @@ Nondecimal-numbers::), and then demonstrates the results of the ---------- Footnotes ---------- - (1) This example shows that 0's come in on the left side. For + (1) This example shows that zeros come in on the left side. For `gawk', this is always true, but in some languages, it's possible to -have the left side fill with 1's. +have the left side fill with ones. File: gawk.info, Node: Type Functions, Next: I18N Functions, Prev: Bitwise Functions, Up: Built-in @@ -13422,7 +13442,7 @@ traverses every element of an array of arrays (*note Arrays of Arrays::). `isarray(X)' - Return a true value if X is an array. Otherwise return false. + Return a true value if X is an array. Otherwise, return false. `isarray()' is meant for use in two circumstances. The first is when traversing a multidimensional array: you can test if an element is @@ -13469,8 +13489,8 @@ brackets ([ ]): Return the plural form used for NUMBER of the translation of STRING1 and STRING2 in text domain DOMAIN for locale category CATEGORY. STRING1 is the English singular variant of a message, - and STRING2 the English plural variant of the same message. The - default value for DOMAIN is the current value of `TEXTDOMAIN'. + and STRING2 is the English plural variant of the same message. + The default value for DOMAIN is the current value of `TEXTDOMAIN'. The default value for CATEGORY is `"LC_MESSAGES"'. @@ -13499,7 +13519,7 @@ File: gawk.info, Node: Definition Syntax, Next: Function Example, Up: User-de 9.2.1 Function Definition Syntax -------------------------------- - It's entirely fair to say that the `awk' syntax for local variable + It's entirely fair to say that the awk syntax for local variable definitions is appallingly awful. -- Brian Kernighan Definitions of functions can appear anywhere between the rules of an @@ -13529,17 +13549,22 @@ the argument names are used to hold the argument values given in the call. A function cannot have two parameters with the same name, nor may it -have a parameter with the same name as the function itself. In -addition, according to the POSIX standard, function parameters cannot -have the same name as one of the special predefined variables (*note -Built-in Variables::). Not all versions of `awk' enforce this -restriction. +have a parameter with the same name as the function itself. + + CAUTION: According to the POSIX standard, function parameters + cannot have the same name as one of the special predefined + variables (*note Built-in Variables::), nor may a function + parameter have the same name as another function. + + Not all versions of `awk' enforce these restrictions. `gawk' + always enforces the first restriction. With `--posix' (*note + Options::), it also enforces the second restriction. Local variables act like the empty string if referenced where a string value is required, and like zero if referenced where a numeric -value is required. This is the same as regular variables that have -never been assigned a value. (There is more to understand about local -variables; *note Dynamic Typing::.) +value is required. This is the same as the behavior of regular +variables that have never been assigned a value. (There is more to +understand about local variables; *note Dynamic Typing::.) The BODY-OF-FUNCTION consists of `awk' statements. It is the most important part of the definition, because it says what the function @@ -13568,9 +13593,9 @@ function is supposed to be used. variable values hide, or "shadow", any variables of the same names used in the rest of the program. The shadowed variables are not accessible in the function definition, because there is no way to name them while -their names have been taken away for the local variables. All other -variables used in the `awk' program can be referenced or set normally -in the function's body. +their names have been taken away for the arguments and local variables. +All other variables used in the `awk' program can be referenced or set +normally in the function's body. The arguments and local variables last only as long as the function body is executing. Once the body finishes, you can once again access @@ -13623,7 +13648,7 @@ takes a number and prints it in a specific format: printf "%6.3g\n", num } -To illustrate, here is an `awk' rule that uses our `myprint' function: +To illustrate, here is an `awk' rule that uses our `myprint()' function: $3 > 0 { myprint($3) } @@ -13652,13 +13677,13 @@ extra whitespace signifies the start of the local variable list): When working with arrays, it is often necessary to delete all the elements in an array and start over with a new list of elements (*note Delete::). Instead of having to repeat this loop everywhere that you -need to clear out an array, your program can just call `delarray'. +need to clear out an array, your program can just call `delarray()'. (This guarantees portability. The use of `delete ARRAY' to delete the contents of an entire array is a relatively recent(1) addition to the POSIX standard.) The following is an example of a recursive function. It takes a -string as an input parameter and returns the string in backwards order. +string as an input parameter and returns the string in reverse order. Recursive functions must always have a test that stops the recursion. In this case, the recursion terminates when the input string is already empty: @@ -13749,14 +13774,14 @@ File: gawk.info, Node: Variable Scope, Next: Pass By Value/Reference, Prev: C 9.2.3.2 Controlling Variable Scope .................................. -Unlike many languages, there is no way to make a variable local to a +Unlike in many languages, there is no way to make a variable local to a `{' ... `}' block in `awk', but you can make a variable local to a function. It is good practice to do so whenever a variable is needed only in that function. To make a variable local to a function, simply declare the variable as an argument after the actual function arguments (*note Definition -Syntax::). Look at the following example where variable `i' is a +Syntax::). Look at the following example, where variable `i' is a global variable used by both functions `foo()' and `bar()': function bar() @@ -13792,7 +13817,7 @@ variable instance: foo's i=3 top's i=3 - If you want `i' to be local to both `foo()' and `bar()' do as + If you want `i' to be local to both `foo()' and `bar()', do as follows (the extra space before `i' is a coding convention to indicate that `i' is a local variable, not an argument): @@ -13874,7 +13899,7 @@ explicitly whether the arguments are passed "by value" or "by reference". Instead, the passing convention is determined at runtime when the -function is called according to the following rule: if the argument is +function is called, according to the following rule: if the argument is an array variable, then it is passed by reference. Otherwise, the argument is passed by value. @@ -13932,7 +13957,7 @@ function _are_ visible outside that function. stores `"two"' in the second element of `a'. Some `awk' implementations allow you to call a function that has not -been defined. They only report a problem at runtime when the program +been defined. They only report a problem at runtime, when the program actually tries to call the function. For example: BEGIN { @@ -13977,15 +14002,15 @@ undefined, and therefore, unpredictable. In practice, though, all versions of `awk' simply return the null string, which acts like zero if used in a numeric context. - A `return' statement with no value expression is assumed at the end -of every function definition. So if control reaches the end of the -function body, then technically, the function returns an unpredictable + A `return' statement without an EXPRESSION is assumed at the end of +every function definition. So, if control reaches the end of the +function body, then technically the function returns an unpredictable value. In practice, it returns the empty string. `awk' does _not_ warn you if you use the return value of such a function. Sometimes, you want to write a function for what it does, not for what it returns. Such a function corresponds to a `void' function in -C, C++ or Java, or to a `procedure' in Ada. Thus, it may be +C, C++, or Java, or to a `procedure' in Ada. Thus, it may be appropriate to not return any value; simply bear in mind that you should not be using the return value of such a function. @@ -14091,13 +14116,13 @@ you can specify the name of the function to call as a string variable, and then call the function. Let's look at an example. Suppose you have a file with your test scores for the classes you -are taking. The first field is the class name. The following fields -are the functions to call to process the data, up to a "marker" field +are taking, and you wish to get the sum and the average of your test +scores. The first field is the class name. The following fields are +the functions to call to process the data, up to a "marker" field `data:'. Following the marker, to the end of the record, are the various numeric test scores. - Here is the initial file; you wish to get the sum and the average of -your test scores: + Here is the initial file: Biology_101 sum average data: 87.0 92.4 78.5 94.9 Chemistry_305 sum average data: 75.2 98.3 94.7 88.2 @@ -14155,9 +14180,9 @@ using indirect function calls: return ret } - These two functions expect to work on fields; thus the parameters + These two functions expect to work on fields; thus, the parameters `first' and `last' indicate where in the fields to start and end. -Otherwise they perform the expected computations and are not unusual: +Otherwise, they perform the expected computations and are not unusual: # For each record, print the class name and the requested statistics { @@ -14210,18 +14235,19 @@ to force it to be a string value.) may think at first. The C and C++ languages provide "function pointers," which are a mechanism for calling a function chosen at runtime. One of the most well-known uses of this ability is the C -`qsort()' function, which sorts an array using the famous "quick sort" +`qsort()' function, which sorts an array using the famous "quicksort" algorithm (see the Wikipedia article -(http://en.wikipedia.org/wiki/Quick_sort) for more information). To -use this function, you supply a pointer to a comparison function. This +(http://en.wikipedia.org/wiki/Quicksort) for more information). To use +this function, you supply a pointer to a comparison function. This mechanism allows you to sort arbitrary data in an arbitrary fashion. We can do something similar using `gawk', like this: # quicksort.awk --- Quicksort algorithm, with user-supplied # comparison function - # quicksort --- C.A.R. Hoare's quick sort algorithm. See Wikipedia - # or almost any algorithms or computer science text + + # quicksort --- C.A.R. Hoare's quicksort algorithm. See Wikipedia + # or almost any algorithms or computer science text. function quicksort(data, left, right, less_than, i, last) { @@ -14250,7 +14276,7 @@ mechanism allows you to sort arbitrary data in an arbitrary fashion. The `quicksort()' function receives the `data' array, the starting and ending indices to sort (`left' and `right'), and the name of a function that performs a "less than" comparison. It then implements -the quick sort algorithm. +the quicksort algorithm. To make use of the sorting function, we return to our previous example. The first thing to do is write some comparison functions: @@ -14490,7 +14516,7 @@ File: gawk.info, Node: Library Functions, Next: Sample Programs, Prev: Functi *note User-defined::, describes how to write your own `awk' functions. Writing functions is important, because it allows you to encapsulate algorithms and program tasks in a single place. It simplifies -programming, making program development more manageable, and making +programming, making program development more manageable and making programs more readable. In their seminal 1976 book, `Software Tools',(1) Brian Kernighan and @@ -14595,7 +14621,7 @@ often use variable names like these for their own purposes. The example programs shown in this major node all start the names of their private variables with an underscore (`_'). Users generally don't use leading underscores in their variable names, so this -convention immediately decreases the chances that the variable name +convention immediately decreases the chances that the variable names will be accidentally shared with the user's program. In addition, several of the library functions use a prefix that helps @@ -14608,7 +14634,7 @@ for private function names.(1) As a final note on variable naming, if a function makes global variables available for use by a main program, it is a good convention -to start that variable's name with a capital letter--for example, +to start those variables' names with a capital letter--for example, `getopt()''s `Opterr' and `Optind' variables (*note Getopt Function::). The leading capital letter indicates that it is global, while the fact that the variable name is not all capital letters indicates that the @@ -14616,7 +14642,7 @@ variable is not one of `awk''s predefined variables, such as `FS'. It is also important that _all_ variables in library functions that do not need to save state are, in fact, declared local.(2) If this is -not done, the variable could accidentally be used in the user's +not done, the variables could accidentally be used in the user's program, leading to bugs that are very difficult to track down: function lib_func(x, y, l1, l2) @@ -14794,7 +14820,7 @@ for use in printing the diagnostic message. This is not possible in `awk', so this `assert()' function also requires a string version of the condition that is being tested. Following is the function: - # assert --- assert that a condition is true. Otherwise exit. + # assert --- assert that a condition is true. Otherwise, exit. function assert(condition, string) { @@ -14815,7 +14841,7 @@ the condition that is being tested. Following is the function: false, it prints a message to standard error, using the `string' parameter to describe the failed condition. It then sets the variable `_assert_exit' to one and executes the `exit' statement. The `exit' -statement jumps to the `END' rule. If the `END' rules finds +statement jumps to the `END' rule. If the `END' rule finds `_assert_exit' to be true, it exits immediately. The purpose of the test in the `END' rule is to keep any other `END' @@ -15030,9 +15056,9 @@ the strings in an array into one long string. The following function, `join()', accomplishes this task. It is used later in several of the application programs (*note Sample Programs::). - Good function design is important; this function needs to be general -but it should also have a reasonable default behavior. It is called -with an array as well as the beginning and ending indices of the + Good function design is important; this function needs to be +general, but it should also have a reasonable default behavior. It is +called with an array as well as the beginning and ending indices of the elements in the array to be merged. This assumes that the array indices are numeric--a reasonable assumption, as the array was likely created with `split()' (*note String Functions::): @@ -15151,7 +15177,7 @@ optional timestamp value to use instead of the current time. File: gawk.info, Node: Readfile Function, Next: Shell Quoting, Prev: Getlocaltime Function, Up: General Functions -10.2.8 Reading a Whole File At Once +10.2.8 Reading a Whole File at Once ----------------------------------- Often, it is convenient to have the entire contents of a file available @@ -15193,13 +15219,13 @@ reads the entire contents of the named file in one shot: It works by setting `RS' to `^$', a regular expression that will never match if the file has contents. `gawk' reads data from the file -into `tmp' attempting to match `RS'. The match fails after each read, +into `tmp', attempting to match `RS'. The match fails after each read, but fails quickly, such that `gawk' fills `tmp' with the entire contents of the file. (*Note Records::, for information on `RT' and `RS'.) In the case that `file' is empty, the return value is the null -string. Thus calling code may use something like: +string. Thus, calling code may use something like: contents = readfile("/some/path") if (length(contents) == 0) @@ -15289,8 +15315,9 @@ File: gawk.info, Node: Filetrans Function, Next: Rewind Function, Up: Data Fi The `BEGIN' and `END' rules are each executed exactly once, at the beginning and end of your `awk' program, respectively (*note BEGIN/END::). We (the `gawk' authors) once had a user who mistakenly -thought that the `BEGIN' rule is executed at the beginning of each data -file and the `END' rule is executed at the end of each data file. +thought that the `BEGIN' rules were executed at the beginning of each +data file and the `END' rules were executed at the end of each data +file. When informed that this was not the case, the user requested that we add new special patterns to `gawk', named `BEGIN_FILE' and `END_FILE', @@ -15324,7 +15351,7 @@ does so _portably_; this works with any implementation of `awk': This file must be loaded before the user's "main" program, so that the rule it supplies is executed first. - This rule relies on `awk''s `FILENAME' variable that automatically + This rule relies on `awk''s `FILENAME' variable, which automatically changes for each new data file. The current file name is saved in a private variable, `_oldfilename'. If `FILENAME' does not equal `_oldfilename', then a new data file is being processed and it is @@ -15339,7 +15366,7 @@ correctly even for the first data file. The program also supplies an `END' rule to do the final processing for the last file. Because this `END' rule comes before any `END' rules supplied in the "main" program, `endfile()' is called first. Once -again the value of multiple `BEGIN' and `END' rules should be clear. +again, the value of multiple `BEGIN' and `END' rules should be clear. If the same data file occurs twice in a row on the command line, then `endfile()' and `beginfile()' are not executed at the end of the first @@ -15366,7 +15393,7 @@ how it simplifies writing the main program. You are probably wondering, if `beginfile()' and `endfile()' functions can do the job, why does `gawk' have `BEGINFILE' and -`ENDFILE' patterns (*note BEGINFILE/ENDFILE::)? +`ENDFILE' patterns? Good question. Normally, if `awk' cannot open a file, this causes an immediate fatal error. In this case, there is no way for a @@ -15374,7 +15401,8 @@ user-defined function to deal with the problem, as the mechanism for calling it relies on the file being open and at the first record. Thus, the main reason for `BEGINFILE' is to give you a "hook" to catch files that cannot be processed. `ENDFILE' exists for symmetry, and because -it provides an easy way to do per-file cleanup processing. +it provides an easy way to do per-file cleanup processing. For more +information, refer to *note BEGINFILE/ENDFILE::. File: gawk.info, Node: Rewind Function, Next: File Checking, Prev: Filetrans Function, Up: Data File Management @@ -15382,15 +15410,14 @@ File: gawk.info, Node: Rewind Function, Next: File Checking, Prev: Filetrans 10.3.2 Rereading the Current File --------------------------------- -Another request for a new built-in function was for a `rewind()' -function that would make it possible to reread the current file. The -requesting user didn't want to have to use `getline' (*note Getline::) -inside a loop. +Another request for a new built-in function was for a function that +would make it possible to reread the current file. The requesting user +didn't want to have to use `getline' (*note Getline::) inside a loop. However, as long as you are not in the `END' rule, it is quite easy to arrange to immediately close the current input file and then start -over with it from the top. For lack of a better name, we'll call it -`rewind()': +over with it from the top. For lack of a better name, we'll call the +function `rewind()': # rewind.awk --- rewind the current file and start over @@ -15448,7 +15475,7 @@ longer in the list). See also *note ARGC and ARGV::. Because `awk' variable names only allow the English letters, the regular expression check purposely does not use character classes such -as `[:alpha:]' and `[:alnum:]' (*note Bracket Expressions::) +as `[:alpha:]' and `[:alnum:]' (*note Bracket Expressions::). ---------- Footnotes ---------- @@ -15459,14 +15486,14 @@ opened. However, the code here provides a portable solution. File: gawk.info, Node: Empty Files, Next: Ignoring Assigns, Prev: File Checking, Up: Data File Management -10.3.4 Checking for Zero-length Files +10.3.4 Checking for Zero-Length Files ------------------------------------- All known `awk' implementations silently skip over zero-length files. This is a by-product of `awk''s implicit read-a-record-and-match-against-the-rules loop: when `awk' tries to -read a record from an empty file, it immediately receives an end of -file indication, closes the file, and proceeds on to the next +read a record from an empty file, it immediately receives an +end-of-file indication, closes the file, and proceeds on to the next command-line data file, _without_ executing any user-level `awk' program code. @@ -15516,7 +15543,7 @@ File: gawk.info, Node: Ignoring Assigns, Prev: Empty Files, Up: Data File Man Occasionally, you might not want `awk' to process command-line variable assignments (*note Assignment Options::). In particular, if you have a file name that contains an `=' character, `awk' treats the file name as -an assignment, and does not process it. +an assignment and does not process it. Some users have suggested an additional command-line option for `gawk' to disable command-line assignments. However, some simple @@ -15806,8 +15833,8 @@ which is in `ARGV[0]': } } - The rest of the `BEGIN' rule is a simple test program. Here is the -result of two sample runs of the test program: + The rest of the `BEGIN' rule is a simple test program. Here are the +results of two sample runs of the test program: $ awk -f getopt.awk -v _getopt_test=1 -- -a -cbARG bax -x -| c = <a>, Optarg = <> @@ -15853,10 +15880,10 @@ File: gawk.info, Node: Passwd Functions, Next: Group Functions, Prev: Getopt ============================== The `PROCINFO' array (*note Built-in Variables::) provides access to -the current user's real and effective user and group ID numbers, and if -available, the user's supplementary group set. However, because these -are numbers, they do not provide very useful information to the average -user. There needs to be some way to find the user information +the current user's real and effective user and group ID numbers, and, +if available, the user's supplementary group set. However, because +these are numbers, they do not provide very useful information to the +average user. There needs to be some way to find the user information associated with the user and group ID numbers. This minor node presents a suite of functions for retrieving information from the user database. *Note Group Functions::, for a similar suite that retrieves @@ -15867,7 +15894,7 @@ kept. Instead, it provides the `<pwd.h>' header file and several C language subroutines for obtaining user information. The primary function is `getpwent()', for "get password entry." The "password" comes from the original user database file, `/etc/passwd', which stores -user information, along with the encrypted passwords (hence the name). +user information along with the encrypted passwords (hence the name). Although an `awk' program could simply read `/etc/passwd' directly, this file may not contain complete information about the system's set @@ -15915,7 +15942,7 @@ Encrypted password User-ID The user's numeric user ID number. (On some systems, it's a C - `long', and not an `int'. Thus we cast it to `long' for all + `long', and not an `int'. Thus, we cast it to `long' for all cases.) Group-ID @@ -16014,8 +16041,8 @@ or on some other `awk' implementation. `PROCINFO["FS"]', is similar. The main part of the function uses a loop to read database lines, -split the line into fields, and then store the line into each array as -necessary. When the loop is done, `_pw_init()' cleans up by closing +split the lines into fields, and then store the lines into each array +as necessary. When the loop is done, `_pw_init()' cleans up by closing the pipeline, setting `_pw_inited' to one, and restoring `FS' (and `FIELDWIDTHS' or `FPAT' if necessary), `RS', and `$0'. The use of `_pw_count' is explained shortly. @@ -16143,7 +16170,7 @@ Group Password Group ID Number The group's numeric group ID number; the association of name to number must be unique within the file. (On some systems it's a C - `long', and not an `int'. Thus we cast it to `long' for all + `long', and not an `int'. Thus, we cast it to `long' for all cases.) Group Member List @@ -16233,29 +16260,30 @@ to ensure that the database is scanned no more than once. The `_gr_init()' function first saves `FS', `RS', and `$0', and then sets `FS' and `RS' to the correct values for scanning the group information. It also takes care to note whether `FIELDWIDTHS' or `FPAT' is being -used, and to restore the appropriate field splitting mechanism. +used, and to restore the appropriate field-splitting mechanism. - The group information is stored is several associative arrays. The + The group information is stored in several associative arrays. The arrays are indexed by group name (`_gr_byname'), by group ID number (`_gr_bygid'), and by position in the database (`_gr_bycount'). There is an additional array indexed by username (`_gr_groupsbyuser'), which is a space-separated list of groups to which each user belongs. - Unlike the user database, it is possible to have multiple records in -the database for the same group. This is common when a group has a + Unlike in the user database, it is possible to have multiple records +in the database for the same group. This is common when a group has a large number of members. A pair of such entries might look like the following: - tvpeople:*:101:johny,jay,arsenio + tvpeople:*:101:johnny,jay,arsenio tvpeople:*:101:david,conan,tom,joan For this reason, `_gr_init()' looks to see if a group name or group -ID number is already seen. If it is, the usernames are simply +ID number is already seen. If so, the usernames are simply concatenated onto the previous list of users.(1) Finally, `_gr_init()' closes the pipeline to `grcat', restores `FS' -(and `FIELDWIDTHS' or `FPAT' if necessary), `RS', and `$0', initializes -`_gr_count' to zero (it is used later), and makes `_gr_inited' nonzero. +(and `FIELDWIDTHS' or `FPAT', if necessary), `RS', and `$0', +initializes `_gr_count' to zero (it is used later), and makes +`_gr_inited' nonzero. The `getgrnam()' function takes a group name as its argument, and if that group exists, it is returned. Otherwise, it relies on the array @@ -16318,9 +16346,9 @@ very simple, relying on `awk''s associative arrays to do work. ---------- Footnotes ---------- - (1) There is actually a subtle problem with the code just presented. -Suppose that the first time there were no names. This code adds the -names with a leading comma. It also doesn't check that there is a `$4'. + (1) There is a subtle problem with the code just presented. Suppose +that the first time there were no names. This code adds the names with +a leading comma. It also doesn't check that there is a `$4'. File: gawk.info, Node: Walking Arrays, Next: Library Functions Summary, Prev: Group Functions, Up: Library Functions @@ -16329,11 +16357,11 @@ File: gawk.info, Node: Walking Arrays, Next: Library Functions Summary, Prev: ================================ *note Arrays of Arrays::, described how `gawk' provides arrays of -arrays. In particular, any element of an array may be either a scalar, +arrays. In particular, any element of an array may be either a scalar or another array. The `isarray()' function (*note Type Functions::) lets you distinguish an array from a scalar. The following function, -`walk_array()', recursively traverses an array, printing each element's -indices and value. You call it with the array and a string +`walk_array()', recursively traverses an array, printing the element +indices and values. You call it with the array and a string representing the name of the array: function walk_array(arr, name, i) @@ -16390,24 +16418,24 @@ File: gawk.info, Node: Library Functions Summary, Next: Library Exercises, Pr * The functions presented here fit into the following categories: General problems - Number-to-string conversion, assertions, rounding, random - number generation, converting characters to numbers, joining - strings, getting easily usable time-of-day information, and - reading a whole file in one shot. + Number-to-string conversion, testing assertions, rounding, + random number generation, converting characters to numbers, + joining strings, getting easily usable time-of-day + information, and reading a whole file in one shot Managing data files Noting data file boundaries, rereading the current file, checking for readable files, checking for zero-length files, - and treating assignments as file names. + and treating assignments as file names Processing command-line options - An `awk' version of the standard C `getopt()' function. + An `awk' version of the standard C `getopt()' function Reading the user and group databases - Two sets of routines that parallel the C library versions. + Two sets of routines that parallel the C library versions Traversing arrays of arrays - A simple function to traverse an array of arrays to any depth. + A simple function to traverse an array of arrays to any depth @@ -16502,7 +16530,7 @@ you. to replace the installed versions on your system. Nor may all of these programs be fully compliant with the most recent POSIX standard. This is not a problem; their purpose is to illustrate `awk' language -programming for "real world" tasks. +programming for "real-world" tasks. The programs are presented in alphabetical order. @@ -16528,7 +16556,7 @@ separated by TABs by default, but you may supply a command-line option to change the field "delimiter" (i.e., the field-separator character). `cut''s definition of fields is less general than `awk''s. - A common use of `cut' might be to pull out just the login name of + A common use of `cut' might be to pull out just the login names of logged-on users from the output of `who'. For example, the following pipeline generates a sorted, unique list of the logged-on users: @@ -16937,7 +16965,7 @@ unsuccessful match. If the line does not match, the `next' statement just moves on to the next record. A number of additional tests are made, but they are only done if we -are not counting lines. First, if the user only wants exit status +are not counting lines. First, if the user only wants the exit status (`no_print' is true), then it is enough to know that _one_ line in this file matched, and we can skip on to the next file with `nextfile'. Similarly, if we are only printing file names, we can print the file @@ -16971,7 +16999,7 @@ line is printed, with a leading file name and colon if necessary: } The `END' rule takes care of producing the correct exit status. If -there are no matches, the exit status is one; otherwise it is zero: +there are no matches, the exit status is one; otherwise, it is zero: END { exit (total == 0) @@ -17013,7 +17041,8 @@ a more palatable output than just individual numbers. Here is a simple version of `id' written in `awk'. It uses the user database library functions (*note Passwd Functions::) and the group -database library functions (*note Group Functions::): +database library functions (*note Group Functions::) from *note Library +Functions::. The program is fairly straightforward. All the work is done in the `BEGIN' rule. The user and group ID numbers are obtained from @@ -17110,8 +17139,8 @@ is as follows:(1) By default, the output files are named `xaa', `xab', and so on. Each file has 1,000 lines in it, with the likely exception of the last file. To change the number of lines in each file, supply a number on the -command line preceded with a minus (e.g., `-500' for files with 500 -lines in them instead of 1,000). To change the name of the output +command line preceded with a minus sign (e.g., `-500' for files with +500 lines in them instead of 1,000). To change the names of the output files to something like `myfileaa', `myfileab', and so on, supply an additional argument that specifies the file name prefix. @@ -17748,7 +17777,7 @@ checking and setting of defaults: the delay, the count, and the message to print. If the user supplied a message without the ASCII BEL character (known as the "alert" character, `"\a"'), then it is added to the message. (On many systems, printing the ASCII BEL generates an -audible alert. Thus when the alarm goes off, the system calls attention +audible alert. Thus, when the alarm goes off, the system calls attention to itself in case the user is not looking at the computer.) Just for a change, this program uses a `switch' statement (*note Switch Statement::), but the processing could be done with a series of @@ -17880,7 +17909,7 @@ the "from" list. Once upon a time, a user proposed adding a transliteration function to `gawk'. The following program was written to prove that character transliteration could be done with a user-level function. This program -is not as complete as the system `tr' utility but it does most of the +is not as complete as the system `tr' utility, but it does most of the job. The `translate' program was written long before `gawk' acquired the @@ -17890,13 +17919,13 @@ and `gsub()' built-in functions (*note String Functions::). There are two functions. The first, `stranslate()', takes three arguments: `from' - A list of characters from which to translate. + A list of characters from which to translate `to' - A list of characters to which to translate. + A list of characters to which to translate `target' - The string on which to do the translation. + The string on which to do the translation Associative arrays make the translation part fairly easy. `t_ar' holds the "to" characters, indexed by the "from" characters. Then a @@ -17904,7 +17933,7 @@ simple loop goes through `from', one character at a time. For each character in `from', if the character appears in `target', it is replaced with the corresponding `to' character. - The `translate()' function calls `stranslate()' using `$0' as the + The `translate()' function calls `stranslate()', using `$0' as the target. The main program sets two global variables, `FROM' and `TO', from the command line, and then changes `ARGV' so that `awk' reads from the standard input. @@ -17913,7 +17942,7 @@ the standard input. record: # translate.awk --- do tr-like stuff - # Bugs: does not handle things like: tr A-Z a-z, it has + # Bugs: does not handle things like tr A-Z a-z; it has # to be spelled out. However, if `to' is shorter than `from', # the last character in `to' is used for the rest of `from'. @@ -17991,13 +18020,13 @@ File: gawk.info, Node: Labels Program, Next: Word Sorting, Prev: Translate Pr 11.3.4 Printing Mailing Labels ------------------------------ -Here is a "real world"(1) program. This script reads lists of names and +Here is a "real-world"(1) program. This script reads lists of names and addresses and generates mailing labels. Each page of labels has 20 labels on it, two across and 10 down. The addresses are guaranteed to be no more than five lines of data. Each address is separated from the next by a blank line. - The basic idea is to read 20 labels worth of data. Each line of + The basic idea is to read 20 labels' worth of data. Each line of each label is stored in the `line' array. The single rule takes care of filling the `line' array and printing the page when 20 labels have been read. @@ -18009,13 +18038,13 @@ splits records at blank lines (*note Records::). It sets `MAXLINES' to Most of the work is done in the `printpage()' function. The label lines are stored sequentially in the `line' array. But they have to -print horizontally; `line[1]' next to `line[6]', `line[2]' next to +print horizontally: `line[1]' next to `line[6]', `line[2]' next to `line[7]', and so on. Two loops accomplish this. The outer loop, controlled by `i', steps through every 10 lines of data; this is each row of labels. The inner loop, controlled by `j', goes through the -lines within the row. As `j' goes from 0 to 4, `i+j' is the `j'-th -line in the row, and `i+j+5' is the entry next to it. The output ends -up looking something like this: +lines within the row. As `j' goes from 0 to 4, `i+j' is the `j'th line +in the row, and `i+j+5' is the entry next to it. The output ends up +looking something like this: line 1 line 6 line 2 line 7 @@ -18118,8 +18147,8 @@ a useful format. printf "%s\t%d\n", word, freq[word] } - The program relies on `awk''s default field splitting mechanism to -break each line up into "words," and uses an associative array named + The program relies on `awk''s default field-splitting mechanism to +break each line up into "words" and uses an associative array named `freq', indexed by each word, to count the number of times the word occurs. In the `END' rule, it prints the counts. @@ -18205,7 +18234,7 @@ File: gawk.info, Node: History Sorting, Next: Extract Program, Prev: Word Sor 11.3.6 Removing Duplicates from Unsorted Text --------------------------------------------- -The `uniq' program (*note Uniq Program::), removes duplicate lines from +The `uniq' program (*note Uniq Program::) removes duplicate lines from _sorted_ data. Suppose, however, you need to remove duplicate lines from a data @@ -18258,7 +18287,7 @@ hand. Here we present a program that can extract parts of a Texinfo input file into separate files. This Info file is written in Texinfo -(http://www.gnu.org/software/texinfo/), the GNU project's document +(http://www.gnu.org/software/texinfo/), the GNU Project's document formatting language. A single Texinfo source file can be used to produce both printed documentation, with TeX, and online documentation. (The Texinfo language is described fully, starting with *note @@ -18299,7 +18328,7 @@ them in a standard directory where `gawk' can find them. The Texinfo file looks something like this: ... - This program has a @code{BEGIN} rule, + This program has a @code{BEGIN} rule that prints a nice message: @example @@ -18324,7 +18353,7 @@ upper- and lowercase letters in the directives won't matter. given (`NF' is at least three) and also checking that the command exits with a zero exit status, signifying OK: - # extract.awk --- extract files and run programs from texinfo files + # extract.awk --- extract files and run programs from Texinfo files BEGIN { IGNORECASE = 1 } @@ -18351,11 +18380,11 @@ The variable `e' is used so that the rule fits nicely on the screen. file name is given in the directive. If the file named is not the current file, then the current file is closed. Keeping the current file open until a new file is encountered allows the use of the `>' -redirection for printing the contents, keeping open file management +redirection for printing the contents, keeping open-file management simple. The `for' loop does the work. It reads lines using `getline' (*note -Getline::). For an unexpected end of file, it calls the +Getline::). For an unexpected end-of-file, it calls the `unexpected_eof()' function. If the line is an "endfile" line, then it breaks out of the loop. If the line is an `@group' or `@end group' line, then it ignores it and goes on to the next line. Similarly, @@ -18445,10 +18474,10 @@ File: gawk.info, Node: Simple Sed, Next: Igawk Program, Prev: Extract Program 11.3.8 A Simple Stream Editor ----------------------------- -The `sed' utility is a stream editor, a program that reads a stream of -data, makes changes to it, and passes it on. It is often used to make -global changes to a large file or to a stream of data generated by a -pipeline of commands. Although `sed' is a complicated program in its +The `sed' utility is a "stream editor", a program that reads a stream +of data, makes changes to it, and passes it on. It is often used to +make global changes to a large file or to a stream of data generated by +a pipeline of commands. Although `sed' is a complicated program in its own right, its most common use is to perform global substitutions in the middle of a pipeline: @@ -18562,7 +18591,7 @@ include a library function twice. `igawk' should behave just like `gawk' externally. This means it should accept all of `gawk''s command-line arguments, including the -ability to have multiple source files specified via `-f', and the +ability to have multiple source files specified via `-f' and the ability to mix command-line and library source files. The program is written using the POSIX Shell (`sh') command @@ -18592,8 +18621,8 @@ language.(1) It works as follows: file names). This program uses shell variables extensively: for storing -command-line arguments, the text of the `awk' program that will expand -the user's program, for the user's original program, and for the +command-line arguments and the text of the `awk' program that will +expand the user's program, for the user's original program, and for the expanded program. Doing so removes some potential problems that might arise were we to use temporary files instead, at the cost of making the script somewhat more complicated. @@ -18851,7 +18880,7 @@ It's done in these steps: The last step is to call `gawk' with the expanded program, along with the original options and command-line arguments that the user -supplied. +supplied: eval gawk $opts -- '"$processed_program"' '"$@"' @@ -18914,15 +18943,15 @@ One word is an anagram of another if both words contain the same letters Column 2, Problem C, of Jon Bentley's `Programming Pearls', Second Edition, presents an elegant algorithm. The idea is to give words that are anagrams a common signature, sort all the words together by their -signature, and then print them. Dr. Bentley observes that taking the -letters in each word and sorting them produces that common signature. +signatures, and then print them. Dr. Bentley observes that taking the +letters in each word and sorting them produces those common signatures. The following program uses arrays of arrays to bring together words with the same signature and array sorting to print the words in sorted order: - # anagram.awk --- An implementation of the anagram finding algorithm - # from Jon Bentley's "Programming Pearls", 2nd edition. + # anagram.awk --- An implementation of the anagram-finding algorithm + # from Jon Bentley's "Programming Pearls," 2nd edition. # Addison Wesley, 2000, ISBN 0-201-65788-0. # Column 2, Problem C, section 2.8, pp 18-20. @@ -18942,7 +18971,7 @@ signature; the second dimension is the word itself: apart into individual letters, sorts the letters, and then joins them back together: - # word2key --- split word apart into letters, sort, joining back together + # word2key --- split word apart into letters, sort, and join back together function word2key(word, a, i, n, result) { @@ -19040,12 +19069,13 @@ File: gawk.info, Node: Programs Summary, Next: Programs Exercises, Prev: Misc characters. The ability to use `split()' with the empty string as the separator can considerably simplify such tasks. - * The library functions from *note Library Functions::, proved their - usefulness for a number of real (if small) programs. + * The examples here demonstrate the usefulness of the library + functions from *note Library Functions::, for a number of real (if + small) programs. * Besides reinventing POSIX wheels, other programs solved a - selection of interesting problems, such as finding duplicates - words in text, printing mailing labels, and finding anagrams. + selection of interesting problems, such as finding duplicate words + in text, printing mailing labels, and finding anagrams. @@ -19162,16 +19192,16 @@ File: gawk.info, Node: Advanced Features, Next: Internationalization, Prev: S This major node discusses advanced features in `gawk'. It's a bit of a "grab bag" of items that are otherwise unrelated to each other. -First, a command-line option allows `gawk' to recognize nondecimal -numbers in input data, not just in `awk' programs. Then, `gawk''s -special features for sorting arrays are presented. Next, two-way I/O, -discussed briefly in earlier parts of this Info file, is described in -full detail, along with the basics of TCP/IP networking. Finally, -`gawk' can "profile" an `awk' program, making it possible to tune it -for performance. +First, we look at a command-line option that allows `gawk' to recognize +nondecimal numbers in input data, not just in `awk' programs. Then, +`gawk''s special features for sorting arrays are presented. Next, +two-way I/O, discussed briefly in earlier parts of this Info file, is +described in full detail, along with the basics of TCP/IP networking. +Finally, we see how `gawk' can "profile" an `awk' program, making it +possible to tune it for performance. - A number of advanced features require separate major nodes of their -own: + Additional advanced features are discussed in separate major nodes +of their own: * *note Internationalization::, discusses how to internationalize your `awk' programs, so that they can speak multiple national @@ -19245,7 +19275,7 @@ File: gawk.info, Node: Array Sorting, Next: Two-way I/O, Prev: Nondecimal Dat 12.2 Controlling Array Traversal and Array Sorting ================================================== -`gawk' lets you control the order in which a `for (i in array)' loop +`gawk' lets you control the order in which a `for (INDX in ARRAY)' loop traverses an array. In addition, two built-in functions, `asort()' and `asorti()', let @@ -19264,9 +19294,9 @@ File: gawk.info, Node: Controlling Array Traversal, Next: Array Sorting Functi 12.2.1 Controlling Array Traversal ---------------------------------- -By default, the order in which a `for (i in array)' loop scans an array -is not defined; it is generally based upon the internal implementation -of arrays inside `awk'. +By default, the order in which a `for (INDX in ARRAY)' loop scans an +array is not defined; it is generally based upon the internal +implementation of arrays inside `awk'. Often, though, it is desirable to be able to loop over the elements in a particular order that you, the programmer, choose. `gawk' lets @@ -19288,21 +19318,22 @@ arguments: RETURN < 0; 0; OR > 0 } - Here, I1 and I2 are the indices, and V1 and V2 are the corresponding -values of the two elements being compared. Either V1 or V2, or both, -can be arrays if the array being traversed contains subarrays as values. -(*Note Arrays of Arrays::, for more information about subarrays.) The -three possible return values are interpreted as follows: + Here, `i1' and `i2' are the indices, and `v1' and `v2' are the +corresponding values of the two elements being compared. Either `v1' +or `v2', or both, can be arrays if the array being traversed contains +subarrays as values. (*Note Arrays of Arrays::, for more information +about subarrays.) The three possible return values are interpreted as +follows: `comp_func(i1, v1, i2, v2) < 0' - Index I1 comes before index I2 during loop traversal. + Index `i1' comes before index `i2' during loop traversal. `comp_func(i1, v1, i2, v2) == 0' - Indices I1 and I2 come together but the relative order with + Indices `i1' and `i2' come together, but the relative order with respect to each other is undefined. `comp_func(i1, v1, i2, v2) > 0' - Index I1 comes after index I2 during loop traversal. + Index `i1' comes after index `i2' during loop traversal. Our first comparison function can be used to scan an array in numerical order of the indices: @@ -19445,7 +19476,7 @@ elements compare equal. This is usually not a problem, but letting the tied elements come out in arbitrary order can be an issue, especially when comparing item values. The partial ordering of the equal elements may change the next time the array is traversed, if other elements are -added or removed from the array. One way to resolve ties when +added to or removed from the array. One way to resolve ties when comparing elements with otherwise equal values is to include the indices in the comparison rules. Note that doing this may make the loop traversal less efficient, so consider it only if necessary. The @@ -19479,14 +19510,14 @@ lowercase letters as equivalent or distinct. Another point to keep in mind is that in the case of subarrays, the element values can themselves be arrays; a production comparison -function should use the `isarray()' function (*note Type Functions::), +function should use the `isarray()' function (*note Type Functions::) to check for this, and choose a defined sorting order for subarrays. All sorting based on `PROCINFO["sorted_in"]' is disabled in POSIX mode, because the `PROCINFO' array is not special in that case. As a side note, sorting the array indices before traversing the -array has been reported to add 15% to 20% overhead to the execution +array has been reported to add a 15% to 20% overhead to the execution time of `awk' programs. For this reason, sorted array traversal is not the default. @@ -19535,8 +19566,8 @@ array is not affected. Often, what's needed is to sort on the values of the _indices_ instead of the values of the elements. To do that, use the `asorti()' function. The interface and behavior are identical to that of -`asort()', except that the index values are used for sorting, and -become the values of the result array: +`asort()', except that the index values are used for sorting and become +the values of the result array: { source[$0] = some_func($0) } @@ -19568,8 +19599,8 @@ chooses_, taking into account just the indices, just the values, or both. This is extremely powerful. Once the array is sorted, `asort()' takes the _values_ in their -final order, and uses them to fill in the result array, whereas -`asorti()' takes the _indices_ in their final order, and uses them to +final order and uses them to fill in the result array, whereas +`asorti()' takes the _indices_ in their final order and uses them to fill in the result array. NOTE: Copying array indices and elements isn't expensive in terms @@ -19767,7 +19798,7 @@ REMOTE-PORT name. NOTE: Failure in opening a two-way socket will result in a - non-fatal error being returned to the calling code. The value of + nonfatal error being returned to the calling code. The value of `ERRNO' indicates the error (*note Auto-set::). Consider the following very simple example: @@ -19848,8 +19879,8 @@ First, the `awk' program: junk Here is the `awkprof.out' that results from running the `gawk' -profiler on this program and data. (This example also illustrates that -`awk' programmers sometimes get up very early in the morning to work.) +profiler on this program and data (this example also illustrates that +`awk' programmers sometimes get up very early in the morning to work): # gawk profile, created Mon Sep 29 05:16:21 2014 @@ -19902,7 +19933,7 @@ profiler on this program and data. (This example also illustrates that output. They are as follows: * The program is printed in the order `BEGIN' rules, `BEGINFILE' - rules, pattern/action rules, `ENDFILE' rules, `END' rules and + rules, pattern-action rules, `ENDFILE' rules, `END' rules, and functions, listed alphabetically. Multiple `BEGIN' and `END' rules retain their separate identities, as do multiple `BEGINFILE' and `ENDFILE' rules. @@ -19947,13 +19978,13 @@ output. They are as follows: scalar, it gets parenthesized. * `gawk' supplies leading comments in front of the `BEGIN' and `END' - rules, the `BEGINFILE' and `ENDFILE' rules, the pattern/action + rules, the `BEGINFILE' and `ENDFILE' rules, the pattern-action rules, and the functions. The profiled version of your program may not look exactly like what you typed when you wrote it. This is because `gawk' creates the -profiled version by "pretty printing" its internal representation of +profiled version by "pretty-printing" its internal representation of the program. The advantage to this is that `gawk' can produce a standard representation. Also, things such as: @@ -20003,15 +20034,15 @@ output profile file. produces the profile and the function call trace and then exits. When `gawk' runs on MS-Windows systems, it uses the `INT' and `QUIT' -signals for producing the profile and, in the case of the `INT' signal, +signals for producing the profile, and in the case of the `INT' signal, `gawk' exits. This is because these systems don't support the `kill' command, so the only signals you can deliver to a program are those generated by the keyboard. The `INT' signal is generated by the -`Ctrl-<C>' or `Ctrl-<BREAK>' key, while the `QUIT' signal is generated -by the `Ctrl-<\>' key. +`Ctrl-c' or `Ctrl-BREAK' key, while the `QUIT' signal is generated by +the `Ctrl-\' key. Finally, `gawk' also accepts another option, `--pretty-print'. When -called this way, `gawk' "pretty prints" the program into `awkprof.out', +called this way, `gawk' "pretty-prints" the program into `awkprof.out', without any execution counts. NOTE: Once upon a time, the `--pretty-print' option would also run @@ -20063,7 +20094,7 @@ File: gawk.info, Node: Advanced Features Summary, Prev: Profiling, Up: Advanc two-way communications. * By using special file names with the `|&' operator, you can open a - TCP/IP (or UDP/IP) connection to remote hosts in the Internet. + TCP/IP (or UDP/IP) connection to remote hosts on the Internet. `gawk' supports both IPv4 and IPv6. * You can generate statement count profiles of your program. This @@ -20072,7 +20103,7 @@ File: gawk.info, Node: Advanced Features Summary, Prev: Profiling, Up: Advanc `USR1' signal while profiling causes `gawk' to dump the profile and keep going, including a function call stack. - * You can also just "pretty print" the program. This currently also + * You can also just "pretty-print" the program. This currently also runs the program, but that will change in the next major release. @@ -20116,7 +20147,7 @@ File: gawk.info, Node: I18N and L10N, Next: Explaining gettext, Up: Internati "Internationalization" means writing (or modifying) a program once, in such a way that it can use multiple languages without requiring further -source-code changes. "Localization" means providing the data necessary +source code changes. "Localization" means providing the data necessary for an internationalized program to work in a particular language. Most typically, these terms refer to features such as the language used for printing error messages, the language used to read responses, and @@ -20130,7 +20161,7 @@ File: gawk.info, Node: Explaining gettext, Next: Programmer i18n, Prev: I18N ================== `gawk' uses GNU `gettext' to provide its internationalization features. -The facilities in GNU `gettext' focus on messages; strings printed by a +The facilities in GNU `gettext' focus on messages: strings printed by a program, either directly or via formatting with `printf' or `sprintf()'.(1) @@ -20259,8 +20290,7 @@ File: gawk.info, Node: Programmer i18n, Next: Translator i18n, Prev: Explaini 13.3 Internationalizing `awk' Programs ====================================== -`gawk' provides the following variables and functions for -internationalization: +`gawk' provides the following variables for internationalization: `TEXTDOMAIN' This variable indicates the application's text domain. For @@ -20272,6 +20302,8 @@ internationalization: for translation at runtime. String constants without a leading underscore are not translated. + `gawk' provides the following functions for internationalization: + ``dcgettext(STRING' [`,' DOMAIN [`,' CATEGORY]]`)'' Return the translation of STRING in text domain DOMAIN for locale category CATEGORY. The default value for DOMAIN is the current @@ -20310,8 +20342,7 @@ internationalization: the null string (`""'), then `bindtextdomain()' returns the current binding for the given DOMAIN. - To use these facilities in your `awk' program, follow the steps -outlined in *note Explaining gettext::, like so: + To use these facilities in your `awk' program, follow these steps: 1. Set the variable `TEXTDOMAIN' to the text domain of your program. This is best done in a `BEGIN' rule (*note BEGIN/END::), or it can @@ -20533,7 +20564,7 @@ actually almost portable, requiring very little change: its value, leaving the original string constant as the result. * By defining "dummy" functions to replace `dcgettext()', - `dcngettext()' and `bindtextdomain()', the `awk' program can be + `dcngettext()', and `bindtextdomain()', the `awk' program can be made to run, but all the messages are output in the original language. For example: @@ -20668,9 +20699,9 @@ File: gawk.info, Node: Gawk I18N, Next: I18N Summary, Prev: I18N Example, Up `gawk' itself has been internationalized using the GNU `gettext' package. (GNU `gettext' is described in complete detail in *note (GNU -`gettext' utilities)Top:: gettext, GNU gettext tools.) As of this -writing, the latest version of GNU `gettext' is version 0.19.3 -(ftp://ftp.gnu.org/gnu/gettext/gettext-0.19.3.tar.gz). +`gettext' utilities)Top:: gettext, GNU `gettext' utilities.) As of +this writing, the latest version of GNU `gettext' is version 0.19.4 +(ftp://ftp.gnu.org/gnu/gettext/gettext-0.19.4.tar.gz). If a translation of `gawk''s messages exists, then `gawk' produces usage messages, warnings, and fatal errors in the local language. @@ -20682,7 +20713,7 @@ File: gawk.info, Node: I18N Summary, Prev: Gawk I18N, Up: Internationalizatio ============ * Internationalization means writing a program such that it can use - multiple languages without requiring source-code changes. + multiple languages without requiring source code changes. Localization means providing the data necessary for an internationalized program to work in a particular language. @@ -20696,10 +20727,10 @@ File: gawk.info, Node: I18N Summary, Prev: Gawk I18N, Up: Internationalizatio file, and the `.po' files are compiled into `.gmo' files for use at runtime. - * You can use position specifications with `sprintf()' and `printf' - to rearrange the placement of argument values in formatted strings - and output. This is useful for the translations of format control - strings. + * You can use positional specifications with `sprintf()' and + `printf' to rearrange the placement of argument values in formatted + strings and output. This is useful for the translation of format + control strings. * The internationalization features have been designed so that they can be easily worked around in a standard `awk'. @@ -20755,8 +20786,7 @@ File: gawk.info, Node: Debugging Concepts, Next: Debugging Terms, Up: Debuggi --------------------------- (If you have used debuggers in other languages, you may want to skip -ahead to the next section on the specific features of the `gawk' -debugger.) +ahead to *note Awk Debugging::.) Of course, a debugging program cannot remove bugs for you, because it has no way of knowing what you or your users consider a "bug" versus @@ -20843,11 +20873,11 @@ defines terms used throughout the rest of this major node: File: gawk.info, Node: Awk Debugging, Prev: Debugging Terms, Up: Debugging -14.1.3 Awk Debugging --------------------- +14.1.3 `awk' Debugging +---------------------- Debugging an `awk' program has some specific aspects that are not -shared with other programming languages. +shared with programs written in other languages. First of all, the fact that `awk' programs usually take input line by line from a file or files and operate on those lines using specific @@ -20865,8 +20895,8 @@ commands. File: gawk.info, Node: Sample Debugging Session, Next: List of Debugger Commands, Prev: Debugging, Up: Debugger -14.2 Sample Debugging Session -============================= +14.2 Sample `gawk' Debugging Session +==================================== In order to illustrate the use of `gawk' as a debugger, let's look at a sample debugging session. We will use the `awk' implementation of the @@ -20885,8 +20915,8 @@ File: gawk.info, Node: Debugger Invocation, Next: Finding The Bug, Up: Sample -------------------------------- Starting the debugger is almost exactly like running `gawk' normally, -except you have to pass an additional option `--debug', or the -corresponding short option `-D'. The file(s) containing the program +except you have to pass an additional option, `--debug', or the +corresponding short option, `-D'. The file(s) containing the program and any supporting code are given on the command line as arguments to one or more `-f' options. (`gawk' is not designed to debug command-line programs, only programs contained in files.) In our case, we invoke @@ -20896,7 +20926,7 @@ the debugger like this: where both `getopt.awk' and `uniq.awk' are in `$AWKPATH'. (Experienced users of GDB or similar debuggers should note that this syntax is -slightly different from what they are used to. With the `gawk' +slightly different from what you are used to. With the `gawk' debugger, you give the arguments for running the program in the command line to the debugger rather than as part of the `run' command at the debugger prompt.) The `-1' is an option to `uniq.awk'. @@ -21020,10 +21050,10 @@ typing `n' (for "next"): -| 66 if (fcount > 0) { This tells us that `gawk' is now ready to execute line 66, which -decides whether to give the lines the special "field skipping" treatment +decides whether to give the lines the special "field-skipping" treatment indicated by the `-1' command-line option. (Notice that we skipped -from where we were before at line 63 to here, because the condition in -line 63 `if (fcount == 0 && charcount == 0)' was false.) +from where we were before, at line 63, to here, because the condition +in line 63, `if (fcount == 0 && charcount == 0)', was false.) Continuing to step, we now get to the splitting of the current and last records: @@ -21081,15 +21111,15 @@ mentioned): Well, here we are at our error (sorry to spoil the suspense). What we had in mind was to join the fields starting from the second one to -make the virtual record to compare, and if the first field was numbered -zero, this would work. Let's look at what we've got: +make the virtual record to compare, and if the first field were +numbered zero, this would work. Let's look at what we've got: gawk> p cline clast -| cline = "gawk is a wonderful program!" -| clast = "awk is a wonderful program!" Hey, those look pretty familiar! They're just our original, -unaltered, input records. A little thinking (the human brain is still +unaltered input records. A little thinking (the human brain is still the best debugging tool), and we realize that we were off by one! We get out of the debugger: @@ -21126,11 +21156,11 @@ categories: * Miscellaneous Each of these are discussed in the following subsections. In the -following descriptions, commands which may be abbreviated show the +following descriptions, commands that may be abbreviated show the abbreviation on a second description line. A debugger command name may also be truncated if that partial name is unambiguous. The debugger has the built-in capability to automatically repeat the previous command -just by hitting <Enter>. This works for the commands `list', `next', +just by hitting `Enter'. This works for the commands `list', `next', `nexti', `step', `stepi', and `continue' executed without any argument. * Menu: @@ -21170,8 +21200,8 @@ The commands for controlling breakpoints are: Set a breakpoint at entry to (the first instruction of) function FUNCTION. - Each breakpoint is assigned a number which can be used to delete - it from the breakpoint list using the `delete' command. + Each breakpoint is assigned a number that can be used to delete it + from the breakpoint list using the `delete' command. With a breakpoint, you may also supply a condition. This is an `awk' expression (enclosed in double quotes) that the debugger @@ -21209,26 +21239,26 @@ The commands for controlling breakpoints are: `delete' [N1 N2 ...] [N-M] `d' [N1 N2 ...] [N-M] - Delete specified breakpoints or a range of breakpoints. Deletes - all defined breakpoints if no argument is supplied. + Delete specified breakpoints or a range of breakpoints. Delete all + defined breakpoints if no argument is supplied. `disable' [N1 N2 ... | N-M] Disable specified breakpoints or a range of breakpoints. Without - any argument, disables all breakpoints. + any argument, disable all breakpoints. `enable' [`del' | `once'] [N1 N2 ...] [N-M] `e' [`del' | `once'] [N1 N2 ...] [N-M] Enable specified breakpoints or a range of breakpoints. Without - any argument, enables all breakpoints. Optionally, you can - specify how to enable the breakpoint: + any argument, enable all breakpoints. Optionally, you can specify + how to enable the breakpoints: `del' - Enable the breakpoint(s) temporarily, then delete it when the - program stops at the breakpoint. + Enable the breakpoints temporarily, then delete each one when + the program stops at it. `once' - Enable the breakpoint(s) temporarily, then disable it when - the program stops at the breakpoint. + Enable the breakpoints temporarily, then disable each one when + the program stops at it. `ignore' N COUNT Ignore breakpoint number N the next COUNT times it is hit. @@ -21274,7 +21304,7 @@ execution of the program than we saw in our earlier example: `continue' [COUNT] `c' [COUNT] Resume program execution. If continued from a breakpoint and COUNT - is specified, ignores the breakpoint at that location the next + is specified, ignore the breakpoint at that location the next COUNT times before stopping. `finish' @@ -21309,10 +21339,10 @@ execution of the program than we saw in our earlier example: `step' [COUNT] `s' [COUNT] Continue execution until control reaches a different source line - in the current stack frame. `step' steps inside any function - called within the line. If the argument COUNT is supplied, steps - that many times before stopping, unless it encounters a breakpoint - or watchpoint. + in the current stack frame, stepping inside any function called + within the line. If the argument COUNT is supplied, steps that + many times before stopping, unless it encounters a breakpoint or + watchpoint. `stepi' [COUNT] `si' [COUNT] @@ -21393,13 +21423,13 @@ AWK STATEMENTS (`"'...`"'). You can also set special `awk' variables, such as `FS', `NF', - `NR', and son on. + `NR', and so on. `watch' VAR | `$'N [`"EXPRESSION"'] `w' VAR | `$'N [`"EXPRESSION"'] Add variable VAR (or field `$N') to the watch list. The debugger then stops whenever the value of the variable or field changes. - Each watched item is assigned a number which can be used to delete + Each watched item is assigned a number that can be used to delete it from the watch list using the `unwatch' command. With a watchpoint, you may also supply a condition. This is an @@ -21423,11 +21453,11 @@ File: gawk.info, Node: Execution Stack, Next: Debugger Info, Prev: Viewing An 14.3.4 Working with the Stack ----------------------------- -Whenever you run a program which contains any function calls, `gawk' +Whenever you run a program that contains any function calls, `gawk' maintains a stack of all of the function calls leading up to where the program is right now. You can see how you got to where you are, and also move around in the stack to see what the state of things was in the -functions which called the one you are in. The commands for doing this +functions that called the one you are in. The commands for doing this are: `backtrace' [COUNT] @@ -21447,8 +21477,8 @@ are: `frame' [N] `f' [N] Select and print stack frame N. Frame 0 is the currently - executing, or "innermost", frame (function call), frame 1 is the - frame that called the innermost one. The highest numbered frame is + executing, or "innermost", frame (function call); frame 1 is the + frame that called the innermost one. The highest-numbered frame is the one for the main program. The printed information consists of the frame number, function and argument names, source file, and the source line. @@ -21465,7 +21495,7 @@ File: gawk.info, Node: Debugger Info, Next: Miscellaneous Debugger Commands, Besides looking at the values of variables, there is often a need to get other sorts of information about the state of your program and of the -debugging environment itself. The `gawk' debugger has one command which +debugging environment itself. The `gawk' debugger has one command that provides this information, appropriately called `info'. `info' is used with one of a number of arguments that tell it exactly what you want to know: @@ -21522,11 +21552,12 @@ from a file. The commands are: option. The available options are: `history_size' - The maximum number of lines to keep in the history file + Set the maximum number of lines to keep in the history file `./.gawk_history'. The default is 100. `listsize' - The number of lines that `list' prints. The default is 15. + Specify the number of lines that `list' prints. The default + is 15. `outfile' Send `gawk' output to a file; debugger output still goes to @@ -21534,7 +21565,7 @@ from a file. The commands are: standard output. `prompt' - The debugger prompt. The default is `gawk> '. + Change the debugger prompt. The default is `gawk> '. `save_history' [`on' | `off'] Save command history to file `./.gawk_history'. The default @@ -21542,8 +21573,8 @@ from a file. The commands are: `save_options' [`on' | `off'] Save current options to file `./.gawkrc' upon exit. The - default is `on'. Options are read back in to the next - session upon startup. + default is `on'. Options are read back into the next session + upon startup. `trace' [`on' | `off'] Turn instruction tracing on or off. The default is `off'. @@ -21562,7 +21593,7 @@ from a file. The commands are: commands; however, the `gawk' debugger will not source the same file more than once in order to avoid infinite recursion. - In addition to, or instead of the `source' command, you can use + In addition to, or instead of, the `source' command, you can use the `-D FILE' or `--debug=FILE' command-line options to execute commands from a file non-interactively (*note Options::). @@ -21572,13 +21603,13 @@ File: gawk.info, Node: Miscellaneous Debugger Commands, Prev: Debugger Info, 14.3.6 Miscellaneous Commands ----------------------------- -There are a few more commands which do not fit into the previous +There are a few more commands that do not fit into the previous categories, as follows: `dump' [FILENAME] - Dump bytecode of the program to standard output or to the file + Dump byte code of the program to standard output or to the file named in FILENAME. This prints a representation of the internal - instructions which `gawk' executes to implement the `awk' commands + instructions that `gawk' executes to implement the `awk' commands in a program. This can be very enlightening, as the following partial dump of Davide Brini's obfuscated code (*note Signature Program::) demonstrates: @@ -21662,22 +21693,21 @@ categories, as follows: FILENAME. This command may change the current source file. FUNCTION - Print lines centered around beginning of the function + Print lines centered around the beginning of the function FUNCTION. This command may change the current source file. `quit' `q' Exit the debugger. Debugging is great fun, but sometimes we all have to tend to other obligations in life, and sometimes we find - the bug, and are free to go on to the next one! As we saw - earlier, if you are running a program, the debugger warns you if - you accidentally type `q' or `quit', to make sure you really want - to quit. + the bug and are free to go on to the next one! As we saw earlier, + if you are running a program, the debugger warns you when you type + `q' or `quit', to make sure you really want to quit. `trace' [`on' | `off'] - Turn on or off a continuous printing of instructions which are - about to be executed, along with printing the `awk' line which they - implement. The default is `off'. + Turn on or off continuous printing of the instructions that are + about to be executed, along with the `awk' lines they implement. + The default is `off'. It is to be hoped that most of the "opcodes" in these instructions are fairly self-explanatory, and using `stepi' and `nexti' while @@ -21690,7 +21720,7 @@ File: gawk.info, Node: Readline Support, Next: Limitations, Prev: List of Deb 14.4 Readline Support ===================== -If `gawk' is compiled with the `readline' library +If `gawk' is compiled with the GNU Readline library (http://cnswww.cns.cwru.edu/php/chet/readline/readline.html), you can take advantage of that library's command completion and history expansion features. The following types of completion are available: @@ -21720,7 +21750,7 @@ File: gawk.info, Node: Limitations, Next: Debugging Summary, Prev: Readline S We hope you find the `gawk' debugger useful and enjoyable to work with, but as with any program, especially in its early releases, it still has -some limitations. A few which are worth being aware of are: +some limitations. A few that it's worth being aware of are: * At this point, the debugger does not give a detailed explanation of what you did wrong when you type in something it doesn't like. @@ -21731,13 +21761,13 @@ some limitations. A few which are worth being aware of are: Commands:: (or if you are already familiar with `gawk' internals), you will realize that much of the internal manipulation of data in `gawk', as in many interpreters, is done on a stack. `Op_push', - `Op_pop', and the like, are the "bread and butter" of most `gawk' + `Op_pop', and the like are the "bread and butter" of most `gawk' code. Unfortunately, as of now, the `gawk' debugger does not allow you to examine the stack's contents. That is, the intermediate results of expression evaluation are on the stack, but cannot be - printed. Rather, only variables which are defined in the program + printed. Rather, only variables that are defined in the program can be printed. Of course, a workaround for this is to use more explicit variables at the debugging stage and then change back to obscure, perhaps more optimal code later. @@ -21749,12 +21779,12 @@ some limitations. A few which are worth being aware of are: * The `gawk' debugger is designed to be used by running a program (with all its parameters) on the command line, as described in *note Debugger Invocation::. There is no way (as of now) to - attach or "break in" to a running program. This seems reasonable - for a language which is used mainly for quickly executing, short + attach or "break into" a running program. This seems reasonable + for a language that is used mainly for quickly executing, short programs. - * The `gawk' debugger only accepts source supplied with the `-f' - option. + * The `gawk' debugger only accepts source code supplied with the + `-f' option. File: gawk.info, Node: Debugging Summary, Prev: Limitations, Up: Debugger @@ -21763,8 +21793,8 @@ File: gawk.info, Node: Debugging Summary, Prev: Limitations, Up: Debugger ============ * Programs rarely work correctly the first time. Finding bugs is - "debugging" and a program that helps you find bugs is a - "debugger". `gawk' has a built-in debugger that works very + called debugging, and a program that helps you find bugs is a + debugger. `gawk' has a built-in debugger that works very similarly to the GNU Debugger, GDB. * Debuggers let you step through your program one statement at a @@ -21780,8 +21810,8 @@ File: gawk.info, Node: Debugging Summary, Prev: Limitations, Up: Debugger breakpoints, execution, viewing and changing data, working with the stack, getting information, and other tasks. - * If the `readline' library is available when `gawk' is compiled, it - is used by the debugger to provide command-line history and + * If the GNU Readline library is available when `gawk' is compiled, + it is used by the debugger to provide command-line history and editing. @@ -22027,7 +22057,7 @@ so: $ gawk --version -| GNU Awk 4.1.2, API: 1.1 (GNU MPFR 3.1.0-p3, GNU MP 5.0.2) - -| Copyright (C) 1989, 1991-2014 Free Software Foundation. + -| Copyright (C) 1989, 1991-2015 Free Software Foundation. ... (You may see different version numbers than what's shown here. That's @@ -22566,7 +22596,7 @@ set: It's not that well known but it's not that obscure either. It's Euler's modification to Newton's method for calculating pi. Take a look at lines (23) - (25) here: - `http://mathworld.wolfram.com/PiFormulas.htm'. + `http://mathworld.wolfram.com/PiFormulas.html'. The algorithm I wrote simply expands the multiply by 2 and works from the innermost expression outwards. I used this to program HP @@ -28507,7 +28537,7 @@ Unix `awk' git clone git://github.com/onetrueawk/awk bwkawk - This command creates a copy of the Git (http://www.git-scm.com) + This command creates a copy of the Git (http://git-scm.com) repository in a directory named `bwkawk'. If you leave that argument off the `git' command line, the repository copy is created in a directory named `awk'. @@ -28554,7 +28584,7 @@ Unix `awk' To get `awka', go to `http://sourceforge.net/projects/awka'. The project seems to be frozen; no new code changes have been made - since approximately 2003. + since approximately 2001. `pawk' Nelson H.F. Beebe at the University of Utah has modified BWK `awk' @@ -28739,7 +28769,7 @@ released versions of `gawk'. changes, you will probably wish to work with the development version. To do so, you will need to access the `gawk' source code repository. The code is maintained using the Git distributed version control system -(http://git-scm.com/). You will need to install it if your system +(http://git-scm.com). You will need to install it if your system doesn't have it. Once you have done so, use the command: git clone git://git.savannah.gnu.org/gawk.git @@ -28794,7 +28824,7 @@ possible to include them: document describes how GNU software should be written. If you haven't read it, please do so, preferably _before_ starting to modify `gawk'. (The `GNU Coding Standards' are available from the - GNU Project's website (http://www.gnu.org/prep/standards_toc.html). + GNU Project's website (http://www.gnu.org/prep/standards/). Texinfo, Info, and DVI versions are also available.) 5. Use the `gawk' coding style. The C code for `gawk' follows the @@ -29676,6 +29706,21 @@ ANSI C++ programming languages. These standards often become international standards as well. See also "ISO." +Argument + An argument can be two different things. It can be an option or a + file name passed to a command while invoking it from the command + line, or it can be something passed to a "function" inside a + program, e.g. inside `awk'. + + In the latter case, an argument can be passed to a function in two + ways. Either it is given to the called function by value, i.e., a + copy of the value of the variable is made available to the called + function, but the original variable cannot be modified by the + function itself; or it is given by reference, i.e., a pointer to + the interested variable is passed to the function, which can then + directly modify it. In `awk' scalars are passed by value, and + arrays are passed by reference. See "Pass By Value/Reference." + Array A grouping of multiple values under the same name. Most languages just provide sequential arrays. `awk' provides associative arrays. @@ -29711,6 +29756,26 @@ Bash The GNU version of the standard shell (the Bourne-Again SHell). See also "Bourne Shell." +Binary + Base-two notation, where the digits are `0'-`1'. Since electronic + circuitry works "naturally" in base 2 (just think of Off/On), + everything inside a computer is calculated using base 2. Each digit + represents the presence (or absence) of a power of 2 and is called + a "bit". So, for example, the base-two number `10101' is the same + as decimal 21, ((1 x 16) + (1 x 4) + (1 x 1)). + + Since base-two numbers quickly become very long to read and write, + they are usually grouped by 3 (i.e., they are read as octal + numbers), or by 4 (i.e., they are read as hexadecimal numbers). + There is no direct way to insert base 2 numbers in a C program. + If need arises, such numbers are usually inserted as octal or + hexadecimal numbers. The number of base-two digits that fit into + registers used for representing integer numbers in computers is a + rough indication of the computing power of the computer itself. + Most computers nowadays use 64 bits for representing integer + numbers in their registers, but 32-bit, 16-bit and 8-bit registers + have been widely used in the past. *Note Nondecimal-numbers::. + Bit Short for "Binary Digit." All values in computer memory ultimately reduce to binary digits: values that are either zero or @@ -29739,6 +29804,19 @@ Braces The characters `{' and `}'. Braces are used in `awk' for delimiting actions, compound statements, and function bodies. +Bracket Expression + Inside a "regular expression", an expression included in square + brackets, meant to designate a single character as belonging to a + specified character class. A bracket expression can contain a list + of one or more characters, like `[abc]', a range of characters, + like `[A-Z]', or a name, delimited by `:', that designates a known + set of characters, like `[:digit:]'. The form of bracket expression + enclosed between `:' is independent of the underlying + representation of the character themselves, which could utilize + the ASCII, ECBDIC, or Unicode codesets, depending on the + architecture of the computer system, and on localization. See + also "Regular Expression." + Built-in Function The `awk' language provides built-in functions that perform various numerical, I/O-related, and string computations. Examples are @@ -29766,9 +29844,25 @@ C In general, `gawk' attempts to be as similar to the 1990 version of ISO C as makes sense. +C Shell + The C Shell (`csh' or its improved version, `tcsh') is a Unix + shell that was created by Bill Joy in the late 1970s. The C shell + was differentiated from other shells by its interactive features + and overall style, which looks more like C. The C Shell is not + backward compatible with the Bourne Shell, so special attention is + required when converting scripts written for other Unix shells to + the C shell, especially with regard to the management of shell + variables. See also "Bourne Shell." + C++ A popular object-oriented programming language derived from C. +Character Class + See "Bracket Expression." + +Character List + See "Bracket Expression." + Character Set The set of numeric codes used by a computer system to represent the characters (letters, numbers, punctuation, etc.) of a particular @@ -29783,7 +29877,7 @@ CHEM A preprocessor for `pic' that reads descriptions of molecules and produces `pic' input for drawing them. It was written in `awk' by Brian Kernighan and Jon Bentley, and is available from - `http://netlib.sandia.gov/netlib/typesetting/chem.gz'. + `http://netlib.org/typesetting/chem'. Comparison Expression A relation that is either true or false, such as `a < b'. @@ -29796,10 +29890,21 @@ Compiler machine-executable object code. The object code is then executed directly by the computer. See also "Interpreter." +Complemented Bracket Expression + The negation of a "bracket expression". All that is _not_ + described by a given bracket expression. The symbol `^' precedes + the negated bracket expression. E.g.: `[[^:digit:]' designates + whatever character is not a digit. `[^bad]' designates whatever + character is not one of the letters `b', `a', or `d'. See + "Bracket Expression." + Compound Statement A series of `awk' statements, enclosed in curly braces. Compound statements may be nested. (*Note Statements::.) +Computed Regexps + See "Dynamic Regular Expressions." + Concatenation Concatenating two strings means sticking them together, one after another, producing a new string. For example, the string `foo' @@ -29813,6 +29918,12 @@ Conditional Expression otherwise the value is EXPR3. In either case, only one of EXPR2 and EXPR3 is evaluated. (*Note Conditional Exp::.) +Control Statement + A control statement is an instruction to perform a given operation + or a set of operations inside an `awk' program, if a given + condition is true. Control statements are: `if', `for', `while', + and `do' (*note Statements::). + Cookie A peculiar goodie, token, saying or remembrance produced by or presented to a program. (With thanks to Professor Doug McIlroy.) @@ -29919,6 +30030,12 @@ Format are controlled by the format strings contained in the predefined variables `CONVFMT' and `OFMT'. (*Note Control Letters::.) +Fortran + Shorthand for FORmula TRANslator, one of the first programming + languages available for scientific calculations. It was created by + John Backus, and has been available since 1957. It is still in use + today. + Free Documentation License This document describes the terms under which this Info file is published and may be copied. (*Note GNU Free Documentation @@ -29934,9 +30051,16 @@ FSF See "Free Software Foundation." Function - A specialized group of statements used to encapsulate general or - program-specific tasks. `awk' has a number of built-in functions, - and also allows you to define your own. (*Note Functions::.) + A part of an `awk' program that can be invoked from every point of + the program, to perform a task. `awk' has several built-in + functions. Users can define their own functions in every part of + the program. Function can be recursive, i.e., they may invoke + themselves. *Note Functions::. In `gawk' it is also possible to + have functions shared among different programs, and included where + required using the `@include' directive (*note Include Files::). + In `gawk' the name of the function that should be invoked can be + generated at run time, i.e., dynamically. The `gawk' extension + API provides constructor functions (*note Constructor Functions::). `gawk' The GNU implementation of `awk'. @@ -30032,6 +30156,12 @@ Keyword `else', `exit', `for...in', `for', `function', `func', `if', `next', `nextfile', `switch', and `while'. +Korn Shell + The Korn Shell (`ksh') is a Unix shell which was developed by + David Korn at Bell Laboratories in the early 1980s. The Korn Shell + is backward-compatible with the Bourne shell and includes many + features of the C shell. See also "Bourne Shell." + Lesser General Public License This document describes the terms under which binary library archives or shared objects, and their source code may be @@ -30069,6 +30199,13 @@ Metacharacters Instead, they denote regular expression operations, such as repetition, grouping, or alternation. +Nesting + Nesting is where information is organized in layers, or where + objects contain other similar objects. In `gawk' the `@include' + directive can be nested. The "natural" nesting of arithmetic and + logical operations can be changed using parentheses (*note + Precedence::). + No-op An operation that does nothing. @@ -30088,6 +30225,11 @@ Octal are written in C using a leading `0', to indicate their base. Thus, `013' is 11 ((1 x 8) + 3). *Note Nondecimal-numbers::. +Output Record + A single chunk of data that is written out by `awk'. Usually, an + `awk' output record consists of one or more lines of text. *Note + Records::. + Pattern Patterns tell `awk' which input records are interesting to which rules. @@ -30103,6 +30245,9 @@ PEBKAC computer usage problems. (Problem Exists Between Keyboard And Chair.) +Plug-in + See "Extensions." + POSIX The name for a series of standards that specify a Portable Operating System interface. The "IX" denotes the Unix heritage of @@ -30126,6 +30271,9 @@ Range (of input lines) can specify ranges of input lines for `awk' to process or it can specify single lines. (*Note Pattern Overview::.) +Record + See "Input record" and "Output record." + Recursion When a function calls itself, either directly or indirectly. If this is clear, stop, and proceed to the next entry. Otherwise, @@ -30142,6 +30290,16 @@ Redirection using the `>', `>>', `|', and `|&' operators. (*Note Getline::, and *note Redirection::.) +Reference Counts + An internal mechanism in `gawk' to minimize the amount of memory + needed to store the value of string variables. If the value + assumed by a variable is used in more than one place, only one + copy of the value itself is kept, and the associated reference + count is increased when the same value is used by an additional + variable, and decresed when the related variable is no longer in + use. When the reference count goes to zero, the memory space used + to store the value of the variable is freed. + Regexp See "Regular Expression." @@ -30160,6 +30318,15 @@ Regular Expression Constant when you write the `awk' program and cannot be changed during its execution. (*Note Regexp Usage::.) +Regular Expression Operators + See "Metacharacters." + +Rounding + Rounding the result of an arithmetic operation can be tricky. + More than one way of rounding exists, and in `gawk' it is possible + to choose which method should be used in a program. *Note Setting + the rounding mode::. + Rule A segment of an `awk' program that specifies how to process single input records. A rule consists of a "pattern" and an "action". @@ -30221,6 +30388,11 @@ Special File handed directly to the underlying operating system--for example, `/dev/stderr'. (*Note Special Files::.) +Statement + An expression inside an `awk' program in the action part of a + pattern-action rule, or inside an `awk' function. A statement can + be a variable assignment, an array operation, a loop, etc. + Stream Editor A program that reads records from an input stream and processes them one or more at a time. This is in contrast with batch @@ -30263,10 +30435,15 @@ UTC reference time for day and date calculations. See also "Epoch" and "GMT." +Variable + A name for a value. In `awk', variables may be either scalars or + arrays. + Whitespace A sequence of space, TAB, or newline characters occurring inside an input record or a string. + File: gawk.info, Node: Copying, Next: GNU Free Documentation License, Prev: Glossary, Up: Top @@ -31499,7 +31676,7 @@ Index * ! (exclamation point), !~ operator <5>: Case-sensitivity. (line 26) * ! (exclamation point), !~ operator <6>: Computed Regexps. (line 6) * ! (exclamation point), !~ operator: Regexp Usage. (line 19) -* " (double quote), in regexp constants: Computed Regexps. (line 29) +* " (double quote), in regexp constants: Computed Regexps. (line 30) * " (double quote), in shell commands: Quoting. (line 54) * # (number sign), #! (executable scripts): Executable Scripts. (line 6) @@ -31528,7 +31705,7 @@ Index * * (asterisk), * operator, as regexp operator: Regexp Operators. (line 89) * * (asterisk), * operator, null strings, matching: String Functions. - (line 536) + (line 537) * * (asterisk), ** operator <1>: Precedence. (line 49) * * (asterisk), ** operator: Arithmetic Ops. (line 81) * * (asterisk), **= operator <1>: Precedence. (line 95) @@ -31587,7 +31764,7 @@ Index * --re-interval option: Options. (line 279) * --sandbox option: Options. (line 286) * --sandbox option, disabling system() function: I/O Functions. - (line 128) + (line 129) * --sandbox option, input redirection with getline: Getline. (line 19) * --sandbox option, output redirection with print, printf: Redirection. (line 6) @@ -31633,7 +31810,7 @@ Index * -W option: Options. (line 46) * . (period), regexp operator: Regexp Operators. (line 44) * .gmo files: Explaining gettext. (line 42) -* .gmo files, specifying directory of <1>: Programmer i18n. (line 47) +* .gmo files, specifying directory of <1>: Programmer i18n. (line 48) * .gmo files, specifying directory of: Explaining gettext. (line 54) * .mo files, converting from .po: I18N Example. (line 64) * .po files <1>: Translator i18n. (line 6) @@ -31734,7 +31911,7 @@ Index * \ (backslash), in escape sequences: Escape Sequences. (line 6) * \ (backslash), in escape sequences, POSIX and: Escape Sequences. (line 108) -* \ (backslash), in regexp constants: Computed Regexps. (line 29) +* \ (backslash), in regexp constants: Computed Regexps. (line 30) * \ (backslash), in shell commands: Quoting. (line 48) * \ (backslash), regexp operator: Regexp Operators. (line 18) * ^ (caret), ^ operator: Precedence. (line 49) @@ -31828,7 +32005,7 @@ Index * arrays: Arrays. (line 6) * arrays of arrays: Arrays of Arrays. (line 6) * arrays, an example of using: Array Example. (line 6) -* arrays, and IGNORECASE variable: Array Intro. (line 94) +* arrays, and IGNORECASE variable: Array Intro. (line 100) * arrays, as parameters to functions: Pass By Value/Reference. (line 44) * arrays, associative: Array Intro. (line 50) @@ -31855,14 +32032,14 @@ Index (line 6) * arrays, sorting, and IGNORECASE variable: Array Sorting Functions. (line 83) -* arrays, sparse: Array Intro. (line 72) +* arrays, sparse: Array Intro. (line 76) * arrays, subscripts, uninitialized variables as: Uninitialized Subscripts. (line 6) * arrays, unassigned elements: Reference to Elements. (line 18) * artificial intelligence, gawk and: Distribution contents. (line 52) -* ASCII <1>: Glossary. (line 133) +* ASCII <1>: Glossary. (line 197) * ASCII: Ordinal Functions. (line 45) * asort <1>: Array Sorting Functions. (line 6) @@ -31889,7 +32066,7 @@ Index * asterisk (*), * operator, as regexp operator: Regexp Operators. (line 89) * asterisk (*), * operator, null strings, matching: String Functions. - (line 536) + (line 537) * asterisk (*), ** operator <1>: Precedence. (line 49) * asterisk (*), ** operator: Arithmetic Ops. (line 81) * asterisk (*), **= operator <1>: Precedence. (line 95) @@ -32003,7 +32180,7 @@ Index * backslash (\), in escape sequences: Escape Sequences. (line 6) * backslash (\), in escape sequences, POSIX and: Escape Sequences. (line 108) -* backslash (\), in regexp constants: Computed Regexps. (line 29) +* backslash (\), in regexp constants: Computed Regexps. (line 30) * backslash (\), in shell commands: Quoting. (line 48) * backslash (\), regexp operator: Regexp Operators. (line 18) * backtrace debugger command: Execution Stack. (line 13) @@ -32033,13 +32210,13 @@ Index * BEGINFILE pattern: BEGINFILE/ENDFILE. (line 6) * BEGINFILE pattern, Boolean patterns and: Expression Patterns. (line 69) -* beginfile() user-defined function: Filetrans Function. (line 61) -* Bentley, Jon: Glossary. (line 143) +* beginfile() user-defined function: Filetrans Function. (line 62) +* Bentley, Jon: Glossary. (line 207) * Benzinger, Michael: Contributors. (line 97) * Berry, Karl <1>: Ranges and Locales. (line 74) * Berry, Karl: Acknowledgments. (line 33) * binary input/output: User-modified. (line 15) -* bindtextdomain <1>: Programmer i18n. (line 47) +* bindtextdomain <1>: Programmer i18n. (line 48) * bindtextdomain: I18N Functions. (line 12) * bindtextdomain() function (C library): Explaining gettext. (line 50) * bindtextdomain() function (gawk), portability and: I18N Portability. @@ -32096,7 +32273,7 @@ Index * Brennan, Michael: Foreword3. (line 84) * Brian Kernighan's awk <1>: I/O Functions. (line 43) * Brian Kernighan's awk <2>: Gory Details. (line 19) -* Brian Kernighan's awk <3>: String Functions. (line 492) +* Brian Kernighan's awk <3>: String Functions. (line 493) * Brian Kernighan's awk <4>: Delete. (line 51) * Brian Kernighan's awk <5>: Nextfile Statement. (line 47) * Brian Kernighan's awk <6>: Continue Statement. (line 44) @@ -32116,14 +32293,14 @@ Index * Brink, Jeroen: DOS Quoting. (line 10) * Broder, Alan J.: Contributors. (line 88) * Brown, Martin: Contributors. (line 82) -* BSD-based operating systems: Glossary. (line 611) +* BSD-based operating systems: Glossary. (line 753) * bt debugger command (alias for backtrace): Execution Stack. (line 13) * Buening, Andreas <1>: Bugs. (line 70) * Buening, Andreas <2>: Contributors. (line 92) * Buening, Andreas: Acknowledgments. (line 60) * buffering, input/output <1>: Two-way I/O. (line 52) -* buffering, input/output: I/O Functions. (line 140) -* buffering, interactive vs. noninteractive: I/O Functions. (line 75) +* buffering, input/output: I/O Functions. (line 141) +* buffering, interactive vs. noninteractive: I/O Functions. (line 76) * buffers, flushing: I/O Functions. (line 32) * buffers, operators for: GNU Regexp Operators. (line 48) @@ -32148,8 +32325,8 @@ Index * case keyword: Switch Statement. (line 6) * case sensitivity, and regexps: User-modified. (line 76) * case sensitivity, and string comparisons: User-modified. (line 76) -* case sensitivity, array indices and: Array Intro. (line 94) -* case sensitivity, converting case: String Functions. (line 522) +* case sensitivity, array indices and: Array Intro. (line 100) +* case sensitivity, converting case: String Functions. (line 523) * case sensitivity, example programs: Library Functions. (line 53) * case sensitivity, gawk: Case-sensitivity. (line 26) * case sensitivity, regexps and: Case-sensitivity. (line 6) @@ -32158,7 +32335,7 @@ Index (line 56) * character lists in regular expression: Bracket Expressions. (line 6) * character lists, See bracket expressions: Regexp Operators. (line 56) -* character sets (machine character encodings) <1>: Glossary. (line 133) +* character sets (machine character encodings) <1>: Glossary. (line 197) * character sets (machine character encodings): Ordinal Functions. (line 45) * character sets, See Also bracket expressions: Regexp Operators. @@ -32169,7 +32346,7 @@ Index * Chassell, Robert J.: Acknowledgments. (line 33) * chdir() extension function: Extension Sample File Functions. (line 12) -* chem utility: Glossary. (line 143) +* chem utility: Glossary. (line 207) * chr() extension function: Extension Sample Ord. (line 15) * chr() user-defined function: Ordinal Functions. (line 16) @@ -32227,7 +32404,7 @@ Index * common extensions, \x escape sequence: Escape Sequences. (line 61) * common extensions, BINMODE variable: PC Using. (line 33) * common extensions, delete to delete entire arrays: Delete. (line 39) -* common extensions, func keyword: Definition Syntax. (line 93) +* common extensions, func keyword: Definition Syntax. (line 98) * common extensions, length() applied to an array: String Functions. (line 201) * common extensions, RS as a regexp: gawk split records. (line 6) @@ -32246,7 +32423,7 @@ Index * compatibility mode (gawk), octal numbers: Nondecimal-numbers. (line 60) * compatibility mode (gawk), specifying: Options. (line 81) -* compiled programs <1>: Glossary. (line 155) +* compiled programs <1>: Glossary. (line 219) * compiled programs: Basic High Level. (line 15) * compiling gawk for Cygwin: Cygwin. (line 6) * compiling gawk for MS-DOS and MS-Windows: PC Compiling. (line 13) @@ -32278,9 +32455,9 @@ Index * control statements: Statements. (line 6) * controlling array scanning order: Controlling Scanning. (line 14) -* convert string to lower case: String Functions. (line 523) -* convert string to number: String Functions. (line 390) -* convert string to upper case: String Functions. (line 529) +* convert string to lower case: String Functions. (line 524) +* convert string to number: String Functions. (line 391) +* convert string to upper case: String Functions. (line 530) * converting integer array subscripts: Numeric Array Subscripts. (line 31) * converting, dates to timestamps: Time Functions. (line 76) @@ -32292,7 +32469,7 @@ Index * CONVFMT variable: Strings And Numbers. (line 29) * CONVFMT variable, and array subscripts: Numeric Array Subscripts. (line 6) -* cookie: Glossary. (line 177) +* cookie: Glossary. (line 258) * coprocesses <1>: Two-way I/O. (line 25) * coprocesses: Redirection. (line 96) * coprocesses, closing: Close Files And Pipes. @@ -32316,7 +32493,7 @@ Index * cut.awk program: Cut Program. (line 45) * d debugger command (alias for delete): Breakpoint Control. (line 64) * d.c., See dark corner: Conventions. (line 42) -* dark corner <1>: Glossary. (line 188) +* dark corner <1>: Glossary. (line 269) * dark corner: Conventions. (line 42) * dark corner, "0" is actually true: Truth Values. (line 24) * dark corner, /= operator vs. /=.../ regexp constant: Assignment Ops. @@ -32358,7 +32535,7 @@ Index (line 148) * dark corner, regexp constants, as arguments to user-defined functions: Using Constant Regexps. (line 43) -* dark corner, split() function: String Functions. (line 361) +* dark corner, split() function: String Functions. (line 362) * dark corner, strings, storing: gawk split records. (line 83) * dark corner, value of ARGV[0]: Auto-set. (line 39) * data, fixed-width: Constant Size. (line 6) @@ -32373,11 +32550,11 @@ Index * Davies, Stephen <1>: Contributors. (line 74) * Davies, Stephen: Acknowledgments. (line 60) * Day, Robert P.J.: Acknowledgments. (line 78) -* dcgettext <1>: Programmer i18n. (line 19) +* dcgettext <1>: Programmer i18n. (line 20) * dcgettext: I18N Functions. (line 22) * dcgettext() function (gawk), portability and: I18N Portability. (line 33) -* dcngettext <1>: Programmer i18n. (line 36) +* dcngettext <1>: Programmer i18n. (line 37) * dcngettext: I18N Functions. (line 28) * dcngettext() function (gawk), portability and: I18N Portability. (line 33) @@ -32464,7 +32641,7 @@ Index * debugger commands, t (tbreak): Breakpoint Control. (line 90) * debugger commands, tbreak: Breakpoint Control. (line 90) * debugger commands, trace: Miscellaneous Debugger Commands. - (line 108) + (line 107) * debugger commands, u (until): Debugger Execution Control. (line 83) * debugger commands, undisplay: Viewing And Changing Data. @@ -32480,12 +32657,12 @@ Index (line 67) * debugger commands, where (backtrace): Execution Stack. (line 13) * debugger default list amount: Debugger Info. (line 69) -* debugger history file: Debugger Info. (line 80) +* debugger history file: Debugger Info. (line 81) * debugger history size: Debugger Info. (line 65) * debugger options: Debugger Info. (line 57) -* debugger prompt: Debugger Info. (line 77) +* debugger prompt: Debugger Info. (line 78) * debugger, how to start: Debugger Invocation. (line 6) -* debugger, read commands from a file: Debugger Info. (line 96) +* debugger, read commands from a file: Debugger Info. (line 97) * debugging awk programs: Debugger. (line 6) * debugging gawk, bug reports: Bugs. (line 9) * decimal point character, locale specific: Options. (line 270) @@ -32575,7 +32752,7 @@ Index (line 77) * differences in awk and gawk, SYMTAB variable: Auto-set. (line 283) * differences in awk and gawk, TEXTDOMAIN variable: User-modified. - (line 151) + (line 152) * differences in awk and gawk, trunc-mod operation: Arithmetic Ops. (line 66) * directories, command-line: Command-line directories. @@ -32601,7 +32778,7 @@ Index * dollar sign ($), incrementing fields and arrays: Increment Ops. (line 30) * dollar sign ($), regexp operator: Regexp Operators. (line 35) -* double quote ("), in regexp constants: Computed Regexps. (line 29) +* double quote ("), in regexp constants: Computed Regexps. (line 30) * double quote ("), in shell commands: Quoting. (line 54) * down debugger command: Execution Stack. (line 23) * Drepper, Ulrich: Acknowledgments. (line 52) @@ -32653,9 +32830,9 @@ Index * END pattern, print statement and: I/O And BEGIN/END. (line 16) * ENDFILE pattern: BEGINFILE/ENDFILE. (line 6) * ENDFILE pattern, Boolean patterns and: Expression Patterns. (line 69) -* endfile() user-defined function: Filetrans Function. (line 61) -* endgrent() function (C library): Group Functions. (line 211) -* endgrent() user-defined function: Group Functions. (line 214) +* endfile() user-defined function: Filetrans Function. (line 62) +* endgrent() function (C library): Group Functions. (line 212) +* endgrent() user-defined function: Group Functions. (line 215) * endpwent() function (C library): Passwd Functions. (line 207) * endpwent() user-defined function: Passwd Functions. (line 210) * English, Steve: Advanced Features. (line 6) @@ -32663,7 +32840,7 @@ Index * environment variables used by gawk: Environment Variables. (line 6) * environment variables, in ENVIRON array: Auto-set. (line 60) -* epoch, definition of: Glossary. (line 234) +* epoch, definition of: Glossary. (line 315) * equals sign (=), = operator: Assignment Ops. (line 6) * equals sign (=), == operator <1>: Precedence. (line 65) * equals sign (=), == operator: Comparison Operators. @@ -32710,7 +32887,7 @@ Index * exit the debugger: Miscellaneous Debugger Commands. (line 99) * exp: Numeric Functions. (line 33) -* expand utility: Very Simple. (line 72) +* expand utility: Very Simple. (line 73) * Expat XML parser library: gawkextlib. (line 33) * exponent: Numeric Functions. (line 33) * expressions: Expressions. (line 6) @@ -32749,7 +32926,7 @@ Index * extensions, common, BINMODE variable: PC Using. (line 33) * extensions, common, delete to delete entire arrays: Delete. (line 39) * extensions, common, fflush() function: I/O Functions. (line 43) -* extensions, common, func keyword: Definition Syntax. (line 93) +* extensions, common, func keyword: Definition Syntax. (line 98) * extensions, common, length() applied to an array: String Functions. (line 201) * extensions, common, RS as a regexp: gawk split records. (line 6) @@ -32816,7 +32993,7 @@ Index * FILENAME variable, getline, setting with: Getline Notes. (line 19) * filenames, assignments as: Ignoring Assigns. (line 6) * files, .gmo: Explaining gettext. (line 42) -* files, .gmo, specifying directory of <1>: Programmer i18n. (line 47) +* files, .gmo, specifying directory of <1>: Programmer i18n. (line 48) * files, .gmo, specifying directory of: Explaining gettext. (line 54) * files, .mo, converting from .po: I18N Example. (line 64) * files, .po <1>: Translator i18n. (line 6) @@ -32843,7 +33020,7 @@ Index * files, message object, converting from portable object files: I18N Example. (line 64) * files, message object, specifying directory of <1>: Programmer i18n. - (line 47) + (line 48) * files, message object, specifying directory of: Explaining gettext. (line 54) * files, multiple passes over: Other Arguments. (line 56) @@ -32895,7 +33072,7 @@ Index * format time string: Time Functions. (line 48) * formats, numeric output: OFMT. (line 6) * formatting output: Printf. (line 6) -* formatting strings: String Functions. (line 383) +* formatting strings: String Functions. (line 384) * forward slash (/) to enclose regular expressions: Regexp. (line 10) * forward slash (/), / operator: Precedence. (line 55) * forward slash (/), /= operator <1>: Precedence. (line 95) @@ -32909,10 +33086,10 @@ Index * frame debugger command: Execution Stack. (line 27) * Free Documentation License (FDL): GNU Free Documentation License. (line 7) -* Free Software Foundation (FSF) <1>: Glossary. (line 288) +* Free Software Foundation (FSF) <1>: Glossary. (line 375) * Free Software Foundation (FSF) <2>: Getting. (line 10) * Free Software Foundation (FSF): Manual History. (line 6) -* FreeBSD: Glossary. (line 611) +* FreeBSD: Glossary. (line 753) * FS variable <1>: User-modified. (line 50) * FS variable: Field Separators. (line 15) * FS variable, --field-separator option and: Options. (line 21) @@ -32926,7 +33103,7 @@ Index * FS, containing ^: Regexp Field Splitting. (line 59) * FS, in multiline records: Multiple Line. (line 41) -* FSF (Free Software Foundation) <1>: Glossary. (line 288) +* FSF (Free Software Foundation) <1>: Glossary. (line 375) * FSF (Free Software Foundation) <2>: Getting. (line 10) * FSF (Free Software Foundation): Manual History. (line 6) * fts() extension function: Extension Sample File Functions. @@ -32966,7 +33143,7 @@ Index * functions, library, user database, reading: Passwd Functions. (line 6) * functions, names of: Definition Syntax. (line 23) -* functions, recursive: Definition Syntax. (line 83) +* functions, recursive: Definition Syntax. (line 88) * functions, string-translation: I18N Functions. (line 6) * functions, undefined: Pass By Value/Reference. (line 68) @@ -32987,7 +33164,7 @@ Index * gawk, awk and: Preface. (line 21) * gawk, bitwise operations in: Bitwise Functions. (line 40) * gawk, break statement in: Break Statement. (line 51) -* gawk, character classes and: Bracket Expressions. (line 100) +* gawk, character classes and: Bracket Expressions. (line 101) * gawk, coding style in: Adding Code. (line 38) * gawk, command-line options, and regular expressions: GNU Regexp Operators. (line 70) @@ -33022,7 +33199,7 @@ Index * gawk, IGNORECASE variable in <1>: Array Sorting Functions. (line 83) * gawk, IGNORECASE variable in <2>: String Functions. (line 58) -* gawk, IGNORECASE variable in <3>: Array Intro. (line 94) +* gawk, IGNORECASE variable in <3>: Array Intro. (line 100) * gawk, IGNORECASE variable in <4>: User-modified. (line 76) * gawk, IGNORECASE variable in: Case-sensitivity. (line 26) * gawk, implementation issues: Notes. (line 6) @@ -33064,7 +33241,7 @@ Index * gawk, splitting fields and: Constant Size. (line 87) * gawk, string-translation functions: I18N Functions. (line 6) * gawk, SYMTAB array in: Auto-set. (line 283) -* gawk, TEXTDOMAIN variable in: User-modified. (line 151) +* gawk, TEXTDOMAIN variable in: User-modified. (line 152) * gawk, timestamps: Time Functions. (line 6) * gawk, uses for: Preface. (line 34) * gawk, versions of, information about, printing: Options. (line 300) @@ -33079,7 +33256,7 @@ Index * gawkpath_append shell function: Shell Startup Files. (line 19) * gawkpath_default shell function: Shell Startup Files. (line 12) * gawkpath_prepend shell function: Shell Startup Files. (line 15) -* General Public License (GPL): Glossary. (line 305) +* General Public License (GPL): Glossary. (line 399) * General Public License, See GPL: Manual History. (line 11) * generate time values: Time Functions. (line 25) * gensub <1>: String Functions. (line 90) @@ -33089,12 +33266,12 @@ Index * getaddrinfo() function (C library): TCP/IP Networking. (line 38) * getgrent() function (C library): Group Functions. (line 6) * getgrent() user-defined function: Group Functions. (line 6) -* getgrgid() function (C library): Group Functions. (line 182) -* getgrgid() user-defined function: Group Functions. (line 185) -* getgrnam() function (C library): Group Functions. (line 171) -* getgrnam() user-defined function: Group Functions. (line 176) -* getgruser() function (C library): Group Functions. (line 191) -* getgruser() function, user-defined: Group Functions. (line 194) +* getgrgid() function (C library): Group Functions. (line 183) +* getgrgid() user-defined function: Group Functions. (line 186) +* getgrnam() function (C library): Group Functions. (line 172) +* getgrnam() user-defined function: Group Functions. (line 177) +* getgruser() function (C library): Group Functions. (line 192) +* getgruser() function, user-defined: Group Functions. (line 195) * getline command: Reading Files. (line 20) * getline command, _gr_init() user-defined function: Group Functions. (line 83) @@ -33111,7 +33288,7 @@ Index * getline from a file: Getline/File. (line 6) * getline into a variable: Getline/Variable. (line 6) * getline statement, BEGINFILE/ENDFILE patterns and: BEGINFILE/ENDFILE. - (line 54) + (line 53) * getlocaltime() user-defined function: Getlocaltime Function. (line 16) * getopt() function (C library): Getopt Function. (line 15) @@ -33137,18 +33314,18 @@ Index * GNU awk, See gawk: Preface. (line 51) * GNU Free Documentation License: GNU Free Documentation License. (line 7) -* GNU General Public License: Glossary. (line 305) -* GNU Lesser General Public License: Glossary. (line 396) +* GNU General Public License: Glossary. (line 399) +* GNU Lesser General Public License: Glossary. (line 496) * GNU long options <1>: Options. (line 6) * GNU long options: Command Line. (line 13) * GNU long options, printing list of: Options. (line 154) -* GNU Project <1>: Glossary. (line 314) +* GNU Project <1>: Glossary. (line 408) * GNU Project: Manual History. (line 11) -* GNU/Linux <1>: Glossary. (line 611) +* GNU/Linux <1>: Glossary. (line 753) * GNU/Linux <2>: I18N Example. (line 55) * GNU/Linux: Manual History. (line 28) * Gordon, Assaf: Contributors. (line 105) -* GPL (General Public License) <1>: Glossary. (line 305) +* GPL (General Public License) <1>: Glossary. (line 399) * GPL (General Public License): Manual History. (line 11) * GPL (General Public License), printing: Options. (line 88) * grcat program: Group Functions. (line 16) @@ -33160,7 +33337,7 @@ Index * gsub <1>: String Functions. (line 140) * gsub: Using Constant Regexps. (line 43) -* gsub() function, arguments of: String Functions. (line 462) +* gsub() function, arguments of: String Functions. (line 463) * gsub() function, escape processing: Gory Details. (line 6) * h debugger command (alias for help): Miscellaneous Debugger Commands. (line 66) @@ -33187,7 +33364,7 @@ Index * hyphen (-), in bracket expressions: Bracket Expressions. (line 17) * i debugger command (alias for info): Debugger Info. (line 13) * id utility: Id Program. (line 6) -* id.awk program: Id Program. (line 30) +* id.awk program: Id Program. (line 31) * if statement: If Statement. (line 6) * if statement, actions, changing: Ranges. (line 25) * if statement, use of regexps in: Regexp Usage. (line 19) @@ -33195,7 +33372,7 @@ Index * ignore breakpoint: Breakpoint Control. (line 87) * ignore debugger command: Breakpoint Control. (line 87) * IGNORECASE variable: User-modified. (line 76) -* IGNORECASE variable, and array indices: Array Intro. (line 94) +* IGNORECASE variable, and array indices: Array Intro. (line 100) * IGNORECASE variable, and array sorting functions: Array Sorting Functions. (line 83) * IGNORECASE variable, in example programs: Library Functions. @@ -33255,7 +33432,7 @@ Index * insomnia, cure for: Alarm Program. (line 6) * installation, VMS: VMS Installation. (line 6) * installing gawk: Installation. (line 6) -* instruction tracing, in debugger: Debugger Info. (line 89) +* instruction tracing, in debugger: Debugger Info. (line 90) * int: Numeric Functions. (line 38) * INT signal (MS-Windows): Profiling. (line 213) * integer array indices: Numeric Array Subscripts. @@ -33263,37 +33440,37 @@ Index * integers, arbitrary precision: Arbitrary Precision Integers. (line 6) * integers, unsigned: Computer Arithmetic. (line 41) -* interacting with other programs: I/O Functions. (line 106) +* interacting with other programs: I/O Functions. (line 107) * internationalization <1>: I18N and L10N. (line 6) * internationalization: I18N Functions. (line 6) * internationalization, localization <1>: Internationalization. (line 13) -* internationalization, localization: User-modified. (line 151) +* internationalization, localization: User-modified. (line 152) * internationalization, localization, character classes: Bracket Expressions. - (line 100) + (line 101) * internationalization, localization, gawk and: Internationalization. (line 13) * internationalization, localization, locale categories: Explaining gettext. (line 81) * internationalization, localization, marked strings: Programmer i18n. - (line 14) + (line 13) * internationalization, localization, portability and: I18N Portability. (line 6) * internationalizing a program: Explaining gettext. (line 6) -* interpreted programs <1>: Glossary. (line 356) +* interpreted programs <1>: Glossary. (line 450) * interpreted programs: Basic High Level. (line 15) * interval expressions, regexp operator: Regexp Operators. (line 116) * inventory-shipped file: Sample Data Files. (line 32) -* invoke shell command: I/O Functions. (line 106) +* invoke shell command: I/O Functions. (line 107) * isarray: Type Functions. (line 11) -* ISO: Glossary. (line 367) -* ISO 8859-1: Glossary. (line 133) -* ISO Latin-1: Glossary. (line 133) +* ISO: Glossary. (line 461) +* ISO 8859-1: Glossary. (line 197) +* ISO Latin-1: Glossary. (line 197) * Jacobs, Andrew: Passwd Functions. (line 90) * Jaegermann, Michal <1>: Contributors. (line 45) * Jaegermann, Michal: Acknowledgments. (line 60) * Java implementation of awk: Other Versions. (line 117) -* Java programming language: Glossary. (line 379) +* Java programming language: Glossary. (line 473) * jawk: Other Versions. (line 117) * Jedi knights: Undocumented. (line 6) * Johansen, Chris: Signature Program. (line 25) @@ -33302,7 +33479,7 @@ Index * Kahrs, Ju"rgen: Acknowledgments. (line 60) * Kasal, Stepan: Acknowledgments. (line 60) * Kenobi, Obi-Wan: Undocumented. (line 6) -* Kernighan, Brian <1>: Glossary. (line 143) +* Kernighan, Brian <1>: Glossary. (line 207) * Kernighan, Brian <2>: Basic Data Typing. (line 54) * Kernighan, Brian <3>: Other Versions. (line 13) * Kernighan, Brian <4>: Contributors. (line 11) @@ -33343,8 +33520,8 @@ Index * length: String Functions. (line 171) * length of input record: String Functions. (line 178) * length of string: String Functions. (line 171) -* Lesser General Public License (LGPL): Glossary. (line 396) -* LGPL (Lesser General Public License): Glossary. (line 396) +* Lesser General Public License (LGPL): Glossary. (line 496) +* LGPL (Lesser General Public License): Glossary. (line 496) * libmawk: Other Versions. (line 125) * libraries of awk functions: Library Functions. (line 6) * libraries of awk functions, assertions: Assert Function. (line 6) @@ -33389,7 +33566,7 @@ Index * lint checking, undefined functions: Pass By Value/Reference. (line 85) * LINT variable: User-modified. (line 88) -* Linux <1>: Glossary. (line 611) +* Linux <1>: Glossary. (line 753) * Linux <2>: I18N Example. (line 55) * Linux: Manual History. (line 28) * list all global variables, in debugger: Debugger Info. (line 48) @@ -33444,20 +33621,20 @@ Index * matching, expressions, See comparison expressions: Typing and Comparison. (line 9) * matching, leftmost longest: Multiple Line. (line 26) -* matching, null strings: String Functions. (line 536) +* matching, null strings: String Functions. (line 537) * mawk utility <1>: Other Versions. (line 48) * mawk utility <2>: Nextfile Statement. (line 47) * mawk utility <3>: Concatenation. (line 36) * mawk utility <4>: Getline/Pipe. (line 62) * mawk utility: Escape Sequences. (line 120) * maximum precision supported by MPFR library: Auto-set. (line 235) -* McIlroy, Doug: Glossary. (line 177) +* McIlroy, Doug: Glossary. (line 258) * McPhee, Patrick: Contributors. (line 100) * message object files: Explaining gettext. (line 42) * message object files, converting from portable object files: I18N Example. (line 64) * message object files, specifying directory of <1>: Programmer i18n. - (line 47) + (line 48) * message object files, specifying directory of: Explaining gettext. (line 54) * messages from extensions: Printing Messages. (line 6) @@ -33479,7 +33656,7 @@ Index * names, functions: Definition Syntax. (line 23) * namespace issues: Library Names. (line 6) * namespace issues, functions: Definition Syntax. (line 23) -* NetBSD: Glossary. (line 611) +* NetBSD: Glossary. (line 753) * networks, programming: TCP/IP Networking. (line 6) * networks, support for: Special Network. (line 6) * newlines <1>: Boolean Ops. (line 69) @@ -33488,8 +33665,8 @@ Index * newlines, as field separators: Default Field Splitting. (line 6) * newlines, as record separators: awk split records. (line 12) -* newlines, in dynamic regexps: Computed Regexps. (line 59) -* newlines, in regexp constants: Computed Regexps. (line 69) +* newlines, in dynamic regexps: Computed Regexps. (line 60) +* newlines, in regexp constants: Computed Regexps. (line 70) * newlines, printing: Print Examples. (line 12) * newlines, separating statements in actions <1>: Statements. (line 10) * newlines, separating statements in actions: Action Overview. @@ -33535,7 +33712,7 @@ Index (line 43) * null strings, converting numbers to strings: Strings And Numbers. (line 21) -* null strings, matching: String Functions. (line 536) +* null strings, matching: String Functions. (line 537) * number as string of bits: Bitwise Functions. (line 110) * number of array elements: String Functions. (line 201) * number sign (#), #! (executable scripts): Executable Scripts. @@ -33564,10 +33741,10 @@ Index * OFMT variable <2>: Strings And Numbers. (line 57) * OFMT variable: OFMT. (line 15) * OFMT variable, POSIX awk and: OFMT. (line 27) -* OFS variable <1>: User-modified. (line 113) +* OFS variable <1>: User-modified. (line 114) * OFS variable <2>: Output Separators. (line 6) * OFS variable: Changing Fields. (line 64) -* OpenBSD: Glossary. (line 611) +* OpenBSD: Glossary. (line 753) * OpenSolaris: Other Versions. (line 100) * operating systems, BSD-based: Manual History. (line 28) * operating systems, PC, gawk on: PC Using. (line 6) @@ -33617,7 +33794,7 @@ Index (line 12) * ord() user-defined function: Ordinal Functions. (line 16) * order of evaluation, concatenation: Concatenation. (line 41) -* ORS variable <1>: User-modified. (line 118) +* ORS variable <1>: User-modified. (line 119) * ORS variable: Output Separators. (line 21) * output field separator, See OFS variable: Changing Fields. (line 64) * output record separator, See ORS variable: Output Separators. @@ -33693,7 +33870,7 @@ Index (line 65) * portability, deleting array elements: Delete. (line 56) * portability, example programs: Library Functions. (line 42) -* portability, functions, defining: Definition Syntax. (line 109) +* portability, functions, defining: Definition Syntax. (line 114) * portability, gawk: New Ports. (line 6) * portability, gettext library and: Explaining gettext. (line 11) * portability, internationalization and: I18N Portability. (line 6) @@ -33705,7 +33882,7 @@ Index * portability, operators: Increment Ops. (line 60) * portability, operators, not in POSIX awk: Precedence. (line 98) * portability, POSIXLY_CORRECT environment variable: Options. (line 359) -* portability, substr() function: String Functions. (line 512) +* portability, substr() function: String Functions. (line 513) * portable object files <1>: Translator i18n. (line 6) * portable object files: Explaining gettext. (line 37) * portable object files, converting to message object files: I18N Example. @@ -33738,7 +33915,7 @@ Index * POSIX awk, field separators and <1>: Full Line Fields. (line 16) * POSIX awk, field separators and: Fields. (line 6) * POSIX awk, FS variable and: User-modified. (line 60) -* POSIX awk, function keyword in: Definition Syntax. (line 93) +* POSIX awk, function keyword in: Definition Syntax. (line 98) * POSIX awk, functions and, gsub()/sub(): Gory Details. (line 90) * POSIX awk, functions and, length(): String Functions. (line 180) * POSIX awk, GNU long options and: Options. (line 15) @@ -33757,7 +33934,7 @@ Index * POSIX, gawk extensions not included in: POSIX/GNU. (line 6) * POSIX, programs, implementing in awk: Clones. (line 6) * POSIXLY_CORRECT environment variable: Options. (line 339) -* PREC variable: User-modified. (line 123) +* PREC variable: User-modified. (line 124) * precedence <1>: Precedence. (line 6) * precedence: Increment Ops. (line 60) * precedence, regexp operators: Regexp Operators. (line 156) @@ -33772,7 +33949,7 @@ Index * print statement, commas, omitting: Print Examples. (line 31) * print statement, I/O operators in: Precedence. (line 71) * print statement, line continuations and: Print Examples. (line 76) -* print statement, OFMT variable and: User-modified. (line 113) +* print statement, OFMT variable and: User-modified. (line 114) * print statement, See Also redirection, of output: Redirection. (line 17) * print statement, sprintf() function and: Round Function. (line 6) @@ -33831,7 +34008,7 @@ Index * programming conventions, functions, calling: Calling Built-in. (line 10) * programming conventions, functions, writing: Definition Syntax. - (line 65) + (line 70) * programming conventions, gawk extensions: Internal File Ops. (line 45) * programming conventions, private variable names: Library Names. @@ -33840,7 +34017,7 @@ Index * programming languages, Ada: Glossary. (line 11) * programming languages, data-driven vs. procedural: Getting Started. (line 12) -* programming languages, Java: Glossary. (line 379) +* programming languages, Java: Glossary. (line 473) * programming, basic steps: Basic High Level. (line 20) * programming, concepts: Basic Concepts. (line 6) * pwcat program: Passwd Functions. (line 23) @@ -33887,7 +34064,7 @@ Index * readfile() user-defined function: Readfile Function. (line 30) * reading input files: Reading Files. (line 6) * recipe for a programming language: History. (line 6) -* record separators <1>: User-modified. (line 132) +* record separators <1>: User-modified. (line 133) * record separators: awk split records. (line 6) * record separators, changing: awk split records. (line 85) * record separators, regular expressions as: awk split records. @@ -33900,8 +34077,8 @@ Index * records, splitting input into: Records. (line 6) * records, terminating: awk split records. (line 125) * records, treating files as: gawk split records. (line 93) -* recursive functions: Definition Syntax. (line 83) -* redirect gawk output, in debugger: Debugger Info. (line 72) +* recursive functions: Definition Syntax. (line 88) +* redirect gawk output, in debugger: Debugger Info. (line 73) * redirection of input: Getline/File. (line 6) * redirection of output: Redirection. (line 6) * reference counting, sorting arrays: Array Sorting Functions. @@ -33915,8 +34092,8 @@ Index * regexp constants, as patterns: Expression Patterns. (line 34) * regexp constants, in gawk: Using Constant Regexps. (line 28) -* regexp constants, slashes vs. quotes: Computed Regexps. (line 29) -* regexp constants, vs. string constants: Computed Regexps. (line 39) +* regexp constants, slashes vs. quotes: Computed Regexps. (line 30) +* regexp constants, vs. string constants: Computed Regexps. (line 40) * register extension: Registration Functions. (line 6) * regular expressions: Regexp. (line 6) @@ -33935,7 +34112,7 @@ Index (line 57) * regular expressions, dynamic: Computed Regexps. (line 6) * regular expressions, dynamic, with embedded newlines: Computed Regexps. - (line 59) + (line 60) * regular expressions, gawk, command-line options: GNU Regexp Operators. (line 70) * regular expressions, interval expressions and: Options. (line 279) @@ -33954,7 +34131,7 @@ Index * regular expressions, searching for: Egrep Program. (line 6) * relational operators, See comparison operators: Typing and Comparison. (line 9) -* replace in string: String Functions. (line 408) +* replace in string: String Functions. (line 409) * return debugger command: Debugger Execution Control. (line 54) * return statement, user-defined functions: Return Statement. (line 6) @@ -33965,7 +34142,7 @@ Index (line 11) * revtwoway extension: Extension Sample Rev2way. (line 12) -* rewind() user-defined function: Rewind Function. (line 16) +* rewind() user-defined function: Rewind Function. (line 15) * right angle bracket (>), > operator <1>: Precedence. (line 65) * right angle bracket (>), > operator: Comparison Operators. (line 11) @@ -33999,8 +34176,8 @@ Index * round to nearest integer: Numeric Functions. (line 38) * round() user-defined function: Round Function. (line 16) * rounding numbers: Round Function. (line 6) -* ROUNDMODE variable: User-modified. (line 127) -* RS variable <1>: User-modified. (line 132) +* ROUNDMODE variable: User-modified. (line 128) +* RS variable <1>: User-modified. (line 133) * RS variable: awk split records. (line 12) * RS variable, multiline records and: Multiple Line. (line 17) * rshift: Bitwise Functions. (line 53) @@ -34020,7 +34197,7 @@ Index * sample debugging session: Sample Debugging Session. (line 6) * sandbox mode: Options. (line 286) -* save debugger options: Debugger Info. (line 84) +* save debugger options: Debugger Info. (line 85) * scalar or array: Type Functions. (line 11) * scalar values: Basic Data Typing. (line 13) * scanning arrays: Scanning an Array. (line 6) @@ -34057,19 +34234,19 @@ Index * separators, field, FIELDWIDTHS variable and: User-modified. (line 37) * separators, field, FPAT variable and: User-modified. (line 43) * separators, field, POSIX and: Fields. (line 6) -* separators, for records <1>: User-modified. (line 132) +* separators, for records <1>: User-modified. (line 133) * separators, for records: awk split records. (line 6) * separators, for records, regular expressions as: awk split records. (line 125) * separators, for statements in actions: Action Overview. (line 19) -* separators, subscript: User-modified. (line 145) +* separators, subscript: User-modified. (line 146) * set breakpoint: Breakpoint Control. (line 11) * set debugger command: Viewing And Changing Data. (line 59) * set directory of message catalogs: I18N Functions. (line 12) * set watchpoint: Viewing And Changing Data. (line 67) -* shadowing of variable values: Definition Syntax. (line 71) +* shadowing of variable values: Definition Syntax. (line 76) * shell quoting, rules for: Quoting. (line 6) * shells, piping commands into: Redirection. (line 136) * shells, quoting: Using Shell Variables. @@ -34111,14 +34288,14 @@ Index (line 14) * sidebar, Changing NR and FNR: Auto-set. (line 326) * sidebar, Controlling Output Buffering with system(): I/O Functions. - (line 138) + (line 139) * sidebar, Escape Sequences for Metacharacters: Escape Sequences. (line 137) * sidebar, FS and IGNORECASE: Field Splitting Summary. (line 38) * sidebar, Interactive Versus Noninteractive Buffering: I/O Functions. - (line 73) -* sidebar, Matching the Null String: String Functions. (line 534) + (line 74) +* sidebar, Matching the Null String: String Functions. (line 535) * sidebar, Operator Evaluation Order: Increment Ops. (line 58) * sidebar, Piping into sh: Redirection. (line 134) * sidebar, Pre-POSIX awk Used OFMT for String Conversion: Strings And Numbers. @@ -34126,13 +34303,13 @@ Index * sidebar, Recipe for a Programming Language: History. (line 6) * sidebar, RS = "\0" Is Not Portable: gawk split records. (line 63) * sidebar, So Why Does gawk Have BEGINFILE and ENDFILE?: Filetrans Function. - (line 82) + (line 83) * sidebar, Syntactic Ambiguities Between /= and Regular Expressions: Assignment Ops. (line 146) * sidebar, Understanding #!: Executable Scripts. (line 31) * sidebar, Understanding $0: Changing Fields. (line 134) * sidebar, Using \n in Bracket Expressions of Dynamic Regexps: Computed Regexps. - (line 57) + (line 58) * sidebar, Using close()'s Return Value: Close Files And Pipes. (line 131) * SIGHUP signal, for dynamic profiling: Profiling. (line 210) @@ -34185,16 +34362,16 @@ Index * source code, QuikTrim Awk: Other Versions. (line 139) * source code, Solaris awk: Other Versions. (line 100) * source files, search path for: Programs Exercises. (line 70) -* sparse arrays: Array Intro. (line 72) +* sparse arrays: Array Intro. (line 76) * Spencer, Henry: Glossary. (line 16) * split: String Functions. (line 316) * split string into array: String Functions. (line 297) * split utility: Split Program. (line 6) * split() function, array elements, deleting: Delete. (line 61) * split.awk program: Split Program. (line 30) -* sprintf <1>: String Functions. (line 383) +* sprintf <1>: String Functions. (line 384) * sprintf: OFMT. (line 15) -* sprintf() function, OFMT variable and: User-modified. (line 113) +* sprintf() function, OFMT variable and: User-modified. (line 114) * sprintf() function, print/printf statements and: Round Function. (line 6) * sqrt: Numeric Functions. (line 92) @@ -34202,7 +34379,7 @@ Index * square root: Numeric Functions. (line 92) * srand: Numeric Functions. (line 96) * stack frame: Debugging Terms. (line 10) -* Stallman, Richard <1>: Glossary. (line 288) +* Stallman, Richard <1>: Glossary. (line 375) * Stallman, Richard <2>: Contributors. (line 23) * Stallman, Richard <3>: Acknowledgments. (line 18) * Stallman, Richard: Manual History. (line 6) @@ -34226,7 +34403,7 @@ Index * stream editors: Full Line Fields. (line 22) * strftime: Time Functions. (line 48) * string constants: Scalar Constants. (line 15) -* string constants, vs. regexp constants: Computed Regexps. (line 39) +* string constants, vs. regexp constants: Computed Regexps. (line 40) * string extraction (internationalization): String Extraction. (line 6) * string length: String Functions. (line 171) @@ -34238,25 +34415,25 @@ Index * strings splitting, example: String Functions. (line 335) * strings, converting <1>: Bitwise Functions. (line 110) * strings, converting: Strings And Numbers. (line 6) -* strings, converting letter case: String Functions. (line 522) +* strings, converting letter case: String Functions. (line 523) * strings, converting, numbers to: User-modified. (line 30) * strings, empty, See null strings: awk split records. (line 115) * strings, extracting: String Extraction. (line 6) -* strings, for localization: Programmer i18n. (line 14) +* strings, for localization: Programmer i18n. (line 13) * strings, length limitations: Scalar Constants. (line 20) * strings, merging arrays into: Join Function. (line 6) * strings, null: Regexp Field Splitting. (line 43) * strings, numeric: Variable Typing. (line 6) -* strtonum: String Functions. (line 390) +* strtonum: String Functions. (line 391) * strtonum() function (gawk), --non-decimal-data option and: Nondecimal Data. (line 35) -* sub <1>: String Functions. (line 408) +* sub <1>: String Functions. (line 409) * sub: Using Constant Regexps. (line 43) -* sub() function, arguments of: String Functions. (line 462) +* sub() function, arguments of: String Functions. (line 463) * sub() function, escape processing: Gory Details. (line 6) -* subscript separators: User-modified. (line 145) +* subscript separators: User-modified. (line 146) * subscripts in arrays, multidimensional: Multidimensional. (line 10) * subscripts in arrays, multidimensional, scanning: Multiscanning. (line 11) @@ -34264,19 +34441,19 @@ Index (line 6) * subscripts in arrays, uninitialized variables as: Uninitialized Subscripts. (line 6) -* SUBSEP variable: User-modified. (line 145) +* SUBSEP variable: User-modified. (line 146) * SUBSEP variable, and multidimensional arrays: Multidimensional. (line 16) * substitute in string: String Functions. (line 90) -* substr: String Functions. (line 481) -* substring: String Functions. (line 481) +* substr: String Functions. (line 482) +* substring: String Functions. (line 482) * Sumner, Andrew: Other Versions. (line 68) * supplementary groups of gawk process: Auto-set. (line 251) * switch statement: Switch Statement. (line 6) * SYMTAB array: Auto-set. (line 283) * syntactic ambiguity: /= operator vs. /=.../ regexp constant: Assignment Ops. (line 148) -* system: I/O Functions. (line 106) +* system: I/O Functions. (line 107) * systime: Time Functions. (line 66) * t debugger command (alias for tbreak): Breakpoint Control. (line 90) * tbreak debugger command: Breakpoint Control. (line 90) @@ -34302,8 +34479,8 @@ Index (line 6) * text, printing: Print. (line 22) * text, printing, unduplicated lines of: Uniq Program. (line 6) -* TEXTDOMAIN variable <1>: Programmer i18n. (line 9) -* TEXTDOMAIN variable: User-modified. (line 151) +* TEXTDOMAIN variable <1>: Programmer i18n. (line 8) +* TEXTDOMAIN variable: User-modified. (line 152) * TEXTDOMAIN variable, BEGIN pattern and: Programmer i18n. (line 60) * TEXTDOMAIN variable, portability and: I18N Portability. (line 20) * textdomain() function (C library): Explaining gettext. (line 28) @@ -34326,11 +34503,11 @@ Index * timestamps, converting dates to: Time Functions. (line 76) * timestamps, formatted: Getlocaltime Function. (line 6) -* tolower: String Functions. (line 523) -* toupper: String Functions. (line 529) +* tolower: String Functions. (line 524) +* toupper: String Functions. (line 530) * tr utility: Translate Program. (line 6) * trace debugger command: Miscellaneous Debugger Commands. - (line 108) + (line 107) * traceback, display in debugger: Execution Stack. (line 13) * translate string: I18N Functions. (line 22) * translate.awk program: Translate Program. (line 55) @@ -34346,14 +34523,14 @@ Index (line 22) * troubleshooting, fatal errors, printf format strings: Format Modifiers. (line 158) -* troubleshooting, fflush() function: I/O Functions. (line 62) +* troubleshooting, fflush() function: I/O Functions. (line 63) * troubleshooting, function call syntax: Function Calls. (line 30) * troubleshooting, gawk: Compatibility Mode. (line 6) * troubleshooting, gawk, bug reports: Bugs. (line 9) * troubleshooting, gawk, fatal errors, function arguments: Calling Built-in. (line 16) * troubleshooting, getline function: File Checking. (line 25) -* troubleshooting, gsub()/sub() functions: String Functions. (line 472) +* troubleshooting, gsub()/sub() functions: String Functions. (line 473) * troubleshooting, match() function: String Functions. (line 292) * troubleshooting, print statement, omitting commas: Print Examples. (line 31) @@ -34361,10 +34538,10 @@ Index * troubleshooting, quotes with file names: Special FD. (line 62) * troubleshooting, readable data files: File Checking. (line 6) * troubleshooting, regexp constants vs. string constants: Computed Regexps. - (line 39) + (line 40) * troubleshooting, string concatenation: Concatenation. (line 26) -* troubleshooting, substr() function: String Functions. (line 499) -* troubleshooting, system() function: I/O Functions. (line 128) +* troubleshooting, substr() function: String Functions. (line 500) +* troubleshooting, system() function: I/O Functions. (line 129) * troubleshooting, typographical errors, global variables: Options. (line 98) * true, logical: Truth Values. (line 6) @@ -34387,14 +34564,14 @@ Index * undisplay debugger command: Viewing And Changing Data. (line 80) * undocumented features: Undocumented. (line 6) -* Unicode <1>: Glossary. (line 133) +* Unicode <1>: Glossary. (line 197) * Unicode <2>: Ranges and Locales. (line 61) * Unicode: Ordinal Functions. (line 45) * uninitialized variables, as array subscripts: Uninitialized Subscripts. (line 6) * uniq utility: Uniq Program. (line 6) * uniq.awk program: Uniq Program. (line 65) -* Unix: Glossary. (line 611) +* Unix: Glossary. (line 753) * Unix awk, backslashes in escape sequences: Escape Sequences. (line 120) * Unix awk, close() function and: Close Files And Pipes. @@ -34443,7 +34620,7 @@ Index * variables, predefined conveying information: Auto-set. (line 6) * variables, private: Library Names. (line 11) * variables, setting: Options. (line 32) -* variables, shadowing: Definition Syntax. (line 71) +* variables, shadowing: Definition Syntax. (line 76) * variables, types of: Assignment Ops. (line 40) * variables, types of, comparison expressions and: Typing and Comparison. (line 9) @@ -34544,560 +34721,560 @@ Index Tag Table: Node: Top1204 Node: Foreword342291 -Node: Foreword446733 -Node: Preface48255 -Ref: Preface-Footnote-151126 -Ref: Preface-Footnote-251233 -Ref: Preface-Footnote-351466 -Node: History51608 -Node: Names53954 -Ref: Names-Footnote-155048 -Node: This Manual55194 -Ref: This Manual-Footnote-161681 -Node: Conventions61781 -Node: Manual History64119 -Ref: Manual History-Footnote-167101 -Ref: Manual History-Footnote-267142 -Node: How To Contribute67216 -Node: Acknowledgments68345 -Node: Getting Started73150 -Node: Running gawk75583 -Node: One-shot76773 -Node: Read Terminal78021 -Node: Long80048 -Node: Executable Scripts81564 -Ref: Executable Scripts-Footnote-184353 -Node: Comments84456 -Node: Quoting86938 -Node: DOS Quoting92462 -Node: Sample Data Files93137 -Node: Very Simple95732 -Node: Two Rules100630 -Node: More Complex102516 -Node: Statements/Lines105378 -Ref: Statements/Lines-Footnote-1109833 -Node: Other Features110098 -Node: When111029 -Ref: When-Footnote-1112783 -Node: Intro Summary112848 -Node: Invoking Gawk113731 -Node: Command Line115245 -Node: Options116043 -Ref: Options-Footnote-1131847 -Ref: Options-Footnote-2132076 -Node: Other Arguments132101 -Node: Naming Standard Input135049 -Node: Environment Variables136142 -Node: AWKPATH Variable136700 -Ref: AWKPATH Variable-Footnote-1140113 -Ref: AWKPATH Variable-Footnote-2140158 -Node: AWKLIBPATH Variable140418 -Node: Other Environment Variables141674 -Node: Exit Status145162 -Node: Include Files145838 -Node: Loading Shared Libraries149435 -Node: Obsolete150862 -Node: Undocumented151559 -Node: Invoking Summary151826 -Node: Regexp153490 -Node: Regexp Usage154944 -Node: Escape Sequences156981 -Node: Regexp Operators163211 -Ref: Regexp Operators-Footnote-1170637 -Ref: Regexp Operators-Footnote-2170784 -Node: Bracket Expressions170882 -Ref: table-char-classes172897 -Node: Leftmost Longest175821 -Node: Computed Regexps177123 -Node: GNU Regexp Operators180520 -Node: Case-sensitivity184193 -Ref: Case-sensitivity-Footnote-1187078 -Ref: Case-sensitivity-Footnote-2187313 -Node: Regexp Summary187421 -Node: Reading Files188888 -Node: Records190982 -Node: awk split records191715 -Node: gawk split records196630 -Ref: gawk split records-Footnote-1201174 -Node: Fields201211 -Ref: Fields-Footnote-1203987 -Node: Nonconstant Fields204073 -Ref: Nonconstant Fields-Footnote-1206316 -Node: Changing Fields206520 -Node: Field Separators212449 -Node: Default Field Splitting215154 -Node: Regexp Field Splitting216271 -Node: Single Character Fields219621 -Node: Command Line Field Separator220680 -Node: Full Line Fields223892 -Ref: Full Line Fields-Footnote-1225409 -Ref: Full Line Fields-Footnote-2225455 -Node: Field Splitting Summary225556 -Node: Constant Size227630 -Node: Splitting By Content232219 -Ref: Splitting By Content-Footnote-1236213 -Node: Multiple Line236376 -Ref: Multiple Line-Footnote-1242262 -Node: Getline242441 -Node: Plain Getline244653 -Node: Getline/Variable247293 -Node: Getline/File248441 -Node: Getline/Variable/File249825 -Ref: Getline/Variable/File-Footnote-1251428 -Node: Getline/Pipe251515 -Node: Getline/Variable/Pipe254198 -Node: Getline/Coprocess255329 -Node: Getline/Variable/Coprocess256581 -Node: Getline Notes257320 -Node: Getline Summary260112 -Ref: table-getline-variants260524 -Node: Read Timeout261353 -Ref: Read Timeout-Footnote-1265178 -Node: Command-line directories265236 -Node: Input Summary266141 -Node: Input Exercises269442 -Node: Printing270170 -Node: Print272005 -Node: Print Examples273462 -Node: Output Separators276241 -Node: OFMT278259 -Node: Printf279613 -Node: Basic Printf280398 -Node: Control Letters281968 -Node: Format Modifiers285951 -Node: Printf Examples291960 -Node: Redirection294446 -Node: Special FD301287 -Ref: Special FD-Footnote-1304447 -Node: Special Files304521 -Node: Other Inherited Files305138 -Node: Special Network306138 -Node: Special Caveats307000 -Node: Close Files And Pipes307951 -Ref: Close Files And Pipes-Footnote-1315127 -Ref: Close Files And Pipes-Footnote-2315275 -Node: Nonfatal315425 -Node: Output Summary317348 -Node: Output Exercises318569 -Node: Expressions319249 -Node: Values320434 -Node: Constants321112 -Node: Scalar Constants321803 -Ref: Scalar Constants-Footnote-1322662 -Node: Nondecimal-numbers322912 -Node: Regexp Constants325930 -Node: Using Constant Regexps326455 -Node: Variables329598 -Node: Using Variables330253 -Node: Assignment Options332164 -Node: Conversion334039 -Node: Strings And Numbers334563 -Ref: Strings And Numbers-Footnote-1337628 -Node: Locale influences conversions337737 -Ref: table-locale-affects340484 -Node: All Operators341072 -Node: Arithmetic Ops341702 -Node: Concatenation344207 -Ref: Concatenation-Footnote-1347026 -Node: Assignment Ops347132 -Ref: table-assign-ops352111 -Node: Increment Ops353383 -Node: Truth Values and Conditions356821 -Node: Truth Values357906 -Node: Typing and Comparison358955 -Node: Variable Typing359765 -Node: Comparison Operators363418 -Ref: table-relational-ops363828 -Node: POSIX String Comparison367323 -Ref: POSIX String Comparison-Footnote-1368395 -Node: Boolean Ops368533 -Ref: Boolean Ops-Footnote-1373012 -Node: Conditional Exp373103 -Node: Function Calls374830 -Node: Precedence378710 -Node: Locales382371 -Node: Expressions Summary384003 -Node: Patterns and Actions386563 -Node: Pattern Overview387683 -Node: Regexp Patterns389362 -Node: Expression Patterns389905 -Node: Ranges393615 -Node: BEGIN/END396721 -Node: Using BEGIN/END397482 -Ref: Using BEGIN/END-Footnote-1400216 -Node: I/O And BEGIN/END400322 -Node: BEGINFILE/ENDFILE402636 -Node: Empty405537 -Node: Using Shell Variables405854 -Node: Action Overview408127 -Node: Statements410453 -Node: If Statement412301 -Node: While Statement413796 -Node: Do Statement415825 -Node: For Statement416969 -Node: Switch Statement420126 -Node: Break Statement422508 -Node: Continue Statement424549 -Node: Next Statement426376 -Node: Nextfile Statement428757 -Node: Exit Statement431387 -Node: Built-in Variables433790 -Node: User-modified434923 -Ref: User-modified-Footnote-1442604 -Node: Auto-set442666 -Ref: Auto-set-Footnote-1456358 -Ref: Auto-set-Footnote-2456563 -Node: ARGC and ARGV456619 -Node: Pattern Action Summary460837 -Node: Arrays463264 -Node: Array Basics464593 -Node: Array Intro465437 -Ref: figure-array-elements467401 -Ref: Array Intro-Footnote-1469927 -Node: Reference to Elements470055 -Node: Assigning Elements472507 -Node: Array Example472998 -Node: Scanning an Array474756 -Node: Controlling Scanning477772 -Ref: Controlling Scanning-Footnote-1482968 -Node: Numeric Array Subscripts483284 -Node: Uninitialized Subscripts485469 -Node: Delete487086 -Ref: Delete-Footnote-1489829 -Node: Multidimensional489886 -Node: Multiscanning492983 -Node: Arrays of Arrays494572 -Node: Arrays Summary499331 -Node: Functions501423 -Node: Built-in502322 -Node: Calling Built-in503400 -Node: Numeric Functions505391 -Ref: Numeric Functions-Footnote-1510210 -Ref: Numeric Functions-Footnote-2510567 -Ref: Numeric Functions-Footnote-3510615 -Node: String Functions510887 -Ref: String Functions-Footnote-1534362 -Ref: String Functions-Footnote-2534491 -Ref: String Functions-Footnote-3534739 -Node: Gory Details534826 -Ref: table-sub-escapes536607 -Ref: table-sub-proposed538127 -Ref: table-posix-sub539491 -Ref: table-gensub-escapes541027 -Ref: Gory Details-Footnote-1541859 -Node: I/O Functions542010 -Ref: I/O Functions-Footnote-1549228 -Node: Time Functions549375 -Ref: Time Functions-Footnote-1559863 -Ref: Time Functions-Footnote-2559931 -Ref: Time Functions-Footnote-3560089 -Ref: Time Functions-Footnote-4560200 -Ref: Time Functions-Footnote-5560312 -Ref: Time Functions-Footnote-6560539 -Node: Bitwise Functions560805 -Ref: table-bitwise-ops561367 -Ref: Bitwise Functions-Footnote-1565676 -Node: Type Functions565845 -Node: I18N Functions566996 -Node: User-defined568641 -Node: Definition Syntax569446 -Ref: Definition Syntax-Footnote-1574853 -Node: Function Example574924 -Ref: Function Example-Footnote-1577843 -Node: Function Caveats577865 -Node: Calling A Function578383 -Node: Variable Scope579341 -Node: Pass By Value/Reference582329 -Node: Return Statement585824 -Node: Dynamic Typing588805 -Node: Indirect Calls589734 -Ref: Indirect Calls-Footnote-1601036 -Node: Functions Summary601164 -Node: Library Functions603866 -Ref: Library Functions-Footnote-1607475 -Ref: Library Functions-Footnote-2607618 -Node: Library Names607789 -Ref: Library Names-Footnote-1611243 -Ref: Library Names-Footnote-2611466 -Node: General Functions611552 -Node: Strtonum Function612655 -Node: Assert Function615677 -Node: Round Function619001 -Node: Cliff Random Function620542 -Node: Ordinal Functions621558 -Ref: Ordinal Functions-Footnote-1624621 -Ref: Ordinal Functions-Footnote-2624873 -Node: Join Function625084 -Ref: Join Function-Footnote-1626853 -Node: Getlocaltime Function627053 -Node: Readfile Function630797 -Node: Shell Quoting632767 -Node: Data File Management634168 -Node: Filetrans Function634800 -Node: Rewind Function638856 -Node: File Checking640243 -Ref: File Checking-Footnote-1641575 -Node: Empty Files641776 -Node: Ignoring Assigns643755 -Node: Getopt Function645306 -Ref: Getopt Function-Footnote-1656768 -Node: Passwd Functions656968 -Ref: Passwd Functions-Footnote-1665805 -Node: Group Functions665893 -Ref: Group Functions-Footnote-1673787 -Node: Walking Arrays674000 -Node: Library Functions Summary675603 -Node: Library Exercises677004 -Node: Sample Programs678284 -Node: Running Examples679054 -Node: Clones679782 -Node: Cut Program681006 -Node: Egrep Program690725 -Ref: Egrep Program-Footnote-1698223 -Node: Id Program698333 -Node: Split Program701978 -Ref: Split Program-Footnote-1705426 -Node: Tee Program705554 -Node: Uniq Program708343 -Node: Wc Program715762 -Ref: Wc Program-Footnote-1720012 -Node: Miscellaneous Programs720106 -Node: Dupword Program721319 -Node: Alarm Program723350 -Node: Translate Program728154 -Ref: Translate Program-Footnote-1732719 -Node: Labels Program732989 -Ref: Labels Program-Footnote-1736340 -Node: Word Sorting736424 -Node: History Sorting740495 -Node: Extract Program742331 -Node: Simple Sed749856 -Node: Igawk Program752924 -Ref: Igawk Program-Footnote-1767248 -Ref: Igawk Program-Footnote-2767449 -Ref: Igawk Program-Footnote-3767571 -Node: Anagram Program767686 -Node: Signature Program770743 -Node: Programs Summary771990 -Node: Programs Exercises773183 -Ref: Programs Exercises-Footnote-1777314 -Node: Advanced Features777405 -Node: Nondecimal Data779353 -Node: Array Sorting780943 -Node: Controlling Array Traversal781640 -Ref: Controlling Array Traversal-Footnote-1789973 -Node: Array Sorting Functions790091 -Ref: Array Sorting Functions-Footnote-1793980 -Node: Two-way I/O794176 -Ref: Two-way I/O-Footnote-1799121 -Ref: Two-way I/O-Footnote-2799307 -Node: TCP/IP Networking799389 -Node: Profiling802262 -Node: Advanced Features Summary810539 -Node: Internationalization812472 -Node: I18N and L10N813952 -Node: Explaining gettext814638 -Ref: Explaining gettext-Footnote-1819663 -Ref: Explaining gettext-Footnote-2819847 -Node: Programmer i18n820012 -Ref: Programmer i18n-Footnote-1824878 -Node: Translator i18n824927 -Node: String Extraction825721 -Ref: String Extraction-Footnote-1826852 -Node: Printf Ordering826938 -Ref: Printf Ordering-Footnote-1829724 -Node: I18N Portability829788 -Ref: I18N Portability-Footnote-1832243 -Node: I18N Example832306 -Ref: I18N Example-Footnote-1835109 -Node: Gawk I18N835181 -Node: I18N Summary835819 -Node: Debugger837158 -Node: Debugging838180 -Node: Debugging Concepts838621 -Node: Debugging Terms840474 -Node: Awk Debugging843046 -Node: Sample Debugging Session843940 -Node: Debugger Invocation844460 -Node: Finding The Bug845844 -Node: List of Debugger Commands852319 -Node: Breakpoint Control853652 -Node: Debugger Execution Control857348 -Node: Viewing And Changing Data860712 -Node: Execution Stack864090 -Node: Debugger Info865727 -Node: Miscellaneous Debugger Commands869744 -Node: Readline Support874773 -Node: Limitations875665 -Node: Debugging Summary877779 -Node: Arbitrary Precision Arithmetic878947 -Node: Computer Arithmetic880363 -Ref: table-numeric-ranges883961 -Ref: Computer Arithmetic-Footnote-1884820 -Node: Math Definitions884877 -Ref: table-ieee-formats888165 -Ref: Math Definitions-Footnote-1888769 -Node: MPFR features888874 -Node: FP Math Caution890545 -Ref: FP Math Caution-Footnote-1891595 -Node: Inexactness of computations891964 -Node: Inexact representation892923 -Node: Comparing FP Values894280 -Node: Errors accumulate895362 -Node: Getting Accuracy896795 -Node: Try To Round899457 -Node: Setting precision900356 -Ref: table-predefined-precision-strings901040 -Node: Setting the rounding mode902829 -Ref: table-gawk-rounding-modes903193 -Ref: Setting the rounding mode-Footnote-1906648 -Node: Arbitrary Precision Integers906827 -Ref: Arbitrary Precision Integers-Footnote-1911726 -Node: POSIX Floating Point Problems911875 -Ref: POSIX Floating Point Problems-Footnote-1915748 -Node: Floating point summary915786 -Node: Dynamic Extensions917980 -Node: Extension Intro919532 -Node: Plugin License920798 -Node: Extension Mechanism Outline921595 -Ref: figure-load-extension922023 -Ref: figure-register-new-function923503 -Ref: figure-call-new-function924507 -Node: Extension API Description926493 -Node: Extension API Functions Introduction927943 -Node: General Data Types932767 -Ref: General Data Types-Footnote-1938506 -Node: Memory Allocation Functions938805 -Ref: Memory Allocation Functions-Footnote-1941644 -Node: Constructor Functions941740 -Node: Registration Functions943474 -Node: Extension Functions944159 -Node: Exit Callback Functions946456 -Node: Extension Version String947704 -Node: Input Parsers948369 -Node: Output Wrappers958248 -Node: Two-way processors962763 -Node: Printing Messages964967 -Ref: Printing Messages-Footnote-1966043 -Node: Updating `ERRNO'966195 -Node: Requesting Values966935 -Ref: table-value-types-returned967663 -Node: Accessing Parameters968620 -Node: Symbol Table Access969851 -Node: Symbol table by name970365 -Node: Symbol table by cookie972346 -Ref: Symbol table by cookie-Footnote-1976490 -Node: Cached values976553 -Ref: Cached values-Footnote-1980052 -Node: Array Manipulation980143 -Ref: Array Manipulation-Footnote-1981241 -Node: Array Data Types981278 -Ref: Array Data Types-Footnote-1983933 -Node: Array Functions984025 -Node: Flattening Arrays987879 -Node: Creating Arrays994771 -Node: Extension API Variables999542 -Node: Extension Versioning1000178 -Node: Extension API Informational Variables1002079 -Node: Extension API Boilerplate1003144 -Node: Finding Extensions1006953 -Node: Extension Example1007513 -Node: Internal File Description1008285 -Node: Internal File Ops1012352 -Ref: Internal File Ops-Footnote-11024022 -Node: Using Internal File Ops1024162 -Ref: Using Internal File Ops-Footnote-11026545 -Node: Extension Samples1026818 -Node: Extension Sample File Functions1028344 -Node: Extension Sample Fnmatch1035982 -Node: Extension Sample Fork1037473 -Node: Extension Sample Inplace1038688 -Node: Extension Sample Ord1040363 -Node: Extension Sample Readdir1041199 -Ref: table-readdir-file-types1042075 -Node: Extension Sample Revout1042886 -Node: Extension Sample Rev2way1043476 -Node: Extension Sample Read write array1044216 -Node: Extension Sample Readfile1046156 -Node: Extension Sample Time1047251 -Node: Extension Sample API Tests1048600 -Node: gawkextlib1049091 -Node: Extension summary1051749 -Node: Extension Exercises1055438 -Node: Language History1056160 -Node: V7/SVR3.11057816 -Node: SVR41059997 -Node: POSIX1061442 -Node: BTL1062831 -Node: POSIX/GNU1063565 -Node: Feature History1069446 -Node: Common Extensions1083240 -Node: Ranges and Locales1084564 -Ref: Ranges and Locales-Footnote-11089182 -Ref: Ranges and Locales-Footnote-21089209 -Ref: Ranges and Locales-Footnote-31089443 -Node: Contributors1089664 -Node: History summary1095205 -Node: Installation1096575 -Node: Gawk Distribution1097521 -Node: Getting1098005 -Node: Extracting1098828 -Node: Distribution contents1100463 -Node: Unix Installation1106528 -Node: Quick Installation1107211 -Node: Shell Startup Files1109622 -Node: Additional Configuration Options1110701 -Node: Configuration Philosophy1112440 -Node: Non-Unix Installation1114809 -Node: PC Installation1115267 -Node: PC Binary Installation1116586 -Node: PC Compiling1118434 -Ref: PC Compiling-Footnote-11121455 -Node: PC Testing1121564 -Node: PC Using1122740 -Node: Cygwin1126855 -Node: MSYS1127678 -Node: VMS Installation1128178 -Node: VMS Compilation1128970 -Ref: VMS Compilation-Footnote-11130192 -Node: VMS Dynamic Extensions1130250 -Node: VMS Installation Details1131934 -Node: VMS Running1134186 -Node: VMS GNV1137022 -Node: VMS Old Gawk1137756 -Node: Bugs1138226 -Node: Other Versions1142109 -Node: Installation summary1148537 -Node: Notes1149593 -Node: Compatibility Mode1150458 -Node: Additions1151240 -Node: Accessing The Source1152165 -Node: Adding Code1153601 -Node: New Ports1159766 -Node: Derived Files1164248 -Ref: Derived Files-Footnote-11169723 -Ref: Derived Files-Footnote-21169757 -Ref: Derived Files-Footnote-31170353 -Node: Future Extensions1170467 -Node: Implementation Limitations1171073 -Node: Extension Design1172321 -Node: Old Extension Problems1173475 -Ref: Old Extension Problems-Footnote-11174992 -Node: Extension New Mechanism Goals1175049 -Ref: Extension New Mechanism Goals-Footnote-11178409 -Node: Extension Other Design Decisions1178598 -Node: Extension Future Growth1180706 -Node: Old Extension Mechanism1181542 -Node: Notes summary1183304 -Node: Basic Concepts1184490 -Node: Basic High Level1185171 -Ref: figure-general-flow1185443 -Ref: figure-process-flow1186042 -Ref: Basic High Level-Footnote-11189271 -Node: Basic Data Typing1189456 -Node: Glossary1192784 -Node: Copying1217942 -Node: GNU Free Documentation License1255498 -Node: Index1280634 +Node: Foreword446735 +Node: Preface48266 +Ref: Preface-Footnote-151137 +Ref: Preface-Footnote-251244 +Ref: Preface-Footnote-351477 +Node: History51619 +Node: Names53970 +Ref: Names-Footnote-155063 +Node: This Manual55209 +Ref: This Manual-Footnote-161709 +Node: Conventions61809 +Node: Manual History64146 +Ref: Manual History-Footnote-167139 +Ref: Manual History-Footnote-267180 +Node: How To Contribute67254 +Node: Acknowledgments68383 +Node: Getting Started73200 +Node: Running gawk75639 +Node: One-shot76829 +Node: Read Terminal78093 +Node: Long80124 +Node: Executable Scripts81637 +Ref: Executable Scripts-Footnote-184426 +Node: Comments84529 +Node: Quoting87011 +Node: DOS Quoting92529 +Node: Sample Data Files93204 +Node: Very Simple95799 +Node: Two Rules100698 +Node: More Complex102584 +Node: Statements/Lines105446 +Ref: Statements/Lines-Footnote-1109901 +Node: Other Features110166 +Node: When111102 +Ref: When-Footnote-1112856 +Node: Intro Summary112921 +Node: Invoking Gawk113805 +Node: Command Line115319 +Node: Options116117 +Ref: Options-Footnote-1131912 +Ref: Options-Footnote-2132141 +Node: Other Arguments132166 +Node: Naming Standard Input135114 +Node: Environment Variables136207 +Node: AWKPATH Variable136765 +Ref: AWKPATH Variable-Footnote-1140172 +Ref: AWKPATH Variable-Footnote-2140217 +Node: AWKLIBPATH Variable140477 +Node: Other Environment Variables141733 +Node: Exit Status145251 +Node: Include Files145927 +Node: Loading Shared Libraries149516 +Node: Obsolete150943 +Node: Undocumented151635 +Node: Invoking Summary151902 +Node: Regexp153565 +Node: Regexp Usage155019 +Node: Escape Sequences157056 +Node: Regexp Operators163285 +Ref: Regexp Operators-Footnote-1170695 +Ref: Regexp Operators-Footnote-2170842 +Node: Bracket Expressions170940 +Ref: table-char-classes172955 +Node: Leftmost Longest175897 +Node: Computed Regexps177199 +Node: GNU Regexp Operators180628 +Node: Case-sensitivity184300 +Ref: Case-sensitivity-Footnote-1187185 +Ref: Case-sensitivity-Footnote-2187420 +Node: Regexp Summary187528 +Node: Reading Files188995 +Node: Records191088 +Node: awk split records191821 +Node: gawk split records196750 +Ref: gawk split records-Footnote-1201289 +Node: Fields201326 +Ref: Fields-Footnote-1204104 +Node: Nonconstant Fields204190 +Ref: Nonconstant Fields-Footnote-1206428 +Node: Changing Fields206631 +Node: Field Separators212562 +Node: Default Field Splitting215266 +Node: Regexp Field Splitting216383 +Node: Single Character Fields219733 +Node: Command Line Field Separator220792 +Node: Full Line Fields224009 +Ref: Full Line Fields-Footnote-1225530 +Ref: Full Line Fields-Footnote-2225576 +Node: Field Splitting Summary225677 +Node: Constant Size227751 +Node: Splitting By Content232334 +Ref: Splitting By Content-Footnote-1236299 +Node: Multiple Line236462 +Ref: Multiple Line-Footnote-1242343 +Node: Getline242522 +Node: Plain Getline244729 +Node: Getline/Variable247369 +Node: Getline/File248518 +Node: Getline/Variable/File249903 +Ref: Getline/Variable/File-Footnote-1251506 +Node: Getline/Pipe251593 +Node: Getline/Variable/Pipe254271 +Node: Getline/Coprocess255402 +Node: Getline/Variable/Coprocess256666 +Node: Getline Notes257405 +Node: Getline Summary260199 +Ref: table-getline-variants260611 +Node: Read Timeout261440 +Ref: Read Timeout-Footnote-1265277 +Node: Command-line directories265335 +Node: Input Summary266240 +Node: Input Exercises269625 +Node: Printing270353 +Node: Print272188 +Node: Print Examples273645 +Node: Output Separators276424 +Node: OFMT278442 +Node: Printf279797 +Node: Basic Printf280582 +Node: Control Letters282154 +Node: Format Modifiers286139 +Node: Printf Examples292149 +Node: Redirection294635 +Node: Special FD301473 +Ref: Special FD-Footnote-1304639 +Node: Special Files304713 +Node: Other Inherited Files305330 +Node: Special Network306330 +Node: Special Caveats307192 +Node: Close Files And Pipes308141 +Ref: Close Files And Pipes-Footnote-1315326 +Ref: Close Files And Pipes-Footnote-2315474 +Node: Nonfatal315624 +Node: Output Summary317547 +Node: Output Exercises318768 +Node: Expressions319448 +Node: Values320637 +Node: Constants321314 +Node: Scalar Constants322005 +Ref: Scalar Constants-Footnote-1322867 +Node: Nondecimal-numbers323117 +Node: Regexp Constants326127 +Node: Using Constant Regexps326653 +Node: Variables329816 +Node: Using Variables330473 +Node: Assignment Options332384 +Node: Conversion334259 +Node: Strings And Numbers334783 +Ref: Strings And Numbers-Footnote-1337848 +Node: Locale influences conversions337957 +Ref: table-locale-affects340703 +Node: All Operators341295 +Node: Arithmetic Ops341924 +Node: Concatenation344429 +Ref: Concatenation-Footnote-1347248 +Node: Assignment Ops347355 +Ref: table-assign-ops352334 +Node: Increment Ops353644 +Node: Truth Values and Conditions357075 +Node: Truth Values358158 +Node: Typing and Comparison359207 +Node: Variable Typing360023 +Node: Comparison Operators363690 +Ref: table-relational-ops364100 +Node: POSIX String Comparison367595 +Ref: POSIX String Comparison-Footnote-1368667 +Node: Boolean Ops368806 +Ref: Boolean Ops-Footnote-1373284 +Node: Conditional Exp373375 +Node: Function Calls375113 +Node: Precedence378993 +Node: Locales382653 +Node: Expressions Summary384285 +Node: Patterns and Actions386856 +Node: Pattern Overview387976 +Node: Regexp Patterns389655 +Node: Expression Patterns390198 +Node: Ranges393907 +Node: BEGIN/END397014 +Node: Using BEGIN/END397775 +Ref: Using BEGIN/END-Footnote-1400511 +Node: I/O And BEGIN/END400617 +Node: BEGINFILE/ENDFILE402932 +Node: Empty405829 +Node: Using Shell Variables406146 +Node: Action Overview408419 +Node: Statements410745 +Node: If Statement412593 +Node: While Statement414088 +Node: Do Statement416116 +Node: For Statement417264 +Node: Switch Statement420422 +Node: Break Statement422804 +Node: Continue Statement424845 +Node: Next Statement426672 +Node: Nextfile Statement429053 +Node: Exit Statement431681 +Node: Built-in Variables434092 +Node: User-modified435225 +Ref: User-modified-Footnote-1442928 +Node: Auto-set442990 +Ref: Auto-set-Footnote-1456699 +Ref: Auto-set-Footnote-2456904 +Node: ARGC and ARGV456960 +Node: Pattern Action Summary461178 +Node: Arrays463611 +Node: Array Basics464940 +Node: Array Intro465784 +Ref: figure-array-elements467718 +Ref: Array Intro-Footnote-1470338 +Node: Reference to Elements470466 +Node: Assigning Elements472928 +Node: Array Example473419 +Node: Scanning an Array475178 +Node: Controlling Scanning478198 +Ref: Controlling Scanning-Footnote-1483592 +Node: Numeric Array Subscripts483908 +Node: Uninitialized Subscripts486093 +Node: Delete487710 +Ref: Delete-Footnote-1490459 +Node: Multidimensional490516 +Node: Multiscanning493613 +Node: Arrays of Arrays495202 +Node: Arrays Summary499956 +Node: Functions502047 +Node: Built-in503086 +Node: Calling Built-in504164 +Node: Numeric Functions506159 +Ref: Numeric Functions-Footnote-1510977 +Ref: Numeric Functions-Footnote-2511334 +Ref: Numeric Functions-Footnote-3511382 +Node: String Functions511654 +Ref: String Functions-Footnote-1535155 +Ref: String Functions-Footnote-2535284 +Ref: String Functions-Footnote-3535532 +Node: Gory Details535619 +Ref: table-sub-escapes537400 +Ref: table-sub-proposed538915 +Ref: table-posix-sub540277 +Ref: table-gensub-escapes541814 +Ref: Gory Details-Footnote-1542647 +Node: I/O Functions542798 +Ref: I/O Functions-Footnote-1550034 +Node: Time Functions550181 +Ref: Time Functions-Footnote-1560690 +Ref: Time Functions-Footnote-2560758 +Ref: Time Functions-Footnote-3560916 +Ref: Time Functions-Footnote-4561027 +Ref: Time Functions-Footnote-5561139 +Ref: Time Functions-Footnote-6561366 +Node: Bitwise Functions561632 +Ref: table-bitwise-ops562194 +Ref: Bitwise Functions-Footnote-1566522 +Node: Type Functions566694 +Node: I18N Functions567846 +Node: User-defined569493 +Node: Definition Syntax570298 +Ref: Definition Syntax-Footnote-1575957 +Node: Function Example576028 +Ref: Function Example-Footnote-1578949 +Node: Function Caveats578971 +Node: Calling A Function579489 +Node: Variable Scope580447 +Node: Pass By Value/Reference583440 +Node: Return Statement586937 +Node: Dynamic Typing589916 +Node: Indirect Calls590845 +Ref: Indirect Calls-Footnote-1602151 +Node: Functions Summary602279 +Node: Library Functions604981 +Ref: Library Functions-Footnote-1608589 +Ref: Library Functions-Footnote-2608732 +Node: Library Names608903 +Ref: Library Names-Footnote-1612361 +Ref: Library Names-Footnote-2612584 +Node: General Functions612670 +Node: Strtonum Function613773 +Node: Assert Function616795 +Node: Round Function620119 +Node: Cliff Random Function621660 +Node: Ordinal Functions622676 +Ref: Ordinal Functions-Footnote-1625739 +Ref: Ordinal Functions-Footnote-2625991 +Node: Join Function626202 +Ref: Join Function-Footnote-1627972 +Node: Getlocaltime Function628172 +Node: Readfile Function631916 +Node: Shell Quoting633888 +Node: Data File Management635289 +Node: Filetrans Function635921 +Node: Rewind Function640017 +Node: File Checking641403 +Ref: File Checking-Footnote-1642736 +Node: Empty Files642937 +Node: Ignoring Assigns644916 +Node: Getopt Function646466 +Ref: Getopt Function-Footnote-1657930 +Node: Passwd Functions658130 +Ref: Passwd Functions-Footnote-1666970 +Node: Group Functions667058 +Ref: Group Functions-Footnote-1674955 +Node: Walking Arrays675160 +Node: Library Functions Summary676760 +Node: Library Exercises678164 +Node: Sample Programs679444 +Node: Running Examples680214 +Node: Clones680942 +Node: Cut Program682166 +Node: Egrep Program691886 +Ref: Egrep Program-Footnote-1699389 +Node: Id Program699499 +Node: Split Program703175 +Ref: Split Program-Footnote-1706629 +Node: Tee Program706757 +Node: Uniq Program709546 +Node: Wc Program716965 +Ref: Wc Program-Footnote-1721215 +Node: Miscellaneous Programs721309 +Node: Dupword Program722522 +Node: Alarm Program724553 +Node: Translate Program729358 +Ref: Translate Program-Footnote-1733921 +Node: Labels Program734191 +Ref: Labels Program-Footnote-1737542 +Node: Word Sorting737626 +Node: History Sorting741696 +Node: Extract Program743531 +Node: Simple Sed751055 +Node: Igawk Program754125 +Ref: Igawk Program-Footnote-1768451 +Ref: Igawk Program-Footnote-2768652 +Ref: Igawk Program-Footnote-3768774 +Node: Anagram Program768889 +Node: Signature Program771950 +Node: Programs Summary773197 +Node: Programs Exercises774418 +Ref: Programs Exercises-Footnote-1778549 +Node: Advanced Features778640 +Node: Nondecimal Data780622 +Node: Array Sorting782212 +Node: Controlling Array Traversal782912 +Ref: Controlling Array Traversal-Footnote-1791278 +Node: Array Sorting Functions791396 +Ref: Array Sorting Functions-Footnote-1795282 +Node: Two-way I/O795478 +Ref: Two-way I/O-Footnote-1800423 +Ref: Two-way I/O-Footnote-2800609 +Node: TCP/IP Networking800691 +Node: Profiling803563 +Node: Advanced Features Summary811834 +Node: Internationalization813767 +Node: I18N and L10N815247 +Node: Explaining gettext815933 +Ref: Explaining gettext-Footnote-1820958 +Ref: Explaining gettext-Footnote-2821142 +Node: Programmer i18n821307 +Ref: Programmer i18n-Footnote-1826183 +Node: Translator i18n826232 +Node: String Extraction827026 +Ref: String Extraction-Footnote-1828157 +Node: Printf Ordering828243 +Ref: Printf Ordering-Footnote-1831029 +Node: I18N Portability831093 +Ref: I18N Portability-Footnote-1833549 +Node: I18N Example833612 +Ref: I18N Example-Footnote-1836415 +Node: Gawk I18N836487 +Node: I18N Summary837131 +Node: Debugger838471 +Node: Debugging839493 +Node: Debugging Concepts839934 +Node: Debugging Terms841744 +Node: Awk Debugging844316 +Node: Sample Debugging Session845222 +Node: Debugger Invocation845756 +Node: Finding The Bug847141 +Node: List of Debugger Commands853620 +Node: Breakpoint Control854952 +Node: Debugger Execution Control858629 +Node: Viewing And Changing Data861988 +Node: Execution Stack865364 +Node: Debugger Info866999 +Node: Miscellaneous Debugger Commands871044 +Node: Readline Support876045 +Node: Limitations876939 +Node: Debugging Summary879054 +Node: Arbitrary Precision Arithmetic880228 +Node: Computer Arithmetic881644 +Ref: table-numeric-ranges885242 +Ref: Computer Arithmetic-Footnote-1886101 +Node: Math Definitions886158 +Ref: table-ieee-formats889446 +Ref: Math Definitions-Footnote-1890050 +Node: MPFR features890155 +Node: FP Math Caution891826 +Ref: FP Math Caution-Footnote-1892876 +Node: Inexactness of computations893245 +Node: Inexact representation894204 +Node: Comparing FP Values895561 +Node: Errors accumulate896643 +Node: Getting Accuracy898076 +Node: Try To Round900738 +Node: Setting precision901637 +Ref: table-predefined-precision-strings902321 +Node: Setting the rounding mode904110 +Ref: table-gawk-rounding-modes904474 +Ref: Setting the rounding mode-Footnote-1907929 +Node: Arbitrary Precision Integers908108 +Ref: Arbitrary Precision Integers-Footnote-1913008 +Node: POSIX Floating Point Problems913157 +Ref: POSIX Floating Point Problems-Footnote-1917030 +Node: Floating point summary917068 +Node: Dynamic Extensions919262 +Node: Extension Intro920814 +Node: Plugin License922080 +Node: Extension Mechanism Outline922877 +Ref: figure-load-extension923305 +Ref: figure-register-new-function924785 +Ref: figure-call-new-function925789 +Node: Extension API Description927775 +Node: Extension API Functions Introduction929225 +Node: General Data Types934049 +Ref: General Data Types-Footnote-1939788 +Node: Memory Allocation Functions940087 +Ref: Memory Allocation Functions-Footnote-1942926 +Node: Constructor Functions943022 +Node: Registration Functions944756 +Node: Extension Functions945441 +Node: Exit Callback Functions947738 +Node: Extension Version String948986 +Node: Input Parsers949651 +Node: Output Wrappers959530 +Node: Two-way processors964045 +Node: Printing Messages966249 +Ref: Printing Messages-Footnote-1967325 +Node: Updating `ERRNO'967477 +Node: Requesting Values968217 +Ref: table-value-types-returned968945 +Node: Accessing Parameters969902 +Node: Symbol Table Access971133 +Node: Symbol table by name971647 +Node: Symbol table by cookie973628 +Ref: Symbol table by cookie-Footnote-1977772 +Node: Cached values977835 +Ref: Cached values-Footnote-1981334 +Node: Array Manipulation981425 +Ref: Array Manipulation-Footnote-1982523 +Node: Array Data Types982560 +Ref: Array Data Types-Footnote-1985215 +Node: Array Functions985307 +Node: Flattening Arrays989161 +Node: Creating Arrays996053 +Node: Extension API Variables1000824 +Node: Extension Versioning1001460 +Node: Extension API Informational Variables1003361 +Node: Extension API Boilerplate1004426 +Node: Finding Extensions1008235 +Node: Extension Example1008795 +Node: Internal File Description1009567 +Node: Internal File Ops1013634 +Ref: Internal File Ops-Footnote-11025304 +Node: Using Internal File Ops1025444 +Ref: Using Internal File Ops-Footnote-11027827 +Node: Extension Samples1028100 +Node: Extension Sample File Functions1029626 +Node: Extension Sample Fnmatch1037264 +Node: Extension Sample Fork1038755 +Node: Extension Sample Inplace1039970 +Node: Extension Sample Ord1041645 +Node: Extension Sample Readdir1042481 +Ref: table-readdir-file-types1043357 +Node: Extension Sample Revout1044168 +Node: Extension Sample Rev2way1044758 +Node: Extension Sample Read write array1045498 +Node: Extension Sample Readfile1047438 +Node: Extension Sample Time1048533 +Node: Extension Sample API Tests1049882 +Node: gawkextlib1050373 +Node: Extension summary1053031 +Node: Extension Exercises1056720 +Node: Language History1057442 +Node: V7/SVR3.11059098 +Node: SVR41061279 +Node: POSIX1062724 +Node: BTL1064113 +Node: POSIX/GNU1064847 +Node: Feature History1070728 +Node: Common Extensions1084522 +Node: Ranges and Locales1085846 +Ref: Ranges and Locales-Footnote-11090464 +Ref: Ranges and Locales-Footnote-21090491 +Ref: Ranges and Locales-Footnote-31090725 +Node: Contributors1090946 +Node: History summary1096487 +Node: Installation1097857 +Node: Gawk Distribution1098803 +Node: Getting1099287 +Node: Extracting1100110 +Node: Distribution contents1101745 +Node: Unix Installation1107810 +Node: Quick Installation1108493 +Node: Shell Startup Files1110904 +Node: Additional Configuration Options1111983 +Node: Configuration Philosophy1113722 +Node: Non-Unix Installation1116091 +Node: PC Installation1116549 +Node: PC Binary Installation1117868 +Node: PC Compiling1119716 +Ref: PC Compiling-Footnote-11122737 +Node: PC Testing1122846 +Node: PC Using1124022 +Node: Cygwin1128137 +Node: MSYS1128960 +Node: VMS Installation1129460 +Node: VMS Compilation1130252 +Ref: VMS Compilation-Footnote-11131474 +Node: VMS Dynamic Extensions1131532 +Node: VMS Installation Details1133216 +Node: VMS Running1135468 +Node: VMS GNV1138304 +Node: VMS Old Gawk1139038 +Node: Bugs1139508 +Node: Other Versions1143391 +Node: Installation summary1149815 +Node: Notes1150871 +Node: Compatibility Mode1151736 +Node: Additions1152518 +Node: Accessing The Source1153443 +Node: Adding Code1154878 +Node: New Ports1161035 +Node: Derived Files1165517 +Ref: Derived Files-Footnote-11170992 +Ref: Derived Files-Footnote-21171026 +Ref: Derived Files-Footnote-31171622 +Node: Future Extensions1171736 +Node: Implementation Limitations1172342 +Node: Extension Design1173590 +Node: Old Extension Problems1174744 +Ref: Old Extension Problems-Footnote-11176261 +Node: Extension New Mechanism Goals1176318 +Ref: Extension New Mechanism Goals-Footnote-11179678 +Node: Extension Other Design Decisions1179867 +Node: Extension Future Growth1181975 +Node: Old Extension Mechanism1182811 +Node: Notes summary1184573 +Node: Basic Concepts1185759 +Node: Basic High Level1186440 +Ref: figure-general-flow1186712 +Ref: figure-process-flow1187311 +Ref: Basic High Level-Footnote-11190540 +Node: Basic Data Typing1190725 +Node: Glossary1194053 +Node: Copying1225982 +Node: GNU Free Documentation License1263538 +Node: Index1288674 End Tag Table |